Solid-state NMR Study Reveals Collagen I Structural Modifications of Amino Acid Side Chains upon Fibrillogenesis*

Background: Collagen fibrillogenesis is a fundamental process both in vivo and in vitro. Results: Fibrillogenesis increases the heterogeneity of conformations of side chain amino acids, impacting 40% of imino acids. Conclusion: A temperature-induced-attractive force component comes from significant amount of the collagen amino acids. Significance: This work brings new perspectives to studies of collagen assembly and interactions with other matrix molecules. In vivo, collagen I, the major structural protein in human body, is found assembled into fibrils. In the present work, we study a high concentrated collagen sample in its soluble, fibrillar, and denatured states using one and two dimensional {1H}-13C solid-state NMR spectroscopy. We interpret 13C chemical shift variations in terms of dihedral angle conformation changes. Our data show that fibrillogenesis increases the side chain and backbone structural complexity. Nevertheless, only three to five rotameric equilibria are found for each amino acid residue, indicating a relatively low structural heterogeneity of collagen upon fibrillogenesis. Using side chain statistical data, we calculate equilibrium constants for a great number of amino acid residues. Moreover, based on a 13C quantitative spectrum, we estimate the percentage of residues implicated in each equilibrium. Our data indicate that fibril formation greatly affects hydroxyproline and proline prolyl pucker ring conformation. Finally, we discuss the implication of these structural data and propose a model in which the attractive force of fibrillogenesis comes from a structural reorganization of 10 to 15% of the amino acids. These results allow us to further understand the self-assembling process and fibrillar structure of collagen.

Collagen I, the most abundant protein in human body, is a very long (300 nm) and thin (1.5 nm) protein. It is formed by three left-handed polypeptide chains supercoiled into a righthanded helix. These polypeptide chains, composed of ϳ1040 amino acids each (1), are formed almost exclusively by repeats of three amino acid units, -(Gly-X-Y) n -, where Gly is a glycine and X and Y can be any other amino acid, but often, in mammals, X is a proline and Y is a hydroxyproline (2). The collagen molecule has propensity to self-associate into fibrils that further organize into hierarchical architectures both in vivo and in vitro (3,4). Within those units, collagen molecules are arranged longitudinally and laterally so as forming a crystal-like period called "D period" of 67 nm evidenced both by x-ray diffraction and electron microscopy studies (5,6). So far, the highest resolution obtained on the whole molecule by x-ray diffraction is 5-11 Å (lateral and axial resolution, respectively) (5). Highresolution local structural data were obtained on several crystals of collagen-like peptide. However, due to the limited number of these crystals (ϳ20 resolved structures currently in the Protein Data Bank), data collected are far from representing the great diversity of -(Gly-X-Y) n -sequences present in collagen I (7). 1 H NMR has only shown moderate potential to solve collagen-like peptide structure because of many overlaps of the resonances (8). Moreover, reported 13 C NMR experiments did not display a sufficient spectral resolution to allow a clear distinction among the different amino acid carbon resonances (9,10). NMR studies using specific residue labeling have given valuable information on dynamics (11)(12)(13). To deduce the backbone conformation, Saitô et al. (9) have compared 13 C chemical shifts of fibrillar collagen with those obtained on small triple helical peptides. Torchia et al. (14) have compared native and denatured resonances of collagen fragment CB2. More recently, Zhu et al. (10) have studied structural modifications of bovine cortical bone induced by dehydration. Even though all of these studies reveal important structural characteristics of collagen, many 13 C chemical shifts were "hidden" due to a poor spectral resolution.
In the present work, using solid-state NMR, we have investigated structural changes in collagen at high concentrations in four different regimes: three native conditions defined as (i) soluble at acidic pH (pH 2.5; T ϭ 20°C), (ii) and (iii) fibrillar at basic pH (pH 8.5; T ϭ 20°C and T ϭ 30°C), and a denatured condition (iv) (pH 8.5; T ϭ 35°C). This last sample was used as a reference. We have observed spectrally resolved one-dimensional and two-dimensional { 1 H}-13 C magic angle spinning (MAS) 2 spectra ( 13 C typical line width from 20 to 40 Hz). We have identified side chains and backbone 13 C chemical shifts modifications upon fibrillogenesis passing from acidic to basic pH. We have shown that one given residue can adopt different conformations along the molecule. We have interpreted our data in terms of side chain conformational modifications using statistical data from Ref. 15. We have also quantified those structural modifications by comparing the 13 C chemical shifts of denatured collagen to side chain structural library data.

EXPERIMENTAL PROCEDURES
Sample Purification and Fibrillogenesis-Collagen was prepared as described previously (6). Collagen concentration was assessed by hydroxyproline titration (16), and purity was determined by SDS-PAGE. Collagen stock solution (500 mM CH 3 COOH, pH 2.5) was dialyzed against HCl (3 mM, pH 2.5) and concentrated to 410 mg⅐ml Ϫ1 by a centrifugation filtration process using PES tubes of 30 kDa cut-off (3000 ϫ g at 10°C, variable angle centrifuge EPENDORF 5702R TM ). The pH was further increased by exposing the acidic sample to ammonia vapors (30%) at 20°C for 7 min. Two-dimensional solid-state NMR spectra were recorded 6 h after the sample exposition to ammonia and acquired during 6 h. The stability of the sample was checked by recording one-dimensional 13 C spectra prior and after each two-dimensional 1 H-13 C experiments. The spectra were similar for each condition. Finally, the "basic pH" sample was centrifuged-filtrated to collect enough liquid (ϳ150 l) to measure the pH with a pH paper. The resulting pH was 8.5 Ϯ 0.5. Denaturation of the sample was obtained by exposing the sample to a high power 1 H decoupling (( 1 H), 60 kHz) for 16 h.
NMR Parameters-Solid-state NMR analyses were performed on a Bruker Avance III 500 spectrometer at 11.7 Tesla operating at 500.21 MHz for 1 H and 125.79 MHz for 13 C. A Bruker MAS double resonance probe was used with 7-mm rotors spun at a frequency MAS ϭ 2.5 kHz. A BCU-X unit was used to regulate the temperature between 20 and 35°C. For all experiments, t 90 ( 1 H) and t 90 ( 13 C) were 11 and 7 s, respectively. Relaxation delays of 1 and 3 s for 1 H and 13 C, respectively, were sufficiently long to record quantitative spectra. 512 scans were acquired for 13 C single pulse experiment, and 48 scans were acquired for the two-dimensional { 1 H}-13 C Insensitive Nuclei Enhanced by Polarization Transfert MAS experiment. For all 13 C spectra, a low power proton decoupling (4.5 kHz) was applied during acquisition. The INEPT evolution delay (1.7 ms) was matched to an averaged 1 J CH value of 145 Hz and the refocusing delay (1.1 ms) was matched to get the CH, CH 2 , and CH 3 resonances similarly phased. 256 t 1 transients were collected for two-dimensional { 1 H}-13 C INEPT spectra. 1 H and 13 C chemical shifts were referenced (␦ ϭ 0 ppm) to tetramethylsilane. No 13 C signal was detected through a cross-polarization MAS experiment (contact time of 10 ms, 440 scans, and a 1 H decoupling of 40 kHz). See supplemental Fig. S2 for details.
Assignment and Fit-1 H and 13 C chemical shift assignments were based on two-dimensional { 1 H}-13 C INEPT MAS spectra (see Fig. 1 and supplemental Fig. S1), and previous work on a collagen-like peptide (8) and native collagen I (9,10,17). No ambiguity was found for carbon resonances except for some C␣ and C␤ for which we observed overlaps in the two-dimensional { 1 H}-13 C INEPT MAS spectra obtained in native conditions. In our assignments, we did not consider tryptophans and cysteines because collagen I from rat tendon do not present either (1). We were not able to assign any resonance to methionine, histidine, and tyrosine due to their very small amounts (Ͻ0.6% of the total amino acid content). We were not able to assign 3-hydroxyprolines (ϳ5% of total hydroxyprolines content (18)) because such modifications shift the carbon resonances to very high downfield regions (19). This point, together with the lack of data in the literature, makes the assignment unreliable.
One-dimensional 13 C spectra were fitted using Dmfit software (20). This allows the determination of the number of underlying resonances within a given peak and, consequently, allows the extraction of either the number of residues or the number of conformations for a given carbon. A direct quantification was not possible in the specific case of Arg C␥ and Pro C␥, which overlapped in the one-dimensional 13 C spectrum (supplemental Fig. S2). However, we used Arg C␦ peaks, free of overlap in the one-dimensional spectra, to deduce the relative intraresidual intensity of Arg C␥ (supplemental Table S3 and supplemental data). Because there are two times less arginine than proline, we considered arginine intensity twice as low as that of proline in the quantitative spectrum. We then subtracted this intensity to the overlap peak to deduce the intensity of Pro C␥. Dmfit software generated S.D. based on the match between experimental data and peak simulation. Supplemental Table S1 shows that SDs for these fits spread from 1 to 10% depending on carbon type and physicochemical condition. Noteworthy, in our case, a change of 0.01 to 0.02 ppm in peak position always induced an increase of the S.D. of ϳ1 unit or greater. We thus estimated our resolution limit to 0.02 ppm. We were not able to fit correctly the ␣ and some ␤ carbons; however, all linear side chain carbons have been fitted (S.D. always inferior to 6% at pH 8.5).

RESULTS
Dense collagen I matrices were synthesized through a twostep procedure making use of the lyotropic properties of collagen molecules at acidic pH and self-assembling properties at neutral to basic pH. As it has been previously observed that collagen molecules assume classical fibrillar staggered in both conditions (6), we used solid-state NMR to analyze structural modifications of collagen molecules in four different regimes: (i) soluble (pH 2.5) at 20°C, (ii) and (iii) fibrillar (pH 8.5) at 20 and 30°C, and (iv) denatured (pH 8.5) at 35°C. In (iv), 13 C chemical shifts only reflect the primary structure of the collagen protein and are not subservient to higher order structural constraints (triple helix or fibrillar) (14). Therefore, (iv) was used as a reference condition. The 13 C one-dimensional and one-dimensional { 1 H}-13 C INEPT MAS spectra of a fibrillar sample are depicted in supplemental Fig. S2. These spectra present a remarkable spectral resolution (full width at halfmaximum ϳ 0.2 to 0.4 ppm) compared with previous 13 C NMR spectra reported in the literature (9,10,21). The 13 C chemical shifts obtained from the quantitative spectrum and those obtained from the one-dimensional { 1 H}-13 C INEPT spectrum are identical (supplemental Fig. S2). However, no signal was detected from a 1 H cross-polarization MAS experiment (supplemental Fig. S2), meaning that all carbons from the sample exhibit fast dynamics. To analyze the impact of pH and temperature upon the collagen structure, we used a two-dimensional { 1 H}-13 C INEPT MAS experiment to correlate the 1 H to the 13 C chemical shifts (supplemental Fig. S3). 1 H and 13 C chemical shifts of the denatured sample are in very good agreement with previous studies of collagen-like peptide (8,14). In the denatured state (iv), most of the side chain and backbone carbons display only one single resonance ( Fig. 1 and supplemental Fig. S1, bottom row). Backbone and side chain 13 C chemical shifts found in the present work are in good agreement with data extracted from a random coiled protein library (S.D. of 0.07 after calibration (22,23)). Considering all native conditions whether in acidic or neutral basic pH, the most striking difference from (iv) is a split of resonances in the carbon dimension for all detectable side chain chemical shifts. At acidic pH, although this split is less pronounced, peak fitting indicates clearly that each side chain is composed of two to five peaks, depending on the residue ( Fig. 1 and supplemental Fig. S1, second row from bottom). Interestingly, increasing the pH induces chemical shift dispersion for a same carbon ( Fig. 1 and supplemental Fig. S1, third row from bottom). For some carbons, the peak splitting is further increased by increasing temperature from 20 to 30°C ( Fig. 1 and supplemental Fig. S1, top row). In native conditions, most ␣ carbons are composed of more than one peak but, because of the poor ␣ carbon spectral resolution, we were not able to make a proper fit of these peaks. Thus, a change of pH and/or temperature does induce changes in carbon chemical shift for ␣ carbons although we cannot quantify it. For all native conditions, Gly C␣ displays two resonances centered at 43.37 ppm and 42.60 ppm with the same relative intensity and at quite close position compared with the denatured state (iv) (supplemental Figs. S1 and S4). In the proton dimension, the Gly H␣ less intense peak presents a clear upfield shift (⌬␦ ϭ Ϫ0.1 ppm) compared with (iv) (supplemental Figs. S1 and S4). Considering the native conditions, the only ␣ carbon that presents a measurable change in chemical shift in the car-bon dimension is Asp C␣, for which an increase in pH induces a clear downfield shift of ⌬␦ ϭ ϩ0.15 ppm (supplemental Fig.  S1). This shift difference is in agreement with that expected considering deprotonation effects (24,25). Among ␤ carbons, Asp C␤ shows an important downfield shift of ϩ0.6 ppm with the pH increase and, at 30°C, pH 8.5 Ϯ 0.5, it is composed of four peaks (Fig. 1C). Ala C␤ presents two peaks at acidic pH and increasing pH induces the apparition of one less intense peak at a downfield position ( Fig. 1A and supplemental Table S3). For ␤ carbons possessing a hydroxyl group (Ser, Thr), no significant changes were seen.
To better describe the effects of pH and temperature, we divided the side chain carbons into three classes based on similarity of changes observed in carbon chemical shift. The first class corresponds to Arg, Lys, Glu, and Gln residues. In native conditions, side chain carbons (C␥, C␦, and C⑀) of these residues present three peaks in the carbon dimension. The central resonance is always very close to that of the denatured state (iv) and the two other peaks are located upfield and downfield. The increase in pH and temperature does not influence much the middle peak position but induces a further shift of the two others ( Fig. 1 and supplemental Fig. S1, second row from top). The second class contains only Pro and Hyp residues. The increase in pH and temperature induces an important (⌬␦ ϳ Ϫ0.6 ppm) upfield shift of the proline C␤ and C␥ residues (ϳ40% of the total intensity) (Fig. 1B). Hyp C␤ presents mainly three peaks at acidic pH and mainly five peaks at pH 8.5 Ϯ 0.5 both at 20 and 30°C (Fig. 1C). No significant changes are observed for Hyp C␥ (supplemental Fig. S1) similarly to other carbons possessing hydroxyl groups. We were not able to fit correctly C␦ resonance because of peak overlap. Hyp C␦ and Pro C␦ show larger line widths than the other prolyl carbons making measurements of the less intense peaks difficult. In the third class, we find Leu (C␦ 1 and 2), Ile (C␦ 1), Val (C␥ 1 and 2), and Thr (C␥) residues.

Fibrillogenesis of Collagen I Studied by Solid-state NMR
The pH increase induces a split in the carbon dimension while increasing temperature does not induce important changes except for Leu (C␦ 1 and 2) and Thr (C␥) that present less intense peaks and show a slight increase in chemical shift dispersion (Fig. 1A).
The effect of physicochemical conditions on the resonance dispersion of arginine C␥, proline C␥, and hydroxyproline C␤ is shown on supplemental Fig. S5. Furthermore, Fig. 2, A-C, compares quantitatively the increases in 13 C chemical shift dispersion for these residues. The graphs are normalized using chemical shift values from (iv). In particular, Fig. 2A reveals that, for Arg and Lys, pH and temperature have a greater impact on the most upfield peaks (⌬␦ ϭ ϩ0.75 ppm) than on the more downfield ones (⌬␦ ϭ Ϫ0.17 ppm). In contrast, Gln exhibits a similar evolution of the chemical shift dispersion for the two extreme peaks (⌬␦ ϭ Ϯ0.3 ppm). Concerning Glu C␥ peaks, the pH increase induces an important downfield shift (⌬␦ ϭ Ϯ1.65 ppm). However, one cannot directly compare acidic condition with (iv) (pH 8.5 Ϯ 0.5) due to the deprotonations occurring at basic pH. For the second class, Hyp C␤ and Pro C␤ and Pro C␥, only part of the peaks shows an upfield shift (Fig. 2B). Increase in temperature induces an important shift of the resonances for Hyp and Pro ( Fig. 2B and supplemental Fig. S5, B and C). Because our one-dimensional 13 C experiments were made under quantitative conditions, it is worthwhile to compare the signal intensities between the two-dimensional INEPT experiment (sensitive to local motion) and the one-dimensional quantitative experiment. The difference between two-dimen-sional and one-dimensional quantitative experiment is relatively small (Ͻ20%) (supplemental Fig. S2 and Table S3).

DISCUSSION
Collagen Dynamics-The spectra obtained through two-dimensional { 1 H}-13 C INEPT MAS and one-dimensional 13 C experiments are in good agreement with previous 13 C NMR experiments reported on native and denatured samples (14). We observe that varying the pH does not impact significantly the 1 H line widths for the native conditions, indicating that the structural changes induced by fibrillogenesis affect only moderately collagen dynamics reflected by the line width of the 1 H resonances (supplemental Table S1). Mosser et al. (26) have shown that increasing the pH of a very dense collagen solution (400 -800 mg⅐ml Ϫ1 ) resulted in the formation of very small fibrils (ϳ5-15 nm of diameter). Nano-fibrils are also present within larger collagen I fibrils (5). As in this study we are dealing with very small fibrils submitted to fast dynamics, this explains why the spectral resolution and the INEPT signal intensity are the same in the soluble and the fibrillar state. Moreover, this also explains the lack of spectral resolution present in former studies achieved on Achilles tendons. In this tissue, the fibrils display a diameter at least five times larger in which the nanofibrils must be more constrained (for example, 50 to 200 nm in Ref. 27). This also explains why we were unable to record any 13   with ␦ den as the corresponding chemical shift in (iv) condition and d as the corresponding chemical shift in conditions (i), (ii), and (iii) conditions. A, arginine, lysine, and glutamine carbons ␥. For these residues, we have omitted the central peaks in the graph because they were not greatly affected by pH or temperature changes. B, hydroxyproline C␤ and proline C␥ and C␤. C, leucine C␦1, threonine C␥2, and isoleucine C␥2. D, equilibrium constants between the conformers ϩgauche and trans/Ϫgauche for arginine, lysine, and glutamine 1 dihedral angles in the different states. Denat, denatured; Sol, soluble; Fibril, Fibrillar.

Fibrillogenesis of Collagen I Studied by Solid-state NMR
Considering side chain carbons (␥, ␦, ⑀) for one given side chain type, our data show that the relative distribution of intensity in the two-dimensional INEPT spectra is always close to that found in one-dimensional quantitative spectra (supplemental Table S3). This result is consistent with the fact that in hydrated fibrils, there is no strong interaction between amino acid residues (11)(12)(13). In a future work, a more quantitative analysis will be made concerning collagen dynamics. In the present work, we have chosen to focus our study on the structural changes revealed by the chemical shift analysis.
Chemical Shift Analysis-Previous works have shown that the chemical shifts of C␣, C␤, and the overall side chain carbons (C␥, C␦, C⑀) are sensitive to changes in backbone (, ) and side chains ( 1 , 2 , 3 ) dihedral angles, respectively. In contrast, it is considered that chemical shifts of CϭO carbons or protons, are sensitive to hydrogen bond and hydration degree (15,24,28,29). The physical origin of the correlation between chemical shift and residue conformation is known as "␥ effect": some side chain conformations induce a gauche interaction for a given carbon that increases steric conflict and generates an upfield 13 C chemical shift (supplemental Fig. S6) (15). For proteins, this effect overcomes all other effects such as the ring currents (29). Because at room temperature amino acid side chains of rat tendon fibrils show a very fast interconversion rate between different conformations (11)(12)(13), each 13 C chemical shift in our spectra must represent a equilibrium between those conformations in specific locations along the polypeptide chain. In the denatured condition (iv), when no specific protein fold is present, almost all carbons display only one 13 C chemical shift per carbon type, and thus, only a single equilibrium exists for all amino acids wherever their location in the primary sequence. On the contrary, in the triple helix and in the fibrillar state, conformations will be stabilized or destabilized according to the amino acid location in the molecule, thus generating different equilibria. For example, in our study, which deals with the full collagen triple helix, Arg C␥ 13 C resonances display the presence of three different chemical shifts indicating the presence of three major equilibria along the molecule. Moreover, the work of Dunbrack et al. (30) shows that there is strong correlation between dihedral angles inside side chains and between side chains and backbone dihedral angles. Therefore, any change in chemical shift for a given side chain carbon (Arg C␥, for example) must induce some changes in the chemical shift of other carbons (Arg C␦) of the same residue. Indeed, one sees that all residues always display the same number of resonances for each of their carbons (three for Lys C⑀, C␦, and C␥). Moreover, this indicates that, at least for residues having a carbon free of overlap in the onedimensional quantitative experiment, all of the resonances are detected in the INEPT experiments.
Backbone Conformation-An extended study has been conducted to investigate backbone conformation based on sequence analysis and collagen peptide-like crystal structure (7). In particular, Bella (7) has predicted the presence of six major helical conformations in the human collagen I triple helix. This work shows that for regions poorer in imino acids, the triple helical twist is less important (7). Alanine C␤ is the most useful "probe" to detect backbone conformations due to its abundance in collagen (11%). Our results show that, at acidic pH, Ala C␤ displays four resonances representing five different backbone conformations. However, in fibrillar state, Ala C␤ chemical shifts indicates that five major backbone conformations exist in agreement with prediction of Ref. 7 ( Fig. 1A and supplemental Table S2). The Ala C␤ peak at 16.6 ppm is not affected by the association into triple helix because this chemical shift remains comparable with that of (iv) (Fig. 1A and supplemental Table S2). In random coiled peptides, residues immediately followed by a proline present a chemical upfield shift of ϳ1 ppm for C␣ and C␤ carbons (23). Because the peak fitting for (iv) (supplemental Table S2) agrees well with the number of Ala followed by an imino acid in the primary sequence (1), we assigned this peak (16.6 ppm) to a region rich in (Gly-Ala-Hyp) triplets. At this point, the other peaks, 17.14, 17.52, 17.85 ppm, correspond to regions rather depleted in imino acids and the 18.2-ppm peaks correspond to the richest region.
Side Chain Analysis (Equilibrium Quantification)-To interpret our chemical shifts in terms of conformational changes, we have compared our data with the work of London et al. (15) in which the authors have built a library that correlates side chain conformations found in the Protein Data Bank with 13 C chemical shifts. In this library, chemical shifts are those of proteins in solution at room temperature and, because of rotamer averaging, they are much lower than chemical shifts obtained for pure conformers (15). Using these data from the library, we were able to associate a given upfield or downfield shift with a given conformer stabilization. For example, the library shows for glutamine (Gln) that, statistically, when the side chain dihedral angle 1 adopts the trans (180°) or Ϫgauche (60°) conformation, the Gln C␥ chemical shift is ϳ0.5 ppm lower than when 1 assumes the ϩgauche (Ϫ60°) conformation (15). Based on these data, a downfield shift (higher ppm value) for Gln C␥ means a stabilization of the ϩgauche conformation for 1 . Hansen et al. (29) have obtained chemical shift differences between pure conformations using the J coupling constant and estimated a chemical shift difference of 5.5 ppm between Ile C␦ 1 conformers. These calculated chemical shift differences are always close to the difference found between the two more extreme values of the London et al. library (15). To extract structural information from our data, we made the assumption that the chemical shift values displayed for the denatured sample (iv) are close to the statistical conformers distribution found in conformation libraries (30). Indeed, as we have argued before, the lack of higher structural order in the denatured state liberates backbone and side chains into "random coiled" conformations. Then, the conformation equilibrium between side chain conformations (ϩgauche, Ϫgauche, and trans) only depends on the residue. For example, this explains why in (iv), all arginines present the same 13 C chemical shift. Thus, the chemical shift values from (iv) are in agreement with chemical shift values in the random coiled peptide side chain library (23) and with denatured collagen (14) found in the literature, even though the primary sequence and the physicochemical conditions are not the same (14,23). We thus used 13 C chemical shifts values from (iv) as internal references of "random coiled side chain chemical shift" and have directly associated them with a random coiled side chain statistical distribution using an appropriate statistical structural library (30). We also made the assumption that the amplitude of the chemical shift differences between two pure conformations could be deduced from the subtraction of the more extreme chemical shift values in the London et al. library (15). Detailed calculations with glutamine, glutamate, arginine, lysine, leucine, and lysine are given as examples in the supporting information section (supplemental data and dihedral angle analysis). Due to the calculations summarized in Table 1, we show that some Arg and Lys residues adopt a ϩgauche/trans conformation. This conformation is characterized by a side chain location that is extended away from the triple helix (supplemental Fig. S7). It also shows that fibrillogeneseis has an impact on Leu residues, with half of them stabilized in an extended ϩgauche/trans conformation and the other half in a trans/ Ϫgauche conformation. It is important to know how many residues are implicated in those conformations. For glutamine, glutamate, and leucine, this information is directly accessible from onedimensional quantitative spectrum (supplemental Table S3). In the case of Arg and Lys, the strong correlation between dihedral angle conformations (30) makes possible a safe estimation (supplemental data and dihedral angle analysis). Therefore, from one-dimensional 13 C peak quantification analysis (supplemental Table S3), one can see that pH and temperature changes affect many residues (about one-third of the total glutamine, glutamate, arginine, and lysine residues).
Imino Acids-Because proline and hydroxyproline play an important role in collagen structure, they have been greatly investigated in the past (31)(32)(33). In native collagen I, as backbone dihedral angle conformations are always trans (8), there are only two possible stable conformations for imino acids: pucker up and down. Actually, because the prolyl ring structure is rigid, any change in one dihedral angle affects all of the others (8). It has been shown that there is a clear correlation between imino acid positioning within the Gly-X-Y collagen triplet and pucker conformation. Residues in X and Y positions adopt preferentially a down and up conformation, respectively (2). However, our data show that C␤ and C␥ proline residues displayed an important downfield shift that increases (ϳ0.6 ppm for ␤ and ␥) with increasing pH, suggesting that fibrillogenesis strongly influences proline conformation (supplemental Tables S1 and S3). According to statistical data, a downfield shift for proline C␥ means a stabilization of the 1 angle in (ϩgauche) conformation indicating an up pucker conformation. Thus, our data suggest that fibrillogenesis favors the up pucker conformation. Okuyama et al. (28) have shown that at room temperature, the crystal structure of -(Gly-Pro-Hyp) n -displays all prolines in a down conformation, but the same peptide at low temperature (100 K) displays nearly half of all prolines (three of seven) in an up conformation. Our data show that the chemical shift variation of C␦ is about half (ϳ0.3 ppm) of that detected for C␤ and C␥. This is in agreement with the simulated relative physical amplitude for these two carbons between down and up conformations (31). Peak quantification from one-dimensional 13 C spectra reveals that ϳ25% of prolines undergo this stabilization.
In the case of hydroxyproline, our results show an increase of the Hyp C␤ chemical shift dispersion as the pH and temperature increase. On the contrary, Hyp C␥ does not display any significant dispersion. This is in agreement with the fact that the 13 C chemical shift of carbons bonded to hydroxyl groups is governed by the strength of the hydrogen bond (19). For Hyp C␤, a direct quantification can be made from the one-dimensional spectrum, indicating how many residues are implicated in these conformational changes. Considering the same arguments for Pro residues, Hyp C␤ chemical shift analysis indicates that fibril formation stabilizes some hydroxyprolines in the up pucker conformation (ϳ20%) and others in the down conformation (ϳ20%) (supplemental Table S3). This is the first time that this Pro and Hyp pucker conformational change is evidenced in the collagen I molecule. This effect seems to be essential for fibril formation as it affects a great amount of imino acids, which altogether represent ϳ22% of collagen amino acids (1).
Impact on Collagen Fibrillogenesis-The fact that there are a limited number of side chain equilibria (three to five) per amino acid is a very important point to understand collagen fibrillogenesis. Indeed, one sees that in the quaternary fibrillar structure model (5), a given amino acid type from one triple helix will "face" a great diversity of amino acids of the neighboring triple

C chemical shift (CS) and calculated equilibrium constant (K) for side chain carbons of collagen sample in denaturated, fibrillar, and soluble conditions
Chemical shift (␦) and peak intensity were measured using Dmfit 2008 as described under "Experimental Procedures." The equilibrium constant calculation is detailed in the supplemental data. Because the state of protonation of glutamates at pH 2.5 (i) is different from that of the reference (iv), we were not able to calculate its equilibrium constant. g, gauche; t, trans. MARCH  helices. However, our data show that these numerous combinations of neighboring amino acids in the quaternary structure induce only a restricted number of side chain conformations. The present data also suggest that the relatively weak temperature-induced attractive force component responsible for fibrillogenesis comes from a rather large proportion of amino acids of the molecule.

Fibrillogenesis of Collagen I Studied by Solid-state NMR
Actually, previous force measurement studies, at neutral pH, demonstrated that the self-assembling net force of collagen comes mainly from hydration forces (34). The same work indicates that the attractive component of this net force is strongly temperature-dependent, in contrast to the repulsive, "hard wall"-like component. To explain this temperature dependence, Leikin et al. hypothesize that this attractive component is due to reorientation of some side chains improving complementarities between the water layer surfaces of adjacent molecules. Because this attractive component is very weak (ϳ3.5 kcal/ mol per molecule), they suggest that those side chains must be located in histidine rich regions that represent Ͻ1% of the whole molecular surface. On the contrary, according to their model, the repulsive forces should come from the hydration layer of the nonhistidine rich regions (ϳ99% of collagen surface).
In our case, increasing the pH "turns off" the repulsive electrostatic force, making the molecules come close together to be submitted to this water layer hard wall repulsion. Two hypotheses can be made concerning the local level of this mechanism: 1) there is no interpenetration of water layers, and/or 2) there is an energetically unfavourable interpenetration inducing a reorganization of the water shell. The first hypothesis could fit with side chains that do not display important changes in equilibrium conformation with pH increase (the resonances have a similar chemical shift to those found in the denatured state i.e. about one-half of all side chains). In the second hypothesis, one would expect that any modifications concerning the water shell and the side chains/side chains or side chain/backbone interactions would automatically induce a change in side chain conformation. Our data show that the pH increase induces a structural reorganization for some side chains, indicating that they belong to regions where water shell interpenetration has occurred. Consequently, our data also show that for some of these side chains, the increase in temperature increases the equilibrium dispersion from random coiled values. As the inter-molecular repulsive force is not affected by temperature, all side chains that show an increase in equilibrium dispersion with both temperature and pH, must necessarily be located in regions where energetically favored complementary surfaces are formed. Moreover, considering all side chains that answer this criteria (equilibrium changes with T°and pH), one can estimate that these correspond to 10 to 15% of the total collagen residue content (supplemental Table S1). Our results bring experimental evidences supporting the hypothesis of Leikin et al. (35) but further show that the proportion of residues concerned is much greater than previously postulated.

CONCLUSIONS
13 C and 1 H-13 C solid-state MAS NMR experiments were used to identify conformational changes in collagen I in three different structural states: denatured, native repulsive regime, and native attractive regime (fibril state). Chemical shift analysis indicates that fibrillar packing induces an increase in heterogeneity of conformations. More specifically, using Ala C␤ chemical shifts as probe, we show the existence of five different backbone conformations in collagen, with one of those being specific to the fibrillar state. Analysis of 13 C chemical shifts demonstrates that linear side chains show moderate changes in the average conformation. Using statistical structural data present in the literature, we assigned changes in 13 C chemical shift to the most spread-out side chain conformations and thus calculated equlibrium distributions between conformations. We also show that fibrils formation impacts on many imino acids, stabilizing prolines in the down pucker conformation and hydroxyproline both in up and down conformations. These data suggest that the relatively weak temperature-induced attractive force component responsible for fibrillogenesis comes from large percentage of amino acids contrary to what was thought thus far. Finally, our data bring a more precise view about the impact of fibrillogenesis on local collagen conformations and open wide perspectives to the study of collagen I in interaction with other extracellular matrix molecules.