IN SITU D-PERIODIC MOLECULAR STRUCTURE OF TYPE II COLLAGEN

Collagens components

Collagens are essential components of extracellular matrices in multicellular animals. Fibrillar type II collagen is the most prominent component of articular cartilage and other cartilage-like tissues such as notochord. Its in situ macromolecular and packing structures have not been fully characterized, but an understanding of these attributes may help reveal mechanisms of tissue assembly and degradation (as in osteo-and rheumatoid arthritis). In some tissues such as lamprey notochord, the collagen fibrillar organization is naturally crystalline and may be studied by Xray diffraction. We used diffraction data from native and derivative notochord tissue samples to solve the axial, D-periodic structure of type II collagen via Multiple Isomorphous Replacement. The electron density maps and heavy atom data revealed the conformation of the non-helical telopeptides and the overall Dperiodic structure of collagen type II in native tissues, data that were further supported by structure prediction and transmission electron microscopy (TEM). These results help to explain the observed differences in collagen type I and type II fibrillar architecture and indicate the collagen type II crosslink organization, which is crucial for fibrillogenesis. TEM data show the close relationship between lamprey and mammalian collagen fibrils, even though the respective larger scale tissue architecture differs.
Type II collagen fibrils are major components of cartilage, intervertebral discs, and the vitreous humor of the eye, and are vital to the normal development of bones and teeth. These fibrils are formed from highly ordered 67 nm staggered arrays of collagen molecules, producing the characteristic D-periodic structure of fibrillar collagens (S. Fig. 1). Because the length of the type II collagen molecule (~300 nm) is not an integer multiple of 67 nm, the fifth molecular segment does not extend across the whole Dperiod. Therefore a gap region is defined, where there are only four molecular segments and an overlap region where there are five. This arrangement is known as the Hodge-Petruska scheme (see packing / microfibril in S. Fig. 1) (1). Each D-period is composed of arrays of five fragments of five different collagen molecules, which between them contain the complete sequence of one collagen molecule. The type II collagen molecule has three identical α1 polypeptide chains of 1060 amino acid residues each, with a large uninterrupted triple-helical region and relatively short, non-helical telopeptides (19 amino acid residues in the Ntelopeptide, and 27 amino acid residues in the Ctelopeptide) (2) that do not possess the Gly-Xaa-Yaa repeating primary structure found in the triple-helical region. The length of the α-chains is the same, but they are displaced from one another by one residue in the triple-helix, to allow for its proper super-coiling (2).
In vitro, the polymerization of collagen molecules into fibrils is an entropy driven selfassembly process (3), whereas in living tissues it also involves cellular and specific extracellular matrix (ECM) interactions (4). Proteoglycans (PG's), such as decorin, fibromodulin, and biglycan, bind type II collagen fibrils to stabilize the larger fibril-bundles (fibers composed of multiple fibrils). The diameter of the latter being regulated through interactions with the PG's protein core and anionic glycosaminoglycan (AGAG) chains (5). However, the fibrils internal order and stability are chiefly maintained by intermolecular crosslinking between collagen molecules. Intra-fibrillar links are formed through covalent bonds between lysine and hydroxylysine (Hyl, a modified amino acid specific to collagen, single letter abbreviation U) through a Schiff base reaction (6,7). These bonds are formed between Lys and Hyl residues within both the helical and non-helical telopeptide domains of collagen molecules within each D-period. Thus, it is apparent that although the telopeptides do not have triple-helical structure, they are very important for collagen crosslinking and the assembly of collagen molecules into fibrils (8). Proper crosslink formation ensures the structural integrity and stability of fibrils and the tissue. Abnormal crosslinking, either enhanced or inhibited, leads to connective tissue diseases such as Ehlers-Danlos syndrome, Marfan syndrome, glaucoma and skin fragility in normal aging (9). Furthermore, the proper conformation of the telopeptides has been shown to be important for normal fibril structure (10)(11)(12)(13). Accurately defining the structure of the D-periodic, Hodge-Petruska (1) arrangement of type II collagen, that is: gap and overlap ratio, telopeptide conformations and crosslink locations, may facilitate our understanding of these conditions.
Structural studies of type II collagen packing has been complicated by the technical limitations its fibrillar nature presents. The insolubility and size of mature collagen molecules makes nuclear magnetic resonance (NMR) inviable and only small fragments of the collagen molecule or synthetic, short collagen-like peptides have been crystallized and studied by (single-crystal) X-ray crystallography so far (14). Furthermore, disruption of the fibril structure in order to study the constituent collagen molecules in isolation compromises the goal of understanding the molecular packing within fibrils and the native collagen structure within the D-period. Electron microscopy offers significant capabilities for studying fibrillar collagens, but does not provide sufficient resolution for characterizing the telopeptide conformation, and sample preparation may disturb subtle, small features such as the telopeptides.
Type II collagen fibrils have an intrinsically crystalline D-periodic structure, even though their lateral arrangement is less well ordered (15,16). This means that X-ray diffraction experiments on whole intact tissue samples can provide information concerning the axial organization of collagen. Such structural data have been obtained previously (15,16), but were less detailed than that seen recently for collagen type I (obtained by our group) (17)(18)(19). In this study, we applied techniques developed during earlier investigations of type I collagen structure (17,(19)(20)(21), to better define the D-periodic structure of type II collagen fibrils in situ. We present here the axial D-periodic structure of native type II collagen in lamprey notochord, determined by Multiple Isomorphous Replacement (MIR) to 1.9 nm resolution.

EXPERIMENTAL PROCEDURES
Materials-Adult female lampreys were kindly provided by the Ludington Biological Station of the U.S. Fish and Wildlife Service and J. Ellen Marsden of the University of Vermont.
Fiber diffraction-The notochords of adult lampreys were carefully harvested and dissected from their sheaths, nearby muscle and other tissues, and stored at pH 7.5 in Phosphate Buffered Saline (PBS, 8 mM Na2HPO4, 2 mM KH2PO4, 137mM NaCl) or Tris Buffered Saline (TBS, 25 mM Tris, 150 mM NaCl) for 1 to 6 h at 4°C before each experiment. The notochord samples were cut longitudinally into pieces (1 x 2 x 25 mm) along their longitudinal axis and mounted in custom made sample holders that provided for 1-2% stretching and preservation of their hydrated state. Isomorphous staining was performed as previously described (17,20). Briefly, samples were treated with iodine (KI) and platinum (PtCl3) for 1 h, followed by 15 min of washing in a large excess of buffer. Iions covalently bind to tyrosine and in some cases histidine, whereas Pt 3+ ions associate with the sulfur group of methionine. X-ray diffraction studies were performed primarily using the small and wide angle fiber diffraction instruments on the BioCAT Beamline 18ID at the Advanced Photon Source (APS), Argonne National Laboratory, IL, USA. Some preliminary studies used the wide angle instrument on the BioCARS beamline 14BMC at the APS. The sample-to-detector distance was 2000 mm for SAXS (1-7 meridional orders) and 150 mm for WAXS and microfocus WAXS (5-37 meridional orders, medium to high resolution fiber crystallography), and the wavelength was 1.033 Å. Diffraction images were recorded on a custom made 'Brandeis' detector (48 x 48 micron pixel size) (22)(23)(24)) for WAXS, and on an AVIEX PCCD 16080 detector (38.4 x 38.4 micron pixel size) (22) for SAXS. The exposure time was limited to 1 second to safe-guard against radiation/heat damage to the portion of the sample being shot.
Data analysis-Diffraction images were processed using CCP13 Fibrefix (25) and Fit2D (26) software for background subtraction, and the CCP13 software for measurements of meridional intensities (17). Measured intensities (S. Fig. 2), (squares of amplitudes) were used for calculation of difference Patterson maps and a onedimensional electron density map of the collagen type II axial unit cell (the D-period) by reverse Fourier transformation (in-house software). The scaling of the experimental amplitudes and the solution of the phases were performed as described previously (17,20,19).
Lamprey collagen type II amino acid sequence-The entire collagen type II sequence is known for human, rat and mouse and partial sequences are known for lamprey (the primary source for this study) [ExPASy sequence data bank codes P02458, P05539, P28481, Q2I8Y0 and Q2I8X9]. Unlike collagen type I, collagen type II is highly homologous across the species (27), and its full sequence may therefore be reasonably estimated by homologous comparison (N-terminal half from rat, mouse and human; C-terminal half from the lamprey sequence, Q2I8X9) (28).
Model electron density map-The model electron density map was calculated (17,20), using the residue level scattering factors of Hulmes et al. (21), the amino acid sequence described previously, the Hodge-Petruska scheme, and assuming a regular unit height spacing of the molecular helix (residue to residue distance). The telopeptides were initially modeled as elongated structures extending straight from the ends of the triple-helical region. After comparison with the difference Patterson and Fourier maps, the telopeptides were modeled as folded structures, similar to that of type I collagen C-telopeptide (although the N-telopeptide is required a double folded conformation to fit the experimental data, see Results).
Determination of the gap/overlap ratio-In this study, we took the approach that the gap/overlap ratio is determined by the extents of the telopeptides (17) and along the lines of the height/depth ratio approach of Bradshaw et al. (20) Hence the overlap start and end points were determined by considering: 1) The position of the iodine peaks for the telopeptides that give the location of the telopeptide ends. The distance from the telopeptide termini to the fold can be estimated from the number of known amino acids spanning the distance.
2) The point of the overlap region above average height versus gap region below average depth. The low resolution of the study means that the influence of the first few orders (particularly the first) produces a tail effect that makes the overlap appear larger than it is. This was partly countered through the calculation of the native electron density map with the amplitude of the first order reduced ten times (S. Fig. 3).
3) The model parameters that fit all of the following: difference Fourier heavy atompositions, difference Patterson peak separation values, and best fits the native electron density, provided there is compliance with points 1 and 2.
4) Examination of the TEM images of type II fibrils to confirm that the values determined from 1-3 were reasonable. We hold that the X-ray data are the most accurate approach due to the use of native samples, better resolution and due to the multiple lines of approach used to reach it.
Transmission electron microscopy (TEM)-Collagen type II fibrils were studied at the University of Chicago Electron Microscopy Center. Images were examined under 300 KV using an FEI Tecnai F30 microscope with a Gatan CCD digital micrograph (4kx4k) as detector. Sample preparation was the same as for the diffraction experiments, followed by fixation, embedding and sectioning into 90 nm sections on a Reichert-Jung Ultracut E microtome. Sections were stained with uranyl acetate and lead citrate to enhance the image contrast. We found that suspensions greatly enhanced the clarity of the human type II collagen fibrils. For these preparations, cartilaginous tissues were homogenized in TBS on ice for 1 min to release collagen fibrils and other matrix components. These samples were then placed on grids and stained with uranyl acetate.
The D-period from the X-ray structure was confirmed against the TEM data by sampling the gap/overlap ratio in the clearest TEM image (i.e such as that shown in S. Fig. 4F) over ten unit cells. The standard deviation was 0.0049 and the percentage error (as measured by the mean discrepancy between the sum of gap and overlap measurements versus the single D-band measurement was <0.05%), arriving at a value of 0.419 : 0.579 D for the gap:overlap. The scale bars showing the gap-overlap and D-band extents were then superimposed on the other TEM images (see Results).
Telopeptide structure prediction and minimization-The extent of folding or compression required of the telopeptides to fit the native electron density map was determined from the difference Patterson and Fourier maps. In addition, structure prediction calculations were performed for the conformation of the telopeptides. Both N-and C-termini were examined using the Chou-Fasman (29) and SOPMA (30) structure prediction methods, and the results were compared with the diffraction and electron microscopy data (D-period gap/overlap ratio, position of heavy atom peaks). The results were found to be in good agreement; therefore atomic coordinates of the telopeptides for one alpha chain were generated de novo from the amino acid sequence and energy minimized using NAMD (31) as a further check of the hypothesis that the telopeptides are folded.
Molecular model and surface rendering-The molecular surfaces were calculated using 'spock' (32)(33)(34) with the default options, except the surface polygon parameter which was set to 120 for improved surface definition. The display option was set to 'mesh' to allow the underlying bonds of the model to be seen. The calculated mesh is not an electron density map, simply a rendering of the molecular surface.

RESULTS
X-ray diffraction data. A medium-wide angle diffraction pattern from native (unstained) type II collagen is shown in Fig. 1, with the central section containing the meridional series indicated. Diffraction images from native lamprey notochord and two heavy-atom derivatives were recorded, showing meridional reflections up to the 37 th order or ~18 Å (integrated and scaled intensities of orders 1-35 for native samples and derivatives are presented in S. Fig. 2).
Location of heavy-atom-binding sites and difference Fourier and Patterson maps. The amplitudes of the native samples were subtracted from the amplitudes of the derivative samples and the resulting differences for each order, together with the calculated phases were used in a reverse Fourier transformation. This operation produces an electron density map, whose peaks correspond to the positions of heavy atoms within the native-like unit cell (the D-period) of the collagen type II fibril. The iodine derivative difference Fourier map is shown in Fig. 2. Although the majority of these heavy atom binding positions appear to correspond to the expected collagen amino acid residue labeling positions, some appear to have bound isomorphously at an unexpected location (no liable amino acid residues from the collagen sequence), and given the corresponding native electron density, this may indicate the presence of the small leucine rich repeat protein (sLRRP), biglycan, or a related sLRRP at this location. It is known that sLRRP's are attached to the surface of type II collagen fibrils at around 0.7-0.8D (see discussion).
The presence of tyrosine residues in the telopeptides and telopeptide region, allows the detection of the ends of the telopeptides (difference Fourier, Fig. 2B, and Patterson maps Fig. 2C and 3D). From these data it was clear that the telopeptides are highly contracted or folded (Figs 3 and 4), a possibility supported by the native electron density map (see below and Fig.  2A).
Collagen type II one-dimensional electron density profile. The native electron density map was calculated using reverse Fourier transformation of the experimentally derived structure factors (Fig. 2). This map represents the sum of the five collagen molecular segments in lateral projection (perpendicular to the fiber axis) in the Hodge-Petruska scheme (S. Fig. 1), and is a direct function of the axial D-periodic structure of the type II collagen fibril. This map allows deductions to be made concerning the conformation of the collagen molecules, the collagen fibril parameters, and the telopeptide conformations (in conjunction with the heavy atom labeling data).
From the one-dimensional electron density map, the overlap is measured to be 0.42D, which is shorter than that of the collagen type I fibril, and the gap is approximately 0.58D (Fig. 2). This overlap to gap ratio of 0.42:0.58 is confirmed by the TEM data (Fig 5 and S. Fig. 4). Compared with type I collagen, these parameters might appear confusing, because the type II molecule is four amino acid residues longer than the type I molecule, which would suggest that the overlap to gap ratio should be closer to 1:1 than the type I ratio of 0.46:0.54 (17,35)). After considering the difference Patterson and Fourier data, it is reasonable to assume that the shorter collagen type II overlap region is a consequence of the telopeptides adopting a conformation that makes them shorter in linear projection (see S. Fig. 3), (17). The telopeptides would have to be either highly compressed ('contracted') or tightly-turned to fit the measured gap-overlap ratio and difference Fourier and Patterson data. The compressed model does not seem plausible due to peptide bond constraints; compressed telopeptides would require a height translation of ~0.8 Å to fit within the 0.42D overlap region.
One-dimensional model of collagen type II Dperiodic structure and structure prediction. To aid the analysis and interpretation of the onedimensional electron density map, an electron density model of one D-period was calculated from the collagen type II amino acid sequence and the individual whole amino acid residue structure factors (Fig. 2) (21). In the model, the collagen molecules are staggered by 234 residues (~67 nm) and the individual peptide α-chains are staggered by one amino acid residue within each triple-helix segment (17,19). The initial model assumed a conformation of the telopeptides where they were straight and relaxed (Figs. 3 and 4, and S. Fig. 3). This model provided parameters for the D-period that disagreed with the one-dimensional map obtained from the experimental data (S. Fig. 3 and Fig. 2). The model overlap region (0.53D) was longer than the experimentally determined value (0.42D) and the gap region was shorter (theoretical 0.47D, experimental 0.58D).
Assuming that the telopeptides may form turns, unstructured coils and relaxed strands (17,21,36), their amino acid sequences were examined and possible conformations were predicted using the Chou-Fasman and SOPMA methods (S. Tables 1, 2). Both these methods and the difference Fourier and Patterson data suggested tight-turns in the Nand C-telopeptides, which would change the gap/overlap ratio in the model to better fit the experimentally determined values.
Telopeptide conformations. According to the structure predictions (S. Table 1, Fig. 3), the Ntelopeptide can form two turns, one at position 11G-12G and another at position 4G-5G. The telopeptide section between the triple-helix and the first turn has greatest propensity to form β-strands. The peptide chains of the triple-helix itself are considered to be in almost a relaxed β-strand conformation, which is close to the polyproline II helical conformation. Given that the telopeptide is directly connected to the triple-helical regions, the N-telopeptide may have some 'structural memory' (17,37) of the polyproline II helix, despite not having a triple-helical conformation until it makes a 180° turn. A second turn is predicted after a straight section. The fragment between the two turns and the rest of the telopeptide after the second turn has a greater propensity to form a 'random coil' (or turn), forming a high electron density area relative to the rest of the triple-helix, between amino acid positions 11 and 15 (Fig. 3). This N-telopeptide conformation makes the overlap region shorter and also brings the Hyl9 residue closer for crosslinking to the Lys949 in monomer five (a close molecular packing neighbor in type I collagen). If the N-telopeptide was simply straight and relaxed, Hyl9 would be too distant from Lys949 and the alternative crosslinking partners Lys238 (monomer 2), Lys703 (monomer 4) and Lys937 (monomer 5) are even further distant. The average spacing between collagen type II monomers in the fibril is about 1.3 nm, based on equatorial diffraction data (not shown) and collagen type I parameters (19). There are eight C -C and two C -N bonds in the lysylhydroxylysine crosslink between the two peptide backbones. The C -C bond is 1.54 Å and C -N bond is 1.47 Å (38). Hence, the length of the whole lysyl-hydroxylysine link from peptide backbone to peptide backbone maybe around 1.5 nm at a maximum. Therefore both the lysine and the hydroxylysine have to be in approximately the same lateral plane, or no more than 0.8 nm apart in the axial direction (within three amino acid residue positions) in order to form the crosslink.
The C-telopeptide shows a propensity to form one sharp turn at the Gly13 position of the telopeptide (Gly1046 of the full molecular sequence) (S. Table 2 and Fig. 4). The fragment of the telopeptide between the turn and triple-helical region has a predicted structure close to a relaxed helix due to methionine clusters (S. Table 2), disturbed by glycine residues. The rest of the Cterminus after the turn is more likely to be a 'random coil' (S. Table 2, Fig. 4). This folded conformation causes additional shortening of the overlap and extension of the gap region in the electron density model. It also moves Lys1050 by approximately eight amino acid positions, moving it much closer to Hyl106 for covalent crosslinking (Fig. 4) (see previous discussion of Lys-Hyl crosslink length).
The structures of the collagen type II telopeptides, as determined from the experimental data (difference Patterson and Fourier maps) and supported by the predicted conformations, were incorporated into the refined one-dimensional model of the D-periodic molecular packing ( Fig. 2  and S. Fig. 3a). This model electron density map provided a much better approximation to the experimental data. Structure factors were calculated from the refined model and used to generate a model electron density map at the same resolution as the experimental map (35 meridional orders, Fig. 2). This allowed the simple R-factor (19) to be calculated, as an estimate of error of the goodness of fit between the two maps. The calculated R-factor was 0.19.
TEM data: mammalian and lamprey fibrils are homologues but tissue architecture differs. Micrographs of collagen type I (rat-tail tendon, human quadriceps tendon) and type II fibers (lamprey notochord, human articular cartilage, bovine articular cartilage) were examined (Fig. 5 and S. Fig. 4). The collagen fibers in these images show the typical pattern for fixed collagenous tissues: black and white bands with approximately 64 nm periodicity (the D-period is shortened during fixation and embedding (15,39). These images also confirm previous observations that collagen type II fibrils from lamprey notochord do not show any detectable differences from those of mammalian tissues (15,16). The lamprey notochord fibrils have similar diameters of about ~35 nm and the typical positive staining pattern observed for mammalian fibrils (Fig. 5 and S. Fig.  4).
Despite the similarities between fibril morphology, the architecture of collagen fibril arrangement differs between mammalian and lamprey tissues. There are also some differences in cellular and PG content in these tissues. Lamprey notochord has a very specific cell distribution and most cells were mechanically removed during sample preparation, whereas cartilage cells are embedded in the collagen meshwork and can be seen throughout the whole tissue (40,41). Lamprey notochord principally contains the PG's, biglycan types I and II, which are similar in sequence to the bovine and human biglycans (comparison of lamprey sequence fragments Q9DE00 and Q9DDZ9 with human P21810 and bovine P21809, (42)), and hence are most likely to be structurally related to decorin and fibromodulin. In contrast, mammalian cartilage contains several different types of PG's, glycoproteins and other types of collagen present (27,42). The specific composition of the ECM plays a crucial role in the formation of collagen fibrils and the organization of the fibrillar meshwork. This influence appears to give rise to the complexity of the articular cartilage matrix, as observed by TEM (S. Fig. 4d), in comparison with the more 'primitive' architecture of notochord tissue (S. Fig. 4b).

DISCUSSION
TEM images of collagen type I and type II fibers from the same tissues used in X-ray diffraction experiments, provided complementary information that correlated well with the results of the X-ray diffraction experiments and with the derived one-dimensional model of collagen type II's D-period structure (S. Fig 3 and Figs 2, 5).
The experimental one-dimensional electron density map (Fig. 2) indicated that a previous observation reported for type I collagen (19,20,43) also applies to type II collagen: there are 234 amino acid residues in the molecular segments within the Hodge-Petruska D-periodic fibril packing structure, which is 67 nm long in hydrated samples, and 64-65 nm long in dehydrated, fixed samples (Fig. 5). However, the overlap is 0.42D +/-0.03 (28.14 nm or 98 amino acid residues long) and the gap is 0.58D (38.86 nm or 136 amino acid residues) in type II collagen, whilst in type I collagen, the overlap is 0.46D and gap is 0.54D. The overlap/gap ratio for type II collagen has previously been suggested to be closer to 0.4D/0.6D than type I (39, 40,44,45) and is also confirmed from the TEM data ( Fig. 5 and S. Fig.  4). This indicates that although there are some obvious similarities, collagen types I and II differ significantly at the level of molecular packing. These differences may arise due to the specific telopeptide conformation of each collagen type.
Experimental and model electron density maps of the type II collagen D-period. Fig. 2 shows both the one-dimensional map of the collagen type II D-periodic structure (blue) and the model constructed from the amino acid sequence and residue scattering factors (green). These maps have similar features, although some peaks are not common to both. The initial theoretical model and experimental peak density positions differ significantly, indicating that the real molecular conformation is not simply linear, particularly in the configuration of the telopeptides (Figs. 2-4) and S. Fig 3). A major peak in the gap region of the electron density map for native type II collagen that is not present in the model (see below), may correspond to dermatan sulfate PG (DSPG) binding sites, which in the case of lamprey notochord is biglycan or bigylcan-like protein, see above (40) This is in agreement with the principal attachment sites for decorin in the e/d band of type I collagen, as determined by TEM (5,35,47). Other peaks on the map may correspond to electron dense side-chain residues of the collagen molecules or other ECM molecules ligated to the surface of the collagen type II fibril, and/or differences in the amino acid residue-to-residue spacing. The latter may in part arise from differences in helical symmetry (e.g. 10/3 vs. 7/2 symmetry). Other significant differences between the experimental and model maps may be explained by the presence of other PG's, fibronectin and aggrecan molecules that may be associated with collagen fibers in lamprey notochord Some of these peaks may be present in the type I collagen electron density map (17), but if present, are much more subtle. Their greater prominence in the type II collagen onedimensional map is due to the fact that type II fibrils are much thinner then type I fibrils (~35 nm and 100-200 nm, respectively, although larger fibers, i.e. bundles of fibrils, may also be present), thereby increasing the effective occupancy and electron density contrast of PG molecules attached to the outside of the fibril.
Informative differences between model and experimental electron density maps. The model of the collagen type II D-periodic structure has several peaks in both the gap and overlap regions that correspond to clusters of amino acids with electron dense side chains, and in general these peaks are also present in the experimental electron density map. However, there are two major differences between the experimental electron density map and the model: a trough in the map at 0.275D and a peak at 0.8D that are not readily explained by the collagen amino acid sequence or electron scattering density alone. As already noted, the data for the experimental electron density map were obtained from native tissues, which contain not only collagen, but also other ECM molecules. Biglycan may bind to several sites on the collagen monomer, but the electron density map indicates that one of these attachment sites on fibrillar type II collagen at 0.8D (905G -915P, monomer four) is occupied in a highly ordered fashion. This is in agreement with the principal attachment sites for the biglycan homologue, decorin, in the e and d staining bands of type I collagen (48,49). The presence of the model peak, in contrast to the experimental electron density map trough at 0.275D may suggest a significant and common molecular inflection amongst the five molecules in the overlap region, before and after this position. Such that the molecular sections passing through 0.275D are straight but tilted on either side of this position, accounting for the greater electron density on either side of this point and the lower relative electron density at 0.275D itself. This type of inflection is also present in the type I collagen microfibril (19). Neither of these two differences between the experimental map or model affect the measurement of the gap-overlap ratio however.
Telopeptide conformations. The conformations of the telopeptides are important for collagen fibril assembly due to the covalent Lys-Hyl crosslinks formed between the collagen molecules within these non-helical domains (50,51). Lysine or hydroxylysine is deaminated by lysyl oxidase and a covalent link is formed spontaneously; this may occur at two places in the collagen type II molecule (7) (Figs. 3 and 4). The distribution of Lys and Hyl residues within the collagen molecule (Figs. 3 and 4) determines the pattern of crosslinks and therefore the molecular stagger within the fibril. The N-and C-telopeptides are involved in crosslinking and their conformations affect the axial position of the crosslinking residues, Hyl9 and Lys1050, which bind to Lys949 and Hyl106 in the triple-helical region respectively. Thereby influencing crosslink formation and therefore fibrillogenesis.
Previous investigations of the structure of fibrillar collagen indicated that the telopeptides form contracted and folded structures (9,17,21,39). Electron microscopy studies of human collagen type II fibers suggested that the folding of the telopeptides is similar to that proposed here (39), but the 20 nm resolution of those studies was not sufficient to provide conclusive evidence (the stretched N-and C-telopeptides are 4.5 nm and 5.4 nm in length, respectively). Similarly, studies with synthetic or ex vivo collagen N-and Ctelopeptides indicated an axial compression of the fragment/s, with possible folding (52-54). However, the specific telopeptide conformations were not determined. Subsequently, in X-ray diffraction investigations of native rat-tail tendon collagen type I in situ, we showed the onedimensional packing structure of the collagen molecules in the D-period. The conformation of the C-telopeptides showed a sharp turn around residues Pro13 and Gln14 and the specific locations of the lysine-hydroxylysine crosslinks (17). These results were later confirmed when the full three-dimensional, higher resolution structure was determined (19). In this study the conformations of collagen type II telopeptides were inferred from the native and difference Fourier density maps, and predicted by the Chou-Fasman method and confirmed by SOPMA (S. Tables 1 and 2). Both telopeptides have a propensity to form a sharp turn in the middle of their sequences (Figs. 3 and 4), which make the whole collagen molecule 2% (5.7 nm) shorter and changes the gap/overlap ratio of the model. These predictions agreed well with the observation that the overlap region must be shorter than linear telopeptide structures would allow, and also with the peak distributions in the difference Patterson and Fourier maps, since the heavy atom binding locations included positions within the telopeptides themselves.
Significantly, the N-telopeptide structure of type II collagen appears to differ from that of type I collagen, which does not have any sharp turns (17,19). In contrast, the proposed collagen type II C-telopeptide structure is much closer to that observed for the collagen type I C-telopeptide and to the prediction of Ortolani (39). It has tyrosine (Tyr1058) and methionine (Met1055) residues at the end that have been moved deeper into the overlap region, closer to the N-terminus. This conformation is also supported by the difference Patterson and Fourier maps.
Implications of telopeptide structure for crosslinking. The derived conformation of the collagen type II telopeptides suggests a specific, well ordered crosslinking pattern in the fibril that is different from that of collagen type I. It may also be the molecular basis of the differences in fibril diameter and fibril bundle organization between two of the most prominent fibrillar collagens, types I and II. The type II collagen molecule has more candidate residues for crosslinking than the type I molecule. At the Cterminus, there are three potential crosslink forming Lys residues, due to the homotypic chain composition of type II molecules. In contrast, crosslinking at the C-terminus of the type I molecule appears to occur primarily through its two alpha 1 chains. Similarly, at the N-terminus of collagen type II, Hyl9 may form three covalent bonds with Lys residues on monomer five, whereas the potential for N-terminus crosslinking in collagen type I is diminished in comparison (19). Therefore, the greater potential for crosslinking in collagen type II may lead to greater numbers of supporting covalent bonds within and possibly even between the fibrils (within fibril bundles) than in collagen type I. Thus, the type II fibril is likely to be more stable than the collagen type I fibril due, in part, to a higher crosslink content. The greater potential for interfibrillar crosslinking in type II collagen may also result in a more complex and possibly more stable network-like tissue organization. At the same time, because the type II N-telopeptide is significantly more bulky then type I collagen's, it may also function in sterically inhibiting the formation of type II fibrils as large as those observed for type I. This in turn would lead to more loosely organized fibril bundles (or thick fibrils) composed of these small and relatively thin fibrils (55).  Note: The scaled amplitudes (S. Fig. 2) were also used to calculate difference Patterson maps for the iodine derivative). The peak position in a difference Patterson plot corresponds to the distance between two points within the unit cell (plotted from 0 to 0.5 unit cell length in fractional coordinates). The peak height is a function of the relative electron density at the termini of each distance. In the present case, a peak represents the distance between heavy atoms that have labeled the unit cell contents isomorphously. Fig. 3. Modeling of the N-telopeptide of collagen type II: in its extended conformation (A), possible cross-linking pattern of hexagonally packed type II collagen monomers with the N-telopeptide carrying monomer 1 marked yellow (B), folded telopeptide conformation (C) supported by hydrogen bonds; Lys residue is shown in blue. The Hyl shifts towards the triple-helix (to the right) making the possibility of forming a covalent bond with Hyl947 of monomer 5 more favorable (D, two α-peptides are shown extended, blue, one is shown folded, green); the crosslink is indicated by light-red arrow. Black bars alongside blue bars in d show the large axial distance crosslinks would need to reach if the telopeptide is extended (blue, see also A). Red bar shows short axial distance spanned by folded telopeptide (green, see also C). Monomers are approximately 1.3 nm distant from each other in lateral space (B). Hydroxylysine is defined as single letter code "U". Mesh displayed around model shown parts A and B is a molecular surface rendering to show the model outline (see methods), not electron density. Fig. 4. Modeling of the C-telopeptide of collagen type II: in extended (A) and folded (B) conformations, supported by hydrogen bonds; Hyl residue is shown in blue. The possible cross-linking pattern of hexagonally packed type II collagen monomers is shown with the C-telopeptide carrying monomer 5 marked in yellow (C). The Hyl shifts towards the triple-helix (to the left) making its covalent bonding with Lys106 of monomer 1 more favorable (D, two α-peptides are shown extended, blue, one is shown folded, green); the crosslink is indicated by light-red arrow. Black bars alongside blue bars in d show the large axial distance crosslinks would need to reach if telopeptide is extended (blue, see also A). Red bar shows short axial distance spanned by the folded telopeptide (green, see also E). Monomers are approximately 1.3 nm distant from each other in lateral space (B). Hydroxylysine is defined as single letter code "U". A and C is a molecular surface rendering to show the model outline (see methods), not electron density.