Analyses of lysine aldehyde cross-linking in collagen reveal that the mature cross-link histidinohydroxylysinonorleucine is an artifact

Lysyl oxidase-generated intermolecular cross-links are essential for the tensile strength of collagen fibrils. Two cross-linking pathways can be defined, one based on telopeptide lysine aldehydes and another on telopeptide hydroxylysine aldehydes. Since the 1970s it has been accepted that the mature cross-linking structures on the lysine aldehyde pathway, which dominates in skin and cornea, incorporate histidine residues. Here, using a range of MS-based methods, we re-examined this conclusion and found that telopeptide aldol dimerization is the primary mechanism for stable cross-link formation. The C-telopeptide aldol dimers formed labile addition products with glucosylgalactosyl hydroxylysine at α1(I)K87 in adjacent collagen molecules that resisted borohydride reduction and after acid hydrolysis produced histidinohydroxylysinonorleucine (HHL), but only from species with a histidine in their α1(I) C-telopeptide sequence. Peptide MS analyses and the lack of HHL formation in rat and mouse skin, species that lack an α1(I) C-telopeptide histidine, revealed that HHL is a laboratory artifact rather than a natural cross-linking structure. Our experimental results also establish that histidinohydroxymerodesmosine is produced by borohydride reduction of N-telopeptide allysine aldol dimers in aldimine intermolecular linkage to nonglycosylated α1(I) K930. Borohydride reduction of the aldimine promotes an accompanying base-catalyzed Michael addition of α1(I) H932 imidazole to the α,β-unsaturated aldol. These aldehydes are intramolecular at the N terminus but at the C terminus they can be both intramolecular and intermolecular according to present and earlier findings.

Collagen is the most abundant vertebrate protein. The highly evolved collagen family includes over 40 different gene products and many additional splice variants with diverse tissue-dependent roles (1,2). Fibril-forming collagens make up the bulk of an organism's collagen mass and are responsible for the underlying strength of most tissues throughout metazoan evolution. The strength of individual collagen fibrils rests heavily on covalent cross-links being formed between individual mol-ecules by the action of lysyl oxidase during fibril assembly and growth (3,4). The content, placement, and chemistry of the resulting cross-linking amino acids is highly regulated and variable, dependent on the tissue type and in particular on the quality of the post-translational modifications of the newly synthesized collagen molecules (5,6). Covalent cross-links can vary among tissues even across a single genetic type of collagen such as collagen type I, the most abundant molecular type in vertebrates (7,8).
In essence, two pathways of lysyl oxidase-mediated crosslinking operate in fibril-forming collagens, one based on telopeptide lysine aldehydes and the other on telopeptide hydroxylysine aldehydes (3). The latter features in bone, cartilages, and many other skeletal and tough connective tissues, whereas the former is characteristic of skin, cornea, and certain tendons, notably rodent tail tendons. The aldehyde side chains created by lysyl oxidase initially interact with specific helical domain hydroxylysines on neighboring collagen molecules to form divalent cross-links. These undergo further reactions to produce mature, chemically more stable cross-linking structures. On a purely hydroxylysine-aldehyde pathway these are the trivalent pyridinolines, hydroxylysyl pyridinoline (HP), 2 and lysyl pyridinoline. In tissues featuring purely a lysine-aldehyde pathway, the only known stable natural maturation product is histidinohydroxylysinonorleucine (HHL), which is prominent on analysis of mature human and bovine skin (9 -12), but not rat or mouse skin or tail tendon type I collagen (11)(12)(13)(14). Originally this cross-linking structure had been misidentified as hydroxyaldol-histidine (15), which later studies corrected (11)(12)(13)(14).
Bone type I collagen in contrast has a unique cross-linking pattern that depends on the reactions between a mix of both telopeptide lysine and telopeptide hydroxylysine aldehydes and their helical domain partners. The mature structures in bone include a mix of trivalent pyridinolines and pyrroles and divalent ketoimine cross-links (16,17). This chemistry is largely controlled during collagen synthesis in the endoplasmic reticulum of osteoblasts by chaperone protein complexes that regulate the activity of lysyl hydroxylase isozymes against telopeptide and triple-helical domain substrate sites (5,8,18).
This work was supported in whole or in part by National Institutes of Health NIAMS Grants AR037318 and AR036794 (to D. E.) and NICHD Grant HD070394 (to D. E.). The authors declare that they have no conflicts of interest with the contents of this article. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. 1 To whom correspondence should be addressed. Tel.: 206-543-4700; Fax: 206-685-4700; E-mail: deyre@uw.edu.
The well-recognized initial reaction products on the lysine aldehyde pathway after lysyl oxidase has converted telopeptide lysines to aldehyde side chains are intramolecular aldol dimers formed between ␣1 and ␣2 and ␣1 and ␣1 N-telopeptides within the individual type I collagen molecules (19). These aldols and remaining telopeptide lysine aldehyde monomers form aldimine cross-links intermolecularly with adjacent helical domain hydroxylysine side chains, which on borohydride reduction of tissue produce tetravalent histidinohydroxymerodesmosine (HHMD) and the reduced divalent aldimine crosslink hydroxylysinonorleucine (HLNL) (20,21). The reactions progress spontaneously in the fibril in vivo, catalyzed by their local protein environment within the highly ordered collagen polymer. The original methods by which the lysyl oxidase catalyzed cross-linking products were identified require borohydride reduction to stabilize labile aldimines as secondary amines. After acid hydrolysis the novel cross-linking amino acids were resolved by column chromatography and their structures were determined by MS, NMR, and other analytical methods (22)(23)(24)(25)(26)(27)(28)(29). The essential elements of lysyl oxidase-mediated cross-linking in collagen and elastin were thus established in the late 1960s through the 1970s (30). There remained some controversy as to the source and origin of HHMD and the structurally related aldol histidine (31) on borohydride reduction, whether the histidine was added as an artifact of basecatalyzed Michael addition (21)(22)(23)(24)(25)(26)(27)(28) or not (29). The controversy was never clearly resolved by methods available at the time.
Peptide MS is a powerful tool for revealing site-specific collagen post-translational modifications including collagen cross-link analysis (32). It helped us discover effects on crosslinking in bone collagen in various human genetic variants and mouse models of osteogenesis imperfecta and in skin and tendon collagens of mice with null genes that affect collagen crosslinking (8,17,32). From these studies we discovered that the allysine aldol dimer, the first cross-link to be identified in collagen (19), which is responsible for the prominent ␤-chain dimers seen on SDS-PAGE of type I collagen extracts from skin and other soft tissues, was also formed between ␣1(I) C-telopeptide lysine aldehydes, both intramolecularly and intermolecularly (34). None of our cross-linking findings on skin, however, were consistent with the proposed origin of the stable maturation product, HHL, predicted to be an additional product between the ␣1(I) C-telopeptide to K87 helix aldimine cross-link, dehydro-HLNL, and a third chain histidine residue at ␣2(I) H92 (10). To understand better the nature of the mature cross-links on the lysine aldehyde pathway we applied MS and different proteolytic cleavage and detection methods to examine the structure of cross-linked peptides containing C-telopeptide and N-telopeptide cross-linking sites from various skin, cornea, and tendon collagens that are known to be particularly rich in or to lack HHL cross-links (35).
The findings show that allysine aldol dimers form the major stable cross-linking bonds at both ends of the type I collagen molecule in tissues that use the lysine aldehyde pathway. Together with literature data they provide evidence for a mechanism by which HHL is produced on acid hydrolysis artifactu-ally from a C-telopeptide allysine aldol in labile linkage to glcgal Hyl at ␣1(I) K87, but only in those species that have a conserved histidine in their ␣1(I)-C-telopeptide sequence.

Comparison of cross-linked ␣-chain SDS-PAGE patterns across tissue collagen extracts
Some quantitative differences in extractability of type I collagen are evident between tissues that depend on lysine aldehyde cross-linking ( Fig. 1), but the patterns of ␣, ␤, ␥, and higher oligomers are very similar. In contrast, an extract of demineralized bone was run as a contrasting type I collagen-based matrix that is cross-linked by a mix of lysine aldehydes and hydroxylysine aldehydes (17). The results show a consistent pattern of ␤-chain dimers being more prominent than ␣-chains in extracts of skin, cornea, and tail tendon collagens compared with bone collagen. The skin, cornea and tendon extracts also reveal higher order oligomers of which ␥ 112 is the most prominent in nondenaturing, acetic acid extracts. The latter ␥ 112 trimer was previously shown to be derived from extracted native type I collagen molecules that contain an allysine aldol between the ␣1(I) N-and the ␣2(I) N-telopeptide and between the two ␣1(I) C-telopeptides. Under the denaturing conditions of SDS-PAGE this runs as the ␥ 112 band.

Tissue-dependent hydroxylation and glycosylation differences between helical domain cross-linking lysines
In-gel trypsin and Lys-C digestion were also used to measure the post-translational status of the known triple-helical domain cross-linking sites in ␣1(I) and ␣2(I) chains from the various tissues. The results are summarized in Fig. 2. Type I collagen of skin and cornea differs from that of tail tendon in having K87 fully hydroxylated and glycosylated as glcgal-Hyl, whereas in tail tendon K87 is fully hydroxylated but is not glycosylated. Bone collagen has primarily a mix of glcgal-Hyl and gal-Hyl at K87, the ratio of which varies between species (17). profiles for collagen extracts from skin, cornea, and tendon. Native collagen molecules extracted by 3% acetic acid were run in the left five lanes and heat denatured collagen from a separate tissue sample in the right five lanes. Loads based on the following original tissue weights extracted were: bovine cornea, 60 g; bovine skin, 30 g; rat skin, 30 g; tail tendon, 15 g; demineralized bone, 300 g.

Collagen cross-linking revisited Western blot analysis of CNBr-digested tissue collagens for C-telopeptide cross-linked structures
The ␤-chain dimers, ␤ 11 and ␤ 12 ( Fig. 1), were the first established products of lysyl oxidase cross-linking in collagen, the result of allysine aldol intramolecular dimerization of N-telopeptides (19). Only recently were C-telopeptide aldol dimers discovered, initially from studying the tail tendon collagen of fibromodulin-null mice (34) but subsequently in the skin and tendons of other mutant and WT mice (8,33). These studies showed evidence for both intramolecular and intermolecular C-telopeptide aldol cross-link formation. We therefore designed an analytical screen to profile all forms of C-telopeptide-containing structures in CNBr digests of tissue collagen. To do this a mAb, mAb 1G7, that recognizes an epitope specific to the ␣1(I) C-telopeptide was applied to SDS-PAGE resolved tissue digests. The results are shown in Fig. 3.
Although immunoreactive yields vary somewhat between individual tissue sources as seen from the lane intensities (which may depend on efficiency of C-telopeptide epitope generation by trypsin), all tissues show similar patterns except bone. The fastest positive band in each lane from skin, cornea, and tendon is the linear ␣1(I)CB6 sequence that ends with the C-telopeptide. Looking at the rat skin lane as an example, the next slower band is an aldol dimer of CB6 and above that is the linear CNBr partial cleavage product, ␣1CB7,6. The same bands repeat in the lanes for rat tail tendon, bovine cornea, and skin. In the latter two lanes, with stronger overall band intensities, the higher band is an ␣1CB6 ϫ ␣1CB7,6 dimeric partial cleavage product. No other significant cross-linked forms of CB6 are revealed. For example, no band that would be consistent with the proposed molecular origin of HHL as a mature cross-link linking the three CB peptides, ␣1CB6 ϫ ␣1CB5 ϫ ␣2CB4 (10), was recovered from skin or cornea.
The lane from rat bone collagen gives a different band pattern that is nevertheless consistent with the known cross-linking mechanism of C-telopeptides in bone type I collagen. Stable The four known lysine sites of helical domain cross-linking were interrogated for post-translational status by in-gel trypsin or endo Lys-C digestion followed by LC-MS. A, summary of the results for bovine cornea and skin, rat skin, tail tendon, and bone. B, example of the mass spectra from cornea for the trypsin-digested ␣1(I) K87 peptide and (D) its fragmentation pattern. C, example of the mass spectrum from cornea of the Endo Lys-C-digested ␣1(I) K933 peptide and (E) its fragmentation pattern.

Mass spectral identification of CB-peptides
To confirm the CB-peptide band identities, histidine-containing peptides were enriched from tissue CNBr digests by an immobilized metal ion affinity column, then fractionated on a C8 reverse-phase column (Fig. 4). As shown, collected fractions were run serially across lanes of an SDS-PAGE gel. Individual bands were excised and subjected to in-gel trypsin digestion and tandem LC-MS analysis, which established their indicated CB-peptide identities. These results also helped confirm the identities of the bands revealed by Western blot analysis in Fig. 3.

Prominence of allysine aldol cross-linked N-and C-telopeptide dimers in bacterial collagenase-digested cornea collagen
To dissect the cross-linking sites more finely down to smaller peptides using mass spectrometric analysis, the same tissues were digested with bacterial collagenase under conditions that generated reproducible and informative cross-linked peptide structures (8,16). The digests were fractionated by reversephase HPLC and individual fractions analyzed by LC-MS and their structures were determined manually from the MS/MS fragmentation patterns (8,16). In Fig. 5, the elution positions and the structures of the major N-telopeptide and C-telopeptide cross-linked dimers are identified.

Effect of borohydride reduction on telopeptide aldol dimeric peptide yields
Reduction of tissue with sodium borohydride failed to change the yield of C-telopeptde aldol dimeric peptides but N-telopeptide dimers were no longer found. Instead the same N-telopeptide dimeric peptides were recovered in covalent linkage to hydroxylysine at K930 in a peptide fragment from the ␣1(I) chain.
The masses and fragmentation profile of these three chained cross-linked structures ( Fig. 6) are exactly consistent with reduction of the aldimine bond formed between the ␣,␤-unsaturated allysine aldol aldehyde and the ⑀-amino group of K930 hydroxylysine. Apparently on tissue reduction, the N-telopeptide aldol adducted to the helical domain hydroxylysine is susceptible to borohydride reduction but the C-telopeptide aldol adduct is not.

Identification of an ␣1(I) C-telopeptide dimer in labile linkage to glycosylated K87 in bacterial collagenase-digested cornea collagen
In further analyses of borohydride-reduced and nonreduced cornea after bacterial collagenase digestion, a large peptide was . Western blotting detection of ␣1(I) C-telopeptide cross-linked structures from CNBr-digested skin, cornea, and tendon gives a similar, characteristic pattern distinct to that from bone. A, illustration of the cross-linking interaction sites between telopeptides and helical domains of 4D-staggered neighboring collagen molecules packed in fibrils. Underneath, a line drawing shows the position of CNBr peptides defined by methionine residues (marked by vertical lines) in rat and bovine ␣1(I) and ␣2(I) chains. The mAb 1G7 recognizes an epitope at the C terminus of the ␣1(I) C-telopeptide whether or not its cross-linking lysine is hydroxylated, oxidized to aldehyde by lysyl oxidase, or cross-linked in oligomeric peptides. B, aliquots of CNBr-digested samples of bovine cornea and skin, rat skin, tail tendon, and demineralized bone were run on 12.5% SDS-PAGE, stained with Coomassie Blue (left panel) or transblotted to polyvinylidene difluoride membrane and ␣1(I) C-telopeptide-containing bands detected by probing with mAb 1G7 and chemiluminescence development (right panel). Lane loads as follows were adjusted to give approximately equal stained band intensities across the left gel (bovine cornea, 10 g; bovine skin, 9 g; rat skin 8 g; tail tendon, 5 g; bone, 7 g) and the right immunoblot (bovine cornea, 10 g; bovine skin, 2.5 g; rat skin 8 g; tail tendon, 6 g; bone, 2 g).

Collagen cross-linking revisited
recovered spread across several fractions of the C8 HPLC chromatogram. The peptide was recovered with the same characteristic properties whether tissue had been reduced or not (Fig.  7). The structure was determined manually based on fragmentation properties of the parent ions and of MS 3 of the helical arm throw off (Fig. 7, C and D). The parent ion was unusual in our experience with collagen cross-linked peptides in that the intact helical peptide arm (ions 1082.61ϩ and 541.82ϩ) containing K87 hydroxylysine with sugars (glucosylgalactosyl) attached was released on MS/MS as a major, intact fragment ion. This implied a labile covalent bond between the allysine aldol cross-linking residue of the C-telopeptide dimer and the glycosylated K87 hydroxylysine. The 1049 4ϩ prominent ion (Fig. 7B) matches the mass of the C-telopeptide dimer being released as the ␣,␤-unsaturated allysine aldol peptide dimer.

LC-MS analysis of HHL, HHMD, and total cross-linking amino acids after acid hydrolysis
The LC total ion-current profiles of the cross-linking amino acid products of acid hydrolysis of borohydride-reduced bovine tendon, skin, and cornea, and mouse skin are shown in Fig. 8A, together with mass spectra from scrolled regions of these profiles. The results confirm that bovine cornea and skin are rich sources of HHL and HHMD, whereas mouse skin yields only HHMD but no HHL. Bovine flexor tendon reveals products of cross-linking from both the lysine aldehyde and hydroxylysine aldehyde pathways as expected, with hydroxylysyl pyridinoline (HP), HHMD, HHL, and the reduced divalent cross-links HLNL and LNL all present. From skin, desmosine (Des, from elastin) is also detected. Samples of the same tissues hydrolyzed without prior borohydride reduction gave HHL but no HHMD, HLNL, or LNL as expected.

Discussion
The SDS-PAGE patterns of extracted collagen chains from skin, cornea, and tendon are essentially similar, with more prominent ␤-chain dimers than ␣-chains (Fig. 1). The crosslinked trimer, ␥ 112, is also consistently present. We have shown that this is the product of C-telopeptide aldol dimer formation in mouse tail tendon and skin collagen, and is especially prominent in fibromodulin-null mice due to increased telopeptide lysine oxidation by lysyl oxidase (34). Similar, abnormally high levels of ␥ 112 and C-telopeptide aldol dimeric cross-links in skin collagen were also observed in mice with null genes encoding Sc65 or P3h3, two members of the prolyl 3-hydroxylase gene family (8,33). The latter two mouse models phenocopy a biochemical abnormality in the post-translational modification of type I collagen that results in underhydroxylation of triple-helical domain lysines at cross-linking sites ␣1(I) and ␣2(I) K87 and ␣1(I) K930 and ␣2(I) K933 in all tissues but especially skin (8). This results in mostly lysine and little glcgal-Hyl at ␣1(I) K87 in skin collagen. The increase in ␥ 112 and intramolecular C-telopeptide aldol dimers in skin collagen was attributed to the lack of sugar residues at K87 allowing intramolecular aldols to form when both ␣1(I) C-telopeptides in the same molecule had their lysines oxidized to aldehydes by lysyl oxidase. In WT skin, intermolecular aldols at the C terminus appeared to be the norm based on the ␥ and higher cross-linked chain patterns (8).
For the present study, this implies that the state of glycosylation at K87 may sterically determine the reaction properties of C-telopeptide allysines in the growing fibril during spontaneous cross-link formation.
Analysis of the state of hydroxylation of lysine and glycosylation of hydroxylysine at the triple-helical domain cross-linking sites established that in skin and cornea collagens K87 was fully

Collagen cross-linking revisited
glycosylated, but in tail tendon type I collagen K87 was exclusively Hyl with no attached sugars (Fig. 2). This lack of glycosylation of tendon type I collagen has been noted before (35). To determine whether this had any tissue-dependent, selective effect on C-telopeptide cross-linking interactions, not evident by SDS-PAGE analysis of extracted collagen chains, we next looked at CNBr digests of whole tissue, matrix collagen. The mAb 1G7, which recognizes the C terminus of the C-telopeptide sequence, EXAHDGGR, where K can be Lys, Hyl, or a cross-link to one or more additional peptides, provides a useful screen on Western blotting for all linear and cross-linked structures that have this terminal peptide sequence. Whole tissue was treated with trypsin under nondenaturing conditions to enhance the yield of the C-terminal neoepitope that 1G7 recognizes. All tissues, except a bone control, were selected for comparison based on their collagen being cross-linked exclusively by the lysine aldehyde pathway. Bovine skin and particularly cornea were known to be rich in HHL (36), whereas this cross-link was reported absent from rat and mouse skin and tail tendon collagens (10, [12][13][14]. The results in Fig. 3 showing essentially the same patterns of linear and cross-linked forms of CB6 from skin, cornea, and tendons is not consistent with the proposed intermolecular sites linked by HHL residues. The origin of HHL as a trivalent cross-link of three chains, an ␣1(I) C-telopeptide in aldimine linkage ␣1(I) Hyl87 adducted to ␣2(I) H92, was proposed based on N-terminal Edman sequencing analyses of a partially purified tryptic peptide prepared from bovine dermis (10). No band fitting the size of such a combination of CB-peptide fragments is evident in the 1G7 Western blots from bovine skin or cornea, the latter tissue collagen in our hands being the richest source of HHL at about 1 mol/mol of collagen (36). The only cross-linked form of CB6 on SDS-PAGE from all tissues except bone was a CB6 dimer and a larger CB7,6 partial cleavage form of it (Fig. 3). From a bone CB-digest run as a control, the main cross-linked forms of CB6 are the divalent ketoimine cross-linked peptide, CB6 ϫ ␣1(I)CB5, and trivalent pyridinoline/pyrrole cross-linked CB6 ϫ CB6 ϫ ␣1(I)CB5. The latter serves as a useful size marker to compare with the smaller, faster running CB6 aldol dimer from skin, cornea, and tendon CNBr digests.
To further refine the identity of the native cross-linked structures, smaller peptides were prepared by bacterial collagenase digestion. On MS these gave prominent yields of the same C-telopeptide and N-telopeptide aldol dimeric peptides from all skin, cornea, and tendon digests (Fig. 5 shows the results for bovine cornea), whether or not the tissue had been reduced with sodium borohydride. In addition, the bovine cornea digest revealed a larger peptide consisting of the CB6 dimer in labile linkage to ␣1(I) Hyl87 contained in a short segment of helix (Fig. 7). This adduct was recovered consistently from repeat digests of cornea and also from skin but not rat or mouse tail tendon (data not shown), but in relatively low and variable yields compared with the free C-telopeptide aldol dimeric peptide (Fig. 5). The mass spectral fragmentation behavior of this larger peptide with a strong throw off of the intact GMK(galglc)-GHRG K87 arm is consistent with an interpretation that this was an adduct in labile linkage to the allysine aldol aldehyde group of the C-telopeptide dimer. Variable yields of this peptide together with the more abundant free C-telopeptide dimer were recovered from digests whether or not the tissue was previously treated with sodium borohydride.
This was in sharp contrast to the behavior of the N-telopeptide dimers that were absent from digests of borohydride-reduced tissue and quantitatively converted to the reduced aldi-

Collagen cross-linking revisited
mine trimeric peptides identified in Fig. 6. Whether the tissue was reduced or not, the labile C-telopeptide trimer behaved the same on MS, with the characteristic throw off of the intact helical arm shown in Fig. 7. Again in sharp contrast, the MS/MS profile of the borohydride-reduced N-telopeptide trimeric pep-tides did not yield any intact helical arm fragmentation products (Fig. 6). The low yields of the C-telopeptide aldol/K87 adduct on bacterial collagenase digestion, and of HHL on acid hydrolysis of bacterial collagenase-digested cornea (Fig.  8), are consistent with the C-telopeptide aldol forming a

Collagen cross-linking revisited
labile adduct with glcgal-Hyl at K87 in vivo, which can break on proteolysis but produces HHL on acid hydrolysis. The mAb 1G7 Western blotting results are consistent with this conclusion (Fig. 3).
The present findings can explain why skin collagen from certain species yield HHL on acid hydrolysis but rat and mouse skin do not. Specifically, bovine, human, and most vertebrates contain the ␣1(I) C-telopeptide sequence, EKAHDGGR (NCBI accession P02453.3), with histidine (H 18c ) two residues C-terminal to the cross-linking lysine, whereas the rat and mouse sequence, EKSQDGGR (NCBI accession numbers P02454.5 and P11087.4), does not (Fig. 9). Skin from other species (sheep, goat, deer, as well as cow) all yield HHL on hydrolysis (37) and have the EKAHDGGR sequence conserved. From the present structural results, it is clear that it is this telopeptide histidine imidazole that is incorporated into HHL on acid hydrolysis of native fibrils. If so, this raises the question, what happens to the C-telopeptide aldol dimers in rat and mouse skin on acid hydrolysis? Without the neighboring telopeptide histidine, these native aldol complexes are probably broken down on acid hydrolysis as early studies of allysine aldol properties in collagen concluded (38).
The intramolecular aldol dimers formed between N-telopeptides on the other hand clearly behave differently from the C-telopeptide aldol dimers in their chemical reaction properties in native fibrils, being quantitatively and stably reduced as aldimine adducts to K930 of an ␣1(I) chain by sodium borohydride (Fig. 6). Although we did not recover equivalent reduced aldol

Collagen cross-linking revisited
dimer peptide adducts to ␣2(I) K933 they are possible and we cannot rule them out.

Proposed origin and mechanism of formation of the histidinecontaining adducts HHL and HHMD
The above properties support the proposed mechanisms outlined in Fig. 10 whereby the C-telopeptide aldol dimer resists borohydride reduction as an ␣1(I) K87-linked enamine tautomer, whereas N-telopeptide aldol aldimine adducts are sodium borohydride-reduced producing HHMD, which is released on acid hydrolysis. The speculated retro-aldol mechanism for generating HHL from the enamine tautomer on acid hydrolysis is proposed based on current knowledge of the base-catalyzed mechanisms that stabilize Schiff's base to enamine intermediates at the active sites of a range of aldolase enzymes (39 -42), together with the following unique sequence features of the C-telopeptide to helix crosslinking site.
All vertebrate ␣1(I) C-telopeptides including rat and mouse have a conserved arginine 6 residues C-terminal (R 22C ) to the cross-linking lysine (see Fig. 9). Also in skin and cornea, ␣1(I) K87 is almost all glucosylgalactosyl hydroxylysine, whereas ␣1(I) K930 and ␣2(I) K933 are not glycosylated. In contrast to skin and cornea, neither K87 nor K930/933 are glycosylated in rodent tail tendon type I collagen. We believe these chemical features are responsible for the distinctive tissue-dependent reactivity properties of C-telopeptide aldehydes, and their ensuing aldol dimers, compared with N-telopeptides. Specifically, we propose that R 22C , which will be in close proximity to the aldehyde in the hairpin loop conformation of each C-telopeptide (43), base-catalyzes both aldol formation and the subsequent enamine tautomer structure as the dominant tautomeric form of the C-telopeptide aldol reaction product with K87. When in addition, K87 is glcgal Hyl, as it is in skin and cornea, the product is HHL on acid hydrolysis. The hexoses may provide the latent, local reducing agent required by the proposed reaction mechanism. In support of this mechanism, it has been established that short synthetic peptides containing the ␣,␤-unsaturated aldehyde cinnamaldehyde inhibit certain Cross-linking amino acid products in acid hydrolysates of whole tissues were separated from bulk amino acids by organic solvent partition on a hydrated cellulose column, then resolved and analyzed by electrospray LC-MS on the LTQ XL using a Cogent 4 diamond hydride column essentially according to the method of Naffa et al. (37). A, total ion current elution profile from the LC column for hydrolysates of sodium borohydride-reduced samples of bovine tendon, bovine skin, mouse skin, and bovine cornea. The elution positions of the various major collagen cross-linking amino acids, HP, HHMD, HHL, HLNL, LNL, lysinonorleucine, and Des, desmosine (from elastin), are shown. B-E, mass spectra scrolled across the regions of the LC eluent for each tissue sample showing the masses of the various cross-linking entities. Both HHL and HHMD are prominent in bovine skin and cornea but absent from mouse skin. F, acid-hydrolyzed pooled fractions from the C8 RP-HPLC chromatogram of bacterial collagenase-digested cornea containing the C-telopeptide aldol dimer to ␣1(I) K87 adduct (Fig. 6) reveal a low but significant yield of HHL, as did acid hydrolysates of whole bacterial collagenase-digested cornea samples. , which is incorporated on acid hydrolysis into HHL, is shown in blue, and the arginine (R 22C ) that helps catalyze the cross-link formation is shown in green. It is notable that both mouse and rat lack the histidine residue and do not form HHL. Sequences are from Ensembl databases.

Collagen cross-linking revisited
classes of phosphatases by forming a reversible covalent enamine adduct with the guanido group of an arginine residue in their Src homology 2 domain active site (44,45). As enamines, these adducts resisted borohydride reduction. We propose that R 22C in ␣1(I) C-telopeptides can similarly interact with the aldol formed between two K 16C allysines and, in native fibrils, influence the equilibrium of the addition reactions between K 16C allysines and K87 helix hydroxylysines, favoring the enamine tautomer shown in Fig. 10.

Significance
Our findings show that the long accepted collagen crosslinking pathway that involves histidine addition to produce the mature cross-links in skin, cornea, and other tissues is incor-rect. In considering how tissues that use this pathway grow and remodel their fibrillar extracellular matrix the new concept has important implications.
We conclude that aldol dimers are the main, stable crosslinking bonds formed by C-telopeptides and N-telopeptides in skin, cornea, and certain tendon type I collagens. Their further reaction products form relatively labile intermolecular crosslinks. If the aldol-forming bonds are all intramolecular at the N terminus and mostly intramolecular at the C terminus, as is the case with rodent tail tendon, this can explain why rodent tail tendon collagen is highly soluble as native monomers in dilute acetic acid (8,34), because the intermolecular cross-links will be mostly acid-labile aldimines and enamines (Fig. 10). In normal skin collagen, most of the C-terminal aldol bonds appear to Figure 10. Proposed mechanism of formation of HHL from C-telopeptide aldols compared with HHMD from N-telopeptide aldols. A, HHL formation on acid hydrolysis is restricted to skin and cornea from species that have a conserved ␣1(I) C-telopeptide histidine residue. Both this histidine and glucosylgalactosyl hydroxylysine at ␣1(I) K87 appear to be necessary for HHL formation from tissue on acid hydrolysis. The proposed reaction mechanism involves a retro-aldol cleavage and covalent addition of the telopeptide histidine imidazole to give the accepted structure for HHL (10, 47). B, HHMD is a product of borohydride reduction of the aldimine addition product between N-telopeptide ␣,␤-unsaturated aldol cross-links and K930 hydroxylysine together with the Michael adduction of the histidine at H932 in the same helical sequence through its imidazole group. The peptide structures obtained from borohydridereduced, bacterial collagenase-digested skin and tail tendon collagen show that the histidine has to be from this same chain, not another ␣-chain. If the latter occurred, more complex peptides should have been found. The accepted structure of HHMD with the ring nitrogen of histidine imidazole added to the ␤-carbon of the aldol is shown (10, 37).

Collagen cross-linking revisited
be intermolecular (8), which can explain the lower acid solubility of skin collagen versus tail tendon (8).
The findings are also significant when considering how collagenous tissues that depend on the lysine-aldehyde cross-linking pathway for strength can build and remodel their extracellular fabric (46). Growing tendons, for example, may benefit from having labile intermolecular cross-links that allow collagen fibril remodeling to occur through bond breakage and reformation. In this way, new collagen molecules could be added to increase the diameter of existing fibrils, and fibrils could fuse laterally, whereas the tissue fabric as a whole could continue its mechanical function as it grows with the skeleton and remodels as a mature tissue.
Cornea is a unique tissue that depends on the lysine aldehyde pathway for the natural strength of its highly organized type I collagen fibrillar array (47). The new insight that aldol dimers are the principal source of stable cross-linking bonds in cornea is potentially significant with respect to the recognized clinical benefits from artificially cross-linking keratoconus corneas with riboflavin and then UV-A light (48). The underlying molecular mechanism by which this treatment strengthens corneal tissue and prevents bulging is unclear, but there is experimental evidence that endogenous carbonyl groups are the reactive components that form new cross-links (49). Because cornea type I collagen contains roughly 2 mol of allysine aldol per mole, this is a strong candidates for being the proposed reactive species when corneas are treated with riboflavin and UV-A light.

Tissue sources and preparation
Adult bovine tissues (skin, eyes, and superficial digital flexor tendon) were purchased from Sierra for Medical Science (Whittier, CA). Adult normal Sprague-Dawley laboratory rats and adult normal (C57B6/j) mice were obtained as byproducts from approved and completed animal studies. Tissues were scraped clean and washed in saline (0.15 M NaCl, 0.05 M Tris), pH 7.4, with protease inhibitors (2 mM EDTA, 5 mM benzamidine, 10 mM 1,10-phenanthroline, and 2 mM phenylmethylsulfonyl fluoride) for 24 h at 4°C. Tendon and skin were defatted with chloroform/methanol (1:1, v/v) for 24 h at room temperature. Cornea was extracted in 4 M quanidine HCl, 0.05 M Tris, pH 7.4, with protease inhibitors overnight at 4°C to remove proteoglycans before collagen extraction. Bone was demineralized in 0.1 M HCl, 24 h at 4°C.

Collagen extraction and proteolysis
Type I collagen was solubilized from 2 to 3-mg samples of prepared bone, tendon, skin, and cornea by heat denaturation for 3 min at 90°C in Laemmli buffer (denatured extract), or 3% acetic acid at 4°C for 24 h, or cyanogen bromide digestion in 70% formic acid at room temperature for 24 h. Prepared tissues were also digested with bacterial collagenase as described (16) with the addition of 10% (v/v) acetonitrile to the digestion buffer, and collagenase-generated peptides were separated by reversed-phase HPLC (C8, Brownlee Aquapore RP-300, 4.6 mm ϫ 25 cm) with a linear gradient of acetonitrile/n-propyl alcohol (3:1, v/v) in aqueous 0.1% (v/v) TFA and monitored by absorbance at 220 nm.

Cornea CNBr peptide chromatography
An IMAC (immobilized metal ion affinity chromatography) column was prepared by charging HiTrap chelating HP beads (Amersham Biosciences) with CuCl 2 . CNBr-digested cornea was loaded onto the IMAC column in 0.05 M Tris-HCl, 0.15 M NaCl, 2 M guanidine HCl, pH 8.0, and the bound peptides were eluted with 0.1 M sodium acetate, pH 4.6, containing 2 M guanidine HCl and 0.2 M imidazole. Eluted CNBr peptides were run on reversed-phase HPLC and individual fractions run on 12.5% SDS-PAGE.

SDS-PAGE
The method of Laemmli (50) was used with 5 or 12.5% gels for extracts of tissue collagen. Type I ␣-chains and cyanogen bromide peptides were cut from gels and digested with trypsin or Endo Lys-C (Mass Spec Grade, Promega, Madison, WI) ingel, for mass spectral analyses.

Immunoblotting with mAb 1G7
A mouse mAb (1G7 isotype IgG1,k) was generated against the synthetic peptide EKAHDGGR that recognizes a C-terminal neoepitope of the ␣1(I) collagen C-telopeptide. This can be generated by trypsin digestion of native collagen. Peptide conjugation, immunization, and hybridoma selection were done as described in Ref. 51 using RBF/DnJ mice. The synthetic peptide EKAHDGGR was used for screening and characterization. Using variable length C-telopeptide synthetic peptides, we determined that the C-terminal arginine is necessary for 1G7 antibody recognition. Tissue samples were activated with 0.1 mg/ml of trypsin (trypsin, proteomics grade, Roche Diagnostics) in 0.1 M NH 4 HCO 3 for 24 h then digested with CNBr in 70% formic acid both at room temperature for 24 h. CNBr peptides where separated by 12.5% SDS-PAGE and transblotted to polyvinylidene difluoride membrane. Membranes were blocked in 5% nonfat milk, probed with IG7 mAb (1:5,000 dilution), and detected with an horseradish peroxidase-conjugated secondary antibody (1:10,000, anti-mouse IgG). Blots were developed using SuperSignal West Pico PLUS Chemiluminescent Substrate (Thermo Scientific).

LC-MS peptide analysis
Electrospray LC-MS/MS was performed on tryptic and Lys-C peptides generated from gel bands and individual collagenase HPLC fractions using an LTQ XL ion-trap mass spectrometer (Thermo Scientific) equipped with in-line LC using a C4 5-m capillary column (300 m ϫ 150 mm; Higgins Analytical RS-15M3-W045) eluted at 4.5 l min. The LC mobile phase consisted of buffer A (0.1% formic acid in MilliQ water) and buffer B (0.1% formic acid in 3:1 acetonitrile/n-propyl alcohol, v/v). The LC sample stream was introduced into the mass spectrometer by electrospray ionization (ESI) with a spray voltage of 4 kV. Proteome Discoverer search software (Thermo Scientific) was used for peptide identification using the NCBI protein database. Proline and lysine modifications were examined manually by scrolling or averaging the full scan over sev-Collagen cross-linking revisited eral minutes so that all of the post-translational variations of a given peptide appeared together in the full scan. Cross-linked peptides were identified manually by calculating theoretical parent ion masses and possible MS/MS ions and matching these to the actual parent mass and MS/MS spectrum.

LC-MS of cross-linking amino acids
Samples of freeze-dried tissues both untreated and reduced with sodium borohydride (36), bacterial digested tissue, and subsequent chromatographic peptide pools were hydrolyzed in 6 N HCl at 108°C for 24 h. Cross-linking amino acids were enriched from hydrolysates by partition on a hydrated cellulose column (37). Briefly, a column was prepared by pouring a 5% cellulose (CF1, Whatman) slurry in butanol/water/acetic acid (4:1:1, v/v/v) into a 10-ml syringe plugged at the bottom with glass wool. The column was washed three times with butanol/ water/acetic acid (4:1:1, v/v/v) before the sample dissolved in a 5% cellulose/butanol/water/acetic acid slurry was applied. Free amino acids were removed by washing the column with butanol/water/acetic acid solution. The multicharged bound crosslinking amino acids were eluted with water and freeze-dried for mass spectral analysis. Electrospray LC-MS/MS was performed on the LTQ XL using a Cogent 4 diamond hydride column (15 cm ϫ 1 mm; Microsolv Technology, 70000-15P-1) eluted at 50 l min. The LC mobile phase consisted of buffer A (0.1% formic acid in MilliQ water) and buffer B (0.1% formic acid in 80% acetonitrile).