Post-translational Modifications in Cartilage Oligomeric Matrix Protein

Analysis of the carboxymethylated subunit of human cartilage oligomeric matrix protein (COMP) by matrix-assisted laser desorption time-of-flight mass spectrometry indicated a protonated molecular mass of 86949 ± 149 Da, compared with 83547.0 Da calculated from the sequence. Treatment withN-glycanase caused a reduction in mass of 3571 ± 219 Da, but there was no loss of mass after treatment withO-glycanase or neuraminidase. Peptides containing two putative sites of N-glycosylation were purified and characterized. Analysis of the masses of these afterN-glycanase treatment indicated that one was substituted at Asn-101 with an oligosaccharide of mass 1847.2 ± 6.6 Da, and the other was unsubstituted at Asn-124. The remaining site of attachment, at Asn-721, was, therefore, also substituted with an oligosaccharide of mass 1724 ± 226 Da. Analysis of the total monosaccharide content by chemical methods indicated that there were no additional oligosaccharide substituents. The MALDI-TOF mass spectra of COMP from bovine fetal and adult cartilage were compared, indicating a more heterogeneous pattern of substitution at Asn-101 in the fetal form. Since COMP is distributed throughout the pericellular and territorial environments in developing cartilage but occupies the interterritorial zone in mature cartilage, these changes in glycosylation may allow for different intermolecular interactions.

Cartilage oligomeric matrix protein (COMP) 1 is a pentameric glycoprotein consisting of identical monomeric subunits of apparent molecular weight 100,000 (1). It is found in the extracellular matrix of tracheal, nasal, and articular cartilage and is also synthesized by Swarm rat chondrosarcoma cells (2). In tendon, it is distributed around and within tendon bundles (3). The sequences of rat (4) and human (5) COMP have been determined from cDNA clones. The rat and human sequences contain several regions that show a high homology with the thrombospondins, having a series of four contiguous type 2 epidermal growth factor (EGF) repeats followed by a series of seven type 3 calcium binding domains. There is also an RGD cell-binding motif in human, but not in rat COMP. This is also present in bovine COMP, for which a partial sequence is available (4). The human COMP gene is located on chromosome 19p13.1 (5). The interactions between the COMP subunits that give rise to the pentameric structure are mediated by an ␣-helical portion of the protein close to the N terminus. Oligomerization is independent of disulfide bond formation although these interactions stabilize the assembled structure (6). Recently (6) the crystal structure of the oligomerization domain of COMP was described. This structure consists of a bundle of five ␣-helical strands, each 46 residues in length, that form a parallel coiled coil. The resulting structure is a pore that may act as an ion channel, but the hydrophilic outer surface of this structure indicates that it is not a transmembrane domain (6).
The function of COMP in cartilage is unknown, but a good deal of information has been obtained about its distribution in immature and mature tissues. During development of the rat femoral head, COMP shows a predominant pericellular and territorial distribution around chondrocytes in the cartilage extracellular matrix and in the growth plate (7). As the secondary center of ossification forms, COMP protein disappears from the calcified tissues but persists in the growth plate. In the mature articular cartilage, COMP is prominent both in its mRNA levels, as studied by in situ hybridization, and in protein abundance. Interestingly, the protein is primarily found within the interterritorial matrix compartment at this stage (7). This observation may indicate different roles of COMP in immature and adult tissue. During limb chondrogenesis in mice, COMP expression begins at an early stage, where it is seen in the peripheral region of the developing humerus as the 100 kDa subunit protein (8). Later, it is more uniformly distributed throughout the cartilaginous layer. These observations suggest that COMP, because of its early expression during cartilage development, may play a role in the assembly of the extracellular matrix.
The importance of the role that COMP plays in cartilage development is underscored by the observation that mutations in the COMP gene give rise to pseudoachondroplasia (9) and multiple epiphyseal dysplasia (10), conditions characterized by short stature and cartilage abnormalities. These mutations occur within the type 3 calcium-binding repeats, suggesting an essential role for COMP in calcium-mediated interactions.
COMP has three sites of potential N-glycosylation (4, 5), * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
¶ To whom correspondence should be addressed: Osiris Therapeutics Inc., 2001 Aliceanna St., Baltimore, MD 21231-2001. Tel.: 410-522-5005, ext. 233; Fax: 410-522-6999; E-mail: fbarry@osiristx.com. 1 The abbreviations used are: COMP, cartilage oligomeric matrix protein; EGF, epidermal growth factor; GdnHCl, guanidine hydrochloride; MALDI-TOF MS, matrix-assisted laser desorption ionization time-of-flight mass spectrometry; PVDF, polyvinylidene difluoride; TFA, trifluoroacetic acid. asparagines 101, 124, and 722 (the residue numbers refer to human COMP), but the extent to which they are substituted has not been determined. The role of carbohydrate substituents in modulating the interactions and stability of extracellular proteins has been well described. In some cases, the nature and degree of substitution varies with the developmental state. For example, aggrecan, which is an abundant proteoglycan in cartilage, has several sites of N-glycosylation within the N-terminal globular G1 domain that become substituted with sulfated polylactosamine chains during maturation (11). In this report, the results of studies on the distribution and structure of carbohydrate substituents in bovine and human COMP, using chemical analysis and mass spectrometry, are described. It is shown that there are some important maturation-related differences in the structure of the N-linked sugars.

EXPERIMENTAL PROCEDURES
Purification of COMP-COMP was purified from mature bovine articular cartilage obtained from the metacarpalphalangeal joints by dissociative extraction (1). Briefly, this involved cesium chloride density gradient centrifugation of the 4 M guanidine hydrochloride (GdnHCl) cartilage extract. The fraction of lowest buoyant density (1.35 g/ml), representing one-fourth of the total volume, was collected, dialyzed against water, and concentrated by drying. The concentrate was applied to a Superose 6 column (Pharmacia Biotech Inc.) and eluted with 4 M GdnHCl, 20 mM Tris-HCl, pH 7.4. The material recovered from the void volume peak was dialyzed and applied to a Mono-Q HR 5/5 column (Pharmacia) equilibrated in 10 mM Tris-Cl, 7 M urea, 0.15 M NaCl, pH 7.4. Elution was with a linear gradient of NaCl from 0.15 M to 1.5 M. Human COMP was prepared from femoral head cartilage from a 35year-old individual using the same procedure. COMP was also prepared from steer fetlock cartilage and from third trimester bovine fetuses.
Monosaccharide Analysis-For monosaccharide analysis, samples were hydrolyzed in the liquid phase in the presence of either 2 M TFA or 4 M HCl at 100°C for 3 and 5 h, respectively. To check for losses of monosaccharides during hydrolysis, fetuin was used as a control and was hydrolyzed under conditions identical to those used for COMP. Analysis was carried out on a Dionex (Sunnyvale, CA) LC500 system using a PA-1 column (13). Detection was by pulsed amperometry using a Dionex ED-40 system.
MALDI-TOF Mass Spectrometry-This was carried out using a Hewlett-Packard G2025A instrument in the positive ion mode. 10 l of matrix solution was mixed with 1 l of sample, and a 0.7-l aliquot was dried under vacuum. Crystal formation was monitored using the Hewlett-Packard G2024A sample prep station. For intact proteins, the matrix used was sinapinic acid (Hewlett Packard), approximately 100 shots were summed and the total laser irradiance was 5-6 J. For peptides derived from proteolytic digests, ␣-cyano-4-hydroxycinnamic acid (Hewlett Packard) was used as matrix, approximately 30 shots were summed, and the total laser irradiance was 0.6 -1.0 J. An alternative method for sample preparation was carried out as follows. Sample and matrix were applied as already described and allowed to dry, and then 0.7 l of a formic acid solution (water:acetonitrile:formic acid 100:100:35) was overlaid and allowed to dry. 2 Other Methods-Protein sequencing was carried out using a Hewlett-Packard G1000A protein sequencer using Routine 3.0 methods. For lectin blotting, COMP was reduced and carboxymethylated, separated by sodium dodecyl sulfate polyacrylamide gel electrophoresis on a 10% gel (14), and electrotransferred onto polyvinylidene difluoride (PVDF) membrane. The membrane was probed with lectins SNA (15) and MAA (16) using the CHO kit (Boehringer Mannheim), with bovine fetuin as the standard.

RESULTS
Analysis by MALDI-TOF mass spectrometry of reduced and carboxymethylated COMP from adult human articular carti-2 A. Woods, personal communication.  2. A, lysine-C proteinase peptide map of adult bovine COMP after reduction and carboxymethylation. Separation was carried out on a Vydac C18 column. The peptides were referred to by the notation K-1 through K-17. The retention times are shown in Table I. Absorbance at 217 nm was monitored. B, peptide K-17, eluting at 40.1 min was analyzed by MALDI-TOF MS, before (upper spectrum) and after (lower spectrum) treatment with N-glycanase.
lage indicated a protonated subunit molecular mass of 86949 Ϯ 149 Da (the average of three measurements, Fig. 1 upper spectrum). The predicted value was 83547.0 Da (5), with the mass difference of 3402 Ϯ 149 Da indicating a low level of posttranslational addition. The carboxymethylated subunit protonated molecular mass of COMP prepared by associative extraction from adult bovine articular cartilage was 86560 Ϯ 163 Da (Fig. 1, lower spectrum), close to the value obtained for human COMP. There is a high degree of homology between the bovine, rat, and human sequences (4,5) in the region where comparisons may be made (residues 299 -737 of human COMP, or 59% of the sequence). It was assumed that there is an equally high homology between the sequences in the regions where a direct comparison is not possible. Therefore, many of the mapping studies described here were carried out using bovine COMP and related to predictions made by comparison with the human and rat sequences.
Bovine COMP was digested with lysine-C proteinase, and the peptides were separated by reverse-phase HPLC ( Fig. 2A). Several peptides were collected and identified by N-terminal sequencing, MALDI-TOF mass analysis, or both. These are referred to as peptides K-1 through K-17 (Table I). Of particular interest was peptide K-17, with a protonated mass of 11744.3 Ϯ 3.3 Da. The N-terminal sequence was identified as LTVRPLSQCRPGFCFPGVAXTXT by Edman degradation. When aligned with the human COMP sequence (5), this peptide corresponds to residues 62-155. This peptide carries two putative N-glycosylation sites, at Asn-101 and Asn-124. The protonated mass was reduced to 9897.1 Ϯ 3.3 Da by digestion with N-glycanase, a difference of 1847.2 Ϯ 6.6 (Fig. 2B). This peptide was digested with trypsin and fractionated by HPLC (Fig. 3A). The tryptic peptides, referred to as T-1 through T-4, were identified by N-terminal sequencing, and their protonated masses were determined by MALDI-TOF MS, with and without treatment with N-glycanase. The results for peptides T-1 and T-3 are shown in Fig. 3, B and C and are summarized in Table  I. One of these peptides (peptide T-1, with sequence CG-PCPEGFTGXGSHCADVNECXAHP . . . , corresponding to residues 91-121 of the human sequence, see Table I) was reduced by 1849.8 Da after N-glycanase treatment. The mass of peptide T-3, which contains the other putative N-linkage site with sequence CINTSPGFRCEACPPGFSGPTHEXV . . . , corresponding to residues 122-152 of human COMP, agrees with the calculated mass and was unchanged by N-glycanase treatment. These results indicate that, of the two possible sites of Nglycosylation on peptide K-17, Asn-101 is glycosylated and Asn-124 is not.
Monosaccharide analysis of adult bovine COMP was carried out as described under "Experimental Procedures." An identical amount of sample was subjected to amino acid analysis to gain an accurate measurement of the quantity of protein applied. The result (Table II) showed that adult COMP had approximately 14 mol glucosamine and 9 mol mannose/mol protein.
The MALDI-TOF MS spectrum of the intact fetal COMP after reduction and carboxymethylation showed a double peak indicating protonated mass values of 84661 Ϯ 126 and 86240 Ϯ 46 Da as well as some fragment ions at 59578 and 48171 Da and a doubly charged intact ion (Fig. 4). This mass spectrum differed somewhat from that obtained for adult bovine COMP (Fig. 1) and suggested that the fetal protein may carry a different substitution pattern. Both adult and fetal bovine COMP  Fig. 2A) as described under "Experimental Procedures," eluted peaks were collected, and their protonated mass determined by MALDI-TOF.
b The number preceding each sequence is the residue number of the corresponding sequence of human COMP (5). c ND, not determined.
were digested with lysine C-proteinase and separated by reversed phase HPLC. The chromatograms, shown in Fig. 5, indicate two areas where a difference can be seen between the adult and fetal samples. These are marked with bold arrows. The adult COMP digest showed an additional peak, at 32.65 min, not present in the fetal sample; furthermore, there was a side-peak at 46 min, unique to the fetal digest. Fraction 1 was separated using the same HPLC column and shown by peptide sequencing to contain two peptides, 270 -305 and 426 -492 (Table III, numbering refers to the human COMP sequence).
The C-terminal portions of the peptides was extrapolated using the measured protonated mass for each and the sequence of bovine COMP, which is known in these regions (4). These derive from repeats 1 and 6, respectively, of the calcium binding domain of COMP. The MALDI-TOF MS spectrum of the adult COMP fraction (Fig. 6A) showed an ion of protonated mass 3821.8 Ϯ 0.8 Da, which was absent in the fetal COMP fraction. It is likely that this peptide is a variant of peptide 270 -305. Fraction 2 from the adult bovine COMP digest was identified as peptide K-17 (Table I) and has already been described (Fig.  2B). The corresponding peptide isolated from fetal COMP showed a cluster of peaks ranging in protonated mass from 11134 to 11714 Da (Fig. 6B).
To confirm that COMP contained sialic acid, its reactivity with lectins SNA and MAA was determined. These lectins show specificity for oligosaccharides that carry terminal ␣2-6-linked and ␣2-3-linked sialic acids, respectively (15,16). Reduced and carboxymethylated adult bovine COMP was separated by SDSpolyacrylamide gel electrophoresis and electroblotted onto PVDF (Fig. 7). Reaction was detected with SNA and not with MAA, indicating the presence of ␣2-6-linked sialic acid.
In these mapping experiments, the peptide carrying the third potential N-glycosylation site at Asn-722 was not isolated. One possible explanation for this would be that the protein core in the COMP preparations used was truncated and did not posses an intact C-terminal end. This could lead to the loss of Asn-722, which is located within 16 residues of the mature C terminus. An experiment was therefore conducted to establish that the COMP preparations used in this study had the predicted C terminus. Reduced and carboxymethylated human COMP was digested with aspartate-N proteinase and the unseparated digest was analyzed by MALDI-TOF MS. One of  Table I. Peptides were digested with N-glycanase and analyzed by MALDI-TOF MS. The spectra obtained before (upper spectrum) and after (lower spectrum) N-glycanase treatment of T-1 (B) and T-3 (C) are shown. The protonated masses shown are for individual measurements and, therefore, differ slightly from the averaged values given in the text.  the peptides identified had a protonated mass of 1814.6 Ϯ 0.5 Da (Fig. 8), consistent with the C-terminal peptide (DTI-PEDYETHQLRQA), with predicted mass of 1816.92 Da. This indicated that the C terminus was not in fact truncated. Analysis of human COMP by MALDI-TOF MS following treatment with a variety of glycosidases ( Fig. 9) was carried out. Treatment of the reduced and carboxymethylated monomer with N-glycanase led to a reduction in mass of 3571 Ϯ 219 Da. Further treatment with neuraminidase and O-glycanase did not lead to any significant reduction in mass, indicating the absence of O-linked oligosaccharides. The mass difference resulting from N-glycanase treatment, however, was consistent with the presence of 2 N-linked oligosaccharides. Since Asn-101 carries a substituent of average mass of 1847.2 Ϯ 6.6 Da, and Asn-124 is unsubstituted, the additional N-linked oligosaccharide resides on Asn-722. DISCUSSION We have used MALDI-TOF and chemical analysis to obtain detailed information on the structure and distribution of carbohydrate moieties on COMP. On the basis of the data presented, it is concluded that in adult COMP there are two sites of N-glycosylation at Asn-101 and Asn-721. The third N-link-age site (Asn-124) is unoccupied. The difference between the measured mass and that predicted from the sequence of human COMP indicated that the mass of all post-translationally added groups was 3402 Ϯ 149 Da. Analysis of the total neutral monosaccharide and hexosamine content (Table II) indicated that carbohydrate would account for all of these substituents since they contribute an additional mass of 3700 -4300 Da, depending on the degree of acetylation of hexosamine residues, to the mass of the COMP subunit. The mass of the N-oligosaccharide at Asn-101 was estimated to be 1847.2 Ϯ 6.6, and this is consistent with the expected mass of a high mannose oligosaccharide such as that indicated in Table IV, with the structure (HexNAc) 2 -(Man) 8 -(Fuc) 1 . The oligosaccharide on Asn-721 had a mass of 1723.8 Ϯ 225 Da. The error associated with this measurement means that a number of interpretations can be made, but the structure must contain sufficient glucosamine and sialic acid to account for the presence of these sugars, as measured by monosaccharide analysis and lectin blotting.
In fetal COMP, the oligosaccharide substituent at Asn-101 exists in several structural forms (Fig. 6B). The most likely structural arrangements of oligosaccharide substituents that would give rise to this cluster are shown in Table IV. As in the  The underlined residues were sequenced by Edman degradation. The remaining portion of the sequence is extrapolated using the known portion of the bovine COMP sequence (4). The numbers in brackets refer to the start position of the sequence when aligned with the human COMP sequence (5).
case of the adult structure, the masses are consistent with a core structure of (GlcNAc) 2 -(Man) 3 and additional substitutions that contain fucose, sialic acid, glucosamine, and mannose residues. Addition of these outer chain structures occurs in the trans region of the Golgi prior to secretion (17), and the final oligosaccharide structure depends on the rate at which the protein core moves through the Golgi apparatus. It may be that in fetal tissue there is an additional level of tight regulation relating to the rate of movement of the protein through the Golgi. This may in turn regulate exactly where the molecule is located in the extracellular matrix since changes in glycosylation may influence binding to other matrix components. COMP has been shown by immunohistochemical analysis to be distributed throughout the pericellular and territorial environ-  1 and 3). The membrane was probed with lectin SNA (lanes 1 and 2) and MAA (lanes 3 and 4). 1 g of reduced and carboxymethylated COMP (lanes 2 and 4) was used. The arrows indicate the positions of molecular weight markers.
FIG. 8. MALDI-TOF mass spectrum of a total aspartate-N proteinase digest of human COMP after reduction and carboxymethylation. The peak with mass 1814.6 Da, marked with an asterisk, is the C-terminal peptide, indicating that the protein core of COMP is intact.
FIG. 9. MALDI-TOF mass spectra of reduced and carboxymethylated COMP subunit from adult human articular cartilage after matrix acidification with formic acid. The untreated COMP subunit is shown, and the sample after digestion with N-glycanase (N-gly) and subsequent digestion with neuraminidase (Neu) and O-glycanase (O-gly). ments in developing cartilage and the interritorial compartment in mature cartilage (7). These age-related changes could also reflect differences in the content or binding affinity of COMP binding molecules in these compartments. It is very likely that the changes in glycosylation of COMP described here allow for different intermolecular interactions.