Distinct maturations of N-propeptide domains in fibrillar procollagen molecules involved in the formation of heterotypic fibrils in adult sea urchin collagenous tissues.

We have characterized the primary structure of a new sea urchin fibrillar collagen, the 5alpha chain, including nine repeats of the sea urchin fibrillar module in its N-propeptide. By Western blot and immunofluorescence analyses, we have shown that 5alpha is co-localized in adult collagenous ligaments with the 2alpha fibrillar collagen chain and fibrosurfin, two other extracellular matrix proteins possessing sea urchin fibrillar modules. At the ultrastructural level, the 5alpha N-propeptide is detected at the surface of fibrils, suggesting the retention of this domain in mature collagen molecules. Biochemical characterization of pepsinized collagen molecules extracted from the test tissue (the endoskeleton) together with a matrix-assisted laser desorption ionization time-of-flight analysis allowed us to determine that 5alpha is a quantitatively minor fibrillar collagen chain in comparison with the 1alpha and 2alpha chains. Moreover, 5alpha forms heterotrimeric molecules with two 1alpha chains. Hence, as in vertebrates, sea urchin collagen fibrils are made up of quantitatively major and minor fibrillar molecules undergoing distinct maturation of their N-propeptide regions and participating in the formation of heterotypic fibrils.

Among the components of extracellular matrix, collagens are the most abundant of the glycoproteins. In vertebrates, 27 collagen types have been identified (1)(2)(3)(4). All of them consist of three identical or different ␣ chains that contain at least one collagenous or triple helical segment, consisting of repeating Gly-Xaa-Xaa triplets. The collagen domains of the three ␣ chains coil around each other into a triple helical structure. The fibrillar collagens, including types I-III, V, and XI, are the best known, and their precursor ␣ chains consist of a main triple helix made up of ϳ338 Gly-Xaa-Xaa triplets flanked by two non-collagenous regions containing the N-and the C-propeptides. During maturation of procollagens into collagen molecules, the two propeptide domains are generally removed by the action of specific proteinases (1). The resulting collagen molecules participate in the formation of supramolecular structures called fibrils. The invertebrate fibrillar ␣ chains so far described present the same overall structure as that of their vertebrate counterparts (5).
In different sea urchin species, studies have indicated the presence of two fibrillar ␣ chains (1␣ and 2␣) involved in the formation of heterotrimeric molecules [(1␣) 2 -2␣] (6 -8). Moreover, traces of a homotrimeric 1␣ chain have been described in the sea urchin Paracentrotus lividus (9). The complete primary structure of these two fibrillar collagen chains has been characterized in the sea urchin Strongylocentrotus purpuratus (10,11), whereas partial sequences have been described in P. lividus (12)(13)(14) and Hemicentrotus pulcherrimus (15). The 2␣ chain presents a large N-propeptide region including 12 repeats of an 140 -145 amino acid module that we have named SURF 1 for sea urchin fibrillar module (14). During sea urchin embryogenesis, collagen fibrils show a uniform diameter of 25 nm and present at their surface periodically distributed extensions corresponding to the 2␣ N-propeptide (16). In adults, the collagen fibrils are thicker and have been found in various tissues such as the sutural ligament of the test, the spine ligament, the peristomial membrane that bridges the Aristotle's lantern or Echinoid jaw to the test, jaw, and tube feet (17)(18)(19). Some of these tissues have been defined as mutable collagenous tissues that exhibit a property specific to echinoderms (20). Indeed, mutable collagenous tissues have the ability to modulate their tensile properties in a time scale of seconds under nervous control without the requirement of muscle cells. In sea urchin, we have previously characterized an interfibrillar component of the spine collagenous ligament, fibrosurfin (19). This extracellular matrix protein consists of a series of epidermal growth factor-like and SURF modules. It is worth noting that the two sea urchin proteins that include SURF modules (2␣ and fibrosurfin) are almost exclusively distributed in adult collagenous ligaments. Moreover, we have also partially characterized in the sea urchin S. purpuratus a genomic region that could potentially encode an N-propeptide domain evolutionarily related to the 2␣ N-propeptide (14). The putative protein encoded by this genomic region includes nine repeats of the SURF module and has been given the name 5␣.
In this report, we present data concerning the P. lividus 5␣ chain. This fibrillar ␣ chain presents the same expression pattern as both the 2␣ chain and fibrosurfin in adult tissues. From its structure, its maturation, and its involvement in the formation of heterotypic fibrils, this new fibrillar ␣ chain might be of importance in the unique properties of mutable collagenous tissues.
* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) AJ601384.

EXPERIMENTAL PROCEDURES
Embryo Culture and Nucleic Acid Purification-P. lividus were purchased from the Arago laboratory (Banyuls-sur-mer, France). Gamete collection, fertilization, and embryo culture were done according to standard protocols (21). Total RNA from embryonic and adult tissues was purified as described previously (19).
cDNA Synthesis and PCR-For all of the RT-PCR experiments, 500 ng of test RNA were reverse-transcribed using random primers and the reverse Expand kit (Roche Applied Science) used according to the manufacturer's recommendations. For RACE experiments, the 5Ј-and 3Ј-RACE kit from Invitrogen were used. The oligonucleotides used in this study are listed in Table I and were synthesized by Isoprim (Toulouse,  France). For PCR, 35 amplification cycles of target single-stranded cDNA were usually carried out using the Taq Expand polymerase kit (Roche Applied Science). PCR conditions, purification, cloning, and sequencing of PCR fragments were carried out as described previously (19).
Antibody Production-To prepare anti-5␣ monoclonal antibodies, DNA fragments encoding the SURF module R8 were generated by PCR using the RT-PCR cDNA clone X as template with Pfu DNA polymerase (Promega). The 5Ј primer (5Ј-ATAGATCTAGTGCTGTTGCCACCGAT-GTTG-3Ј) and the 3Ј primer (5Ј-TATCTGCAGACCTCTACACTTCT-GCTC-3Ј) contained a BglII and a PstI site, respectively. The DNA insert was cloned between the BamHI and PstI sites of a derivative of pT7/7 (U. S. Biochemical Corp.). In this overproducing plasmid, six His codons had been included between the PstI and HindIII sites with a stop codon following the last His codon (22). Production in the BL21(DE3) Escherichia coli strain and purification of the recombinant module R8 were carried out as described previously (16). Mouse monoclonal antibody production, titration by enzyme-linked immunosorbent assay, and characterization by immunoblotting were done using established protocols (23). Generally, SURF modules shared a 20 -30% identity, but each SURF module of the 5␣ chain had its 2␣ counterpart. Hence, the R8 module of the 5␣ chain presented 69% identity with the R8 motif of the 2␣ chain. The specificities of the 41 monoclonal antibodies made against the 5␣ chain were tested, and only 7 of them cross-reacted with the 2␣ chain SURF module R8, whereas none of them cross-reacted with SURF module 8 of fibrosurfin. Among the 34 monoclonal antibodies that did not cross-react with the 2␣ chain, the one designated number 27-7F10 was chosen for use in the experiments described here.
Isolation of Pepsin-solubilized Fibrillar Collagen-To obtain fibrillar collagen molecules, we used the method described by Omura et al. (7) with slight modifications. All of the preparations were done at 4°C. Test tissue overlaid by epithelia was crushed (40 g wet weight) and partially demineralized overnight in 0.5 M EDTA, 50 mM Tris-HCl, pH 8.0. The tissue was treated with disaggregating solution (0.5 M NaCl, 50 mM EDTA, 0.2 M ␤-mercaptoethanol, 0.1 M Tris-HCl, pH 8.0) for 3 days, centrifuged at 20,000 ϫ g, and washed three times with distilled water.
The disaggregated collagen molecules then were solubilized by limited pepsin digestion. The pellet was suspended in a pepsin solution with a collagen/pepsin ratio of 100/0.3 (wet weight) in 0.5 M acetic acid, and incubated for 16 h at 4°C. Pepsin was immediately inhibited by the addition of pepstatin A (1 g/ml, Sigma). The viscous solution was centrifuged at 20,000 ϫ g for 30 min. The supernatant was submitted to sequential dialysis in acidic or neutral solvent with a high salt concentration to resolve all of the fibrillar collagen molecules present in the initial solution (24). First, the supernatant was dialyzed against 1.5 M NaCl in 0.5 M acetic acid for several days. After centrifugation (20,000 ϫ g, 1 h), the precipitate was suspended in 0.1 M acetic acid and frozen. The supernatant was submitted to a second dialysis with 4 M NaCl in 50 mM Tris-HCl, pH 8.5, for several days. The small amount of precipitate was centrifuged and suspended in 0.1 M acetic acid and frozen. Bovine type I collagen was purchased from Colética (Lyon, France).
Collagen samples were separated by 6% SDS-PAGE and analyzed by staining with Coomassie Blue or by Western blotting, the latter after electrotransfer onto polyvinylidene difluoride membranes overnight in 10 mM CAPS, pH 11, 5% methanol. Blots were incubated with a rabbit polyclonal antibody made against the N-telopeptide and the N terminus of the triple helix of the 2␣ chain (1/55 dilution, Ref. 16) and then visualized with alkaline phosphatase-conjugated goat anti-rabbit IgG (Bio-Rad) secondary antibody developed using the substrate kit from Bio-Rad.
The "1.5 M NaCl/acetic acid precipitate" was separated by two-dimensional electrophoresis using the Bio-Rad Protean isoelectric focusing cell. The precipitate, suspended in 0.1 M acetic acid, was dialyzed against distilled water, lyophilized, and resuspended in the sample buffer (8 M urea, 4% CHAPS, 20 mM dithiothreitol, 0.2% Bio-lyte ampholytes). The sample was applied to a ReadyStrip pH 3-10 immobilized pH gradient strip by active loading and focused in the isoelectric focusing cell. Separation in the second dimension was carried out by loading the IPG strip onto a 6% SDS-PAGE gel followed by staining with Coomassie Blue.
Matrix-assisted Laser Desorption Ionization-Time of Flight (MALDI-TOF)-Spots of interest were excised from the polyacrylamide gels, and proteins were cleaved in-gel using trypsin. For MALDI-TOF sample preparation, dry tryptic peptide extracts were dissolved in 10 l of 0.1% trifluoroacetic acid. The matrix was a 1 mg/100 l solution of a-cyano-4-hydroxycinnamic acid (LaserBiolabs, Sophia-Antipolis, France) in CH 3 CN/H 2 O (50/50, v/v) containing 0.1% trifluoroacetic acid. We used the dried-droplet method for sample deposition. In this method, 1 l of sample solution and 1 l of matrix solution were mixed on the target and allowed to dry. MALDI-TOF mass spectra were recorded on a Voyager DE-PRO mass spectrometer (Applied Biosystems, Courtaboeuf, France) in the 700 -5,000 Da mass range using 300 -400 shots of laser/spectrum. Delayed extraction source and reflector equipment permitted sufficient resolution to consider monoisotopic peptide masses of [MϩH] ϩ ions. Internal calibration was done using trypsin autolysis fragments at m/z 842.5100 and 2211.1046 Da. Theoretical masses were predicted using the PeptideMass program at www.expasy.org/tools/ peptide-mass.html (25) and the 5␣ sequence and the available P. lividus sequences of the 1␣ and 2␣ chains (GenBank TM accession numbers M25282 and J05422). Furthermore, we took into account hydroxylation of proline residues in position Xaa. To validate the identity of the tryptic peptides, we used a stringent 0.001% maximum deviation between theoretical and experimental masses.
Immunofluorescence Microscopy-For immunofluorescence analysis, tissues were prepared as described previously (19). Spine bases were dissected, rinsed with artificial sea water (480 mM NaCl, 10 mM KCl, 26 mM MgCl 2 , 29 mM MgSO 4 , 10 mM CaCl 2 , 2.4 mM NaHCO 3 , pH 8.0) and fixed for 4 h at 4°C in 2.5% paraformaldehyde in artificial sea water. Tissues were demineralized with 0.5 M EDTA at 4°C and then rinsed and frozen in liquid nitrogen. Thin sections were cut on a Cryostat (Leitz) and immunolabeled with 27-7F10 and 11-4E11 (anti-SURF module R2, 2␣ chain, Ref. 16) antibodies (1 g/ml) or without antibody for negative controls. Sections were incubated with fluorescein-conjugated goat anti-mouse IgG (1/400, Jackson ImmunoResearch, West Grove, PA) and observed on a Zeiss AxioPlan microscope. For electron microscopy, tests were dissected from individual P. lividus, rinsed with artificial sea water, and fixed at room temperature for 12 h in 3% glutaraldehyde in cacodylate buffer (0.1 M, pH 7.8). Samples were rinsed in the same buffer and post-fixed for 1 h in 1% osmium tetroxide in PIPES buffer (0.1 M, pH 7.4). After rapid washing in water, sections were dehydrated in a graded ethanol series and embedded in London Resin White at 60°C. Blocks were cut to the surface of the tissue and demineralized in 0.5 M EDTA for several days. After reembedding in London Resin White, ultrathin sections were cut on a Reichert-Jung Ultracut ultramicrotome and then immunolabeled with 27-7F10 and 11-4E11 monoclonal antibodies (1 g/ml) and anti-N-telopeptide rabbit polyclonal serum antibody (1/100). Negative controls were carried out by omitting the primary antibody. The secondary antibodies used were a goat anti-mouse IgG conjugated to 5-nm gold particles and a goat anti-rabbit IgG conjugated to 10-nm gold particles (1/30, British Biocell International, Cardiff, United Kingdom), respectively. Sections were contrasted with methanolic uranyl acetate and lead citrate and observed with a CM120 Philips electron microscope at the "Centre Technique des Microstructures" (Université Claude-Bernard, Lyon I, Villeurbanne, France).

RESULTS
Using a probe coding for SURF module R8 of the 2␣ chain, we had previously isolated from a P. lividus genomic DNA library 54 positive clones exhibiting variable labeling intensities. Hence, shotgun analysis of two weakly positive overlapping genomic clones coupled to RT-PCR and RACE approaches led us to characterize fibrosurfin, an extracellular matrix protein that contains 13 SURF modules (19). Using the same strategy, we analyzed two moderately positive overlapping genomic clones (Fig. 1A). Shotgun sequence analysis revealed that we had identified part of a gene that corresponded to the P. lividus ortholog of the S. purpuratus COLP5␣ gene (14). Hence, as for the 1␣ and the 2␣ chains, comparable domains of the 5␣ chain characterized in these two sea urchin species presented 86 -95% identity (Fig. 1A).
Using the available coding sequences, RT-PCR and 5Ј-RACE were carried out using RNA extracted from test tissue ( Fig. 1B; RT-PCR fragments 5-1 to 5-5 and 5Ј-RACE clone). This permitted us to characterize the first 1,788 residues of the 5␣ chain. At this point, the sequence revealed that 5␣ probably represented a new sea urchin fibrillar collagen chain evolutionarily related to the 2␣ chain. The sequence included 1 putative signal peptide (residues 1-31), 1 large N-propeptide made up of a Tsp-2 module, 9 SURF modules and 1 minor interrupted triple helix, 1 N-telopeptide region, and 90 Gly-Xaa-Xaa triplets. Comparable domains of the 2␣ and the 5␣ chains presented an average of 69% identity (Fig. 1B). To determine the C-terminal sequence of the 5␣ chain, two properties of the 5␣ chain were used to generate a set of 3Ј primers, i.e. the strong similarity between the 2␣ and 5␣ chains, and the presence of highly conserved stretches of amino acids in the C-propeptides of all known fibrillar ␣ chains. This led to the synthesis of two RT-PCR products encoding the major triple helix and the Nterminal region of the C-propeptide (RT-PCR products 5-6 and 5-7 in Fig. 1B). This RT-PCR approach, coupled to a 3Ј-RACE, led us to complete the primary structure of the 5␣ chain, confirming that we had identified a new sea urchin fibrillar collagen chain (Fig. 1B). It is worth indicating that the Ntelopeptide region represents the least conserved region between these two fibrillar collagen chains (Fig. 1C).
To investigate the nature of the 5␣ chain, we first prepared monoclonal antibodies against a recombinant protein consisting of SURF module R8 of this chain. These antibodies were tested for their cross-reactivity (see "Experimental Procedures") with the SURF module R8 of the 2␣ chain because these comparable domains presented 69% identity, whereas the identity with other SURF modules of 2␣, 5␣, and fibrosurfin was Ͻ35%. The specificity of these antibodies was also confirmed by the absence of cross-reactivity with recombinant proteins built of SURF modules of 2␣ and fibrosurfin (data not shown). The presence of the 5␣ chain was analyzed by Western blot experiments during embryogenesis and in different adult tissues. Similar to fibrosurfin (19), the 5␣ chain is not detectable during the embryogenesis (data not shown). In adults, immunoreactive bands for the 5␣ chain were detected in the test, peristomial membrane, Aristotle's lantern, the base of the spine, and the tube feet (Fig. 2). The same positive adult tissue pattern has previously been defined for both 2␣ and fibrosurfin (19). Two clusters of immunoreactive bands for the 5␣ chain were detected in these adult tissues. Cluster A included bands of molecular mass higher than 200 kDa, whereas cluster B comprised bands of 100 -180 kDa (Fig. 2). The significance of the two clusters will be discussed later in the light of the next results. The identification of multiple bands in the cluster A and of the lowest apparent molecular mass bands in cluster B might result from proteolytic events during urea extraction from the mineralized tissues. A similar degradation pattern has previously been shown for proteins that include SURF modules, i.e. fibrosurfin and the 2␣ chain (19). Fig. 3 shows immunostaining of the catch apparatus using monoclonal antibodies specific for the N-propeptide of both the 2␣ and 5␣ chains. These two ␣ chains were co-localized to the spine collagenous ligament of the catch apparatus (Fig. 3, A and B) and were also present under the epithelium surrounding the muscle layer at the base of the spines (Fig. 3, D and E). A background of autofluorescence was detected within cells in these regions, resulting from a brown pigmentation (Fig. 3, C and F). Co-localization of these two ␣ chains was also demonstrated in the peristome and in the sutural ligaments of the test (data not shown).
The co-localization of the 2␣ and 5␣ chains in adult tissues and the Western blots realized both in this study and in a previous work (19) is difficult to correlate with the data, indicating that pepsinized sea urchin fibrillar collagen molecules are made of two ␣ chains, the 1␣ and 2␣ (6,7,9). Moreover, Edman degradation analysis of pepsinized fibrillar collagens extracted from test after SDS-PAGE separation reveals only sequences specific to the 1␣ and 2␣ chains (8). To clarify the discrepancy between our immunological and biochemical studies, we first used immunoelectron microscopy using monoclonal antibodies specific for the N-propeptide of the 2␣ or 5␣ chain or polyclonal antibodies made against the N-telopeptide and the N terminus of the triple helix of the 2␣ chain. For 5␣, 5-nm gold particles were detected on the surface of all of the collagen fibrils in sutural ligaments (Fig. 4, A-C), indicating that its N-propeptide was either unprocessed or closely associated with these fibrils. For the 2␣ chain (Fig. 4D), its N-propeptide was not present on the surface of those fibrils that contained the 2␣ chain as shown in Fig. 4, E and F, where 10-nm gold particles coupled to antibodies specific for the 2␣ N-telopeptide were used. This agrees with a previous study (19) indicating the processing of the 2␣ N-propeptide in adult tissues and its localization at the periphery of the bundles made of collagenous fibrils aligned in parallel. The low number of gold particles in Fig. 4, E and F, results from the availability of the 2␣ Ntelopeptide at the surface of the fibrils. Hence, we have previously used this polyclonal antibody specific of the 2␣ telopeptide to localize the 2␣ chain during the embryogenesis (16). Only a treatment of embryos by 8 M urea at 40°C prior the immunolabeling has permitted us to detect the 2␣ chain in the tissues (16). A similar treatment of tissues analyzed by electron microscopy could not give accurate results because of the damage of the tissue structure.
The second analysis was carried out to identify the 5␣ chain at the protein level. Pepsinized fibrillar collagen purified from test was sequentially precipitated with 1.5 and 4 M NaCl (see "Experimental Procedures"). These salt precipitates were analyzed by SDS-PAGE (Fig. 5A). The amount loaded on the gel corresponded to 0.05 and 50% of the total precipitate, respectively. Most of the collagen molecules were recovered after the 1.5 M NaCl precipitation (Fig. 5A, lane 1.5M). The same pattern of bands was obtained for the 4 M NaCl precipitate (Fig. 5A,  lane 4M), although the fastest migrating band seemed to have a slightly lower apparent molecular mass than the comparable band in the 1.5 M NaCl precipitate. In the 4 M NaCl lane, the higher and lower molecular mass bands were called 1 and 2, respectively. To identify the faster migrating band in these two salt precipitations, a Western blot was made using a polyclonal antibody specific to the 2␣ N-telopeptide. As shown in Fig. 5B, the faster migrating band in the 1.5 M lane corresponded to the 2␣ chain, whereas band 2 in the 4 M lane was apparently not the 2␣ chain. The 1.5 M NaCl precipitate was resolved in a two-dimensional gel electrophoresis system (Fig. 5C). Several spots of molecular mass identical to the 1␣ chain were detected, suggesting distinct types of post-translational modification to this fibrillar chain. At the molecular mass level of the 2␣ chain, two spots were present, a strongly stained spot and a lightly stained spot (called 3), which was more acidic and seemed to have an apparent molecular mass slightly less than the more basic spot. Bands 1 and 2 of the 4 M lane (Fig. 5A) and spot 3 after two-dimensional gel electrophoresis were analyzed by MALDI-TOF. Similar tryptic peptide profiles were obtained for band 2 of the 4 M lane and spot 3 after two-dimensional gel electrophoresis, whereas band 1 of the 4 M lane generated a distinct pattern. Experimental data were compared with the theoretical molecular masses of tryptic fragments generated from the 5␣ chain and the available sequences of the P. lividus 1␣ and 2␣ chains (12,13). This comparison clearly indicated that spot 3 and band 2 corresponded to the 5␣ chain, whereas band 1 corresponded to the 1␣ chain. Peptide fragments identified by MALDI-TOF are shown in Fig. 6 and represent 31.8 and 39.8% of the available triple helical sequences, respectively. The ratio between the upper and the lower bands in the 4 M NaCl precipitate (Fig. 5A) suggested that molecules present in this solution had a [(1␣) 2 5␣] stochiometry. Moreover, data from Fig. 5, A and C, indicate that in sea urchin adult tissues, the 5␣ chain represents just a few percent of the fibrillar ␣ chains.

DISCUSSION
In this report, we have shown for the first time, as in vertebrates, that invertebrates possess major and minor fibrillar collagens that are involved in the formation of heterotypic fibrils. Hence, the 5␣ chain is the first quantitatively minor invertebrate fibrillar collagen characterized to date. From its N-propeptide maturation, its presence in small amounts in comparison to the 1␣ and 2␣ chains, the 5␣ chain might play a similar structural function to that of the vertebrate types V/XI collagens in heterotypic fibrils. In addition, this report confirms that proteins that include SURF modules are located in the vicinity of the mineralized regions of the sea urchin and in adult collagenous tissues.
Two-dimensional gel electrophoresis analysis (Fig. 5C) indicated that 5␣ represents a small percentage of the fibrillar collagen in test. This low amount and its similarity with the 2␣ chain (primary structure, co-migration in SDS-PAGE, molecular association with two 1␣ chains) explain why the 5␣ chain has not been reported in previous biochemical studies (6 -9, 15). Two-dimensional electrophoresis also reveals that the band corresponding to the 1␣ chain in SDS-PAGE is distributed into several spots. Even though we did not analyze these spots by MALDI-TOF, we suggest that they correspond to distinct post-translational modifications of the 1␣ chain. These modifications could correspond to different levels of glycosyla- FIG. 5. Isolation of the 5␣ chain from sea urchin test. A mixture of pepsinized fibrillar collagen from the test was subjected to two successive salt precipitations, first with 1.5 M NaCl in 0.5 M acetic acid, pH 2.6 (1.5M). The supernatant then was precipitated with 4 M NaCl in 50 mM Tris-HCl, pH 8.5 (4M). The precipitates were separated by 6% SDS-PAGE followed by Coomassie Blue staining (A) and analyzed by Western blotting (B) using the anti 2␣ N-telopeptide antibody. Bovine type I collagen (Ib) was analyzed for comparison. The 1.5 M NaCl precipitate was analyzed by isoelectric focusing (IEF) two-dimensional SDS-PAGE (C). The locations of the 1␣ and 2␣ chains are indicated. Three samples were resolved by mass spectroscopy and are annotated with small numbers (1, 2, and 3). tion because it has been shown in three sea urchin species that the 1␣ chain is rich in carbohydrate (15).
From their primary structure, the 2␣ and the 5␣ chains are closely related. Previously, we suggested that the duplication event leading to the creation of their related genes happened prior to the formation of the genomic region encoding the SURF modules R2-R5 of the 2␣ chain (14). Similarities between the 2␣ and 5␣ chains are also shown at the molecular level. Hence, the 2␣ chain forms heterotrimeric molecules with two 1␣ chains. From Fig. 5A (lane 4M) and the MALDI-TOF analysis, it seems that some molecules may be assembled containing a 5␣ chain rather than a 2␣ chain to generate [(1␣) 2 5␣] molecules. First, the ratio between the upper (1␣) and lower (5␣) bands seems to be compatible with this stoichiometry. Second, from the strong similarity between their C-propeptides (70%), the 5␣ chain might replace the 2␣ chain during molecular formation. In vertebrates, heterotypic molecules have been shown to include types V and XI chains (26,27). Hence, ␣1(XI) might replace the ␣1(V) chain to form [(␣1(XI] 2 ␣2(V)] molecule (27). Interestingly, their C-propeptide presents 73% identity. Third, the sea urchin ␣ chains lack part of the sequence involved in chain recognition in vertebrates (28), similar to all invertebrate fibrillar collagens characterized to date (29). For this reason, it is tempting to speculate that only the 1␣ chain might form homotrimeric molecules, whereas the 2␣ and 5␣ chains need the 1␣ fibrillar collagen to be included in heterotrimeric molecules. In this model, the ratio between the [(1␣) 2 2␣] and [(1␣) 2 5␣] molecules will depend on the amount of 2␣ and 5␣ chains synthesized.
Although the 2␣ and 5␣ chains are closely related and seem to make comparable molecular stoichiometries, their maturation in adult tissues is distinct. Hence, the use of monoclonal antibodies specific to the SURF module R8 of the 2␣ or 5␣ chains indicated that the N-propeptide of 2␣ is cleaved in adult tissues, whereas the 5␣ N-propeptide is present on the fibril surface. These results suggest that for the 2␣ chain an Nproteinase absent during embryogenesis cleaves the N-propeptide in a region included between SURF module 9 and the N-telopeptide. A putative Ala-Gln N-proteinase cleavage site is present in the 2␣ N-telopeptide (Fig. 1C) similar to vertebrate types I-III collagens (30 -32). This site could be functional even though no sea urchin N-proteinase has been characterized to date. As for the 5␣ chain, the N-propeptide is either unprocessed or cleaved in a region preceding SURF module R8. We could not exclude the possibility that the N-propeptide has been processed but is still associated with the fibrils. From Western blot analysis, using a monoclonal antibody specific to the 5␣ N-propeptide ( Fig. 2A), the detection of high apparent molecular mass bands (higher than 300 kDa, cluster A) argues in favor of retention of the complete N-propeptide in mature 5␣ collagen chains. Hence, the 5␣ N-propeptide has a theoretical molecular mass of 160 kDa, whereas its triple helix has an apparent molecular mass of 120 kDa (Fig. 5A, lane 4M). Hence, when the antibody specific to the 2␣ N-propeptide is used in Western blot experiments, positive bands ranging from 60 to 200 kDa are detected (19) for a theoretical molecular mass of 205 kDa for the 2␣ N-propeptide. The lack of higher molecular mass bands in this blot is in agreement with the removal of the N-propeptide during 2␣ maturation. All of these data agree with a distinct maturation of collagen molecules including the 2␣ or 5␣ chains. Because of the strong similarity between these two fibrillar chains, it is difficult to understand why 5␣ does not undergo a similar N-propeptide maturation to the 2␣ chain. Therefore, like the 2␣ chain and the 1␣ chain (10), the 5␣ chain possesses a putative N-proteinase site in its N-telopeptide (Fig.  1C). One explanation might be that the N-telopeptide of these two chains is the only one region, which is poorly conserved (Fig. 1C). The structure of this region is important in vertebrates for the activity of the N-proteinase (33). It has also been demonstrated that heat denaturation of type I procollagen prevents processing of the N-propeptide by the N-proteinase (34). Furthermore, in a native procollagen, type I molecule, the cleavage of the two first chains is faster than the third because of a partial unfolding of the N-telopeptide region (35).
Immunolocalization of the 5␣ chain in adult tissues agrees with our previous observation that proteins that incorporate SURF modules are present in collagenous tissues and are located around mineralized tissues. Hence, in these tissues, fibrils present SURF modules from the 5␣ chain at their surface, fibrosurfin is located between the collagen fibrils, and the N-propeptide of the 2␣ chain is present around fibril bundles. SURF modules have been only characterized in sea urchin, but their biological and/or structural functions are still unknown. In mutable collagenous tissues like the spine ligaments, proteins harboring SURF modules might be involved in the variable tensility of these tissues resulting from the modulation of interfibrillar cohesion. However, these proteins are also present in tissues that have not been defined as mutable collagenous tissues, similar to the sutural ligaments (20). It is interesting to note that a clear relationship between sutural loosening and skeletal flexibility during growth has been reported previously (36). Johnson et al. (36) indicated that growth-associated changes in ligamental material properties might explain the sutural loosening. One of the hypotheses is that the properties of mutable collagenous tissues may have had evolutionary origins in tissues that were mutable during growth (36,37). At this time, the relation of SURF module with some unique properties of sea urchin collagenous tissues is rather speculative. However, the presence of 5␣ N-propeptides on the fibril surfaces might be important either for linking and/or the correct organization of the collagen fibrils present in these tissues. We show here that sea urchin fibrils are more complex than previously shown and that the formation of heterotypic fibrils arose early during evolution.