Collagen XXIV, a Vertebrate Fibrillar Collagen with Structural Features of Invertebrate Collagens

Tissue-specific assembly of fibers composed of the major collagen types I and II depends in part on the formation of heterotypic fibrils, using the quantitatively minor collagens V and XI. Here we report the identification of a new fibrillar-like collagen chain that is related to the fibrillar α1(V), α1(XI), and α2(XI) collagen polypeptides and which is coexpressed with type I collagen in the developing bone and eye. The new collagen was designated the α1(XXIV) chain and consists of a long triple helical domain flanked by typical propeptide-like sequences. The carboxyl propeptide is classic, with 8 conserved cysteine residues. The amino-terminal peptide contains a thrombospodin-N-terminal-like (TSP) motif and a highly charged segment interspersed with several tyrosine residues, like the fibril diameter-regulating collagen chains α1(V) and α1(XI). However, a short imperfection in the triple helix makes α1(XXIV) unique from other chains of the vertebrate fibrillar collagen family. The triple helical interruption and additional select features in both terminal peptides are common to the fibrillar chains of invertebrate organisms. Based on these data, we propose that collagen XXIV is an ancient molecule that may contribute to the regulation of type I collagen fibrillogenesis at specific anatomical locations during fetal development.

Tissue-specific assembly of fibers composed of the major collagen types I and II depends in part on the formation of heterotypic fibrils, using the quantitatively minor collagens V and XI. Here we report the identification of a new fibrillar-like collagen chain that is related to the fibrillar ␣1(V), ␣1(XI), and ␣2(XI) collagen polypeptides and which is coexpressed with type I collagen in the developing bone and eye. The new collagen was designated the ␣1(XXIV) chain and consists of a long triple helical domain flanked by typical propeptide-like sequences. The carboxyl propeptide is classic, with 8 conserved cysteine residues. The amino-terminal peptide contains a thrombospodin-N-terminal-like (TSP) motif and a highly charged segment interspersed with several tyrosine residues, like the fibril diameter-regulating collagen chains ␣1(V) and ␣1(XI). However, a short imperfection in the triple helix makes ␣1(XXIV) unique from other chains of the vertebrate fibrillar collagen family. The triple helical interruption and additional select features in both terminal peptides are common to the fibrillar chains of invertebrate organisms. Based on these data, we propose that collagen XXIV is an ancient molecule that may contribute to the regulation of type I collagen fibrillogenesis at specific anatomical locations during fetal development.
Vertebrate collagens are a large family of extracellular proteins that provide mechanical stability to the connective tissue of virtually every organ system. There are at least 40 collagen chains that trimerize into 27 types (1-9), which, in turn, form a large variety of specialized macroaggregates. Structural considerations and the architecture of the resulting polymers have segregated individual collagen molecules into functionally distinct groups (10). Among them, the fibril-forming (fibrillar) collagens represent the most abundant product synthesized by connective tissue cells and include the highly expressed types I-III and the quantitatively minor collagen types V and XI. Types I, III, and V are distributed widely in non-cartilaginous tissues, and types II and XI collagen are found almost exclusively in cartilage and the eye. The pleiotropic manifestations of human fibrillar collagenopathies have dramatically underscored the importance of this ancient group of extracellular proteins in maintaining tissue integrity (1,(11)(12)(13)(14)(15).
Fibrillar collagens display a common molecular structure that consists of a long collagenous domain made of ϳ330 Gly-X-Y triplets flanked at both ends by non-collagenous propeptides. Extracellular removal of propeptides initiates the process of maturation facilitating trimer self-assembly into fibrils (for review, see Ref 16). Ultimately, fibers are organized into specific spatial arrays that are responsible for the properties of individual tissues. It follows that the regulation of fibril diameter is an important determinant of connective tissue function. For instance, large diameter fibrils, spatially arranged in unidirectional bundles, are appropriate for the integrity of tendons. However, such structures would impair the tissue integrity of the cornea, where transparency is dependent upon having thin diameter fibrils arranged in orthogonal layers. One method of controlling fibril diameter is by assembling heterotypic fibrils, incorporating either type V or XI collagen into fibrils with types I or II collagen, respectively (17,18). Types V and XI collagen most likely serve as fibril diameter regulators because they retain a bulky amino-terminal portion attached to the triple helix after their final extracellular processing (17,19). The retained N-peptide domain cannot be embedded in fibrils but instead, projects from the fibrillar surface inhibiting the lateral aggregation of additional molecules onto the fibril (20,21). Heterotypic assembly of isolated fibrillar collagen molecules and loss-of-function mutations of minor fibrillar collagen types in mice have provided in vitro and in vivo support for this regulatory model of fibrillogenesis (17)(18)(19)(20)(21)(22)(23)(24).
Here we report the identification of a novel fibrillar collagen chain, designated ␣1(XXIV), which contains an amino-terminal domain closely related to those of the types V and XI collagen subunits. We show that embryonic expression of the mouse gene (Col24a1) is confined to the developing eye and skeleton.
We therefore propose that collagen XXIV may participate in regulating type I collagen fibrillogenesis at specific anatomical locations during fetal development. Interestingly, the ␣1(XXIV) chain also displays structural features unique to invertebrate fibrillar collagens.

EXPERIMENTAL PROCEDURES
Identification and Full-length Cloning of ␣1(XXIV) Collagen cDNA-The strategy for isolating ␣1(XXIV) collagen is presented first, followed by descriptions of the methods employed. A BLAST search (25) of the data base of expressed sequence tags (dbEST) 1 (26) looking for amino acid sequence homology with the COOH-terminal third of the ␣1(V) collagen chain (Swiss-Prot accession number BAA14323) yielded one clone (GenBank accession number, AA331798) as a possible candidate for a novel human collagen cDNA. The clone, EST 35663, contained an insert about 2.4 kb in length. About 255 bp had been sequenced, encoding 85 amino acid residues with a perfect Gly-X-Y triplet structure. The EST was purchased from the American Type Culture Collection and sequenced in its entirety using the method described below. The sequence revealed codons for an additional 33 amino acid residues of Gly-X-Y structure (i.e. 11 more triplets) and 235 amino acid residues of a non-collagenous (NC) domain. This was followed by a stop codon and ϳ1.4 kb of 3Ј-untranslated region (UTR). From the sequence of the EST clone, nested primers were designed for 5Ј-RACE, using human placental Marathon ready cDNA (Clontech), as template, as described previously (27). The method was adapted as follows. Because the collagen XXIV mRNA was rare in the pool of placental mRNA, a Long Expand PCR Kit (Roche Applied Science) was used for PCR with the following conditions: denaturation, 94°C for 3 min; 10 cycles of 94°C for 30 s, 63°C (Ϫ0.5°C/cycle) for 30 s, 68°C for 4 min; 65 cycles of 94°C for 30 s, 58°C for 30 s, 68°C for 4 min (ϩ10 s/cycle); a final extension period at 68°C of 8 min. The PCR samples from the first round were purified (PCR Purification Kit; Qiagen, Valencia, CA), and 2% of the sample volume was used in the second round of PCR using the PCR protocol given above. These PCR products were purified from agarose gels using Gel Purification Kits (Qiagen) and either subcloned into pCRII or pCR2.1 vectors (Invitrogen) or used directly for sequencing. By performing RACE repeatedly with primer sequences derived from each previous RACE, overlapping segments representing the fulllength ϳ6.5 kb mRNA were obtained.
To confirm the nucleotide sequence and control for PCR-induced nucleotide substitutions, gene-specific primers were used to reamplify the entire cDNA by RT-PCR from human hip (tissue composed of both bone and cartilage) mRNA. The first strand cDNA was synthesized using random primers; routine PCR was used to generate overlapping products that were sequenced directly or were subcloned into pCRII vector (Invitrogen) and sequenced. The resulting composite hip cDNA corresponded exactly with the human placental ␣1(XXIV) collagen cDNA sequence. The human ␣1(XXIV) collagen cDNA sequence is deposited in GenBank under accession number AY244357. The human protein accession number is NP_690850.
Mouse cDNA sequences were obtained from the GenBank data base by homology searches with the human sequence. All sequences in the data base were a mix of exon and intron structure, so that the coding regions were not immediately obvious. XM_149399 encodes the 5Ј-UTR and about 94 amino acid residues at the NH 2 terminus. XM_143611 begins halfway into the minor triple helix and continues coding about 200 amino acid residues of the major triple helix. We closed the gap between XM_149399 and XM_143611 by amplifying a fragment of ␣1(XXIV) from mouse liver cDNA and cloning the piece into pCRII vector for sequence analysis. This mouse clone (GenBank accession number AY243578) overlaps XM_149399 by 27 codons and XM_143611 by 9 codons, and represents the bulk of the N-peptide, extending into the minor triple helix. It also contains an additional 20 amino acid residues after the TSP domain which are not found in the human cDNA. Additional clones identified from the data base included XM_143612 and XM_131211. The former encodes a fragment covering 420 residues of the triple helix, ending about 23 codons prior to the imperfection. XM_131211 corresponds to the 3Ј-end, beginning about 74 amino acid residues into the C-propeptide. It continues into the 3Ј-UTR for more than 1,000 nucleotides. All mouse cDNAs were compared with collagen XXVII sequences to ensure they represent murine ␣1(XXIV) collagen.
Sequence Analysis-Nucleotide sequences were determined with a Thermo Sequenase Cycle Sequencing Kit and 33 P-labeled dideoxy-NTPs (Amersham Biosciences) using either M13 forward or reverse primers if the cDNA fragment were cloned into vector. Alternatively, gene-specific primers, synthesized on an ABI oligonucleotide synthesizer, were used to sequence PCR products directly. A 1.5:1 ratio of inosine to guanosine was included in the sequencing mix. Sequence data were assembled and manipulated using Genetyx-Max 8.0 and Genestream-1 at www2.igh.cnrs.fr/ (Software Development Co., Ltd.; EERI, France). The signal peptide cleavage site, between residues 38 and 39, was predicted by visual inspection, using the rules of von Heijne (28), and by using genome.cbs.dtu.dk/services/SignalP/ (29).
Sources of RNA and Synthesis of First Strand cDNA-Human corneas were obtained from the Lions Eye Bank of Oregon (Portland). Other corneas were dissected from whole eyes obtained from this same eye bank. Mouse corneas, as well as skin, were dissected from C57 mice purchased from Jackson Laboratories (Bar Harbor, ME). A segment of human hip was a gift from Maria Nurminskaia (Tufts University School of Medicine). Human retina was obtained from William Brunken (Tufts University School of Medicine) and from whole eyes from the Lions Eye Bank of Oregon. Mouse retina, as well as bone and tendon, were purchased from Pel-Freez Biologicals (Rogers, AR). All tissues were immediately frozen in liquid nitrogen until RNA isolation. For PCR, RNA was isolated from each tissue with Trizol (Invitrogen), following the manufacturer's instructions. Then Qiagen RNeasy Kit columns were used for on-column DNase digestion, following appendix D instructions within the RNeasy Mini handbook. Total RNA was quantified by absorbance at 260 nm. 200 ng of RNA was converted to first strand cDNA in a 20-l volume and treated with RNase H as described previously (2,30,31). To have additional sources of eye mRNA, Clontech Marathon ready human retina cDNA and whole human eye RNA were purchased.
Routine PCR-Nonradioactive 35-cycle PCRs were performed as described previously (2). Taq polymerase was from Invitrogen., 10ϫ PCR buffer from PE Applied Biosystems, and dNTPs were from Roche Applied Science. A PerkinElmer 9600 was used with the following program: a preliminary cycle of 94°C for 30 s, 35 cycles of 94°C for 15 s, 56°C for 25 s, and 72°C for 1.5 min, and a final elongation cycle of 72°C for 10 min. The identity of all PCR products was confirmed by sequencing.
Relative Reverse Transcription-PCR-Relative RT-PCR was performed as described previously (30,31), radiolabeling the products for quantitation. 27 cycles proved to be in the linear range of amplification for each collagen mRNA in each tissue tested (mouse bone, cornea, skin, retina, and tendon). For collagen XXIV, a 315-bp product was amplified with sense primer CCAGGACCACATGGAAATCC and antisense primer AGAACCGATGAAACCTCGGAC. For a 340-bp m␣ 2(I) collagen product, the forward primer was CAGAGTGGAACAGCGATTACT, and the reverse was GCCCGTCTCCTCATCCAGGTACG. To make a 349-bp collagen III cDNA product, the forward and reverse primers were GATG(C/T)(A/T)(C/G)CCA(C/T)CTTGGTCAGTCCTATG and CAGT(A/ T)GG(AG)CATGATTCACAGATTC, respectively. For ␣1(V) collagen, the primer pair to make a 294-bp product was CACA(A/G)CTACGTG-GA(C/T)TA(C/T)GC and GGGCCAAGAAGTGATTCTGG. A ratio of 1:20 for the 18 S primer:Competimer (Ambion, Inc., Austin, TX) was found to produce an amplification signal comparable with the collagen XXIV signal in 27 cycles of PCR.
Products were loaded onto 5% denaturing acrylamide gels. Radioactive bands were excised from the dried gel and counted, and collagen cDNA products were normalized to the attenuated 18 S signal. All samples were run at the same time and loaded on the same gel to minimize variability.
In Situ Hybridization-In situ hybridizations were carried out using 35 S-labeled riboprobes on tissue sections of staged mouse embryos (32), as described previously (33,34). A ϳ1,000-bp fragment of the 3Ј-UTR of the mouse ␣1(XXIV) collagen cDNA was made by designing primers (nucleotides 521-553 for the sense strand and the reverse complement of 1508 -1527 for the antisense strand) from XM_131211. The template used for PCR was C57 mouse genomic DNA. This fragment was cloned into pCR2.1. The insert was excised with restriction enzymes and ligated into pBluescript II (Stratagene, La Jolla, CA) to employ the T3 and T7 promoters for riboprobe generation. The accuracy of the insert was verified by nucleotide sequence analysis. Selective linearization of the plasmid allowed for generation of sense and antisense riboprobes using T3 and T7 polymerases. Embryos and tissues were fixed in paraformaldehyde, embedded in paraffin, and sectioned at 8 m.
Deparaffinized sections were proteinase K-treated and incubated with 35 S-UTP-labeled riboprobes. Slides were dipped in emulsion and exposed at 4°C for 5-8 days and counterstained with hematoxylin and eosin for brightfield views. Darkfield photographs were taken with a Zeiss Axioscope or Nikon dissecting microscope and a digital camera system. Positive hybridization signal was verified in several ways. One was multiple hybridizations on the same tissue from different dissections. Another was sense strand control hybridization. A third method, performed on all sections, was to compare the sections by brightfield and darkfield: the silver grains are easily seen in brightfield, and their presence confirms positive hybridization signals observed in darkfield photos. This method was particularly useful for detecting nonspecific binding that occurs at external surfaces and the false positive signals because of blood cells (i.e. erythrocyte scatter). For post-processing of the data, Photoshop 7.0 imaging software (Adobe) was used.

RESULTS
Identification of Collagen XXIV cDNA-A search of the Gen-Bank dbESTs for collagen domain homologies identified three cDNAs, each encoding a potentially unique collagen. These EST clones were purchased and sequenced. Full-length copies of each mRNA were obtained by 3Ј-and 5Ј-RACE using a commercial human library as a template. Once it became clear from sequence analysis that the cDNAs did, indeed, represent novel collagens, they were designated as the ␣1 chains for the next available numbers in the collagen family, i.e. types XXII, XXIII, and XXIV. (The structures of collagens XXV-XXVII have since been elucidated (6 -9).) The ␣1(XXII) collagen chain, composed of four triple helical domains, 2 is related in sequence to FACIT-like collagens XVI and XIX (35)(36)(37). ␣1(XXIII) collagen (38) is part of a transmembrane trimeric molecule similar to collagens XIII and XXV (6,39,40). The ␣1(XXIV) collagen (41) was retrieved from the data base in a search for a homology with the triple helical and carboxyl propeptide domains of ␣1(V) collagen and constitutes the focus of the present report.
The ␣1(XXIV) collagen EST encoded 118 residues of perfect Gly-X-Y triplets and a COOH-terminal sequence that fit the characteristics of a typical fibrillar collagen C-propeptide. Marathon ready cDNA screening yielded overlapping cDNA clones covering ϳ6.5 kb of mRNA. The composite open reading frame of the cDNA codes for a polypeptide chain with a predicted molecular mass of ϳ175 kDa and which consists of a 931-amino acid residue collagenous sequence containing a single imperfection, flanked by a non-collagenous domain of 544 residues at the amino end (termed the N-peptide), and a 235 residue stretch at the carboxyl end (the C-propeptide) (see Fig. 1 for schematic and Fig. 2 for sequence).
The most distinctive feature of the N-peptide domain of ␣1(XXIV) collagen is the presence of a ϳ250-residue long thrombospondin NH 2 -terminal-like motif (Figs. 1 and 2), referred to as TSP, which is also found in some fibrillar and non-fibrillar collagen chains (42)(43)(44). A comparison of the fibrillar collagen N-peptides, including the TSPs, is shown in Fig. 2. Pairwise sequence analyses revealed that the TSP motif of ␣1(XXIV) collagen is most closely related to those of the ␣1(V) and ␣1(XI) chains (27.3% identity with both ␣1(V) and ␣1(XI) and 45.5% and 43.3% similarity, respectively). Of the 5 cysteine residues (in red in Fig. 2) in the ␣1(XXIV) TSP motif, 3 are found in analogous regions of collagens V, XI, and some of the non-fibrillar FACIT molecules. The most downstream cysteine in the ␣1(XXIV) collagen TSP motif is conserved with a cysteine found in the ␣1(XII) and ␣1(XIV) FACIT chains but not in the fibrillar collagen chains (44). The placement of this cysteine in ␣1(XXIV) collagen suggests that the disulfide bonding of the TSP domain may be more like that of the FACIT collagens than of the fibrillar collagens. Another homology with the type V and XI collagen chains includes a stretch of 37 residues (N446 -N482 in Fig. 2B) which contains several acidic amino acids and tyrosine residues (45)(46)(47)(48)(49)(50). Finally, there is no recognizable consensus sequence for bone morphogenetic protein-1 or furin convertase (51) in the NC1 of ␣1(XXIV) collagen to suggest where cleavage of the N-propeptide portion of the NC1 domain might occur. However, furin, which normally cleaves at the tetrapeptide sequence (R/K)XRR, is also known to recognize other cleavage sites. These include RAGR, KSAR, VFRR, RQPR, and RVAR (52); the ␣1(XXIV) collagen N-peptide contains two such potential cleavage sites, namely RYVK and VYKR (indicated in Fig. 2 in cyan with dot over-and underlines). If one of these is used, the propeptide cleaved from the N-peptide would consist of either ϳ210 or ϳ260 residues.
Within the N-peptides of all the fibril diameter-regulating fibrillar collagen chains (i.e. ␣1(V), ␣3(V), ␣1(XI), and ␣2(XI)) is the conserved sequence GXKGXKGEP in the first minor triple helix (53). In the single minor triple helix of the ␣1(XXIV) collagen (dot underlined in Fig. 2), a homolog of this sequence, GPKGPKGDP, is found. This homology lends support to the idea that there is likely to be a retained portion of the collagen XXIV N-peptide after processing and that collagen XXIV may play a role in fibril diameter regulation.
The C-propeptide domain of ␣1(XXIV) collagen displays the most conserved feature of fibrillar collagen chains, i.e. the characteristically spaced 8 cysteines that participate in the formation of inter-and intrachain disulfide bonds (cysteines are highlighted in red in Fig. 2). Pairwise sequence comparison of the C-propeptide domains once again confirmed close kinship between ␣1(XXIV) and ␣1(V) collagen (28.4% sequence identity and 46.4% sequence similarity, considering conservative changes). Like the N-peptide, there is no apparent endopeptidase cleavage site in the C-propeptide of ␣1(XXIV) collagen.
In addition, structural similarities with the fibril-forming chains extend to the organization of the collagen XXIV gene,  collagenous domain are 54 bp in length or multiples of 54 bp (data not shown). Altogether, these data indicate that collagen XXIV is a new member of the fibrillar group of collagens and is most closely related to the ␣1(V) collagen polypeptide chain.
Collagen XXIV Contains Features of Invertebrate Collagens-Sequence analyses also revealed several features in the N-peptide, triple helix, and the C-propeptide of ␣1(XXIV) which are unique to invertebrate fibrillar collagen chains. The most distinctive of these characteristics is a single imperfection in the collagenous domain (residues 649 -652) consisting of 4 inserted amino acid residues, STVL, in bold in Fig. 2B. The interruption is flanked by 216 perfect Gly-X-Y triplets on the N side and 93 perfect triplets on the C side. To rule out that the imperfection was an artifact, we used human corneal and hip mRNAs as template and performed RT-PCR to produce cDNA fragments encoding the imperfection and its surrounding area. Sequence analysis of the amplified fragments, as well as analysis of the gene, confirmed the existence of the imperfection in the triple helix. Although imperfections have not been observed in vertebrate fibrillar collagen triple helices prior to the discovery of collagens XXIV and XXVII (8,9), they are not rare in invertebrate fibrillar collagens (54 -57). The kind of imperfection found in ␣1(XXIV) collagen is most like that of a sponge fibrillar collagen (57). This sponge imperfection has been referred to as a Gly-X-Y-Z insertion, however, it can also be viewed as a 4-residue insertion that coincidentally begins with a glycine residue. From this viewpoint, it becomes an analog of the STVL imperfection in ␣1(XXIV) collagen. Also, the sponge fibrillar collagen imperfection is located at a relatively similar position in the triple helix, being 81 triplets from the C-propeptide (57).
There is great diversity within the first ϳ50 amino acid residues of the C-propeptides of fibrillar collagens, as shown by the peppered appearance of green and cyan highlighting of charged residues in Fig. 2C. However, after this divergent region, the C-propeptides of all species of all fibrillar collagens show a high degree of conservation. The overall identity of the of the ␣1(XXIV) C-propeptide with the annelid (54) and sponge (57) counterparts (27.2% and 26.8%) is not very different from its 29.4% identity with the human ␣1(V) collagen chain (46,47); conceivably, this closeness may reflect the necessity of maintaining particular residues for function. However, some regions in the ␣1(XXIV) collagen polypeptide are conspicuously invertebrate-like. There is an unusually high sequence identity (33 out of 56 residues) in C-propeptide residues 52-107 of collagen XXIV and the analogous region of EmF1␣, the sponge fibrillar collagen (57); and, the chain selection sequence in the C-propeptide (9), which falls between the 5th and 6th cysteinyl residues (highlighted in yellow in Fig. 2C), is unusually short compared with the vertebrate fibrillar chains but is more consistent with the length of the invertebrate counterparts.
Another evolutionarily conserved feature that ␣1(XXIV) collagen shares with invertebrate, but not vertebrate chains, was noted in the N-peptide. The N-peptides of fibrillar collagen chains contain a variable number (one to three) of minor triple helical sequences just upstream of the major collagenous domain. Collagen XXIV contains only one minor triple helix separated from the major helix by 6 residues. Such an arrangement closely resembles the 8-amino acid spacer between the major and minor triple helices of the H. vulgaris collagen (58). Altogether, these observations suggest that collagen XXIV is of ancient origin. Similar conclusions have been reached independently for the closely related collagen, type XXVII (8,9).
Collagen XXIV Expression Is Restricted in Adult and Embryonic Tissues-We attempted to examine the expression profile of the Col24a1 gene in embryonic, perinatal, and adult mouse tissues by Northern analyses, but the nonabundance of the collagen XXIV mRNA made such analyses difficult. However, preliminary results of a staged mouse embryo Northern blot suggested that the Col24a1 transcripts were first detectible around E15. Because the Northern analyses were not perfectly definitive and because commercial blots do not contain connective tissue mRNAs, we used other approaches to look at ␣1(XXIV) collagen RNA expression. RNA was isolated from adult mouse bone, cornea, retina, skin, and tendon for analysis by semiquantitative relative RT-PCR (Fig. 3). Not only did this show which of the tissues contained the mRNA, but it also allowed us to estimate the relative abundance of ␣1(XXIV) collagen transcripts compared with mRNAs for chains of fibrillar collagen types I, III, and V. The RT-PCR products are shown in Fig. 3A. The upper band is the normalizer, the reproducibly attenuated 18 S signal, whereas the lower band in each panel represents the product of a particular collagen mRNA amplification. The collagen XXIV primers (rightmost panel of Fig. 3A) robustly amplify mRNA from bone and retina, and, to a lesser degree, also amplify the mRNA from cornea, skin, and FIG. 3. Relative competitive RT-PCR on mouse connective tissue mRNAs. A, film of the radioactive 18 S and collagen cDNA products run on an acrylamide gel. B, relative amount of each collagen mRNA (normalized to the 18 S signal) in each tissue tested: bone (B), cornea (C), retina (R), skin (S), and tendon (T). C, calculated ratios of various collagen mRNAs to ␣1(XXIV) collagen mRNA as well as the ratio of ␣2(I) to ␣1(V) collagen mRNA and the estimated ratio of the total amount of mRNA for these collagens considering chain composition. (The ␣2(I) collagen transcript represents one-third of the total RNA used to synthesize collagen I, and the ␣1(V) collagen transcript represents two-thirds of the amount of total mRNA to synthesize the predominant form of collagen V.) tendon. (For a negative signal, see collagen III mRNA in cornea.) This radiograph indicates that, first, as expected, retina is not an abundant source of fibrillar collagen mRNAs. It contains the least type I, III, and V collagen mRNA compared with the other tissues. However, it is a significant source of ␣1(XXIV) collagen mRNA. Second, the radiograph shows that, as expected, ␣2(I) collagen mRNA is expressed very highly in bone, cornea, skin, and tendon. Also, ␣1(III) collagen mRNA is expressed very highly in skin, but, as mentioned, is not present in cornea. Finally, it demonstrates that ␣1(XXIV) collagen mRNA is a minor product compared with the fibrillar collagen mRNAs.
The RT-PCR data are represented in histogram form with arbitrary y axis units in Fig. 3B, showing the normalized relative abundance for each of the collagen chain mRNAs tested. Here again, it is clear that collagen XXIV mRNA is a nonabun- dant species, representing only about 70% of the amount of type V collagen in bone. Fig. 3C presents the ratio of various fibrillar collagen mRNAs compared with the ␣1(XXIV) transcript. These data again emphasize the nonabundant nature of ␣1(XXIV) collagen mRNA, showing that ␣2(I) mRNA is about 10 times more abundant than ␣1(XXIV) collagen mRNA in bone and more than 90 times more abundant in tendon. To validate further the conclusions derived from these data, the ␣2(I):␣1(V) collagen mRNA ratio was also calculated to show the expected higher percentage of collagen V in mouse cornea compared with bone, tendon, or skin. As indicated, when chain composition is taken into account, the total collagen I:collagen V mRNA ratio predicts that the type V collagen mRNA expression is 5.9% of that of type I collagen in adult mouse tendon but is 14.3% of type I in adult mouse cornea. The ultimate conclusion from this experiment is that, of the adult mouse connective tissues tested, collagen XXIV is most highly expressed in bone.
The seemingly restricted tissue distribution of collagen XXIV from Northern blots (not shown) and RT-PCR (Fig. 3) raised the possibility of a specialized role for this matrix product. Unfortunately, our attempts to generate antibodies suitable to analyze the tissue distribution of collagen XXIV in more detail have not yet been successful. As an alternative, we resorted to refining the expression profile of the Col24a1 gene by means of RNA in situ hybridization. This analysis was also undertaken to facilitate the future analysis of knockout mice deficient in collagen XXIV, the production of which are currently under way. To confirm genuine hybridization from background, in addition to sense controls, all sections were examined in darkfield and brightfield. In brightfield, silver grains are very clearly distinguishable, confirming positive signals.
In situ hybridization was performed on a series of midsagittal sections spanning the period from E10.5 to E16.5 (Fig. 4). No expression was identified before E14.5 (data not shown). Around this time, collagen XXIV transcripts were first detected in the emerging skeletal elements of the head and the appendicular skeleton (Fig. 4A). These same sites of expression were even more evident in hybridizations of E15.5 (Fig. 4B) and E16.5 embryos (Fig. 4C). In addition, we detected Col24a1 transcripts in the axial skeleton and in selected regions of the eye (Fig. 4, B and C).
The 5A, at E14.5 the Col24a1 transcripts labeled the emerging skeletal elements of the head, namely the outer table of membranous primordial parietal bone, just outside of what will become the cerebral cortex, and the mesenchymal precursor of the parietal bone. The embryonic brain itself showed no detectable signal. Positive signals were also present within intramembranous ossification centers of the maxilla and mandible around Meckel's cartilage, but Meckel's cartilage itself, circled by a dotted line in C, was negative (Fig. 5, B and C). The basophenoid bone and femur are positive as well (Fig. 5, D  and E).
At E15.5, collagen XXIV expression in areas of ossification continues and was again identified in the body of the mandible and the maxilla and the calvaria (Fig. 4B). Expression in the eye (Fig. 4B and Fig. 6A) was strongly localized to the embryonic cornea. (The lens expression is artifactual from probe accumulating in a ruptured area of tissue.) In the retina, if a signal is present at all, it is a very weak one and is limited to the nerve fiber layer of the optic cup and the inner nuclear layer. The expression in the E15.5 skeleton coincided with the formation of primary ossification centers, but, additionally, transcripts were detected in the developing otic capsule (see the area indicated by e in Fig. 6B). For the most part, collagen XXIV expression occurred in sites of robust type I collagen fibril assembly and deposition.
At E16.5, ossification sites continue to be the predominant site of hybridization, where transcripts are detected in the teeth (Fig. 7, A and B), mandible (Fig. 7C), and other developing bones, such as the temporal bone in the head (Fig. 7D). Moreover, ␣1(XXIV) collagen mRNA accumulated in cells of the ossifying metatarsal bones (Fig. 7E) and mid-shaft region of the humerus (Fig. 8A). Furthermore, the mRNA is found in ossifying regions of cervical bodies, i.e. around the foramen transversarium (Fig. 8B), and lower regions of the vertebral column (Fig. 8, C and D).
Taken together, collagen XXIV appears to be a marker for the process of bone formation in the embryonic mouse. It is first expressed at E14.5, concomitant with the appearance of the first ossification centers, and remains restricted to these, as well as ossification centers that emerge later during subsequent stages of development. The embryonic cornea is the main exception. Nevertheless, the expression pattern of collagen XXIV in the fully developed skeleton and other tissues remains to be determined. Likewise, the regional localization of Col24a1 in the cornea has yet to be fully determined. DISCUSSION The ␣1(XXIV) polypeptide represents a novel collagen chain with structural features of both vertebrate and invertebrate fibrillar chains. Together with the newly described chain of collagen XXVII (8,9), collagen XXIV represents a separate clade within the fibrillar group of collagens. Structural properties unique to these invertebrate-like fibrillar chains, such as the size of the triple helix and the triple helical imperfection, suggest that trimeric molecular assembly with polypeptides of types I, II, III, V, and XI collagen does not occur. Additionally, the distinct expression patterns of the Col24a1 and Col27a1 genes during mouse embryogenesis strongly indicates that they, too, do not coassemble into heterotrimeric molecules. Lacking additional evidence, we therefore surmise that type XXIV collagen is probably a homotrimeric collagen molecule with specialized functions at specific anatomical sites of fibrillogenesis.
Collagen XXIV is expressed during development at sites undergoing intense type I collagen fibrillogenesis. Of major interest is whether collagen XXIV is assembled into growing fibrils of type I collagen, i.e. forming a heterotypic fibril. If this turns out to be true, it is likely that lateral growth of the fibril would be terminated. The shortness of the triple helix and the interruption within the ␣1(XXIV) collagen would probably not be conducive to further fibrillar assembly. Also, collagen XXIV most likely retains an N-peptide domain that would inhibit lateral growth, regulating the fibril diameters as the retained N-peptides of collagens V and XI do. The question of whether or not collagen XXIV assembles into fibrils with collagen I will be easier to dissect when specific antibodies are generated.
The sum of our in situ hybridization and RT-PCR data also suggests that, although bone expresses collagen XXIV during development as well as during adult life in the mouse, the cornea and retina temporally regulate the expression of the molecule. The cornea contains robust amounts of the mRNA primarily during development, whereas the enhanced retinal expression is postnatal. The possibility that samples had been reversed in performing RT-PCR led us to verify the adult expression pattern using multiple adult corneal (three total) and retinal (four total) mRNA preparations. Little is known about collagen expression in the retina, but a large reduction in corneal expression after maturity is not surprising: a dramatic decrease in fibrillar collagen mRNAs is observed after the cornea matures. 3 Corneal expression of collagen XXIV is also consistent with the idea that the molecule may be a fibril diameter regulator. Within the human cornea is a stroma composed of many orthogonal layers of thin diameter fibrils (ϳ25 nm). More anteriorly, under the corneal epithelial basement membrane, is Bowman's layer, a region of even smaller diameter fibrils (ϳ18 -20 nm). It is well documented that collagen V plays a large role in regulating the diameter of the fibrils assembled in the stroma by being synthesized as 15-20% of the total fibrillar collagen (17,59,60). Further experiments will determine whether collagen XXIV also plays a role in this regulation.
Although our results show that the two major producers of collagen XXIV during development are cornea and bone, it is also possible that the molecule may play a structural role in other tissues such as skin and tendon, where the ␣1(XXIV) collagen mRNA is expressed at very low levels.