Structural Variation of Type XII Collagen at Its Carboxyl-terminal NC1 Domain Generated by Tissue-specific Alternative Splicing*

This paper reports the identification of two structural variations in the NC1 domain of rat and mouse type XII collagen. The long NC1 domain encoding 74 amino acids showed homology to chicken type XII and XIV collagens. The short NC1 domain was composed of 19 amino acids. Through genomic DNA analyses, two alternative exons were identified, each of which contained the variable NC1 sequence. With the amino-terminal NC3 splicing alternatives, we propose here a new descriptive nomenclature: types XIIA-1 and XIIB-1 which include a long NC1 sequence encoded by exon 1 (from the 3′-end), and types XIIA-2 and XIIB-2 which include a short NC1 sequence encoded by exon 2. Types XIIA-1 and XIIB-1, the predominant transcripts in 15-day old mouse embryos, showed decreased expression in 17-day old embryos when type XIIB-2 expression was sustained at constant levels. In adult mice, type XIIB-1 associates with ligament and tendon, whereas type XIIB-2 is expressed in various other tissues. The long NC1 domain contains an extended acidic region (pI = 3.4) followed by a terminal basic region (pI = 13.8). Because the short NC1 domain lacks these features, structural variations in the type XII collagen NC1 domain suggests different functional roles in a tissue-specific fashion.

The primary structure of chicken and mouse type XII collagen has been determined from overlapping cDNAs and suggests that type XII collagen may interact in several ways with other matrix molecules (1)(2)(3). For example, the amino-terminal region of type XII collagen forms three "finger-like" projections, the NC3 domains (see Fig. 1) (4) which contain multiple repeats of fibronectin type III-like subunits, the domain A of von Willebrand factor, a region homologous to the NC4 domain of type IX collagen, as well as RGD sequences. Because of its adhesive structure, it has been proposed that this region of type XII collagen may play a role in the interactions between extracellular matrix molecules and cells, thus contributing to the phys-ical state of the extracellular matrix (5).
The carboxyl-terminal region of type XII collagen contains triple-helical domains (COL1 and COL2) interrupted and flanked by non-triple-helical domains (NC1 and NC2) (Fig. 1). This structural feature is characteristic of a class of collagens, the Fibril Associated Collagen with Interrupted Triple helices or FACIT 1 (4,6). Fibril association through the carboxyl-terminal region has been postulated for members of the FACIT class of molecules. The details of such association with cartilage type II collagen fibrils have been characterized for type IX collagen (7)(8)(9)(10). In type XII collagen, the "fibril association region" makes up less than 10% of the molecule and is thought to be integral to molecular interactions with type I collagen fibrils. However, although there is indirect evidence available to support such an interaction (5,(11)(12)(13), the details of type XII collagen "fibril association " have not yet been fully elucidated.
One obstacle is the considerable difference in the NC1 domain structure of chicken and mouse type XII collagens. Chicken type XII collagen contains a long NC1 domain composed of 76 amino acids (1, 2), whereas mouse type XII collagen has only a 22-amino acid NC1 domain (3). The purpose of this study was to determine the structure of the carboxyl region of mammalian type XII collagen. In this report, we present evidence for two structural variations in the NC1 domain of type XII collagen that are generated by alternative splicing in a tissue-specific fashion.

EXPERIMENTAL PROCEDURES
cDNA Cloning and Sequencing-To obtain cDNAs encoding the COL2, NC2, COL1, and NC1 domains of rat and mouse type XII collagen, the 3Ј-rapid amplification of cDNA ends was adapted (14) using poly(A)ϩ RNA isolated from adult rat and mouse maxillary dentoalveolar tissues or total RNA from neonatal mouse skin. Site-directed coding and nested primers were designed based on the previously reported sequence of rat (15) and mouse cDNAs (3). Primers 5Ј-GGT CCT CCA GGC CCT CAG GG-3Ј and 5Ј-(CAU) 4 CCA GGA GAA CCG GGT CGC CA-3Ј were used to clone rat type XII collagen cDNAs. Primers 5Ј-TGG CAG TGC TGG AGC CAG AGG AG-3Ј, 5Ј-(CAU) 4 TAT TGT GAT TCA TCC-3Ј, and 5Ј-(CUA) 4 TGA AAA ATG CCT CTT TTG-3Ј were used to clone mouse type XII collagen cDNAs. PCR products were analyzed on Southern blots and cloned into the pAMP 1 vector (Cloneamp system, Life Technologies, Inc.) or the T vector (pT7 Blue vector, Novagen). The DNA sequences were determined by the chain-termination method (16).
Exon-Intron Structure Analysis and Sequencing of Genomic DNA Clones-Tail tissue from rats and mice were used to isolate genomic DNA samples. Two separate 3Ј-PCR primers (5Ј-CAA ACT GTA AGC AGC ACT-3Ј and 5Ј-TGA AAA ATG CCT CTT TTG-3Ј) were designed to * This study was supported in part by National Institutes of Health Grants DE07010 (to A. M. K.), DE00351 (to N. Y. K.), AR36820 (to B. R. O.), and ITI Foundation (to I. N., and N. Y. K.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM  encode the reverse and complementary sequences of each of the different 3Ј-untranslated regions of type XII collagen cDNAs. Common 5Јprimers were designed from the coding sequence of the COL1 and NC1 domains: 5Ј-TAT TGT GAT TCA TCC-3Ј and 5Ј-ATT CAT CCC AGT GTG CCA GCA T-3Ј. Either a conventional PCR or long range PCR protocol (TaqPlus system, Stratagene) was used to amplify the ␣1(XII) collagen gene fragments, which were then confirmed by Southern blots. The initial experiments suggested that the exon involved in the short NC1 variant was located on the 5Ј-side of the exon involved in the long NC1 variant. To confirm this exon orientation, a separate PCR exper-   iment was performed using a coding primer within the "short NC1" exon, 5Ј-GGT TCC GGC TAA CAA ACT-3Ј, and the anti-coding primer in the "long NC1" exon, 5Ј-GCA GCG ACT CTA TGT TCA TAG TCT TGC AAA-3Ј. These PCR products were cloned and sequenced to identify the exon-intron junctions. RNA Transfer Blot Analysis-A multiple tissue RNA transfer blot with poly(A)ϩ RNA from whole mouse 7-, 11-, 15-, and 17-day old embryos was purchased from CLONTECH. Hybridization probes were generated from the subcloned fragments of exon 1 and exon 2, separately encoding "long NC1" and "short NC1" specific sequences. In some experiments, an antisense-rich linear amplification PCR was used to prepare the hybridization probes. As a positive control, cDNA of a 370-bp HindIII fragment (MRK) encoding the mouse type XII collagen NC3 domain was used (3). A mouse ␤-actin probe was used as a housekeeping gene control.
Ribonuclease Protection Assay-Mouse clones SB#2 and SB#3, containing a 153-bp insert of the long NC-1 variant and a 102-bp insert of the short NC-1 variant, respectively, were used to synthesize riboprobes labeled with [␣ 32 P]UTP (MAXIscript, Ambion). The MRK clone containing a 370-bp insert of mouse type XII collagen NC3 domain sequence and mouse ␤-actin were used as a positive control and as a housekeeping gene control. The ribonuclease protection experiment was performed with 20 g of total RNA from each of the following mouse tissues: sternal cartilage, dento-alveolar tissue, eye, skin, and tail; the protected bands were analyzed using a computerized program (Kodak Digital Science 1D).
The isolated mouse type XII collagen cDNA, ER#K (Gen-Bank TM U57095), exhibited a sequence highly homologous to the rat AK#G clone with a short NC1 domain (19 aa; 100% homology) and a short 3Ј-untranslated region (63 bp long) ( Table I and Fig. 2). Four of four mouse clones, which were amplified from the neonatal mouse skin total RNA sample, contained identical sequences, with the exception that the poly(A) tail varied from A 19 to A 60 . The separately isolated mouse cDNA from the dento-alveolar tissue total RNA sample, SB#1, contained a long NC1 sequence (74 aa long) that was  (Table I and Fig. 2).
The data, based on these cDNAs, suggested the presence of alternative exons in the 3Ј-end of the ␣1(XII) gene. Rat and mouse genomic DNAs were examined by PCR analyses; two alternative exons were identified (Fig. 3). Exon 1 (from the 3Ј-end) encoded the variable sequence of the long NC1, whereas exon 2 (from the 3Ј-end) encoded the variable sequence of the short NC1. Both exons 1 and 2 contained separate translational termination codons (TAA and TGA, respectively) and separate 3Ј-untranslated regions. At the terminal end of exon 2, we detected the poly(A) signal consensus sequences-AATACT in the rat gene and AATACA in the mouse gene (Fig. 3). An exon-intron junction was detected at the beginning of each of the variable sequences. The split codon exon junction structure generates G AG (Glu) at the 17th amino acid residue of the long NC1 domain and G GT (Gly) at the 17th amino acid residue of the short NC1 domain (Figs. 2 and 3). Thus, an alternative splicing mechanism is strongly suggested to generate the long and short NC1 domains of type XII collagen. Exon 3 (from the 3Ј-end) encoded the common NC1 sequence and part of the COL1 domain. The intron sizes between exons 1 and 2 and between exons 2 and 3 were estimated at 1.4 and 1.1 kilobase pairs, respectively. DNA sequence analyses revealed that the previously reported mouse NC1 domain sequence (3) matched the intron sequence immediately to the 3Ј-side of exon 3 (Fig. 3).
Type XII collagen transcripts also contain separate alternative splicing variants, at the 5Ј-region of the ␣1(XII) gene, that generate the long and short NC3 domains, or type XIIA and XIIB, respectively. Poly(A)ϩ RNA from mouse embryos at days 15 and 17 showed the presence of both type XIIA (approximately 10 kb) and XIIB (approximately 7-8 kb) collagen mRNAs using MRK (mouse type XII collagen NC3 domain) as a probe. When a fragment of mouse genomic DNA containing the exon 1 sequence (long NC1) was used as a probe, both the 10-and 7-8-kb forms of type XII collagen mRNA were positively recognized. The expression pattern of long NC1 type XII collagen was similar to the pattern of the mRNA recognized by the MRK probe, which peaked in 15-day old mouse embryos and decreased in 17-day old embryos (Fig. 4). The exon 2 probe (short NC1) recognized primarily the 7-8-kb form of type XIIB collagen mRNA but only weakly hybridized with a 10-kb type XIIA collagen mRNA. The expression of short NC1 type XIIB collagen mRNA remained constant during the late stages (15-and 17-day old) of mouse embryo development (Fig. 4).
The relative mRNA levels of type XII collagen varied among the different tissue specimens (cartilage, dento-alveolar tissue, eye, skin, and tail). The ribonuclease protection assays revealed an exceedingly high level of expression of type XII collagen with the long NC1 domain in dento-alveolar and tail tissue specimens compared with other mouse tissues and revealed moderate expression in cartilage (Fig. 5). Expression of type XII collagen with the short NC1 domain in the mouse was noticeably present in all tissues tested (Fig. 5). DISCUSSION The newly discovered sequence variations in the NC1 domain of rat and mouse type XII collagen indicated the following characteristic substructures. At the amino-terminal end of the NC1 domain immediately adjacent to the COL1 domain, there is a "common region" that contains 16 amino acid residues, which are 100% conserved in rat and mouse NC1 variants (Fig.  6). The short NC1 domain found in the rat AK#G and mouse ER#K clones extends three additional amino acid residues from the common region prior to the termination codon. The short NC1 domain sequence, with a total of 19 amino acid residues, shows great similarity to the previously reported 22-amino acid mouse NC1 domain sequence deduced from cDNA mc102T (3) (Table I). However, based on the differences between the amino acid and nucleotide sequences at the carboxyl end of the coding region and the 3Ј-untranslated regions, the newly discovered short NC1 sequence should be considered distinct. These short NC1 sequences contain two conserved locations of cysteinyl residues, one at the end of the COL1 domain and the other at the fifth position in the NC1 domain. This structure is homologous to the corresponding domain of type IX collagen chains, which were recently shown to possess all the necessary infor-mation for chain selection and trimer assembly in vitro (17).
The long NC1 sequence of rat and mouse type XII collagen exhibited structural homology with chicken type XII collagen. Following the common region, a stretch of amino acids containing aspartic acid and glutamic acid residues forms the "acidic extension region" (Fig. 6), with a calculated pI of 3.4. The acidic extension region is further followed by a short polypeptide composed of nearly 40% basic amino acid residues, the "basic terminal region," with a calculated pI of 13.8. These characteristic sub-structures of the long NC1 domain of type XII collagen seem to be shared with the corresponding domain of type XIV collagen. Type XII and XIV collagens belong to the FACIT family and show significant structural homology. However, histolocalization of these collagens appears to be somewhat different (18 -21). In vitro experiments indicate that type XIV collagen appears to interact with a dermatan sulfate glycosaminoglycan (GAG) side chain of decorin (22). Because the type XIV collagen NC1 domain contains a typical sulfated GAG binding sequence, it has been postulated that type XIV collagen interacts with type I collagen fibril through decorin. In contrast, the molecular interaction between type XII collagen, type I collagen, and proteoglycans is not yet fully understood (13). The newly discovered NC1 domain isoforms may provide a long awaited clue toward elucidating the molecular interaction of type XII collagen.
Our rat and mouse ␣1(XII) collagen genomic DNA sequence data further defined two alternative exons 1 and 2 that encode the long and short NC1 domain isoforms, respectively (Fig. 3). The previously reported NC1 domain sequence in mice (3) did not match the NC1 domain sequences found in this study, rather it matched the sequence of the intron immediately following exon 3. FIG. 4. A, poly(A)ϩ RNAs from 7-, 11-, 15-, and 17-day old mouse embryos were analyzed by RNA transfer blots. The mouse ␣1(XII) collagen probe, MRK, primarily hybridized two species: 10 kb (type XIIA collagen) and 8 -9 kb (type XIIB collagen) (Common Probe). Two separate genomic DNA fragments each containing either the exon 1 sequence or the exon 2 sequence hybridized to type XIIA and type XIIB collagen mRNAs, respectively. B, diagrammatic sketches depicting the alternative type XII collagens and proposed nomenclature: type XIIA-1 contains long NC3 domain and long NC1 domain; type XIIA-2 contains long NC3 domain and short NC1 domain; type XIIB-1 contains short NC3 domain and long NC1 domain; and type XIIB-2 contains short NC3 domain and short NC1 domain.
RNA transfer blot analyses indicated that the poly(A)ϩ RNA sample isolated from 15-and 17-day old mouse embryos contained both type XIIA (10 kb) and type XIIB (7-8 kb) mRNAs that are due to alternative splicing in the NC3 domain ( Fig. 1) (2,23,24). It was noted that the exon 1 and exon 2 type XII collagen genomic DNA probes hybridized to the types XIIA and XIIB mRNAs to different degrees (Fig. 4). In light of the NC3 splicing alternatives (type XIIA and XIIB), we propose here a new descriptive nomenclature: 1) type XIIA-1, contains long NC3 domain and long NC1 domain; 2) type XIIA-2, contains long NC3 domain and short NC1 domain; 3) type XIIB-1, contains short NC3 domain and long NC1 domain; and 4) type XIIB-2, contains short NC3 domain and short NC1 domain.
Types XIIA-1 and XIIB-1 include a long NC1 variable sequence encoded by exon 1 (from the 3Ј-end), and types XIIA-2 and XIIB-2 include a short NC1 variable sequence encoded by exon 2. These 5Ј-end (NC3) and 3Ј-end (NC1) alternative splicing mechanisms seem to be coordinately regulated during ␣1(XII) gene transcription. Type XIIA collagen (long NC3) is expressed primarily in embryonic tissues, whereas type XII collagen in adult tissues is predominantly the type XIIB species with the short NC3 domain. The ribonuclease protection assay examining adult mouse tissues further reveals that the type XII collagen long NC1 domain probe is protected in dentoalveolar tissue and tail tissue, which contain periodontal ligament and tail tendon, respectively. Because chicken type XII collagen derived from tendon is homologous to type XIIB-1, we speculate that the long NC1 domain may play a tissue-specific role essential to ligament/tendon extracellular matrix organization.
In contrast, type XIIB-2 collagen with the short NC1 domain appears to be more widely expressed (Fig. 5). Sugrue et al. (25) reported that when monoclonal antibody 75d7 (generated against a synthetic peptide encoding a portion of the long NC1 domain of chicken type XII collagen) was used, the bone tissue including periosteum lacked any immunostaining. However, when another monospecific antibody (generated against a fusion peptide containing a nonvariant-region of the NC3 domain of mouse type XII collagen) was used, the entire periosteum of the long bone and membranous bones was stained (26). The implication of these previously published conflicting results is that newly defined type XIIB-2, which lacks the 75d7 epitope sequence in the long NC1 domain, may be predominantly present in periosteum. It is interesting to note that in nonligament/ tendon tissues, type XII collagen localizes at the tissue interface zones such as the perichondreum of cartilage (18,19,25), the periosteum of bone (20,26), the epimysium, perimysium, and endomysium of skeletal muscle (20), and Descemet's membrane of cornea (26,27). The short NC1 domain of type XII collagen may play a functional role at these tissue interface zones. The deduced sequence of the short NC1 domain contains a potential recipient site for O-serine-linked oligosaccharides at the carboxyl end. It is unknown, however, if this site is posttranslationally modified.
These results seem to suggest that the selection of the 3Ј-end alternative splicing is dependent on the tissue type and func-FIG. 5. The ribonuclease protection assay demonstrates the relative level of ␣1(XII) collagen mRNA in sternal cartilage, maxillary dento-alveolar tissue, eye, skin, and tail of adult mice. The relative amounts of mRNA encoding the type XII collagen alternative isoforms containing long and short NC1 domains were separately analyzed against a housekeeping control, ␤-actin. tion. The newly discovered variations in the NC1 domain of type XII collagen can provide a clue as to the puzzling tissue distribution of type XII collagen in structurally and functionally diverse adult tissues. It is tempting to speculate that type XII collagen may contribute to different extracellular matrix organizations in these adult tissues, potentially through altering the fibril association schemes through its NC1 domain.