Complete Sequence of the 23-Kilobase Human COL9A3Gene

We report the complete sequence of the humanCOL9A3 gene that encodes the α3 chain of heterotrimeric type IX collagen, a member of the fibril-associated collagens with interrupted triple helices family of collagenous proteins. Nucleotide sequencing defined over 23,000 base pairs (bp) of the gene and about 3000 bp of the 5′-flanking sequences. The gene contains 32 exons. The domain and exon organization of the gene is almost identical to a related gene, the human COL9A2 gene. However, exon 2 of theCOL9A3 gene codes for one -Gly-X-Y- triplet less than exon 2 of the COL9A2 gene. The difference is compensated by an insertion of 9 bp coding for an additional triplet in exon 4 of the COL9A3 gene. As a result, the number of -Gly-X-Y- repeats in the third collagenous domain remains the same in both genes and ensures the formation of an in-register triple helix. In the course of screening this gene for mutations, heterozygosity for separate 9-bp deletions within the COL1 domain were identified in two kindreds. In both instances, the deletions did not co-segregate with any disease phenotype, suggesting that they were neutral variants. In contrast, similar deletions in triple helical domain of type I collagen are lethal. To study whether α3(IX) chains with the deletion will participate in the formation of correctly folded heterotrimeric type IX collagen, we expressed mutant α3 chains together with normal α1 and α2 chains in insect cells. We show here that despite the deletion, mutant α3 chains were secreted as heterotrimeric, triple helical molecules consisting of three α chains in a 1:1:1 ratio. The results suggest that the next noncollagenous domain (NC2) is capable of correcting the alignment of the α chains, and this ensures the formation of an in-register triple helix.

Type IX collagen is a structural component of hyaline cartilage and vitreous of the eye. It is a heterotrimeric molecule composed of three genetically distinct polypeptide chains: ␣1, ␣2, and ␣3 (1). The protein is characterized by interruptions in the triple helix, and it consists of three collagenous domains (COL1, COL2, and COL3, numbered from the C terminus) that are joined by four small noncollagenous domains (NC1 to NC4) (2,3). In addition to interrupted triple helices, type IX collagen is a fibril-associated collagen and thus belongs to the FACIT subgroup of collagens (4).
Type IX collagen is attached to the surface of type II collagen fibrils by lysine-derived covalent cross-links between the COL2 domain and telopeptides of type II collagen (5)(6)(7)(8). Because the flexible NC3 domain enables the COL3 and NC4 domains to project out of the fibril surface, it has been suggested that the COL3 and NC4 domains may play a role in mediating interactions between collagens and noncollagenous components of hyaline cartilage (6,9,10). The NC3 domain of the ␣2(IX) chain also has an attachment site for a glycosaminoglycan side chain (11). Results from a recent study indicate that the NC1 domain of the three ␣ chains contains all of the necessary information for chain selection and assembly (12). The COL1 domain may play a critical role in the anti-parallel binding to the fibril surface, although this has not been directly demonstrated. There are no known or proposed functions for the NC2 domain as yet.
Transgenic mice expressing ␣1(IX) cDNA with a large inframe deletion of the sequences encoding a part of the COL3 domain, the entire NC3 domain, and part of the COL2 domain develop abnormalities in cartilage collagen fibril structure and a phenotype similar to human osteoarthritis and a mild chondrodysplasia (13). Degenerative joint disease was also seen in separate lines of transgenic mice with inactivation of the Col9a1 gene (14). All of these findings suggest that type IX collagen is not essential for cartilage development, but it is required for maintaining the normal structural integrity of cartilage.
Linkage to COL9A2 has been reported in two families with autosomal dominant multiple epiphyseal dysplasia (MED) 1 (15)(16). In one of these families, a splice site mutation leading to an in-frame 12-amino acid residue deletion in the COL3 domain has been identified. Multiple epiphyseal dysplasia comprises a genetically heterogeneous group of disorders characterized by shared clinical findings ranging from mild joint stiffness and pain in large joints to early onset osteoarthritis (17). Mutations in cartilage oligomeric matrix protein have also * The work was supported in part by grants from the Academy of Finland (to L. A.-K.) and the Arthritis Foundation (to M. L. W.) and National Institutes of Health Grants AR30481 and EY09908 (to R. M.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) AF026801 and AF026802.
** To whom correspondence should be addressed. been shown to cause MED (18 -20), and linkage studies support the existence of additional MED loci (21). The COL9A1 and COL9A3 genes are logical candidates for such loci.
The complete cDNA sequences for the ␣3(IX) chain are currently available for chick (22,23) and human (24). Here we report the complete genomic organization and sequences of the human COL9A3 gene. Also, we report that two unrelated families have different 9-bp deletions in the same region of the COL1 domain that are neutral variants of the gene. These are the first examples of deletions within the triple helical domain of a collagenous protein that are neutral variants.

EXPERIMENTAL PROCEDURES
Isolation of Phage and P1 Clones for the Human COL9A3 Gene-A probe (p1061) was prepared by reverse transcription-PCR for screening of phage genomic libraries using primers H10 and H2 as described (24). A second probe was prepared using a sense primer H41 (5Ј-AAA TCA GGC TCT CGA AGC TC, nt 2032-2051) with an antisense primer H42 (5Ј-TCT TTA CAC AAA TGC TAT GC, nt 2355-2374) to amplify a 342-bp PCR product, p342, that originates in NC1 and extends into the 3Ј-untranslated region of the human ␣3(IX) cDNA (24). The probes p342 and p1061 were 32 P-radiolabeled by nick-translation and used to screen replicate filters from a human EMBL3 PS6/T7 genomic library (CLON-TECH Laboratories, Palo Alto, CA) as described previously (24). Two unique clones, gRB2B1 and gRB5B1, were isolated and sequenced (24).
For PCR screening of genomic P1 libraries for the human COL9A3 gene, the primer pairs were designed on the basis of published sequences for the human ␣3(IX) cDNA (24) and genomic organization of the mouse Col9a2 gene (25). To amplify the 5Ј-end sequences of the gene, the primer pair C93-F4 (5Ј-CAG GAA AGC CGG GGA AAC CAG, nt 200 -220 from the start of translation in the human cDNA) and C93-R5 (5Ј-GTC CAT CTC GTC CAG TCA GAC, nt 277-257) was used. Characterization of P1 Clones-To isolate DNA, the P1 clones were cultured overnight in 3 ml of LB and 25 g/ml kanamycin, and 2.5 ml of the overnight culture was grown in 75 ml of LB for 1.5 h. After addition of isopropyl-1-thio-␤-D-galactopyranoside to a final concentration of 0.5 mM, incubation was continued for 5 h followed by centrifu-gation at 10,000 ϫ g for 10 min in 10-ml aliquots and P1 DNA isolation with a standard plasmid isolation protocol (26). Sequencing of the P1 clone was performed by cycle sequencing (Cycle Sequencing Kit, Amersham Pharmacia Biotech). Sequencing primers were designed on the basis of the published cDNA sequences for the human ␣3(IX) chain (24) and the genomic organization of the mouse Col9a2 gene (25). Intronic sequences between exons 10 and 11, 12 and 13, 26 and 27, and 31 and 32 were amplified using Expand long template PCR system (Roche Molecular Biochemicals). The PCR products were purified using an agarose gel extraction protocol (QIAEX II gel extraction kit, Qiagen), followed by cloning into pUC 18 vector (SURE clone ligation kit, Amersham Pharmacia Biotech), and sequencing (T7 sequencing kit, Amersham Pharmacia Biotech). Sequencing reactions were analyzed on a 6% polyacrylamide gel.
Heteroduplex Analysis-Human control and patient DNA was extracted from whole blood using standard methods and used for PCR amplifications. PCR primers were designed from the intronic sequences to amplify separately each exon of the COL9A3 gene. The product sizes varied from 200 to 400 bp and contained at least 80 bp of 5Ј and 3Ј intronic sequences. Genomic DNA was amplified in a 40-l volume with thermal cycling of 45 s at 94°C, 45 s at 60 -62°C, and 1 min at 72°C for 30 cycles followed by a final extension at 72°C for 10 min. Heteroduplexes were generated by denaturing the samples at 95°C for 5 min and reannealing for 30 min at 68°C. The concentration and quality of PCR products was estimated analyzing 5 l of each reaction in 1.5% agarose gel. CSGE was used to scan the PCR products for mutations that generated heteroduplexes (27). A CSGE gel consisted of 10% polyacrylamide, 99:1 ratio of acrylamide to 1,4-bis(acryloyl)piperazine (Fluka), 10% ethylene glycol, 15% formamide (Life Technologies, Inc.), 0.1% ammonium persulfate, and 0.07% TEMED in 0.5ϫ TTE (44 mM Tris, 14.5 mM Taurine, 0.1 mM EDTA buffer, pH 9.0) buffer. Gel electrophoresis was performed with a standard DNA sequencing apparatus (Life Technologies, Inc.) using 0.5ϫ TTE as the electrode buffer. Prior to electrophoresis, 3-15 l or 25-75 ng of sample was mixed with loading buffer (10ϫ stock solution of 30% glycerol containing 0.25% of both xylene cyanol FF and bromphenol blue). The gel was pre-electrophoresed at 45 W for 15 min, and the samples were electrophoresed at 45 W for 5 h at room temperature. After electrophoresis, the gel was stained with ethidium bromide (1 g/ml), destained with water, and photographed. Samples containing heteroduplexes were analyzed by direct PCR product sequencing (T7 Sequenase PCR product sequencing kit, United States Biochemical). Some PCR products were purified from agarose gel, and 60 ng of purified product was cloned into pUC18 vector and sequenced. Several clones were sequenced to obtain sequences for both alleles.
Expression and Analysis of Recombinant Type IX Collagen in Insect Cells-For amplification of the ␣3(IX) chain containing the Gly-Pro-Pro deletion, specific primers were designed on the basis of the published cDNA sequences (24). Two oligonucleotides, R9A3DEL (5Ј-CTT CTA CGG ACC GGG GGG GCC AGC TGG ACC GGG CCG ACC AAT GG, nt 1655-1685) and F9A3DEL (5Ј-GTT GTT CGG TCC GCC AGG CTC CAT TGG TCA CCC TGG CGC TCG, nt 1702-1730), both containing a generated CspI cleavage site, were designed to exon 30. R9A3DEL was used for PCR amplification with oligonucleotide M29B (5Ј-CCC GAC GCC GCA GTC TAG ACT CCG CCA CGC) that corresponded to the 5Ј-noncoding region and F9A3DEL with oligonucleotide MH30 (5Ј-TCG GGC GTC CTT GTC TCT AGA TTC CTC ACG) that corresponded to the 3Ј-noncoding region of the ␣3(IX) cDNA. A DNA template for PCR amplification was ␣3(IX) cDNA transcribed from total RNA extracted from human fetal cartilage. PCR was performed in a 40-l volume with thermal cycling of 45 s at 94°C, 45 s at 60°C, and 1 min at 72°C for 30 cycles followed by a final extension at 72°C for 10 min. The primer pairs amplified 1729-bp 5Ј-end sequences (M29B and R9A3DEL) and 404-bp 3Ј-end sequences (F9A3DEL and MH30) of the ␣3(IX) cDNA. The PCR products were digested with CspI, purified using QIAEX II gel extraction kit (Qiagen) followed by ligation into pVL1392 vector, and sequenced using cDNA specific primers (ABI PRISM model 377 sequencer, Perkin-Elmer; ABI PRISM dye terminator cycle sequencing ready reaction with AmpliTaq DNA polymerase, FS, Perkin-Elmer).
Expression of mutant recombinant type IX collagen was achieved by co-infecting Trichoplusia ni (High Five, Invitrogen) insect cells with the recombinant virus for the ␣3(IX) chain containing the Gly-Pro-Pro deletion and viruses for the ␣1(IX) and ␣2(IX) chains (43) together with a double promoter virus, 4PH␣␤ (28) coding for the ␣ and ␤ subunits of human prolyl 4-hydroxylase. For expression of wild-type collagen, a virus for the wild-type ␣3(IX) chain (43) was substituted for the mutant ␣3(IX) virus. Culture conditions were as described (43). Culture medium was collected after 72 h of infection, and the recombinant type IX collagen was precipitated with 25% saturation of ammonium sulfate. The precipitate was dissolved overnight at 4°C in 0.5 M urea, 0.2 M NaCl, 0.05 M Tris buffer, pH 7.4. For pepsin digestion, the samples were adjusted to pH 2, and digestion was performed at room temperature for 4 h. The undigested controls were incubated without pepsin, and denaturation of the samples was performed by heating at 60°C for 5 min prior to digestion. After pepsin treatment, the pH was adjusted to 7.5. All the type IX protein samples were analyzed by SDS-PAGE followed by staining with Coomassie Brilliant Blue. Further purification of the mutant recombinant type IX collagen was achieved by cation exchange chromatography as described (43).

RESULTS
Characterization of Genomic Clones-Screening of the human genomic phage library yielded two positive clones, gRB2B1 and gRB5B1, that contained the 3Ј-portion of the COL9A3 gene. gRB2B1 was approximately 12 kb in size and hybridized with both P1061 and p342. gRB5B1 was approximately 18 kb in size and hybridized only with the 3Ј-probe p342. A third genomic clone, gRB35, that was isolated previously, contained the 5Ј-most end of the gene (24). However, after DNA sequencing and restriction mapping of the genomic clones, it was found that the clones did not overlap and thus did not cover the entire gene. To obtain clones covering the entire gene, a human P1 library was screened with two PCR primer pairs designed from the cDNA sequence to amplify the 5Ј-end or the 3Ј-end of the gene. The screening yielded three positive clones. The P1 clones were analyzed for the presence of the most 5Ј-end and 3Ј-end sequences of the corresponding cDNA by sequencing and PCR amplification. All clones were found to contain the entire coding region, and one clone (P1-C93A) was selected for detailed characterization of the gene (Fig. 1). Nucleotide sequencing of the human COL9A3 gene was performed by direct sequencing of the P1 clone or by sequencing of subclones in plasmids. A total of over 26 kb of the nucleotide sequence was determined. The results indicated that the gene is about 23 kb and contains 32 exons (Fig. 1). Also, over 3 kb of 5Ј-flanking sequences are presented. The sequences extend to the 3Ј-end of the next gene, which is called 7-60. 2 Sites for selected restriction enzymes are shown in Fig. 1.
Exon Organization and Domain Structure-The genomic organization of the human COL9A3 gene is indicated in Figs. 1 and 2. The genomic structure was compared with a related gene, the human gene for ␣2(IX) collagen chain (29). Because the domain structures of ␣2(IX) and ␣3(IX) collagen chains are almost identical, it was probable that the exon organization of the corresponding genes would also be conserved. As expected, the overall exon organization of the genes showed considerable similarities. There were, however, some unexpected differences in sizes of the exons encoding for the COL3 domain even though the size of the domain is identical in these genes. The COL3 domain is 411 bp or 137 amino acid residues and is encoded by exons 2 through 10 in both genes. Exon 2 in both genes is a junction exon between NC4 and COL3. Exon 2 in the COL9A3 gene codes for one collagen triplet less than exon 2 in the COL9A2 gene, but the COL3 domain is identical because exon 4 in COL9A3 codes for one additional triplet compared with exon 4 in the COL9A2 gene (Fig. 3). In effect, the deletion of 9 bp coding for one -Gly-X-Y-triplet in exon 2 of the COL9A3 gene is compensated by an insertion of 9 bp coding for an additional triplet in exon 4.
Noncoding Regions-The intron sizes of the human COL9A3 gene vary from 84 to over 2000 bp (Figs. 1 and 2). Six introns are over 1000 bp (Fig. 2). The largest intron is intron 26, which is about 2000 bp. This large size can partially be explained by insertion of two Alu repeats. Surprisingly, the first intron is relatively small, and the large introns are located mostly in the 3Ј-end of the gene. This finding is the opposite of the human COL9A2 gene (29) and all of the fibrillar collagen genes characterized to date, in which the largest introns are located in the 5Ј-end (30 -35).
Although the start of transcription was not determined due to a lack of human cartilage mRNA, sequences up to about Ϫ3000 from the start of translation were analyzed for the presence of the binding sites for common transcription factors. The sequences contained several Sp1 consensus recognition sites, but no TATAA or CCAAT boxes. Altogether, 30 Sp1 sites were found in this sequence, but 11 of the sites were found at the position Ϫ361 to Ϫ35 from the start of translation.
The putative promoter region and the introns were analyzed for the presence of binding sites for known cartilage-specific transcription factors. Results of recent reports indicate that SOX9 may play a role in chondrogenesis (36,37). Two sites for SRY/SOX protein binding motif (A/T)(A/T)CAA(A/T)G were found. One was at position Ϫ3174 to Ϫ3168 and the other at position 226 -232 in the first intron. Both motifs were in reverse orientation. Other cartilage-specific motifs, such as ATrich element and C1 or C3 motifs, were not found (38). In addition, intron 10 was found to contain a sequence of 31 bp that was repeated 12 times. This repeat was not homologous to any known sequences.
Mutation Screening-A proband with MED was screened for mutations in the COL9A3 gene by PCR amplification of exons and flanking sequences and analyzing the products for heteroduplexes by CSGE. Several neutral polymorphisms and one potential disease-causing mutation were identified. The potential disease-causing mutation was in exon 30, and sequencing of the cloned PCR product identified a 9-bp deletion in the exon, removing a Gly-Pro-Hypro triplet in the 5Ј-end of the COL1 domain. The rest of the family members were analyzed for the presence of the mutation. As indicated in Fig. 4, two affected members of the family had the deletion. At the same time, two unaffected members had the deletion, and identical twins who were affected did not have the deletion. Hence, the deletion did not co-segregate with the phenotype in the family.
A second family having a different 9-bp deletion in exon 30 that also removed a Gly-Pro-Hypro repeat (Fig. 5) was serendipitously identified while mapping COL9A3 locus. An exon 30 intraexonic primer pair was used to identify an SSCP polymorphism suitable for linkage analysis in the CEPH linkage panel (24). During the course of determining the frequency of this polymorphism in the control population, a 9-bp deletion was found in a healthy relative of an extended family under study for another disorder. Segregation of the deletion was then studied in this family. Seven individuals within the family were found to have the deletion; however, there was no associated clinical phenotype. Specifically, there was no evidence of short stature, chondrodysplasia, precocious osteoarthritis, hearing loss, or myopia in heterozygous individuals. A fibroblast cell line from one of the family members was used to demonstrate that both alleles appeared to be equally expressed based upon reverse transcription-PCR amplification. The deletions were not found in 350 additional chromosomes, indicating that they are not common polymorphisms.
Sequencing of the region of the deletion in the first family (Fig. 5) indicated that the deletion occurred in a repetitive GC-rich region. Because of the repetitiveness of the sequences, the deletion could occur at two different sites (Fig. 5). Sequencing indicated that the deletion in the second family consisted also of 9 bp coding for a Gly-Pro-Hypro triplet in the 5Ј-end of the COL1 domain. Surprisingly, the deletion in the second family was different than the one in the first family (Fig. 5).
Expression and Analysis of Recombinant Type IX Collagen Containing the Gly-X-Y Deletion in the ␣3(IX) Chain-To study whether ␣3(IX) chains with the deletion will participate in the formation of correctly folded type IX collagen, the Gly-Pro-Pro deletion was generated. Specific primers that contained a generated CspI cleavage site were used for the formation of the 9-bp (nt 1681-1689) deletion in exon 30. As a result of the generation of the CspI cleavage site, the third nucleotide of the CCT (nt 1686), GGA (nt 1698), and CCC (nt 1701) triplets were changed to C, T, and G, respectively. A recombinant virus coding for the Gly-Pro-Pro deleted ␣3(IX) chain was generated and used for the expression of trimeric type IX collagen, together with viruses for normal ␣1(IX), ␣2(IX), and ␣3(IX) chains. Mutant type IX collagen was purified by cation exchange chromatography and analyzed under reducing conditions by SDS-PAGE. Results indicate that the trimeric type IX collagen molecules consisted of the three ␣ chains in a 1:1:1 ratio (Fig. 6). To study the triple helicity of the recombinant type IX collagen, pepsin treatment was performed on native and on denatured recombinant type IX collagen samples followed by SDS-PAGE analysis under nonreducing conditions. Pepsin was found to digest all of the material when the type IX collagen samples were denatured before pepsin treatment. Pepsin resistant fragments were seen, if the protein samples were not denatured (Fig. 7). The results indicate that the recombinant type IX collagen containing the Gly-X-Y deletion in the  Fig. 6 and analyzed by 8% SDS-PAGE under nonreducing conditions. Samples in lanes 3, 5, 7, and 9 were subjected to limited pepsinization (P) for 4 h at room temperature, whereas samples in lanes 2, 4, 6, and 8 were incubated for 4 h at room temperature without pepsin. Samples in lanes 4, 5, 8, and 9 were denatured by heating at 60°C for 5 min before the incubation at room temperature with (lanes 5 and 9) or without (lanes 4 and 8) pepsin. Lane 1, molecular weight marker. HMW, high molecular weight fragments of type IX collagen obtained by pepsinization (39). Pepsin (arrow) indicates the position of pepsin after electrophoresis. ␣3(IX) chain is secreted as correctly folded triple helical molecules. DISCUSSION The results demonstrate that the genomic structure of the COL9A3 gene is similar to that of the human COL9A2 gene (29). However, there are two major differences. One is that the large introns of the COL9A3 gene are found primarily near the 3Ј-end of the gene, whereas the large introns are at the 5Ј-end of the gene (29). The COL9A3 gene appears to be the exception in this regard because most other genes for collagens have their large introns at the 5Ј-end of the gene (30 -35). A second difference between the two genes is the size distribution among exons coding for the COL3 domain. The size of the COL3 domain is 137 amino acids in both genes. Exon 2 of the COL9A3 gene codes for one less -Gly-X-Y-triplet than exon 2 of the COL9A2 gene, but the number of -Gly-X-Y-triplets needed to form an in-register triple helix remains the same because exon 4 of the COL9A3 gene codes for an additional triplet compared with exon 4 of the COL9A2 gene.
The number of -Gly-X-Y-triplets in the triple helix of collagens is the same to ensure the formation of an in-register triple helix. However, the COL1 domain of the ␣3(IX) chain is one triplet shorter than the corresponding domain in the ␣1(IX) and ␣2(IX) chains. Surprisingly, we observed here an additional 9-bp deletion coding for a triplet of -Gly-X-Y-in an individual with MED, suggesting that the deletion was diseasecausing. However, examination of affected and unaffected members of the family indicated that there was no co-inheritance of the deletion with the disease phenotype. Individuals from a second family segregating a different 9-bp deletion within this domain had no skeletal phenotype. Therefore, the deletions must be neutral variants of the gene.
A likely explanation for this finding is that the NC2 domain can compensate for the size difference within COL1 created by the deletion by independently facilitating the correct register of the individual ␣ chains prior to their folding to form the COL2 triple helical domain. Mechanisms leading to the precise alignment of individual ␣ chains have been principally studied in the fibril forming collagens. In these molecules, the C-terminal propeptides of the ␣ chains associate through noncovalent interactions ensuring the correct register of the ␣ chains. The interaction is stabilized by intramolecular disulfide bonds. This is followed by triple helix formation that progresses from the C terminus to the N terminus. It has been shown that the Cterminal propeptides of fibrillar collagens contain all the necessary information for the correct chain selection and association (see Ref. 40). Accordingly, it has been shown recently that synthetic peptides of the three ␣ chains of type IX collagen consisting of the entire NC1 domain and the C-terminal end of the COL1 domain contain all the necessary information for chain selection and assembly (12). The correct assembly of the ␣ chains is critical because triple helix formation progresses in a zipper-like fashion. That a deletion within the COL1 domain of collagen IX does not cause a clinical phenotype suggests that it does not interfere with the normal folding of the remaining domains of this molecule. To study that possibility, recombinant human type IX collagen containing the deletion in ␣3 chain was expressed in insect cells. Analysis of the protein indicated that the mutated ␣3 chain participates in the formation of correctly folded heterotrimeric molecules. Thus, the NC2 domain of collagen IX may function to align ␣ chains prior to COL2 triple helix formation similar to the function of the NC1 domain during COL1 formation (Fig. 8). Internal non-triple helical domains within other FACIT collagens may have similar roles. Similar 9-bp deletions have been reported in direct repeat sequences of the COL1A1 gene in two unrelated probands with lethal osteogenesis imperfecta (41,42). The deletions led to the loss of one of three consecutive Gly-Ala-Hypro triplets at positions 868 -876. The mechanism by which the deletions caused a lethal phenotype is not entirely clear, but the results suggested that the deletions introduced a shift in the phase of the chains in the triple helix, and the shift was propagated from the site of the deletion toward the N terminus of the molecule. Even though the deletions did not abolish the N-proteinase or collagenase cleavage sites, the deletions might interfere with cross-link formation or prevent the lateral association of molecules to form fibrils. A splice mutation in the COL9A2 gene leading to an in-frame deletion of 12 amino acids in the COL3 domain has been reported in a family with MED (16). This mutation shortens the COL3 domain and interferes with the structure of the NC4 domain. Thus, it is unlikely that the 9-bp deletions identified here introduce a shift in a phase of the chains that propagates from the site of the deletion all the way to the N terminus of the molecule. These findings also suggest that the NC2 domain of type IX collagen may compensate for the size difference of the ␣ chains and thus prevent the propagation of the deletion (Fig. 8).
In the clinical context, the fact that similar deletions can be associated with lethal phenotypes in one collagen molecule yet constitute benign variants in another implores the use of caution in overinterpreting the potential consequences of DNA mutations in the absence of complementary biochemical or cell biological studies. FIG. 8. Schematic representation of the structure of the COL1, NC2, and COL2 domains. The common structure is shown in A; the proposed effect of the Gly-X-Y deletion is shown in B. The deletion in the COL1 domain of the ␣3(IX) chain shortens the COL1 domain by one triplet and increases the length of the ␣1(IX) and ␣2(IX) chains by three amino acid residues in the NC2 domain. The deletion does not interfere with disulfide bond formation between the ␣1 and ␣3 chains or the triple helical structure of the COL2 domain.