The human lumican gene. Organization, chromosomal location, and expression in articular cartilage.

A human lumican cDNA sequence was derived by polymerase chain reaction techniques from RNA obtained from intestine, placenta, and articular cartilage. A contiguous sequence of 1729 bases was obtained corresponding to an observed message size of 1.8 kilobases (kb). The cDNA sequence consists of an 80-base pair (bp) 5′-untranslated region, a 1014-bp coding sequence, and a 618-bp 3′-untranslated region terminating in a 17-bp poly(A) tail. The deduced lumican protein sequence has 338 amino acids, including a putative 18-residue signal peptide. The human lumican gene was shown to be spread over about 7.5 kb of genomic DNA and to be located on chromosome 12q22. The gene consists of 3 exons separated by introns of 2.2 and 3.5 kb. The shorter 5′-intron resides 21 bases prior to the translation initiation codon, and the 3′-intron resides 152 bases prior to the translation termination codon. The lumican message is expressed at high levels in adult articular chondrocytes but at low levels in the young juvenile. This age-related trend in message level is not, however, common to all tissues in which the lumican gene is expressed. Lumican is present in the extracellular matrix of human articular cartilage at all ages, although its abundance is far greater in the adult. In the adult cartilage lumican exists predominantly in a glycoprotein form lacking keratan sulfate, whereas the juvenile form of the molecule is a proteoglycan.

A human lumican cDNA sequence was derived by polymerase chain reaction techniques from RNA obtained from intestine, placenta, and articular cartilage. A contiguous sequence of 1729 bases was obtained corresponding to an observed message size of 1.8 kilobases (kb). The cDNA sequence consists of an 80-base pair (bp) 5-untranslated region, a 1014-bp coding sequence, and a 618-bp 3-untranslated region terminating in a 17-bp poly(A) tail. The deduced lumican protein sequence has 338 amino acids, including a putative 18-residue signal peptide. The human lumican gene was shown to be spread over about 7.5 kb of genomic DNA and to be located on chromosome 12q22. The gene consists of 3 exons separated by introns of 2.2 and 3.5 kb. The shorter 5-intron resides 21 bases prior to the translation initiation codon, and the 3-intron resides 152 bases prior to the translation termination codon. The lumican message is expressed at high levels in adult articular chondrocytes but at low levels in the young juvenile. This age-related trend in message level is not, however, common to all tissues in which the lumican gene is expressed. Lumican is present in the extracellular matrix of human articular cartilage at all ages, although its abundance is far greater in the adult. In the adult cartilage lumican exists predominantly in a glycoprotein form lacking keratan sulfate, whereas the juvenile form of the molecule is a proteoglycan.
Lumican belongs to the family of relatively small leucine-rich proteoglycans that are present in the extracellular matrix of many tissues. In addition to lumican, the family includes decorin, biglycan, and fibromodulin (1,2), and each family member has a common structure consisting of a central region of leucine-rich repeats that are flanked at either side by a disulfide-bonded domain. The central leucine-rich region possesses attachment sites for N-linked oligosaccharides, which in fibromodulin and lumican may be modified by sulfation of their polylactosamine units to yield keratan sulfate. Fibromodulin and lumican may therefore also be classed as keratan sulfate-proteoglycans (KS-PG). 1 Such modification does not usually occur in decorin and biglycan, and they retain unsulfated oligosaccharides. These proteoglycans do possess attachment sites for chondroitin sulfate in the amino-terminal regions of their core protein. In many tissues the chondroitin sulfate is modified to dermatan sulfate by epimerization, resulting in decorin and biglycan being classed as dermatan sulfate-proteoglycans (DS-PG).
The structure of the leucine-rich repeats places lumican, fibromodulin, decorin, and biglycan in the leucine-rich repeat (LRR) superfamily of proteins (3). Each member of the family is characterized by multiple adjacent leucine-rich regions, which may possess from 20 to 29 amino acid residues and be repeated up to 30 times. In the case of the four proteoglycans, the leucine-rich repeats consist of 24 amino acid residues, which is the most common size in other family members. Each of the proteoglycans also possesses 10 of the repeating units between the flanking disulfide-bonded domains. The presence of a common structural motif suggests that the four proteoglycans may share common functional properties. Such a common function may be the interaction with fibrillar collagen (4 -6). The presence of the proteoglycans is thought to influence the interaction of the collagen fibrils with one another or with other matrix components (7), and as the binding site for each proteoglycan appears to be distinct (7,8), it is likely that each family member fulfills a different role.
The complete amino acid sequences for bovine and chicken lumican have been deduced from cDNA clones (9,10). The bovine and chicken cDNA clones encode proteins of 342 and 343 amino acids, respectively, and in both cases the first 18 amino acids are thought to represent signal peptides. In the case of the bovine cornea, several distinct KS-PG isoforms have been isolated with deglycosylated core protein sizes of 37 and 25 kDa (11). The 37-kDa core proteins could be further subdivided into two components, designated 37A and 37B, on a charge basis due to differences in KS content (12). Only the proteoglycan with the lowest KS content (37B) represents lumican (13). The other proteoglycans appear to have different protein structures (12) and represent the products of separate mRNAs (14). It is likely that one of these proteoglycans may be fibromodulin, although it has been shown that corneal decorin may exist in a KS-PG form, bearing both chondroitin sulfate and keratan sulfate chains in the adult chick (15). It is also apparent that under some conditions lumican can exist in a glycoprotein form rather than a proteoglycan form (16), and in the cornea the conversion from nonsulfated polylactosamine chains to keratan sulfate chains is developmentally regulated (17).
While there is considerable information on corneal lumican, much less is known about the expression of this molecule in other connective tissues, particularly in cartilage, which is known to be rich in decorin, biglycan, and fibromodulin (18). Furthermore, there is as yet no information on the organization of the lumican gene or its chromosomal localization. The aim of this work is to address these deficits in the human, where there is also no reported information on the protein sequence of lumican.

EXPERIMENTAL PROCEDURES
Materials-Guanidine isothiocyanate, penicillin, gentamycin, fungizone, Dulbecco's modified Eagle's medium, and biotin 14-dATP were from Life Technologies, Inc. Guanidine hydrochloride, bacterial collagenase (type 1A), and hyaluronidase (type IV) were from Sigma. Taq DNA polymerase was from Applied Biosystems-Perkin Elmer. 5Ј-RACE-ready cDNA and multiple tissue Northern blots were from Clontech. Qiagen plasmid kits and Qiaquick gel extraction kits were from Qiagen. The positively charged nylon membrane and the Expand Long Template PCR system were from Boehringer Mannheim. The Rapidhyb buffer, radioactive nucleotides, and x-ray film were from Amersham Corp. The Sequenase sequencing kit was from U.S. Biochemical Corp. DNA suitable for PCR from human-hamster somatic cell hybrids was from Bios Corp. The TA-cloning kit was obtained from Invitrogen. Avidin-conjugated fluorescein isothiocyanate was from Vector Labs. Chondroitinase ABC (from Proteus vulgaris) and endo-␤-galactosidase (from Escherichia freundii) were from ICN, and keratanase II (from Bacillus sp.) was from Seikagaku Kogyo. The alkaline phosphataseconjugated goat anti-rabbit antibody was from Promega. Nitrocellulose membranes and prestained SDS-PAGE molecular weight standards were from Bio-Rad.
Sources of Human Tissue-Fetal intestine was a gift from Dr. Daniel Ménard and was obtained from therapeutic abortion performed at 16 -20 weeks gestation. Human articular cartilage was collected from the distal femur at the time of autopsy and within 20 h of death. The specimens were from individuals aged 4 and 6 weeks and 3, 37, 39, 64, and 66 years. None of the individuals had any evidence of joint disease.
Isolation of RNA-RNA was extracted from chondrocytes or fetal intestine by the acid guanidinium thiocyanate/phenol/chloroform method (19). For chondrocyte isolation, tissue was collected in Dulbecco's modified Eagle's medium plus antibiotics (50 units/ml penicillin, 0.1 mg/ml gentamycin, 5 g/ml fungizone). Where necessary, extraneous connective tissue was dissected off, and the cartilage was cut into pieces of approximately 1 mm 3 . The tissue was digested at 5 ml/g in Dulbecco's modified Eagle's medium containing 0.12% collagenase and 0.1% hyaluronidase in a spinner flask at 37°C overnight. The cell suspension was passed through a tissue sieve to remove any remaining tissue pieces, and the cells were recovered by centrifugation at 1,000 rpm (160 ϫ g av ) at room temperature.
PCR Amplification of cDNA-First strand cDNA was synthesized from 10 g of total RNA in a 20-l reaction, as described previously (20), and 4 l of this reaction mix was used for PCR amplification using Taq DNA polymerase and standard procedures (21). Primers 5Ј-GCATCT-GCTTTAAGAAT and 5Ј-GTTTGAGTGATGTGATT, spanning nucleotides 30 -46 and 1045-1061, respectively, in the published bovine lumican sequence (10) were used to amplify the cDNA fragment HFI.1 from human fetal intestine cDNA in a 40-cycle reaction (94°C for 45 s, 40°C for 30 s, 72°C for 2 min per cycle). The cDNA fragment HFI.2, completing the 3Ј-end of human lumican, was amplified from human fetal intestine cDNA by the 3Ј-RACE protocol (22), using a human lumican-specific primer 5Ј-GAGGATGCTGTTTCAGCT representing bp 597-614 ( Fig. 1) from sequence data obtained from HFI.1 and an adaptor oligonucleotide directed toward the poly(A) tail in a 40-cycle reaction (94°C for 45 s, 50°C for 30 s, 72°C for 1.5 min per cycle). The cDNA fragment HP.1, representing the 5Ј-end of human lumican, was amplified from human placenta 5Ј-RACE-ready cDNA using human lumican-specific primers representing bp 231-248 and bp 113-132 ( Fig.  1) from sequence data obtained from HFI.1 in two nested PCR amplifications following the manufacturer's directions.
In addition, the cDNA fragments HAC.1 and HAC.2, spanning the expressed lumican sequence, were amplified from adult human chondrocyte cDNA. Primers representing bp 42-59 and bp 1091-1111 for HAC.1 and bp 768 -785 and bp 1507-1526 for HAC.2 ( Fig. 1) were chosen from the human lumican sequence derived from the placental and intestinal cDNA. Both amplifications were for 40 cycles (using 94°C for 45 s, 42°C for 30 s, 72°C for 1 min per cycle for HAC.1 and 94°C for 45 s, 50°C for 30 s, 72°C for 1.5 min for HAC.2). All amplified fragments described above were cloned into the pCRII vector using the TA cloning kit of Invitrogen, following the manufacturer's instructions, and subjected to sequence analysis.
PCR Amplification of Genomic DNA-Human genomic DNA was isolated from normal peripheral blood (23). The following primer pairs from the human lumican cDNA sequence ( Fig. 1) were used to screen genomic DNA for the location of intron sequences: 1) bp 42-61 and bp 113-132; 2) bp 301-320 and bp 850 -867; 3) bp 768 -785 and bp 1091-1111; and 4) bp 948 -967 and bp 1507-1526. 250 ng of genomic DNA were amplified using Taq DNA polymerase for 40 cycles (94°C for 30 s, 45°C for 30 s, 72°C for 1 min per cycle). 20 l of each reaction were analyzed on a 1.2% agarose gel. A second PCR screening of genomic DNA was performed with primers adjacent to identified intron/exon junctions to verify that there were no other introns in the described lumican sequence.
To obtain data on intron size and splice junction sites, the Expand Long Template PCR system was used to PCR amplify intron fragments from genomic DNA. Primer pairs representing bp 42-61 and bp 113-132, and bp 768 -785 and bp 1091-1111, from the human lumican cDNA sequence ( Fig. 1) were used with 250 or 500 ng of genomic DNA, as suggested by the manufacturer. After an initial denaturation for 2 min at 94°C, amplifications were performed for 30 cycles (94°C for 10 s, 42°C for 30 s, 68°C for 15 min per cycle). 20 l from each reaction were analyzed on a 0.7% agarose gel. The major band from each reaction was purified from the gel using the Qiaquick gel extraction kit.
Purified fragments were blunt-ended using the Klenow fragment of Escherichia coli DNA polymerase I and phosphorylated with bacteriophage T4 polynucleotide kinase (23). They were then cloned into SmaIdigested pUC18 for identification by sequence analysis.
To obtain the chromosome location, a panel of human/hamster somatic cell hybrids covering the entire human genome was PCR screened in two separate reactions. The first reaction used primers representing bp 301-320 and bp 459 -478 ( Fig. 1) in a 40-cycle reaction (94°C for 45 s, 50°C for 30 s, 72°C for 30 s per cycle), while the second reaction used primers representing bp 768 -785 and bp 850 -867 (94°C for 45 s, 42°C for 30 s, 72°C for 45 s per cycle). The products were analyzed on a 1.5% agarose gel.
DNA Sequencing-Plasmid DNA was purified using Qiagen plasmid kits following manufacturer's instructions. Purified double-stranded plasmid DNA was sequenced using the dideoxy chain-termination method (24) and modified T7 DNA polymerase (Sequenase) with oligonucleotide primers from the cDNA sequence. Sequences were determined on both strands of two separate clones from each PCR reaction. This enabled the identity of each nucleotide to be determined by at least four independent analyses on each DNA strand.
Northern Blots-For chondrocyte samples, 10 g of total RNA was electrophoresed in a 1% agarose, 1.11% formaldehyde gel and capillaryblotted to positively charged nylon membrane in 20 ϫ SSC. For each hybridization, 10 ng of gel-purified probe were labeled using the Multiprime DNA-labeling system. Probes used were amplified by reverse transcription-PCR from human adult chondrocyte RNA. The following probes were used: lumican, bp 41-630 ( Fig. 1); fibromodulin, bp 2-634 (25); decorin, bp Ϫ84 -1223 (26); and biglycan, bp 70 -1239 (27). The only probe not prepared in this manner was that for human glyceraldehyde-3-phosphate dehydrogenase, which was a 901-bp fragment excised from the 1.2-kb cDNA insert in the plasmid pHcGAPNR obtained from the Repository of Human DNA Probes and Libraries (American Type Culture Collection) (28). Hybridizations on all RNA blots were carried out using Rapid-hyb buffer for 2 h at 65°C. Blots were then washed with 2 ϫ SSC for 10 min with several buffer changes at room temperature, followed by 2 ϫ 15-min washes in 0.2 ϫ SSC at 65°C and subsequent exposure to x-ray film.
Fluorescence in Situ Hybridization-The genomic plasmid clones spanning the two introns in the lumican gene were used as probes to map the gene to human chromosomes by using fluorescence in situ hybridization. Each plasmid clone was labeled with biotin-14-dATP using nick translation, and was then hybridized to metaphase chromosomes prepared from normal male peripheral blood lymphocytes by the bromodeoxy-uridine synchronization method (29). Fluorescence in situ hybridization was performed essentially according to the method described previously (30). Briefly, the hybridization solution contained 200 ng of probe DNA, 5 g of Cot 1 DNA, and 5 g of sonicated salmon sperm DNA per 10 l of hybridization mixture (70% formamide, 10% dextran sulfate, 2 ϫ SSC). This was preannealed at 37°C for 15 min and then applied to denatured chromosome slides. Posthybridization washes were at 44°C in 2 ϫ SSC, 50% formamide (4 ϫ 5 min) followed by washing in 1 ϫ SSC at 55°C (4 ϫ 5 min). Hybridized DNAs were then detected with avidin-conjugated fluorescein isothiocyanate. Two amplifications were carried out using biotinylated anti-avidin. To generate clear reverse bands, metaphase chromosomes were counterstained with chromomycin A3 followed by distamycin A (29). The image was captured by using the Photometrics Cooled-CCD camera (CH250) and the BDS image analysis system (ONCOR Imaging).
Preparation of Cartilage Extracts-Cartilage was finely diced and then extracted with 4 M guanidinium chloride, 100 mM sodium acetate, pH 6.0, containing proteinase inhibitors, at 10 ml/g of tissue for 48 h at 4°C (31). The filtered extract was dialyzed against 0.1 M sodium acetate, 0.1 M Tris/HCl, pH 7.3, and treated with chondroitinase ABC at 0.1 units/ml of extract for 4 h at 37°C. The chondroitinase-treated extract was then analyzed directly by SDS-PAGE or following treatment with keratanase II or endo-␤-galactosidase. For further enzyme treatments the extracts were dialyzed against 0.01 M sodium acetate, pH 6.0, and treated with keratanase II at 0.01 units/100 l of extract or endo-␤galactosidase at 0.01 units/100 l of extract. Both digestions were overnight at 37°C. Following enzyme digestions samples were heated at 100°C for 3 min in the presence of SDS-PAGE loading buffer.
SDS-PAGE and Immunoblotting-Proteins in the treated cartilage extracts were fractionated by SDS-PAGE, electroblotted onto nitrocellulose, and immunolocalized as described previously (32). Immunolocalization involved the use of a specific anti-peptide antiserum to human lumican used at a 1:200 dilution, followed by an alkaline phosphatase-conjugated goat anti-rabbit second step antibody for visualization. The specific antiserum was raised in a rabbit to a synthetic peptide conjugated to ovalbumin as described previously (31). The peptide possessed the sequence H-CGGLRVANEVTLN-OH, which comprises an amino-terminal cysteine residue to be used for conjugation, two spacer glycine residues, and the carboxyl-terminal 10 amino acids of human lumican (Fig. 2).

RESULTS
Human Lumican cDNA Sequence-A contiguous sequence of 1729 nucleotides was obtained for the human lumican cDNA using reverse transcription-PCR and a combination of placental and fetal intestinal RNA for reverse transcription and amplification. The initial amplification used oligonucleotide primers based on the published bovine lumican cDNA sequence (10), but subsequent amplification used the new human nucleotide sequence for primer generation. The sequence was verified in either direction on two independent PCR products for each of the three cDNA sequences from which the contiguous sequence was derived. The sequence contains an 80-bp 5Ј-untranslated region, a 1014-bp coding sequence, and a 618-bp 3Ј-untranslated region terminating in a 17-residue poly(A) tail (Fig. 1). There is a single polyadenylation signal (AATAAA) located 15 bases prior to the commencement of the poly(A) tail.
The cDNA encodes a protein of 338 amino acids (Fig. 2), of which the first 18 amino acids represent a putative signal peptide (33,34). The deduced amino acid sequence was verified by additional nucleotide sequencing using PCR products generated from adult human chondrocyte RNA. The sequence shows six cysteine residues at amino acid positions 37, 41, 43, 53, 295, and 328 of the primary translation product that could be involved in disulfide bond formation, with 10 leucine-rich regions separating the first four and last two cysteine residues. This central leucine-rich repeat region contains four asparagine residues at amino acid positions 88, 127, 160, and 252 that form part of the consensus sequence (NX(S/T)) necessary for substitution by N-linked oligosaccharides or keratan sulfate. These features are conserved in the published sequences for bovine (10) and chick (9) lumican.
Organization of Human Lumican Gene-Oligonucleotide primer pairs were synthesized to allow the PCR amplification of overlapping regions spanning the lumican cDNA. These primer pairs were then used to screen a human genomic DNA preparation for the presence of introns, which should result in a product of larger size or no product, depending on the intron size and location. Only two primer pair sets failed to give products of the expected size. These spanned bp 42-132 and bp 768-1111 in the cDNA. In both cases no product was obtained, suggesting the presence of a long intron in the equivalent region of the genomic DNA. To overcome this difficulty in the synthesis of long PCR products, the system in use was changed for one designed for this purpose. Use of the Expand Long Template PCR system gave rise to products with sizes of 2.3 and 3.8 kb, respectively, for the 5Ј-and 3Ј-regions that had previously eluded amplification. This suggests the presence of introns with sizes of about 2.2 and 3.5 kb (Fig. 3).
The location of the putative introns was confirmed by nucleotide sequence analysis, with both showing classical GT/AG sequences flanking the intron splice junctions (35). The 5Јintron splice site occurs between nucleotides 59 and 60 in the cDNA sequence, a site that is within the 5Ј-untranslated region and 21 bases prior to the translation initiation codon (Fig. 3). The 3Ј-intron splice site occurs between nucleotides 942 and 943 and resides within the coding sequence at a site 152 nucleotides prior to the translation termination codon. This junction indicates a phase 1 intron, which splits the codon for lysine (AAG) at amino acid residue 288 in the deduced protein sequence between the first and second bases. The occurrence of the above splice junctions suggests that the human lumican gene consists of three exons spanning about 7.5 kb of genomic DNA, although it is possible that additional introns may exist at the extreme 5Ј-and 3Ј-ends of the gene. The first exon encodes most of the 5Ј-untranslated region, the second exon encodes most of the coding region, and the third exon encodes the remainder of the coding region and the 3Ј-untranslated region.
Chromosome Location of Human Lumican Gene-To determine on which chromosome the human lumican gene resides, DNA from a panel of 12 human-hamster somatic cell hybrids was analyzed by PCR using two sets of primer pairs that give a unique product with human genomic DNA relative to hamster DNA. The expected PCR product was identified in three of the DNA samples from the somatic cell hybrids (Fig. 4). The only chromosome shared by these three hybrids was chromosome 12, and this chromosome was not present in any of the other nine hybrids. These data are concordant with the presence of the human lumican gene on chromosome 12 and discordant with its presence on all other chromosomes.
To determine the precise location of the gene on chromosome 12, fluorescence in situ hybridization analysis was performed using the 2.3-and 3.8-kb genomic plasmid clones spanning the first and second introns of the gene, respectively. Two independent experiments were performed, and over 100 metaphase cells were evaluated. Signals were clearly seen on two chromatids of at least one chromosome band 12q22 in 25% of cells using the 2.3-kb plasmid clone (Fig. 5) and in 5% using the 3.8-kb plasmid clone (data not shown). No other chromosomal sites with consistent signals were detected in more than 1% of cells.
Analysis of Lumican Message Expression-Since all other members of the LRR-proteoglycan family are expressed in cartilage in an age-dependent manner, it was of interest to determine whether the lumican gene was expressed in a similar manner. To this end, RNA from freshly isolated human articular chondrocytes was analyzed by Northern blotting. RNA from adult chondrocytes showed a single species of 1.8 kb hybridizing with the lumican cDNA probe (Fig. 6). This is compatible with the size of the contiguous sequence reported for the lumican cDNA (1729 bp) and the presence of a single polyadenylation signal (Fig. 1). In contrast to the adult, little lumican message expression was apparent in RNA isolated from young juvenile chondrocytes. Identical results were obtained when a probe derived from the published bovine lumican cDNA sequence was used to analyze the Northern blots (data not shown).
The increase in lumican message expression between juvenile and adult chondrocytes is not unique to the lumican gene but is also apparent for the fibromodulin and decorin genes. While quantitation from Northern blots can be misleading due to variation in probe specific activity and multiple size messages, the current data strongly suggest that the relative abundance of the lumican message is much higher than that for the messages of the three other LRR-proteoglycan genes in adult articular chondrocytes (Fig. 6). The biglycan gene is the only family member whose gene shows decreased expression in the adult relative to the juvenile for articular cartilage.
Expression of the lumican gene is not confined to articular cartilage but is widespread in different tissues (Fig. 7). Expression in the adult was high in heart, placenta, skeletal muscle, kidney, and pancreas but low in brain, lung, and liver. Expression in kidney showed the same age trend as observed in articular cartilage, with low expression in fetal RNA and high expression in the adult. In contrast, the lung showed the opposite trend, with higher message expression in the fetus than the adult. In general, the expression of the lumican message resembled that of the fibromodulin and decorin messages in the adult but was quite distinct from that of the biglycan message, which was expressed at high levels in the lung and liver. In the fetus, differences were observed for the relative expression of each LRR-proteoglycan message in different tissues.
Analysis of Lumican in Articular Cartilage-To determine whether lumican is present in the extracellular matrix of human articular cartilage, tissue extracts were examined by immunoblotting following SDS-PAGE (Fig. 8). Lumican was detected in extracts from cartilage of all ages, but there were distinct age variations in both the relative abundance and size heterogeneity of the molecule. Lumican was most abundant in the adult cartilage extracts, where it exhibited a molecular size ranging from 55-80 kDa, with the vast majority of the molecules residing in the lower end of this range. In the juvenile cartilage extracts, lumican had a more restricted size variation, corresponding to the higher molecular size range present in the adult. In the neonate the sizes ranged from 70 -80 kDa. The specificity of the immune reaction was established by pretreatment of the anti-lumican antiserum with the peptide used for immunization, which resulted in blocking all of the immunoreactivity (data not shown).
The molecular structure of the human cartilage lumican was further studied by investigating the effect of keratanase II and endo-␤-galactosidase, which are capable of degrading sulfated and nonsulfated polylactosamine chains, respectively. Both enzymes reduced the size heterogeneity of the cartilage lumican, indicating substitution by keratan sulfate. However, the greatest effect was by endo-␤-galactosidase, which yielded the same size product from all ages studied. This product had an average size of about 57 kDa and was equivalent in size to the major form of lumican detected in the adult cartilage extracts prior to enzyme treatment. This suggests that in adult cartilage much of the lumican exists in a form devoid of keratan sulfate or polylactosamine chains. These molecules are still presumably substituted by oligosaccharides, since their size is greater than that expected for a mature core protein of 320 amino acids (Fig.  2). The size difference would suggest that all four of the potential N-linked oligosaccharide attachment sites might be occupied, as occurs in fibromodulin (36). In the mature adult there is some evidence for fragmentation of the lumican core proteins, as small amounts of immunoreactive material are detected with sizes ranging between 20 and 50 kDa. DISCUSSION The human lumican message contains an open reading frame of 1014 bases giving rise to a deduced protein sequence of 338 amino acids. This is slightly shorter than the corresponding primary translation products for the bovine and chick lumican messages, which give rise to proteins of 342 and 343 amino acids, respectively (9, 10). In all cases signal peptides representing the first 18 amino acids have been predicted, and at least in the bovine the subsequent amino acid has been shown to represent the start of the mature protein isolated from cornea (10). The coding region of the human lumican message shows 86 and 84% identity at the nucleotide and amino acid levels, respectively, with the equivalent bovine sequences, but only 65 and 67% identity at the nucleotide and amino acid levels, respectively, with the equivalent chick sequences. The 5Ј-and 3Ј-untranslated regions for the lumican messages for the three species are also of similar size, giving rise to single components on Northern blotting of between 1.8 and 2.0 kb.
The coding sequence of the human lumican message is shorter than that for the three other human LRR-proteoglycan messages, which encode proteins of 376, 368, and 359 amino acids for fibromodulin, biglycan, and decorin, respectively (25,27). This difference is reflected mainly in the number of amino acids preceding the N-terminal conserved disulfide-bonded domain in the molecules. The coding region for the lumican message shows 53 and 47% identity at the nucleotide and amino acid levels, respectively, with the equivalent region of the fibromodulin message but lower identity with the decorin and biglycan messages, where identity at the amino acid level decreases to 36% in both cases. The size of the human lumican message is similar to that of decorin but is smaller than those for biglycan and fibromodulin, which are about 2.6 and 3.0 kb, respectively (27,37), due mainly to variation in the length of the 3Ј-untranslated regions.
The different members of the LRR-proteoglycan family show considerable conservation of amino acid sequence with respect to the presence of 10 leucine-rich repeats flanked by cysteinerich domains. This homology is greatest in the KS-PG members of the family, which also show conservation of four asparagine residues that can act as potential sites for N-linked oligosaccharide substitution in the leucine-rich repeat region. In the case of bovine fibromodulin (38), it has been shown that each of these sites can also be occupied by a N-linked keratan sulfate chain (36). One would therefore predict that all members of this KS-PG family may be substituted with keratan sulfate at the equivalent sites, although not all sites need be occupied on a given molecule.
The greater homology of lumican with fibromodulin than with decorin or biglycan at both the protein and glycosaminoglycan levels also extends to gene organization. Both the human lumican and fibromodulin genes are composed of three exons, with the first intron just preceding the translation initiation site and the second intron just preceding the termination codon (25). In the case of the human fibromodulin gene, the first intron is only 4 bases prior to the initiation codon, whereas this separation increases to 21 bases in the lumican gene. The intron itself is also smaller in the fibromodulin gene relative to the lumican gene, with sizes of about 1.0 and 2.2 kb, respectively. The second intron resides 150 bases prior to the termination codon in the human fibromodulin gene, whereas the separation is 152 bases in the lumican gene. This two-base difference results in fibromodulin having a phase 0 intron that separates two distinct codons, whereas lumican has a phase 1 intron that divides a single codon. The second intron in fibromodulin is larger than that in lumican, with sizes of about 5 and 3.5 kb, respectively. The differences in intron size and a longer exon 3 encoding the 3Ј-untranslated region make the entire fibromodulin gene about 1 kb longer than the lumican gene.
The human biglycan gene is of a size similar to that of the fibromodulin and lumican genes, but it is quite different in organization, being composed of eight exons (39). Here the majority of the coding region is composed of six exons in contrast to the single exon used in fibromodulin and lumican. The human decorin gene also consists of eight exons and has an organization similar to that of the biglycan gene (40,41). It is, however, much larger than the genes of the other family members, spanning at least 38 kb of genomic DNA (40).
Although the lumican and decorin genes show the greatest differences in genomic organization, they appear to reside quite close to one another on human chromosome 12. In this work the lumican gene has been shown to reside on chromosome 12q22, whereas others have shown that the human decorin gene resides between regions 12q21.3 (41) and 12q23 (40). The human biglycan and fibromodulin genes are on distinct chromosomes, with the biglycan gene residing on chromosome Xq28 (42) and the fibromodulin gene on chromosome 1q32 (36). The region of chromosome 12 encompassing the lumican and decorin genes is of interest, since it has been shown to be the locus for Holt-Oram syndrome (43). This is an autosomal dominant condition that causes skeletal abnormalities, particularly in the upper limbs, and cardiac abnormalities, which led to the more descriptive name of heart-hand syndrome. The cause of this disorder is unknown, but lumican can now be added as a candidate gene.
In articular cartilage, lumican message expression is higher in the adult than in the young juvenile, a trend also exhibited by the messages for fibromodulin and decorin but not that for biglycan, which shows the opposite trend with age. This trend may be related to the different roles played by the proteoglycans, since unlike the other family members biglycan localizes to the pericellular matrix rather than the more remote matrix rich in collagen fibrils (44). It is also apparent that lumican, fibromodulin, and decorin are not expressed in a similar agerelated manner by all tissues, and this probably reflects differences in the functional properties of the proteoglycans and the functional needs of the tissues.
The higher level of lumican message expression in the adult chondrocytes is mirrored by a higher level of lumican residing in the adult cartilage matrix. However, there are distinct differences in the matrix form of cartilage lumican between adult and juvenile tissues. In the juvenile cartilage all the lumican exists as a keratan sulfate proteoglycan, being substituted with sulfated polylactosamine chains. However, in the adult most of the lumican exists as a glycoprotein form devoid of polylactosamine chains. Such glycoprotein forms of lumican have previously been described in early embryonic cornea, prior to later substitution with keratan sulfate (17). However, they have not been previously described in mature connective tissues. The predominance of the glycoprotein form in adult cartilage is unlikely to be the product of glycosidase action within the extracellular matrix, since the keratan sulfate chains of aggrecan are present and are longer in the adult than in the juvenile (45). One therefore presumes that with age the human articular chondrocytes switch from the synthesis of a proteoglycan form of lumican to a glycoprotein form. The reason for this switch is not obvious, and neither is its functional effect on the tissue. It is interesting to note, however, that glycoprotein forms of fibromodulin, the other KS-PG member of the LRRproteoglycan family, have also been shown to exist in mature bovine cartilage (46).