Characterization and Expression of the Mouse Lumican Gene*

Lumican is one of the major keratan sulfate proteoglycans (KSPG) in vertebrate corneas. We previously cloned the murine lumican cDNA. This study determines the structure of murine lumican gene (Lum) and its expression during mouse embryonic developments. The mouse lumican gene was isolated from a bacterial artificial chromosome mouse genomic DNA library and characterized by polymerase chain reaction and Southern hybridization. The lumican gene spans 6.9 kilobase pairs of mouse genome. The gene consists of three exons and two introns. Exon 1 constitutes 88 bases (b) of untranslated sequence. Exon 2 is 883 b and contains most of the coding sequence of lumican mRNA, and exon 3 has 152 b of coding sequence and 659 b of 3′ noncoding sequence. The mouse lumican gene has a TATCA element, a presumptive TATA box, which locates 27 b 5′-upstream from the transcription initiation site. Northern hybridization and in situ hybridization indicate that in early stages of embryonic development, day 7 post coitus the embryo expresses little or no lumican. Thereafter, different levels of lumican mRNA can be detected in various organ systems, such as cornea stroma, dermis, cartilage, heart, lung, and kidney. The cornea and heart are the two tissues that have the highest expression in adult. Immunoblotting studies found that KSPG core proteins became abundant in the cornea and sclera by postnatal day 10 but that sulfated KSPG could not be detected until after the eyes open. These results indicate that lumican is widely distributed in most interstitial connective tissues. The modification of lumican with keratan sulfates in cornea is concurrent with eye opening and may contribute to corneal transparency.

Lumican is one of the major keratan sulfate proteoglycans (KSPG) in vertebrate corneas. We previously cloned the murine lumican cDNA. This study determines the structure of murine lumican gene (Lum) and its expression during mouse embryonic developments. The mouse lumican gene was isolated from a bacterial artificial chromosome mouse genomic DNA library and characterized by polymerase chain reaction and Southern hybridization. The lumican gene spans 6.9 kilobase pairs of mouse genome. The gene consists of three exons and two introns. Exon 1 constitutes 88 bases (b) of untranslated sequence. Exon 2 is 883 b and contains most of the coding sequence of lumican mRNA, and exon 3 has 152 b of coding sequence and 659 b of 3 noncoding sequence. The mouse lumican gene has a TATCA element, a presumptive TATA box, which locates 27 b 5upstream from the transcription initiation site. Northern hybridization and in situ hybridization indicate that in early stages of embryonic development, day 7 post coitus the embryo expresses little or no lumican. Thereafter, different levels of lumican mRNA can be detected in various organ systems, such as cornea stroma, dermis, cartilage, heart, lung, and kidney. The cornea and heart are the two tissues that have the highest expression in adult. Immunoblotting studies found that KSPG core proteins became abundant in the cornea and sclera by postnatal day 10 but that sulfated KSPG could not be detected until after the eyes open. These results indicate that lumican is widely distributed in most interstitial connective tissues. The modification of lumican with keratan sulfates in cornea is concurrent with eye opening and may contribute to corneal transparency.
Corneal strength and transparency depend upon the development and maintenance of an organized extracellular matrix, including uniformly small diameter collagen fibrils with lamellae of consistent interfibrillar spacing. The collagen fibrils of adjacent lamella sheets are perpendicular to one another (1,2).
The mechanism that governs the formation of collagen lamellae in cornea stroma is not well understood. It has been suggested, however, that the ratios of different collagen types in making up the fibrillar corneal collagen and other extracellular specialized matrix components, e.g. proteoglycans and glycoprotein are essential for the development of a transparent cornea (1,(3)(4)(5)(6)(7)(8). In addition to interaction with collagen fibrils, proteoglycans in stroma also play a role in corneal hydration due to their high negative charge of sulfated carbohydrate moieties (9 -11).
Lumican belongs to the family of small leucine-rich proteoglycans (SLRPs) that includes decorin, biglycan, fibromodulin, keratocan, epiphycan, and osteoglycin (20). Each of these proteoglycans possesses 6 -10 leucine-rich repeating units between the flanking cysteine-rich disulfide-bonded domains at the N and C termini of the core protein. The presence of a common structural motif implies that these proteoglycans may share common functional properties. Such a common function is thought to be the interaction with fibrillar collagen. The tissue distributions of each proteoglycan are distinct; therefore, it is likely that each family member fulfills a different role in connective tissues (20 -22). For example, lumican only exists as a proteoglycan in cornea, it is a glycoprotein in the rest of connective tissues (11,14,15,(23)(24)(25). The presence of sulfated lumican molecules in cornea suggests that in this tissue lumican may have unique functions, e.g. maintaining corneal transparency; however, its role serving in other noncorneal tissues remains elusive.
Mouse lumican is a 338-amino acid protein with high sequence homology to bovine, human, and chicken lumican (16 -18, 26). To examine the structure and function relationship of mouse lumican gene using transgenic mice and site-directed mutagenesis techniques; it is imperative to isolate and characterize the mouse lumican cDNA and genomic DNA and to determine the spatial-temporal expression of lumican gene during mouse development. In the present studies, we have cloned and determined the primary structure of mouse lumican gene (Lum). In situ and Northern hybridization were used to determine the temporospatial expression of Lum. Lumican isolated from eye shells (cornea plus sclera) at various developmental stages were also biochemically characterized. Our results indicate that lumican is widely expressed in a variety of connective tissues. Sulfation of the lumican in cornea occurs concomitantly with eye opening and therefore may be an essential step in providing corneal transparency.

MATERIALS AND METHODS
Isolation and Characterization of Mouse Lumican Genomic DNA-A pair of primers, sense 5Ј-CATGTATGGGCAAATATC and antisense 5Ј-TGTAGAAGGTTGTGGTCA (16), derived from mouse lumican cDNA was used in polymerase chain reaction to screen a mouse bacterial artificial chromosome-genomic DNA library (Research Genetics, Inc., Huntsville, AL). A positive clone of 200 kb was isolated. The clone was characterized with restriction enzyme digestion and Southern blot hybridization with 32 P-labeled lumican cDNA. A 6-kb SalI-XbaI fragment and an 8-kb XbaI-XbaI fragment together encoding the full-length cDNA were subcloned into pBSSK vector (Stratagene, La Jolla, CA). Nucleotide sequence (both strands) of the lumican gene was determined with dideoxynucleotide method by the DNA core in the Department of Molecular Genetics at University of Cincinnati. S1 Nuclease Protection Assay-A 20-base antisense primer 5Ј-CTCTCTTGACACTGTTTCCG-3Ј complementary to exon 1 (bases 65-84) and a phosphorated universal sequence primer 5Ј-GGAAACAGC-TATGACCATG-3Ј were used to prepare a 1-kb DNA fragment by polymerase chain reaction using a 5Ј SacI/XbaI fragment of mouse lumican genomic DNA clone as template (Fig. 1). Then the sense strand was degraded with exonuclease using a procedure recommended by the manufacturer (Boehringer Manneheim). The antisense strand was 5Ј end labeled with [␥-32 P]ATP by T4 polynucleotide kinase (New England Biolabs, Inc., Beverly, MA). The labeled DNA was hybridized to 100 g of mouse poly(A) ϩ mRNA prepared from 1 day post natal (P1) new born mice. This reaction was then digested with 300 units of S1 nuclease (27). The yeast total RNA was used with the same treatment as a control. The digested product was analyzed on an 8% denaturing acrylamide sequencing gel. To identify the transcription initiation nucleotide, a sequence reaction using the same antisense primer mentioned above and mouse lumican 5Ј genomic DNA fragment as template was prepared with Sequenase® (U. S. Biochemical Corp.).
RNA Primer Extension Assay-A 41-nucleotide antisense primer, complementary to exon 1 (bases 44 -84), was 5Ј end labeled with [␥-32 P]ATP by T4 polynucleotide kinase (New England Biolabs, Inc.) and hybridized to 100 g of mouse poly(A) ϩ mRNA. Primer extension was performed as described previously (27). The reaction products were analyzed on the same sequencing gel as the one used in the S1 protection assay.
Northern Hybridization-For determination of lumican mRNA in whole embryos, a premade blot containing 2 g of poly(A) ϩ RNA from different prenatal developmental embryos separated by electrophoresis on a 1% denaturing agarose gel was purchased from CLONTECH Labs (Palo Alto, CA). The blot was probed with 32 P-labeled mouse lumican cDNA as described previously (14,27). For tissue specific examination, total RNAs were extracted from mouse tissues using TRI-reagent® (Molecular Research Center, Cincinnati, OH) as described previously (16). 10 g of total RNAs were electrophoresed in 1.3% agarose containing 2 M formaldehyde buffered with TBE (Tris/Borate/EDTA). The RNAs were then transferred to Magna-Charged membranes® and hybridized with 32 P-labeled lumican and mouse GAPDH (glyceraldehyde 3-phosphate dehydrogenase) cDNA probes in a hybridization solution containing 50% formamide at 41°C overnight as described previously (16,27). The free 32 P probes were removed by stringent washing three times with 0.1 ϫ SSC (0.15 M NaCl, 15 mM citrate buffer, pH. 7.0) and 1% SDS at 65°C for 30 min each. The hybridization signals were detected with a PhosphorImager. The amounts of lumican mRNA were calibrated with the GAPDH mRNA in samples.
In Situ Hybridization-To identify the cell types that express lumican, the mouse tissues were fixed with 4% paraformaldehyde and embedded in paraffin as described previously (28). Antisense and sense digoxygenin-labeled riboprobes (Boehringer Mannheim) of lumican were synthesized and used in in situ hybridization on paraffin sections (5-7 m) mounted on Superfrost/Plus microscope slides (Fisher). To remove nonspecifically bound probes, slides were subjected to a stringent wash in 0.5 ϫ SSC at 65°C and treated with 20 g/ml of RNase (Sigma) at room temperature for 1 h, followed by washing with 0.2 ϫ SSC at 65°C as described previously (28). The hybridization signals were visualized with anti-digoxygenin antibody-alkaline phosphatase conjugates using procedures recommended by Boehringer Mannheim.
Characterization of Keratan Sulfate Proteoglycans-Proteoglycans were isolated from eye shells (cornea plus sclera) of prenatal day 18, postnatal days 1, 10, and 20, and 1 year of C57BL mice in 20 volumes of a solution containing 4 M guanidine HCl and protease inhibitors as described by Funderburgh et al. (15). The tissue residue was collected by centrifugation and re-extracted for 12 h at 4°C. The combined supernatants were dialyzed to 6 M urea, 0.02 M Tris-HCl, pH 8. Proteoglycans were absorbed on a 2-ml column of DEAE-Sapharose Fast Flow (Sigma) equilibrated in the same buffer. The column was washed with 0.1 M NaCl in the same buffer, and proteoglycans were eluted with 4 M guanidine HCl, 0.02 M Tris-HCl, pH 8. Dermatan sulfate-containing proteoglycans were precipitated in 50% ethanol for 14 h at Ϫ20°C, and keratan sulfate-containing proteoglycans were precipitated by the further addition of ethanol to the supernatant to make 75% (v/v). This procedure isolated both sulfated and unsulfated forms of lumican (23). The precipitated proteoglycans were collected by centrifugation, rinsed in 80% ethanol, dried in vacuum, and dissolved in 0.2 ml of 0.1 M Tris-phosphate, pH 6.8. Approximately 10 g of protein of the corneal KSPG either untreated or digested with 0.01 unit each of endo-␤galactosidase and keratanase II for 2 h at 37°C was separated on a 3 to 12% gradient SDS-polyacrylamide gel electrophoresis gel. The gel was fixed in 25% isopropanol, 5% acetic acid, for 2 h and stained overnight in 0.025% Alcian blue in 25% isopropanol, 5% acetic acid and destained in the same solution. KSPG core proteins were released by digestion of 5 g of KSPG protein with endo-␤-galactosidase as described above and then separated on a 8 -16% gel. The proteins were transferred to polyvinylidene difluoride membranes (Millipore, Bedford, MA), and KSPG proteins were detected using anti-bovine KSPG antibodies as described previously (14).

Structure of Mouse Lum Gene-
The 200-kb genomic DNA clone isolated from a mouse bacterial artificial chromosome genomic DNA library was characterized by Southern hybridization with 32 P-labeled 5Ј and 3Ј end fragments of mouse lumican cDNA. The full-length mouse lumican gene spans 6.9 kb in mouse genome. Fig. 1 illustrates that murine Lum gene has three exons and two introns. The genomic DNA SalI/XbaI and XbaI/XbaI fragments were subcloned, and the nucleotide sequences were determined in both strands by dideoxynucleotide method. Fig. 2 shows the primary structure of lumican gene and the amino acid sequence deduced from exons. The mouse Lum does not have a conventional TATA box, instead it has a TATCA element, a presumptive TATA box, that is located 27 bases 5Ј-upstream from the transcription initiation site. There is a stretch of 30-b GC-rich element at about Ϫ70 b 5Ј to the transcription initiation site. The first translation initiation ATG codon is located at the 21st base from the beginning of exon 2. Thus, exon 1 encodes the 5Ј end untranslated region of lumican mRNA. Exon 2 (883 bp) encodes most of amino acids in lumican, and exon 3 (811 bp) encodes 51 C-terminal amino acid residues and a 659-bp 3Ј end untranslated region. Introns 1 and 2 are 2342 and 2797 bp, respectively. All exon-intron junctions consist of the consensus splice signal dinucleotides GT (donor) and AG (acceptor). The polyadenylation signal AATAAA is found 635 bp downstream from the termination codon TAA in exon 3.
Identification of Transcription Initiation Site- Fig. 3 illustrates phosphoimages of 32 P-labeled products derived from primer extension and S1 nuclease protection assays. Both the primer extension and S1 nuclease protection assays demonstrate that the transcription initiation site is the nucleotide T located at 2450 b 5Ј to the translation initiation codon ATG, and 27 b 3Ј of the TATA-box like element TATCA as compared with the sequence of lumican genomic DNA was determined by using the same primer used in preparation of the probe of S1 protection assay (Figs. 2 and 3).
Analysis of Mouse Lumican mRNA Expression-To predict the possible phenotypes of lumican mutation mice, it is important to determine the temporal and spatial expression of mouse lumican during development. Northern hybridization and in situ hybridization indicate that in early stages of embryonic development before day 7 post coitus (PC), the embryo does not express lumican or expresses only very low amounts if any (data not shown). Fig. 4A shows that the poly(A) ϩ RNA isolated from 7-day-old embryos contains very little lumican mRNA (lane 1). The levels of lumican mRNA in the whole embryos increase substantially at 11 days PC and maintain at a high level afterward. To further elucidate the expression of lumican mRNA by various tissues during mouse development, Northern hybridization was performed with total RNAs isolated from cornea, heart, skin, muscle, lung, and kidney. Fig. 4B shows that levels of lumican mRNAs in proportion to that of GAPDH in cornea and skin are higher in embryonic days 16 and 18 than those of postnatal days 1, 3, 7, 14, and 21, whereas in heart FIG. 3. Identification of transcription initiation site of the mouse lumican gene by S1 nuclease protection and primer extension analysis. A, schematic representation of probes for S1 nuclease mapping. B, primer extension assay. C, the products in both the primer extension (lane 1) and S1 protection (lane 2) are compared with the phosphorous image of sequence obtained from lumican genomic DNA using the same primer in preparation of S1 protection probe as described under "Materials and Methods." The T residue (*T (ϩ1)) located at 27 b downstream from the TATCA element, the presumptive TATA box, is recognized as the transcription initiation site. There is no product in S1 assay using yeast total RNA with the similar treatment (lane 3). reversed levels of lumican expression are observed. The skeletal muscles express lumican mRNA at a lower but constant level. Lung and kidney have very low level of lumican expression through out the periods examined (data not shown).
To determine the cell types that express lumican mRNA, tissue sections prepared from mouse embryos and newborn mice were hybridized with digoxygenin labeled riboprobes as described under "Materials and Methods." The in situ hybridization shows that the stroma cells in mouse cornea start to express lumican at embryo day 12 (E12). Fig. 5 shows the lumican expression in cornea stroma cells at E14, postnatal day 1 (P1), and adult mouse. The in situ hybridization also detected the expression of lumican mRNA in several other organs, such as skin, heart, lung, and kidney. Most of these organs start to express lumican mRNA at embryo day E12. Fig.  6 shows that the lumican mRNA is expressed by dermal fibroblasts, cardiac muscle cells, the kidney glomerular cells, alveolar epithelial cells, endothelial cells, and fibroblasts in the lung. Table I  dermis. From day 14 PC, the dermis expresses lumican mRNA, but the epidermis does not. At the early development stages, E12 and E14, retina also expresses lumican mRNA ( Fig. 5 and Table I). It is of interest to note that almost all interstitial cells of various tissues examined express lumican mRNA after 12 days of gestation.
Accumulation of Sulfated Lumican in Cornea and Sclera-KSPG core proteins were released by treatment with endo-␤galactosidase from extracts of cornea and sclera and then detected by Western blotting using anti-KSPG antibodies. As shown in Fig. 7A, at postnatal day 1, very little of KSPG core proteins was detected with this antibody. At day 10 and thereafter, there were increasing amounts of the KSPG core proteins. Sulfated KS-containing proteoglycans were detected on SDS-polyacrylamide gel electrophoresis gels with Alcian blue, a dye that stains the sulfated glycosaminoglycan chains of the KSPG molecule. At postnatal day 1 sulfated KSPG could not be detected (Fig. 7B), but by day 10 trace amounts of KSPG smaller in size than that of adult mice was observed. A marked increase in the amount and size of KSPG occurred between days 10 and 20, with the amount at day 20 near that of 1-yearold mice. Predigestion of the KSPG fraction with keratanase II, an enzyme specific for sulfated moieties in the KS chains, results in the lack of Alcian blue staining in specimens. These observations indicate that cornea and sclera do not accumulate significant amounts of sulfated KSPG prior to 10 days after birth, a period in which the eyes are still closed in new born mice. DISCUSSION The mouse lumican belongs to SLRPs family and has all the features of SLRPs: a central domain of leucine-rich repeats flanked by N-and C-terminal domains with highly conserved cysteines (16,18,20,26,29). The structure of mouse Lum gene is similar to that of fibromodulin, which has three exons, with the second exon encoding all ten leucine-rich repeats (30). They are different from the other two members in another class of SLRPs family, i.e. decorin and biglycan, which are composed of eight distinct exons and contain the chondroitin sulfate/dermatan sulfate chains (20,31,32). The mouse lumican gene does not contain a conventional TATA box; rather, an unusual TATCA box is present at 27 b 5Ј-upstream to the transcription initiation start site. About 5% of genes that have been characterized use this TATA box-like element for recognition of transcription initiation (33). In addition, a 30-nucleotide GC-rich stretch is located at Ϫ70 b from the transcription start site (Fig.  2). It has been suggested that the presence of the GC-rich stretch may facilitate the recognition of a weak TATA box, like element TATCA, by transcription factors. For example, the DNA-binding factor SP1 recognizes GC-rich sequences, and genes lacking the TATA box may rely on these GC-rich sites and the proteins bind to them to start transcription (34).
It has been speculated that SLRPs are regulators of tissue morphogenesis and cellular differentiation in vivo through their contribution to the organization of extracellular matrix in connective tissues (20,35,36). The deposition of a collagenous matrix with small and uniform fibril diameters and uniform interfibrillar spacings is essential for the development and maintenance of transparent cornea (37). During corneal wound healing and the pathogenesis of corneal macular dystrophy, it has been noted that there are changes in cornea stromal proteoglycan contents that may account for the formation of opaque corneas (38 -42). In the opaque corneal scar tissues, the amount of KSPG is reduced, and the amount of dermatan sulfate/chondroitin sulfate proteoglycan increased. A return to normal KSPG is observed upon restoration of corneal transparency (39). The corneal macular dystrophy is characterized by the alteration of the metabolism of sulfated glycosaminoglycan chains, possibly due to a deficiency in the catabolic dermatan sulfate/chondroitin sulfate proteoglycan enzyme, ␣-L-iduronidase, or the anabolic enzyme, sulfotransferase (41,43,44).
Our data revealed that there is a marked increase of KSPG core proteins in the mouse cornea at postnatal day 10 compared with that of day 1. Lumican becomes modified with keratan sulfate side chains at postnatal day 20 (Fig. 7). This is consistent with that of previous studies showing that KSPGs exist in a polylactosamine form and become sulfated keratan sulfate proteoglycans during the embryonic development of the chick cornea at about day 15, when the cornea begins to become transparent (45,46). Studies of changes in KSPG during chick corneal development suggest that both the core protein of lumican and its keratan sulfate side chains are important to the development of corneal transparency (45)(46)(47). Sulfation of the KS chain lags behind the core protein accumulation by about 10 days. Sulfated KSPG does not begin to accumulate until day 10, reaching near maximum levels at about day 20. During this time the mouse eyes open. This observation is consistent with the study by Cai et al. (48), who showed about a 2-3-fold increase of mRNA level of chicken ␤-1,4-galactosyltransferase, an enzyme involved in the synthesis of the KS chain backbone, during embryonic days 8 -13, the period when extracellular KS is first detected and begins an exponential accumulation in the growing stroma (46). The chicken corneas achieve transparency during this time. Thus, our observations and others are consistent with the notion that lumican and other proteoglycans may play an important role in the development and maintenance of corneal transparency.
It is of interest to note that the levels of lumican mRNA in developing corneas decrease in the new born mice and maintain a relative constant level in proportion to GAPDH mRNAs as compared with those of embryonic corneas at day 16 and 18 PC (Fig. 4B). Interestingly, the maximum accumulation of lumican protein in the tissue does not correspond with the peak of lumican mRNA. Possibly, the synthesis and secretion of lumican may be regulated at transnational and/or post-translational levels, similar to what has been reported in collagen biosynthesis (19,49). Alternatively, the accumulation of lumican in tissues may be greatly enhanced due to an increased stability by the attachment of KS glycosaminoglycan chains to the core protein. Further studies are needed to examine the possibilities.
Lumican containing the keratan sulfate side chains may be only limited to the cornea; it is intriguing that it is also present in a variety of noncorneal tissues, e.g. cartilage, heart, lung, skin, kidney, etc., as a smaller, more homogeneous, poorly sulfated or nonsulfated glycoprotein (11,(23)(24)(25). This unsulfated molecule may play important roles that are yet to be identified in the maintenance of normal tissue functions. It seems likely that lumican in noncorneal tissues is a regulator of collagen fibrillogenesis, because it appears to be in cornea. Thus, in addition to cornea the ablation of lumican gene may have phenotypes involving multiple organ systems whose functions are compromised in homozygous lumican-deficient mice.  7. Western blot of KSPG proteins from mouse cornea and sclera at different ages. Extracts of mouse cornea and sclera at different ages after birth were subjected to ion exchange and alcohol precipitation to produce a fraction that contains KSPG proteins bearing both unsulfated and sulfated carbohydrate chains as described under "Materials and Methods." A, 5 g of this protein was treated with endo-␤-galactosidase to remove the KS chains, and the free core proteins were detected by immunoblotting with KSPG antibody after SDSpolyacrylamide gel electrophoresis as described under "Materials and Methods." Only the 48-kDa band reacted with the antibody. B, sulfated KSPG in cornea and sclera KSPG was isolated from eye shells (cornea and sclera). 10 g of protein was electrophoresed on a 3-12% SDSpolyacrylamide gel electrophoresis gel and then stained with Alcian Blue as described under "Materials and Methods." Far right-hand lane, KSPG from 1-year-old mice was pretreated with endo-␤-galactosidase and keratanase II before electrophoresis. The numbers over the lanes refer to the age in days postnatal.