Osteoadherin, a cell-binding keratan sulfate proteoglycan in bone, belongs to the family of leucine-rich repeat proteins of the extracellular matrix.

Osteoadherin is a recently described bone proteoglycan containing keratan sulfate. It promotes integrin (alphav beta3)-mediated cell binding (Wendel, M., Sommarin, Y., and Heinegârd, D. (1998) J. Cell Biol. 141, 839-847). The primary structure of bovine osteoadherin has now been determined by nucleotide sequencing of a cDNA clone from a primary bovine osteoblast expression library. The entire translated primary sequence corresponds to a 49,116-Da protein with a calculated isoelectric point for the mature protein of 5.2. The dominating feature is a central region consisting of 11 B-type, leucine-rich repeats ranging in length from 20 to 30 residues. The full, primary sequence contains four putative sites for tyrosine sulfation, three of which are at the N-terminal end of the molecule. There are six potential sites for N-linked glycosylation present. Osteoadherin shows highest sequence identity, 42%, to bovine keratocan and 37-38% identity to bovine fibromodulin, lumican, and human PRELP. Unique to osteoadherin is the presence of a large and very acidic C-terminal domain. The distribution of cysteine residues resembles that of other leucine-rich repeat proteins except for two centrally located cysteines. Northern blot analysis of RNA samples from various bovine tissues showed a 4.5-kilobase pair message for osteoadherin to be expressed in bone only. Osteoadherin mRNA was detected by in situ hybridization in mature osteoblasts located superficially on trabecular bone.

Connective tissues are dominated by an extensive extracellular matrix. In bone, the matrix is dominated by mineral in the form of hydroxyapatite. The mineral crystals are aligned along the fibrils of collagen I (1), which is the predominant organic constituent (2). The extracellular matrix of bone also contains small chondroitin sulfate proteoglycans like decorin and biglycan, as well as other non-collagenous proteins (for review, see Ref. 3). The role of decorin and biglycan in the bone tissue is still unclear. Decorin has high affinity for type I collagen (4) and has been proposed to be involved in mineralization. In histochemical studies, it has been shown that decorin disappears from the collagen fibrils in bone before or during the mineralization of the tissue (5). However, in a recently described inactivation of the decorin gene in mice, no skeletal changes could be detected (6). Biglycan has a different distribution pattern than decorin and does not appear associated with fibrils of collagen I. Biglycan can be found in the osteoid (7) and in preosteogenic cells, suggesting a function in early bone formation.
Osteoadherin has been isolated as a minor, leucine-and aspartic acid-rich keratan sulfate proteoglycan found in the mineralized matrix of bone (24). The protein is rather acidic and binds well to hydroxyapatite. Interestingly, the protein can, in a cation-dependent mechanism, bind osteoblasts via the ␣ v ␤ 3 integrin. In order to further characterize osteoadherin, we have determined its primary structure. This reveals that osteoadherin belongs to the LRR family of connective tissue proteins. Osteoadherin is primarily expressed by mature osteoblasts, as shown in studies of its expression by in situ hybridization.

MATERIALS AND METHODS
Protein Purification-Osteoadherin was isolated from the mineral compartment of bovine bone as described by Wendel et al. (24).
Peptide Isolation, Amino Acid Sequencing, and N-terminal Sequencing-Peptides were isolated from digests in polyacrylamide gel of purified protein with sequencing grade trypsin (Promega) as originally described by Rosenfeld (25) and further developed by Hellman et al. (26). Peptides were separated by reversed-phase HPLC on a RPC C2/C18 column by the use of a SMART system (Amersham Pharmacia Biotech) with a gradient of 0 -40% acetonitrile over 160 min. Peptide peaks were collected and sequenced on an Applied Biosystems 477A automated sequencer with on-line analysis of phenylthiohydantoinderivative on an Applied Biosystems 120A microbore HPLC. N-termi-nal sequencing of osteoadherin was performed on intact protein by standard methods.
RNA Extraction-Primary bovine osteoblasts were prepared using the method of Robey and Termine (27) and total RNA extracted with guanidine isothiocyanate essentially according to Adams et al. (28). Chondrocytes from bovine tracheal cartilage were isolated by collagenase digestion (29). Total RNA from these cells were extracted by the same method. Various tissues from an approximately 2-trimester-old bovine fetus and calvaria from 5-day-old rats were homogenized, and total RNA was extracted similarly.
Screening of cDNA Library and DNA Sequencing-Initially, a rat calvaria library was extensively screened with affinity-purified antibodies to osteoadherin, but no positive clones were obtained. A unidirectional Zap (Stratagene) cDNA library was therefore made from RNA isolated from primary bovine osteoblast cultures according to instructions in the ZAP-cDNA ® synthesis kit (Stratagene). Approximately 500,000 plaque-forming recombinants were screened using affinitypurified rabbit antiserum, and one clearly positive clone was found. The pBluescript ® SK(ϩ) plasmid containing the 2.1-kbp cDNA insert was rescued from the ZAP vector by the use of in vivo excision. The cDNA was digested to various lengths for sequencing by the use of the Erase-a-Base ® system (Promega). The cDNA was sequenced in both directions by the standard, double-stranded dideoxy termination method using T3, T7, and synthetic, internal primers. Two additional clones were identified in the library by the use of a cDNA probe corresponding to bases 1-225 of the original clone. These clones of approximately 4.5 kbp were partially sequenced from the ends by the use of T3 and T7 primers and five internal sequencing primers producing sequence covering most of the coding region.
Northern Blots-Ten g of total RNA from various tissues and species were electrophoresed on a 1% agarose/formaldehyde gel and RNA transferred to nitrocellulose filter (NitroPure, Micron Separations Inc.) by standard procedures. An 800-base pair bovine cDNA fragment (AccI/ SacI) from the original clone was random primer-labeled (Random Primed DNA labeling kit, Boehringer Mannheim) with [␣-32 P]dCTP (Redivue TM , Amersham Pharmacia Biotech) and allowed to hybridize to the blotted mRNA from the tissues. Nonhybridized probe was removed by a final wash with 0.2ϫ SSC, 0.1% SDS at 55°C, prior to detection of radiolabel with a Fujix BAS2000 bio-imaging analyzer. In

RESULTS
cDNA Cloning and Sequencing-A unidirectional Zap cDNA expression library was prepared from cultured primary bovine osteoblasts. The library was screened with an affinitypurified rabbit polyclonal antiserum. From an initial screen of some 500,000 recombinants, one antibody-positive clone of 2,111 base pairs was selected. This clone was sequenced completely in both directions. The nucleotide and translated amino acid sequences are shown in Fig. 1. The overall nucleotide composition is unusual in that it contains 61% A or T nucleotides. The osteoadherin cDNA sequence corresponds to a 422residue protein. The identity of the clone with isolated osteoadherin was unambiguously confirmed by amino acid sequencing of the N terminus and several internal tryptic peptide fragments distributed along the protein, as indicated in Fig. 1.
The original clone had an open reading frame from the beginning of the clone to the first methionine. This raised the possibility that it was not a full-length clone. In addition, no classical polyadenylation signal was present in the 3Ј end. The library was screened again with a cDNA probe covering nucleotides 1-255. Two additional clones were found in a screen of approximately 500,000 plaques. These clones did not extend beyond the 5Ј end of the original clone, but sequencing showed the new clones to be 12 and 14 bases shorter in the 5Ј end, respectively. It is therefore likely that the very 5Ј end of osteoadherin mRNA begins as in the original clone. In support, the nucleotide composition around the putative translation initiation codon conforms well to the consensus sequence described by Kozak (32). Efforts to extend cDNA from bovine osteoblast cultures by the use of primer extension and priming with a reverse primer complementary to bases 43-68 only gave short extension products (data not shown). Both clones from the second screening of the library were approximately 4.5 kbp long. By sequencing with T3 primer and four internal primers in the coding region, both these clones were found to be identical in sequence to the original clone at their 5Ј ends and the coding region including the stop codon. Surprisingly, sequencing past the stop codon toward the 3Ј end yielded entirely different sequence from the first nucleotide after the stop codon. The first 25 bases, in the two clones from the secondary screen, after the stop codon were 5Ј-GTATACATTTTATAACT-TCATTAAG-3Ј. Sequencing with T7 primer from the 3Ј end of these clones and 336 bases in toward the 5Ј end also gave an entirely different sequence, including a classical consensus AATAAA polyadenylation site 18 bases upstream of the beginning of the poly(A) end (data not shown). It should be noted that the last 31 bases of the original clone contains three (A)ATAA(A) sequences, which may substitute for a classical polyadenylation signal. Efforts to obtain sequence from the two clones from the second screening with four internal sequencing primers from the 3Ј-UTR of the original clone were negative. In further support of a completely different 5Ј-UTR in the larger clone, we found that restriction enzyme mapping of the longer clones produced fragments of different size in the 5Ј-UTR, whereas fragments of identical size were obtained from the coding region.
Data base searches with the nucleotide sequence against GenBank indicated that the 3Ј-UTR of the original clone contained SINE (short interspersed nuclear element) and art2 sequences in position 1382-1687. These elements are common features in the bovine genome, with the art2 repeat present in some 100,000 copies per genome (33), and usually found in introns.
Protein Structure-Examination of the deduced protein sequence indicate a signal peptide cleavage site between 20 and 21 residues from the first methionine (see Fig. 1). This site conforms well to the Ϫ1,Ϫ3 signal peptide cleavage site rule of von Heijne (34) and is preceded by a hydrophobic peptide. N-terminal sequence analysis of isolated osteoadherin gave a clear DEDYDQEP sequence, indicating that the protein most likely is first synthesized with a 7-amino acid propeptide. The calculated molecular mass of the complete preprotein is 49,116 Da, whereas with and without propeptide the protein is 46,874 and 45,889, respectively. These sizes correspond well with the size of isolated osteoadherin after removal of N-linked oligosaccharides (24). The calculated, theoretical isoelectric point of the mature protein without propeptide is 5.2.
A dominating feature of the primary structure is the presence of 11 LRRs as aligned in Table I. These follow the general pattern for extracellular matrix LRR proteins with a well preserved ␤-sheet forming region and a less well conserved ␣-helical region. All repeats in osteoadherin are of the B-type (35) with an asparagine in position 10 of the repeat. The length of the repeats vary from 20 to 30 amino acids. The length of the last repeat, as shown in Table I, cannot be unambiguously assigned from the primary sequence and is only tentative.
The cysteine pattern with four residues in the N-terminal end is highly similar to other LRR proteins found in the extracellular matrix. In the fifth and sixth LRR, cysteines are present in position 8 in both repeats (see Table I). This position is likely to be in a turn of the peptide chain from the well conserved ␤-sheet to the ␣-helix domain, as shown for the model LRR protein, ribonuclease inhibitor (36). These two cysteines  Kobe and Deisenhofer (35) as for porcine ribonuclease inhibitor beginning with the ␤-sheet forming residues. A consensus sequence is shown below. a ϭ aliphatic residue; c ϭ charged residue.
X L X X L X L X X N c I X X a X X X X X

Structure of Osteoadherin
are probably very closely positioned and are thus likely to be disulfide-bonded. In the C-terminal end, two cysteines are present in repeats 10 and 11. These cysteines are in similar positions as in the other related LRR proteins (see below). The protein sequence contains six potential sites for N-linked glycosylation, as indicated in Fig. 1, conforming to the consensus sequence N-X-(T/S). The site at Asn 101 is most likely not used as this residue is included in one of the tryptic peptides that were amino acid sequenced with good yield of Asn. Thus, no indication of a substitution on the asparagine at this site was seen by the amino acid sequencing. This residue is located in the putative ␤-sheet region of the LRR (see Table I). The lack of substitution at this site makes sense, as it is likely that the presence of a large carbohydrate in this location would disturb the folding of the protein. The other potential sites are not located in the ␤-sheet regions. These sites were, however, not part of any of the peptides that were sequenced.
Four potential sites for tyrosine sulfation was detected by screening against the PROSITE data base, as indicated in Fig.  1. The first site is in the predicted propeptide on Tyr 5 . Two more sites are found in the mature N-terminal region on Tyr 11 and Tyr 19 . The fourth site is on Tyr 286 in the potential ␤-sheet region of repeat 10.
Similarity to Other Proteins-Similarity searches against GenBank and SwissProt data bases indicate similarity to several other LRR proteins. Comparison of primary sequences of mature LRR proteins shows that osteoadherin is most closely related to a group consisting of fibromodulin, lumican, keratocan, and PRELP (see Fig. 2). Construction of a dendrogram indicates that osteoadherin is not strongly related to any previously described protein forming a distinct subgroup within the family. Highest overall similarity, 42% identical residues, was with bovine keratocan followed by 37-38% identity with bovine lumican, fibromodulin, and human PRELP. An alignment against these proteins is shown in Fig. 3. On the other hand, it should be noted that the cysteines in the N-terminal region are separated by 3, 1, and 9 residues. This pattern is identical to the other proteins in the alignment. Thus, they all belong to the class II subfamily (37).
A characteristic, distinguishing feature of osteoadherin is the large acidic C-terminal domain consisting of 69 amino acids after the last cysteine compared with 10 or 9 residues in the other related proteins. Of the last 38 amino acids, 16 are strongly acidic, forming a very highly negatively charged Cterminal region not found in other members of the LRR family.
Expression of Osteoadherin-Northern blot analysis of total RNA from various tissues and cultured cells showed a 4.5kilobase mRNA in preparations from trabecular bone and a very strong signal in the cultured primary bovine osteoblasts, Fig. 4. The hybridization signal detected is rather diffuse. This could be a result of degradation of the mRNA. However, the quality of the primary osteoblast RNA preparation was high. This preparation was used to construct the expression cDNA library used in this work, from which we could isolate several clones of 4.5 kbp. Through a number of experiments, we have, however, not been able to obtain well defined hybridization signals for bovine osteoadherin with several different preparations of bone tissue RNAs or by using alternative probes. One possible explanation is that there is considerable alternative or inefficient splicing of the osteoadherin pre-mRNA.
Localization of Expression by in Situ Hybridization-In situ hybridization analysis for osteoadherin mRNA in sections from the growth plate of bovine fetal fetlock joints showed that the protein is primarily expressed in osteoblasts on trabecular bone surfaces (Fig. 5). Strongest signal was seen over osteoblasts in well developed trabecular bone (see Fig. 5C), indicating high expression in mature osteoblasts.

DISCUSSION
Through this work, osteoadherin has been shown to be an additional member of the family of leucine-rich repeat proteins in the extracellular matrix. Although osteoadherin is rather similar to lumican, fibromodulin, keratocan, and PRELP, it is not closely related to these. Osteoadherin contains 11 clearly identifiable LLRs, with a well conserved leucine repeat. In lumican, fibromodulin, keratocan, and PRELP, the repeats are arranged in triplets consisting of two repeats of 24 -26 amino acids, followed by one shorter repeat of 20 -21 residues. This kind of triplet pattern is also present in osteoadherin. The significance of the triplet pattern is not understood. One explanation could be that this enables the ␤-sheets of the 20 -26amino acid repeats to be aligned in parallel in the same fashion as in the known three-dimensional structure of ribonuclease inhibitor, which almost entirely consists of equally long 28residue repeats. However, modeling studies (38) indicate that there appears to be room for considerable flexibility in both the loop region connecting the ␤-sheets with the ␣-helix region and the ␣-helix region itself allowing the ␤-sheets to be in register despite differing lengths of the repeats. Indeed, modeling of the LRR region of decorin to the ribonuclease inhibitor (39) shows that decorin can be fitted well despite the shorter repeats. The decorin study suggests that all the members of this family could have a structure in the central part similar to ribonuclease inhibitor.
The relatively large and very acidic C-terminal peptide extension is a feature that distinguishes osteoadherin from the other LRR-containing proteins and proteoglycans. The C-terminal region after the last cysteine is thus considerably larger than in the related proteins, consisting of 69 amino acids compared with 9 -10 in lumican, fibromodulin, keratocan, or PRELP. This C-terminal region in osteoadherin is extremely acidic, with 16 negatively charged aspartic or glutamic acids in the last 38 residues. Osteoadherin binds well to hydroxyapatite, a property utilized in the isolation of the protein from mineralized bone (24). Other hydroxyapatite-binding proteins in bone like bone sialoprotein have long stretches of acidic residues thought to mediate the binding to the mineral (40). A likely function for the very acidic and probably exposed C terminus could thus be to anchor the protein to the mineral. It should be noted that there are two arginines in position 361- 362 just prior to the acidic residues. Exposed dibasic sequences are often recognized by proteases, e.g. of the furin family (41). This presents an interesting possibility for proteolytic processing of the C terminus. Indeed, it has been shown for chondroadherin, another LRR protein of the extracellular matrix, that at least two forms of the protein exists differing by 9 C-terminal amino acids (15,42). This processing is the result of proteolytic processing as this region of chondroadherin is encoded by one exon in the mouse chondroadherin gene (43). However, in contrast to osteoadherin the peptide that is removed in chondroadherin is basic. Interestingly, we have in extracts of mineralized bone found no lower M r protein of dimensions expected after removal of the C-terminal peptide. Thus, either this may not happen or the major part of the protein released from this putative anchor to the mineral may be rapidly lost from the tissue.
Another unusual feature of osteoadherin is the presence of two, presumably closely situated cysteines in LRR 5 and 6 (see Table I). In view of their close proximity, it is likely that these residues are disulfide-bonded. No such pairs of cysteines in the central LRR region are found in the other members of this family of extracellular matrix proteins. In chondroadherin, one single cysteine is found in repeat four. This residue appeared not to be free but to be involved in stabilizing the structure of the repeat as it could not be chemically derivatized (15). Another interesting possibility is that the cysteines could participate in disulfide exchange with other matrix proteins in bone. One example where this may occur is to osteonectin, inasmuch as an essential step in the complete separation of osteoadherin from osteonectin is reduction of disulfide bonds prior to the final purification step (16).
The N-terminal region of osteoadherin has similarities to lumican and fibromodulin in that it has potential sites for tyrosine sulfation. This fits well with our previous finding that removal of radioactive sulfate-labeled keratan sulfate from osteoadherin still leaves a molecule with radiosulfate attached (24). The functional significance of tyrosine sulfation is not known, but it has been suggested to be of importance for intracellular transport (44). The first potential site is found in the propeptide. It is therefore tempting to speculate that this region is involved in intracellular processing of the protein. A propeptide is also found in decorin (8). It has been suggested FIG. 3. Alignment of osteoadherin against the closest related LRR proteins. The primary sequence of the bovine forms of lumican, fibromodulin, keratocan, and the human form of PRELP was aligned to osteoadherin by the use of the CLUSTALW program and illustrated by the BOXSHADE program. that this propeptide is of importance for intracellular transport as its deletion leads to an increase in intracellular retention time during synthesis and secretion of the proteoglycan (45).
Osteoadherin has been shown to be a cell attachment protein binding primary osteoblasts in a cation-dependent interaction (24). This interaction was inhibited by RGD-containing peptides but not by RGE peptides, indicating the presence of a classical cell-binding region as in fibronectin. However, no RGD sequence is found in the primary sequence of osteoadherin. The best candidate region is an RID sequence in the last LRR. This sequence is positioned at the end of the presumed ␤-sheet region. The isoleucine is therefore likely to be oriented inward and the two charged residues exposed at the beginning of the loop region. In fibronectin, the arginine and aspartic acid has been shown to be exposed in a loop structure with the glycine pointing inward (46). This opens the possibility that the RID sequence in osteoadherin is a functional mimic of the RGD in fibronectin. Indeed, an RLD sequence has been found to be part of the minimal active site in fibrinogen responsible for binding to the ␣ M ␤ 2 integrin (47). However, all of the LRR proteins in the alignment in Fig. 3 have an RLD sequence in the same position as the osteoadherin RID sequence, but none of these proteins has been shown to possess cell binding activity.
Osteoadherin appears to be rather specific for bone. By Northern blot analysis, it was found to be expressed only in bone cells. The expression seems to be restricted to mature osteoblasts as highest expression of mRNA for osteoadherin was detected by the use of in situ hybridization in osteoblasts located on the trabeculae at some distance from the growth plate. This pattern is similar to that found for osteopontin (48), whereas bone sialoprotein is expressed much earlier with some expression in late hypertrophic chondrocytes (49), but most prominently by osteoblasts at osteochondral junctions (50).
In primary cultures of bovine osteoblasts, strong osteoadherin expression was detected by Northern blot analysis. Samples of RNA from trabecular bone showed a weaker but easily detectable signal. The size of the mRNA was approximately 4.5 kilobases. Interestingly, with a probe covering the coding region, a rather diffuse band with a strong trailing was seen.
However, if a probe corresponding to the 3Ј-UTR of the first clone was used, no detectable signal was found. In contrast, if a probe 3Ј-UTR from the second set of 4.5-kbp clones was used, a better defined 4.5-kbp band with less prominent trailing was found (data not shown). Thus, summarizing the presence of SINE/art2 sequences together with an absence of a classical polyadenylation signal in the first clone and presence of such a signal in the second two clones indicates that the first clone isolated could represent a not fully processed mRNA with remaining intron sequences. Alternatively, the first clone could represent an alternatively spliced form. In any case, it is rather unusual with alternative splicing in 3Ј-UTR. A similar, but not identical, situation exists in the mouse gene for chondroadherin, where an intron is present in the 3Ј-UTR in close proximity after the stop codon (43).