The Biology of the Small Leucine-rich Proteoglycans

If one of the keys to biology is protein structure, then nature is an efficient operator, because it adopts a number of structurally related proteins to perform functions as diverse as maintaining the mineralized matrix of bones and teeth, the transparency of the cornea, the tensile strength of the skin and tendon, and the vis-coelasticity of blood vessels. Proteoglycans play key roles in all of these fundamental biological processes and behave as potent effec-tors of cellular pathways. The past decade has witnessed an explosion of knowledge in the proteoglycan world, with significant advances in the genetics and cell biology of these complex macromolecules. This minireview describes recent advances in the biology of the small leucine-rich proteoglycan (SLRP) 1 gene family with special emphasis on the biology of the archetype proteoglycan decorin. The focus is on the " functional network " created by these molecules in tissues, on genetic evidence for their functional roles during ontogeny, and on their activities as modulators of complex pathological processes such as fibrosis and cancer growth. Other more extensive reviews may serve to fill the gaps in this one (1– 4). A Family of Structurally Related but Distinct Genes The SLRP gene family comprises at least nine members that, though structurally related, have evolved from different genes, have acquired unique functions, and have undergone a significant degree of structural sophistication (Fig. 1). They can be synthesized as either glycoproteins containing N-linked oligosaccharides or as proteoglycans containing chondroitin/dermatan sulfate or keratan sulfate chains. They can also contain Tyr sulfation, undergo pro-teolytic processing, and contain a pre-core that is cleaved under certain conditions and with advancing age. Moreover, the promoter architecture of various SLRP genes is quite distinctive, and this contributes to their differential tissue expression (5). Three classes of SLRPs can be easily identified based on several parameters including their evolutionary protein conservation, the presence of a distinct cysteine-rich cluster in the N-terminal region, the number of the leucine-rich repeats (LRR), and their genomic organization (Fig. 1). Class I—This group includes decorin (6) and biglycan (7), which show the highest homology (ϳ57% identity) and are the only SLRP members that contain a pro-peptide. The pro-peptide is highly conserved across species and may function as a recognition signal for xylosyltransferase, the first enzyme involved in the synthesis of glycosaminoglycan (GAG) chains. These proteoglycans contain an N-terminal domain that is usually substituted with either one (decorin) or two …

chains, leading to pronounced polyanionic properties. The most salient feature of decorin and biglycan is the presence of 10 LRRs (see below) flanked by cysteine-rich regions (Fig. 1). We previously identified a pattern of amino acid spacing among the four Nterminal Cys residues and predicted that this spacing would be characteristic of each class of SLRPs (1). Not only is the spacing of the Cys residues conserved within each subfamily but also the nature of the intervening amino acids is maintained. For example, class I has an N-terminal Cys consensus sequence that is unique (CX 3 CXCX 6 C) and different from the other two classes (Fig. 1). Another notable feature of class I members is that they are both encoded by genes composed of eight exons with intron/exon junctions in highly conserved positions (8,9). The 10 LRRs are encoded by six exons (exons III-VIII). The C-terminal domain is the least studied region and comprises about 50 amino acid residues and two disulfide-linked cysteine residues separated by ϳ32 amino acids.
Class II-This group comprises five members that can be further divided into three distinct subfamilies. Fibromodulin (10) and lumican (11,12) constitute the first subfamily and exhibit ϳ48% protein sequence identity; keratocan (13) and PRELP (14) constitute the second subfamily with ϳ55% protein identity, whereas osteoadherin (15) constitutes a distinct subfamily with 37-42% protein identity to the other class II members. All of them share an identical cysteine-rich region consensus just before the LRRs. The assignment of novel SLRPs to various classes, as predicted by the consensus sequence for the N-terminal region, has so far held true because osteoadherin, the latest SLRP member to be cloned (15), has the greatest homology to class II SLRPs and indeed contains the predicted consensus (CX 3 CXCX 9 C) (Fig. 1). In contrast to the N-terminal region of decorin/biglycan, class II members contain clusters of Tyr-sulfate residues that would contribute to the polyanionic nature of the proteoglycan. Class II members are primarily substituted with keratan sulfate chains, and polylactosamine, essentially an unsulfated keratan sulfate, can be found in both fibromodulin (16) and keratocan (13). Finally, class II members are encoded by only three exons, with a large central exon encoding nearly all 10 LRRs (Fig. 1).
Class III-Epiphycan/PG-Lb (17)(18)(19) and mimecan/osteoglycin (20), which exhibit only ϳ40% protein sequence identity, are the two members of this class. These proteoglycans can be distinguished by a unique cysteine-rich region consensus (CX 2 CXCX 6 C) and by the presence of only six LRRs. In addition, they are encoded by a gene containing seven exons, and the LRRs are encoded by only three exons (exons V-VII). Epiphycan contains either chondroitin sulfate or dermatan sulfate and can be secreted as a glycoprotein. In cornea, mimecan is a keratan sulfate proteoglycan (20) with multiple transcripts generated by alternative polyadenylation and differential splicing (21).

Leucine-rich Repeats
The common central domain, which can constitute up to ϳ80% of the protein moiety, is composed of ϳ10-fold repeats (with the exception of class III SLRPs) of a 24-amino acid residue LRR with Asn and Leu residues preferentially in conserved positions (LX 2 LXLX 2 NX(L/I)). If the consensus for the LRRs is interpreted with less stringency (2), then there could be two additional LRRs flanking either side of the central LRR domain. The LRR is a structural module used in molecular recognition processes as diverse as cell adhesion, signal transduction, DNA repair, and RNA processing. The crystal structure of the ribonuclease inhibitor, a leucine-rich protein with structural homology to decorin, defines a new class of ␣/␤ protein folds (22). The non-globular shape of the molecule and the exposed face of the parallel ␤-sheet could explain why LRRs are used to achieve strong protein/protein interactions. Molecular modeling of decorin ( Fig. 2A) has revealed a more open structure than the ribonuclease inhibitor (23). The overall dimensions of the arch-shaped decorin, which are similar to those ob-tained with rotary shadowed electron microscopy (24), allow the interaction with a single triple helix of collagen. The open configuration of decorin allows an extensive binding area and thus the formation of several favorable contact points with biological ligands such as the triple helix of collagen ( Fig. 2A) or the EGF receptor (see below). The overall structure provides a flexible platform that can adjust to specific requirements of a particular interacting molecule.
The modeling (25) further shows that it is feasible to build horseshoe structures for all the members of the LRR superfamily, including the bacterial proteins with the shortest 20-residue LRRs. Indeed, the superfamily of LRR proteins has been recently divided into six subfamilies typified by distinct lengths (20 -29 residues) and consensus sequences (25). LRRs from different subfamilies never occur concomitantly within a given LRR protein. Structural modeling provides an explanation for this mutually exclusive relationship; the orientation of the variable non-␤ structural parts of the LRRs is unique to each subfamily and cannot pack together well, whereas the packing of LRRs from one subfamily allows the formation of a specific hydrogen bond network between neighboring LRRs. Thus, it is likely that other members of the SLRP family would fold in a fashion similar to decorin with ␤-strands and ␣-helices parallel to a common axis. Conformational flexibility could be achieved, perhaps, by varying the angle of the protein, which may be more or less open as recently proposed (4), or by altering specific amino acid sequences that bestow functional specificity. For example, decorin and biglycan are 57% identical but also 43% different at the protein level! From various studies, it can be concluded that several independent evolutionary paths (for example, note the different genomic organization vis à vis the LRR in Fig. 1) converged to produce a similar superhelical fold (26). Thus, proteins with LRRs provide a unique solution for a multiplicity of functional activities, and their structural properties appear to be the principal reason for their effectiveness as protein binding motifs (4,22).

Regulation of Matrix Assembly: Key Biological Roles in Skin, Bone, Tendon, and Cornea
The evidence favoring protein/protein interactions for the SLRP gene members is overwhelming. It is through these non-covalent, and presumably reversible, binding events that connective tissues are properly assembled. Several SLRPs bind fibrillar collagens including types I, II, III, V, VI, and XIV and inhibit fibril formation in vitro. Although it is clear from fibril-reconstitution experiments that the main information to build periodic fibrils resides in the amino acid sequence of the collagen, several macromolecules can regulate this complex process (27). The kinetics of assembly and the ultimate fibril diameter are modulated by these factors, and both acceleration and inhibition of fibril formation have been reported. In the SLRP case, the overall effects of this interaction include an initial delayed assembly and a final reduction in the average fibril diameter (28). Removal of the GAG chain or the N-terminal 17-amino acid residues of the decorin protein does not affect the ability of decorin to inhibit fibrillogenesis (29). However, reduction of disulfide bonds abolishes this interaction, whereas renaturation after exposure to dissociative solvents fails to restore all of the effects of decorin on fibrillogenesis (30). Thus, the collagen-regulating activity is mediated by the protein core, likely via the central LRR 4 -6 (31-33), whereas the GAG chains maintain interfibrillar space by extending outward from the protein core. In the case of fibromodulin, inhibition of fibrillogenesis requires more than one binding site including the C-terminal end of the molecule (34), in agreement with the proposed model for decorin-collagen FIG. 1. Evolutionary and structural relationships of the SLRP genes and their proteins. In the dendrogram, obtained with the CLUSTAL program, horizontal lines are proportional to evolutionary distances. All the compared sequences are human with the exception of osteoadherin, which is bovine. The consensus sequences for the N-terminal cysteine-rich region and the LRR region are also presented. The roman numerals indicate the exon number. The 5Јand 3Ј-exons encoding untranslated regions are represented by black rectangles.

FIG. 2. Decorin interacts with collagen and regulates collagen fibrillogenesis in vivo.
A, three-dimensional model of decorin interacting with a triple helix of collagen (yellow). The concave surface of decorin is lined by charged residues with basic and acidic amino acid residues in blue and red, respectively (23). B, electron micrograph showing abnormal collagen fibers from the dermis of a decorin null mouse (35). Notice the variability in cross-sectional diameter and the irregular contours because of abnormal lateral fusion of small fibrils into larger ones along their major axes.
interaction (23). This connective tissue "cooperation" is evolutionarily conserved and physiologically relevant as exemplified by the diverse phenotypes of knockout animals in which specific class I and II SLRP genes have been disrupted by gene targeting. Decorin null animals show an abnormal skin fragility phenotype caused by a reduced tensile strength (35). Close analysis of the dermal collagen provides a structural basis for the skin fragility; the collagen fiber network of the null animals is more loosely packed and exhibits irregular collagen contours (Fig. 2B). This is confirmed by mass mapping of isolated collagen fibrils, which show a pronounced non-uniformity in their axial mass distribution. Thus, skin fragility in these mutant animals could be ascribed to this anomalous collagen network, which could allow for full body development but would lead to a reduced tensile strength with potential complications such as an increased incidence of injury and an abnormal healing process. Targeted disruption of the biglycan gene leads to an osteoporosislike phenotype (36) consistent with the different tissue distribution and collagen binding ability of biglycan. The biglycan null animals show reduced bone mass detectable at 3 months of age that becomes more pronounced with aging. Thus, biglycan acts as a positive regulator of bone formation and bone mass by affecting the cellular processes of bone formation that occur during both development and adult life. Interestingly, mice lacking fibromodulin exhibit an abnormal tendon phenotype (37). In contrast to the decorin null mice, the fibromodulin-deficient animals have collagen fibrils thinner than the wild-type animals as a result of a predominance of a very thin fibril population in an overall similar range of fibril diameters. A significant increase of lumican in connective tissues of the fibromodulin-deficient animals suggests that a coordinate transcriptional or post-transcriptional control for certain SLRP members may be operational in vivo. Disruption of the lumican gene also causes a skin fragility phenotype. In addition, the lumican-deficient animals develop bilateral corneal opacity (38). The underlying structural defect is somewhat reminiscent of the decorin null animals in that collagen fibrils are abnormally thicker. However, the lumican null animals show abnormal collagen formation not only in the dermis but also in the cornea. The presence of multiple SLRPs in the transparent cornea might explain why no apparent abnormality has been detected in mice deficient in decorin, biglycan, or fibromodulin. It is possible that decorin and biglycan might not play a significant role in corneal transparency because the binding of dermatan sulfate SLRPs (decorin and biglycan) occurs at the d and e bands of collagen, in contrast to the keratan sulfate SLRPs (lumican, keratocan, and mimecan) that bind to the a and c bands of collagen (39). This differential binding might affect corneal collagen fibril formation and interfibrillar spacing. Thus, the corneal clouding observed in the lumican-deficient mice may be multifactorial: abnormal fibril assembly, lateral fusion caused by the lack of lumican protein core, and altered interfibrillar spacing because of the lack of lumican-bound keratan sulfate.
The pathological consequences that animals suffer from lack of the above mentioned SLRP genes underscore the following two important facts. (a) Human diseases carrying abnormalities in the genome such as deletions or premature stop codons in one of the SLRP genes or mutations in their collagen-binding domain are likely to exist. (b) Mutations in the glycosaminoglycan-binding regions may also contribute to some of the pathological phenotypes.

Natural Antidotes: TGF-␤ Blocker and Anti-fibrotic Agents
The binding of growth factors to proteoglycans and the subsequent modulation of growth factor activities represent one of the major conceptual advances in the field; whether this binding is mediated by the protein core or the carbohydrate moiety, the final event is a perturbation (either negative or positive) of the growth factor biological activity with profound consequences on the affected cell population. Moreover, this biological interaction provides a mechanistic explanation for the growth-and differentiation-promoting ability of the extracellular matrix. Increased TGF-␤ production is the hallmark of a number of fibrotic diseases that are characterized by abundant accumulation of extracellular matrix components. At least four SLRP members (decorin, biglycan, lumican, and fibromodulin) interact with TGF-␤, and affinity measure-ments indicate a two-site binding model with K d values of 1-20 nM and 20 -200 nM for the high and low affinity binding sites, respectively (40). These in vitro binding studies correlate well with the observation that ectopic expression of decorin leads to marked growth retardation and change in morphology and adhesion properties of TGF-␤-dependent cells (41). Addition of recombinant decorin blocks TGF-␤-dependent growth stimulation or inhibition of cells, indicating that the mechanism of decorin action is the neutralization of TGF-␤ activity. These initial observations have been subsequently exploited in an animal model of experimental glomerulonephritis in which rats are injected with anti-thymocyte antiserum, which, in turn, causes a profound immunological reaction against the renal glomeruli. Marked deposition of extracellular matrix in the glomeruli and mesangial matrix ensues, and the resulting fibrosis leads to renal failure. Both blocking anti-TGF-␤ antibodies and decorin work equally well in preventing glomerulosclerosis (42), a pathological process that can be prevented by gene therapy utilizing decorin cDNA transfected into the skeletal muscle of affected animals (43). The levels of decorin remain high for several days post-transfection, and immunoreactive decorin is increased in glomeruli, liver, and lungs of glomerulonephritic rats (43). This provides strong evidence that SLRPs act as natural antidotes for renal fibrosis, essentially an incurable disease, and perhaps for other forms of fibrosis such as those affecting the liver and the lungs.

Control of Cell Proliferation: Interaction with Receptor
Tyrosine Kinase An emerging function of SLRPs is an intrinsic ability to affect cellular proliferation. For example, ectopic expression of decorin retards the growth of a wide variety of tumor cells. The decorininduced growth arrest is associated with an induction of p21, a potent inhibitor of cyclin-dependent kinase activity (44 -46). Ectopic expression of decorin or a mutated form lacking any glycosaminoglycan chain induces growth suppression, and this effect can be modulated by addition of exogenous recombinant decorin to a wide variety of cells. The similarity of the response to decorin in the various established cell lines suggests that a common signal-transducing pathway, a common co-receptor system for growth factors, or a common post-receptor mechanism is utilized by the various cells. The fact that p21 is induced across species by decorin further indicates that this is a well conserved signaling pathway operational in mammalian cells. Our results predicted that interaction between decorin and a surface receptor would play a biological role in controlling the endogenous levels of at least one negative modulator of cell cycle check points. These data were confirmed by the discovery that decorin specifically interacts with the EGF receptor (EGFR) and causes a sustained activation of the EGFR, which leads to activation of the mitogen-activated protein kinase signal pathway and eventually to an increase in endogenous p21 and cell cycle arrest. Recombinant decorin causes a rapid increase of intracellular Ca 2ϩ levels (Fig. 3), and this effect persists in the absence of extracellular calcium (47). Several lines of evidence support a specific protein/protein interaction between decorin and the EGFR. (a) Decorin induces dimerization of the EGFR, (b) specific binding occurs when decorin is immobilized on a nitrocellulose membrane or free in a physiologic salt solution, (c) decorin induces autophosphorylation of purified EGFR, and (d) decorin induces EGFR tyrosine kinase, and both the binding and activation require a properly folded protein moiety (48). These findings are notable because the discoidin domain receptors (DDR1 and DDR2), two orphan receptor tyrosine kinases, have been shown to be receptors for fibrillar collagen. Similarly to the decorin/EGFR interaction, stimulation of the DDR tyrosine kinase activity requires the native triple helical structure of collagen and occurs over an extended period of time (49,50). Collagen-induced activation of DDR2 results in induction of collagenase (MMP-1) expression, thus leading to a physiological loop whereby increased levels of extracellular collagen signal the cells to increase collagen degradation. Decorin, when present on the substrate with vitronectin, is also capable of affecting the remodeling of the extracellular matrix by inducing MMP-1 (51). Because decorin and other SLRP members are intimately associated with fibrillar collagen, a complex scenario in which multimeric interactions take place in an integrin-independent manner should be considered. An enhancement in decorin content in the newly formed tumor stroma could trigger functional interaction with the EGFR, which would, in turn, start a signaling cascade that directly influences the cell cycle machinery. In this light, it is noteworthy that a double knockout of decorin and p53, a well established tumor suppressor gene, shows a cooperative action between these two genes and an acceleration of lymphoma tumorigenesis (52). Mice lacking both genes show a faster rate of tumor development and succumb to thymic lymphoma within 6 months. This result indicates that the lack of decorin is permissive for lymphoma tumorigenesis in a mouse model predisposed to cancer and suggests that germline mutations in decorin and p53 may cooperate in the transformation of lymphocytes and ultimately lead to a more aggressive phenotype.

The Next Stage
There is still much to be learned about the biology of the SLRPs. New members are being cloned and characterized, and additional knockouts and double knockouts are being performed. Questions regarding redundancy and indispensability are being addressed at various experimental levels. Although the generation of SLRP gene knockout mice has established the importance of individual members in regulating various aspects of connective tissue biology, it has also revealed new layers of complexity that will require more systematic studies of gene expression. Distinctive promoter organizations may also explain some of the quantitative and qualitative signal variations observed in mutant animals. Future challenges include elucidation of the key signaling events and unique pathways through which SLRP members exert their specific biological action. Designing pharmacological strategies that utilize the power of the SLRPs, identifying mutant protein cores with more powerful activities, and synthesizing peptides that could block growth factor activities or alter collagen binding properties are some of the exciting challenges ahead. Gene therapy utilizing various SLRP proteins to treat fibrosis or cancer may not be too far in the future. The next stage is thrilling.