The Folded Protein Modules of the C-terminal G3 Domain of Aggrecan Can Each Facilitate the Translocation and Secretion of the Extended Chondroitin Sulfate Attachment Sequence*

Aggrecan is a multidomain proteoglycan containing both extended and folded protein modules. The C-terminal G3 domain contains a lectin-like, complement regulatory protein-like, and two alternatively spliced epidermal growth factor-like modules. It has been proposed that the lectin module alone has a necessary role in the intracellular translocation and secretion of proteins expressed containing G3. Constructs containing human aggrecan G3 together with 1155 bases of the adjacent chondroitin sulfate attachment region (CS-2) were prepared with different combinations and deletions of the protein modules and transfected into mammalian cells of monkey or hamster origin. The results showed that the products containing only the unfolded protein sequences (CS-2 with or without the C-terminal tail sequence) were translated and accumulated intracellularly but were not secreted. In contrast the constructs containing any of the folded protein modules and the extended CS-2 region were translated and secreted from the cells. The results show that the lectin module was not unique in facilitating the intracellular translocation and secretion of the G3 domain. The conservation of G3-like domains within the aggrecan family of proteoglycans may therefore result from their participation in other extracellular functions.

Aggrecan is a large multidomain proteoglycan produced by chondrocytes and found as an essential component of the extracellular matrix of cartilage (1). The protein core consists of three globular and two extended domains. There is an Nterminal G1 domain that interacts specifically with hyaluronan to form multimolecular aggregates in which up to 100 aggrecan molecules bind to each hyaluronan chain (2). In contrast, the C-terminal G3 domain has no clearly established function. It consists of a C-type lectin module, a complement regulatory protein-like module and two alternatively spliced EGF 1 -like modules. Its structure is highly conserved among aggrecans in different species, and closely related structures are present in other members of this family of proteoglycans, versican, neurocan, and brevican (3). Its possible functions include both carbohydrate (4 -6) and protein ligand (7) interactions in the extracellular matrix, but it has also been proposed to have intracellular functions (8,9).
The cause of nanomelia in the chicken was identified as a mutation in the aggrecan gene that produced a premature stop codon in the extended CS-2 chondroitin sulfate attachment region (10). This resulted in the synthesis of aggrecan lacking the normal C-terminal G3 structure, and although the protein was translated and present intracellularly in chondrocytes, it was not fully glycosylated or secreted (11,12). Cartilage is an essential forerunner of skeletal development in the embryo, and in the nanomelic chick the lack of aggrecan stunted cartilage development and long bone formation.
These observations led to the proposal that the G3 domain was necessary intracellularly for the translocation and secretion of aggrecan. Expression of chicken aggrecan G3 in Chinese hamster ovary cells showed that the G3 domain was much more efficient than the G1 domain in facilitating the secretion of an extended CS attachment sequence, and this was independent of its position 5Ј or 3Ј to the extended sequence (13). Further refinement showed that the removal of the CRP module gave no loss of this activity, and finally, it was also shown that of the remaining LEC coding sequence, only the second of the three exons was necessary in the structure to ensure secretion of the product (9). From these results it was concluded that the LEC module was essential for aggrecan biosynthesis, to facilitate its intracellular translocation, and secretion. To follow up these results, in this study the secretory behavior of a larger range of constructs based on the human aggrecan G3 domain have been investigated after their transfection into mammalian cells of monkey or hamster origin.

EXPERIMENTAL PROCEDURES
Reverse Transcriptase PCR-Total RNA was extracted with Tri-Reagent (Sigma) from a sample of human articular cartilage (kindly provided by Glynne Andrew, Hope Hospital, Salford, UK). Randomprimed reverse transcriptase (RT) product was amplified with primers Agg 4 and Agg 11 (see Fig. 1 for primer details) in a PCR. The human EGF1 sequence was found in a 495-bp product. EGF2 cDNA was amplified using an overlapping primer PCR method (14) with primers based on the human EGF2 sequence (15). Primers Agg 4 and Agg 13 (inverse complement of bases 47 to 64 of the EGF2 sequence) were used to amplify the 5Ј half of the EGF2 cDNA and 177 bp of the flanking upstream CS-2 chondroitin sulfate attachment region. Primers Agg 12 (bases 47 to 64 of the EGF2 sequence) and Agg 11 were used to amplify the 3Ј half of the EGF2 motif cDNA and 204 bp of the downstream lectin-like motif. The final PCR product, amplified with primers Agg 4 and Agg 11, was a cDNA containing the EGF2 motif as though it had been alternatively spliced into the aggrecan mRNA. A cDNA containing both EGF motifs was prepared using a primer containing 20 bp at the 3Ј end of the EGF1 motif and 20 bp at the 5Ј end of the EGF2 motif and another primer containing the inverse complementary sequence. Hu-* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18  man aggrecan signal sequence cDNA was amplified from the same total RNA pool by RT-PCR using primers Agg 18 (situated in the 5Ј-untranslated region of the published human aggrecan cDNA (16), bases 13 to 30 with an additional 5Ј HindIII restriction enzyme site) and Agg 19 (inverse complement of bases 106 to 123 of the same sequence, with Glu-21 providing the GAA of an additional EcoRI restriction enzyme site).
Assembly of G3 Variant Constructs-To construct a panel of cDNAs containing varying combinations of the folded motifs present in the G3 domain, a human aggrecan cDNA encoding the C-terminal 385 amino acids of the CS-2 region, the LEC, CRP and tail motifs (17) was inserted into the EcoRI site of pBluescript KS (Stratagene) from which the XbaI and HincII sites had been deleted (Fig. 1A). EGF sequences were inserted by digesting the RT-PCR products described above with XbaI and HincII and ligating these into the corresponding restriction sites in the original aggrecan cDNA. Constructs in which the LEC and CRP motifs were deleted either singly or together were made using an overlapping primer PCR method (14) using plasmids as template. Each construct required two specific primers, one containing the last 20 bp of the first motif and the first 20 bp of the second motif to be joined together. The other specific primer was the inverse complementary sequence of the first. The sequence of the primers at the transition between motifs followed the predicted exon boundaries (18) as though the motifs had been alternatively spliced. The other primers used for the PCR were Agg 4, Agg 1, and the T3 primer from the Bluescript vector (Fig. 1A). PCR products and parent plasmids were digested with the appropriate restriction enzymes and ligated together to produce the panel of constructs shown in Fig. 1. Constructs were released from the Bluescript vector by digestion with EcoRI and subcloned into the expression vectors pcS or pcA.
Vector pcS was derived from the mammalian expression vector pcDNA3 (Invitrogen) by replacing the cytomegalovirus promoter region with an NruI and BamHI fragment of the Sig pIg vector (R & D Systems) containing the cytomegalovirus promoter and the sequence encoding the CD33 signal peptide. To make vector pcA, human aggrecan signal sequence RT-PCR product (as described above), digested with HindIII and EcoRI, was subcloned into pcDNA3 digested with the same enzymes, placing the aggrecan AUG translation initiation codon under the control of the cytomegalovirus promoter of the vector. Insert CS (Fig. 1B) was made by digesting pcS.CS.L.C.t with XbaI, which removed the last 43 amino acids of the CS-2 region, the entire G3 region, and cuts in the multiple cloning site of the pcS vector at the 3Ј end of the insert. Re-ligation of these XbaI sites fused 23 amino acids encoded by the pcS multiple cloning site onto the truncated CS-2 region. This was the only construct with any non-native sequence. All constructs were sequenced to verify in-frame addition of signal sequences and correct sequences across newly constructed motif junctions. In addition, all constructs were translated in a cell-free transcription/ translation system (TNT-coupled transcription/translation system; Promega) to confirm full-length translation products.
Cell Culture and Transfection-COS-1 cells (American Type Culture Collection) were grown in Dulbecco's modified Eagle's medium (Life Technologies, Inc.) supplemented with 10% fetal bovine serum, 2 mM L-glutamine, penicillin (100 IU/ml), and streptomycin (100 (g/ml) in a humidified atmosphere (5% CO 2 ) at 37°C. Cells were transfected at ϳ60% confluency with plasmid DNA (5 g/60-mm culture dish) using DEAE-dextran (Promega) in chloroquine-containing medium for 4 h. Fresh medium was applied, and the following day the cells were incubated in medium (as above) containing 0.5% fetal bovine serum for a further 48 h.
Embryonic Syrian hamster cells (DES4ϩ.2) transformed with diethylstilbestrol (19,20) were grown in minimum essential medium (Life Technologies) supplemented with 10% fetal bovine serum, 2 mM Lglutamine, penicillin (100 IU/ml), streptomycin (100 (g/ml), and minimum essential medium nonessential amino acids (1ϫ) (Life Technologies, Inc.). Cells were transfected at ϳ70% confluency with plasmid DNA (5 g/60-mm culture dish) in Pfx-5 lipid solution (Invitrogen) in serum-free medium for 4 h. The transfection medium was replaced with complete culture medium for 18 -24 h and then replaced with medium (as above) containing 0.5% fetal bovine serum for 48 h incubation. The harvested medium was centrifuged at 1000 ϫ g for 2 min, and protease inhibitors were added (1 mM EDTA, 2 mM phenylmethylsulfonyl fluoride, 10 mM N-ethylmaleimide, 0.5 g/ml leupeptin, 1 g/ml antipain, 5 g/ml benzamidine HCl, 0.5 g/ml aprotinin, 0.5 g/ml chymostatin, and 0.5 g/ml pepstatin). It was dialyzed against deionized water at 4°C, freeze-dried, dissolved in chondroitinase ABC digestion buffer (0.01 M Tris-HCl, pH 7.4, 0.15 M NaCl) and stored at Ϫ20°C. Total RNA was extracted from COS-1 and DES4ϩ.2 cell layers in Tri-Reagent (Sigma), and cellular proteins were isolated from the organic phase of the extract using the manufacturer's alternative procedure. The volumes of cell protein extracts were adjusted to those of the medium samples by centrifugal evaporation, and 10ϫ chondroitinase digestion buffer was added to ensure that comparisons were made between equal fractions of the total medium and the total cell layer extract. Chondroitin ABC lyase (Sigma) digestion of medium and cell layer extract samples was with 1.5 units/ml at 37°C for 3 h, and in a similar buffer, digestion of N-linked oligosaccharides was with 0.23 units/ml endoglycosidase F, 0.29 units/ml peptide N-glycosidase F (Oxford Glycosciences, Abingdon, UK) incubated at 37°C for 18 h.
SDS-PAGE and Immunoblotting-Samples from medium and cell extracts were electrophoresed under reducing conditions in 7.5% SDS-PAGE gels (21), in 4 -20% gradient SDS-PAGE gels (Bio-Rad), or in NuPAGE 4 -12% Bis-Tris gels (NOVEX), transferred to nitrocellulose membranes (22) (with Tris-Bicine transfer buffer (NOVEX) for Nu-PAGE gels), and blocked in 10 mM phosphate-buffered saline (138 mM NaCl, 2.7 mM KCl, pH 7.4) containing 4% nonfat dried milk powder. Transfer was confirmed by Ponceau R staining. Immunodetection of expressed products was carried out with JD5, a rabbit polyclonal antiserum raised against a bacterial GST.CS2.LEC.CRP.t fusion protein (17) and also with mouse monoclonal antibodies, 1B5, 2B6, and 3B3, which recognize nonsulfated, 4-sulfated, and 6-sulfated chondroitin sulfate chains, respectively, after chondroitinase ABC digestion (23). Bound antibody was detected with horseradish peroxidase-conjugated anti-rabbit or anti-mouse secondary antibodies and chemiluminescence (NEN Life Science Products). Immunoblots were quantitated using a GS-700 imaging densitometer and Molecular Analyst software (Bio-Rad). The integrated optical density was determined for each band and corrected for background. For each construct the percentage of product secreted was calculated for three separate transfections from blots with nonsaturating densities and in a range that showed a linear correlation with loading.

RESULTS
Constructs based on a cDNA coding for the C-terminal third of the human aggrecan mRNA were prepared in a vector, pcS, based on pcDNA3 (Invitrogen). This cDNA sequence included a major part (1155 bp) of the CS-2 chondroitin sulfate attachment region (CS) with 27 potential CS attachment sites, the C-lectin-like module (L), the complement regulatory proteinlike module (C), and the natural short (74 bp) C-terminal sequence (t). This CS.L.C.t sequence (Fig. 1) is the form found most commonly in chondrocyte aggrecan mRNA and lacks the two alternatively spliced EGF-like modules (16). The vector contained a signal sequence from the protein CD-33 to direct protein translation to membrane-bound ribosomes and into the secretory pathway. Other cDNA constructs were prepared with deletions of each of the LEC and CRP modules and with inserted EGF-1 (E1) and EGF-2 (E2) modules generated by RT-PCR from human chondrocyte mRNA. This provided a range of constructs containing the CS-2 region and a variable number of folded protein modules (Fig. 1). All the constructs were prepared with the folded protein modules delineated with junctions at the natural exon boundaries.
The constructs were assessed initially by in vitro translation, and all gave protein products; however, many appeared to be larger than expected on SDS-PAGE. As only a construct lacking the CS-2 region was of the expected size (not shown), it appeared that anomalous migration was a property of the CS-2 region. This may result from poor binding of SDS by this protein region or possibly because it has an extended or stiffened conformation. It increased the apparent mass of all products containing the CS-2 region by about 40 kDa in the electrophoresis system used, but it was also observed with CS-2 products from in vitro translation. The anomalous migration was therefore not a result of the glycosylation of CS-2.
Initial transfection studies were carried out in COS-1 cells (monkey origin). Transfection of the basic CS.L.C.t construct with DEAE-dextran produced transient expression, which was investigated by immunoblotting of cell layer and medium samples after SDS-PAGE (Fig. 2). The results showed that the protein was synthesized and secreted and was well detected by the antiserum in the cell layer and in the medium after culture for 48 h. The separate removal of either of the LEC or CRP modules had no significant effect on synthesis and secretion in transfected cells. However, constructs containing the CS-2 region but lacking both LEC and CRP modules (CS.t and CS) were much less abundant in the medium at 48 h, although they were well expressed in the cell layer. The relatively long culture time used for these analyses showed a clear distinction between protein products that were secreted into the medium (80 -90% in the medium in 48 h) and products that appeared intracellularly but were not efficiently secreted into the medium (Ͻ10% in the medium in 48 h). This established a pattern of synthesis and secretion in which the presence of either the LEC or the CRP module was necessary for the efficient secretion of a CS-2 region (Table I).
Having obtained this result in COS-1 cells, the pattern of secretion was also investigated in DES4ϩ.2 cells. These cells are derived from Syrian hamster embryonic stem cells by mutagenesis (19) and have been characterized as chondrocyte-like as they express collagen type II and IX mRNAs (20). Transfection of these cells with the various constructs showed a pattern of secretion similar to that of COS-1 cells (Fig. 3). Only the constructs containing the CS-2 region but lacking a folding protein module failed to be efficiently secreted. The addition of EGF-1 or -2 modules to the construct with the CS-2 region and both LEC and CRP modules again resulted in the synthesis and secretion of the product. Even the addition of a single EGF-1 module to the CS-2 region was sufficient to give secretion of the protein into the medium well above that observed with CS-2 region alone, although over a number of transfections it appeared less efficient than with the other protein modules. This pattern of secretion of products was consistent between transfection experiments with DES4ϩ.2 cells and COS-1 cells. The extended CS-2 region was poorly translocated and secreted if expressed on its own or together with the natural C-terminal sequence (t), but if it was attached to any of the folded protein modules in the G3 domain, this was sufficient to permit its translocation and secretion. However, no specific module was required for this to occur.
In further transfection experiments in DES4ϩ.2 cells, the CD33 signal sequence in the construct was replaced by the natural aggrecan signal sequence, but this had no effect on synthesis, or secretion, or on the apparent size of the products (not shown). There was therefore no suggestion that the signal sequence was responsible for directing the translated protein into different pathways.
The size of the products expressed in transfected cells was larger than those produced by cell-free translation, and the size of most of the expressed products secreted in the medium of both cell types was larger than that in the cell layer (Fig. 4). These differences in size are likely to result from various processes of glycosylation in the rough endoplasmic reticulum and Golgi regions during translocation through the secretory pathway. The level of chondroitin sulfate synthesis on the secreted products was low, as chondroitinase ABC digestion only sharpened the main bands but caused little reduction in their apparent size (Figs. 2-4). The digested products were reactive with chondroitin sulfate monoclonal antibodies (23) (Fig. 5A) and most reactive with that specific for 4-sulfated terminal disaccharide groups (Fig. 5B). The level of chondroitin sulfate also showed no significant variation with constructs of different protein module composition. It thus appeared that only a fraction of the secreted products received some chondroitin sulfate chains during synthesis. A minor product band of faster mobility, which was reactive with the aggrecan antiserum and with CS antibodies, was seen with different constructs most prominantly in DES4ϩ.2 cells (Fig. 3) and is likely to result from differences in glycosylation among the secreted products. The medium from the DES4ϩ.2 cells contained additional bands of higher molecular weight reactive with the chondroitin sulfate monoclonal antibodies (Fig. 5, A and B). These bands were  present in the medium from nontransfected control cells, but they did not react with the aggrecan specific JD5 antiserum (Fig. 5A) and are therefore other proteoglycans. The cell layer products from transfected cells were not affected by chondroitinase digestion and were unreactive with CS antibodies, showing that they had no chondroitin sulfate attached. Digestion of the cell layer product from cells transfected with the CS.L.C.t construct with enzymes (endoglycosidase-F/peptide N-glycosidase F) that remove N-linked oligosaccharides reduced its size to that of the cell-free translation product (Fig. 4). Similar treatment of the secreted product after chondroitinase ABC digestion also reduced its size, but it remained significantly larger than the digested cell layer product.

DISCUSSION
These results with human aggrecan cDNA constructs expressed in mammalian cells show that, as in the nanomelic chick, the extended CS-2 sequence is not translocated or secreted without a C-terminal folded protein motif. In the current study care has been taken in the preparation of constructs to splice all sequences of the folded protein modules at natural junctions to avoid interfering with their secondary or tertiary structure. Under these conditions, any of the protein modules predicted to fold in the G3 domain (24) were able to facilitate secretion of the adjacent CS-2 region, and there was little detectable difference between the LEC and CRP modules in this function. These experiments on human aggrecan G3 expression show some results that differ from those reported (9) on chicken aggrecan G3 expression in Chinese hamster ovary cells, from which it was concluded that the LEC module was essential for the secretion of G3 constructs. In support of this, the interaction of the LEC module with chaperone Hsp-25 was demonstrated, and this highlighted the potentially important function of the chaperone in the folding of the LEC module and in facilitating its translocation to the Golgi and subsequent secretion. However, from the present results, the secretory role of LEC can be replaced by CRP and at least partly by the EGF modules. These protein modules may also interact with chaperones in the process of folding, and this may similarly facilitate secretion.
Previous studies on aggrecan synthesis in chondrocytes have shown the protein to be in the ER for 20 -30 min, during which N-linked oligosaccharides are synthesized (25). This is followed by more rapid translocation through the Golgi and by secretion within 5-10 min (26). Much evidence shows the addition of chondroitin sulfate and O-linked oligosaccharides to occur in the medial-trans-Golgi (27). The absence of any attached chondroitin sulfate in the cell layer product (Fig. 5A) suggests that most of the molecules in this fraction are in the ER and have not yet reached the medial/trans Golgi. The cell layer product was also shown to contain N-linked oligosaccharides, which supported its location within the ER and explained its increased size compared with the cell-free product. The size of the chondroitinase-digested culture medium product was reduced by the removal of N-linked oligosaccharides, but it remained larger than the cell layer product. This difference in size is likely to result from the residual chondroitin sulfate linkage region sugars and the presence of other O-linked oligosaccharides synthesized during transit through the Golgi before secretion. The relatively low level of chondroitin sulfate synthesis on the expressed constructs in both COS-1 cells and DES4ϩ.2 cells may reflect a limited capacity for chondroitin sulfate synthesis. Neither cell type showed high endogenous levels of [ 35 S]sulfate incorporation into proteoglycans (not shown). The amount of proteoglycan expressed in the transfected cells was therefore possibly too high for the efficient attachment of chondroitin sulfate. However, this is unlikely to have affected the pattern of secretion observed in this study, as transfection of chicken aggrecan G3 constructs in cells deficient in glycosaminoglycan synthesis has previously been shown to give similar results to transfection in normal cells (13).
There are clearly important mechanisms within the cell (28,29) that prevent unfolded proteins from leaving the rough endoplasmic reticulum and progressing along the secretory pathway when they have been miss-translated, miss-spliced, or are from a defective gene, such as the nanomelic chick aggrecan. The expression of proteins with unfolded and extended sequences may require protection from such mechanisms. In this study the presence of a single folded protein module attached C-terminal to the extended and unfolded CS-2 sequence appeared sufficient to avoid this surveillance and rejection mechanism. In this family of proteoglycans the extended sequences are positioned between folded domains. In their normal biosynthesis this structural arrangement may ensure that they avoid being mistaken for incorrectly expressed products. The capping of extended sequences with folded domains in secretory proteins may have evolved as part of a prerequisite for normal secretion.
The results suggest that the LEC module in aggrecan G3 domain is not essential for aggrecan intracellular translocation and secretion. It is therefore likely that it is highly conserved within the C-terminal domains of the aggrecan family of proteoglycans for other reasons. Calcium-dependent interaction of the LEC module of all members of this family has been shown to occur with tenascin-R by binding to the fibronectin type III repeats (7). Calcium-dependant interactions with carbohydrate ligands have also been detected (4 -6). The LEC, CRP, and EGF-like modules may therefore together contribute to extracellular functions of the G3-like domains of the aggrecan family, and the alternative splicing of EGF-1 and EGF-2 may occur to modulate these functions.