Expression of Collagen XVIII and Localization of Its Glycosaminoglycan Attachment Sites*

Collagen XVIII is the only currently known collagen that carries heparan sulfate glycosaminoglycan side chains. The number and location of the glycosaminoglycan attachment sites in the core protein were determined by eukaryotic expression of full-length chick collagen XVIII and site-directed mutagenesis. Three Ser-Gly consensus sequences carrying glycosaminoglycan side chains were detected in the middle and N-terminal part of the core protein. One of the Ser-Gly consensus sequences carried a heparan sulfate side chain, and the remaining two had mixed chondroitin and heparan sulfate side chains; thus, recombinant collagen XVIII was a hybrid of heparan sulfate and chondroitin proteoglycan. In contrast, collagen XVIII from all chick tissues so far assayed have exclusively heparan sulfate side chains, indicating that the posttranslational modification of proteins expressed in vitro is not entirely identical to the processing that occurs in a living embryo. Incubating the various mutated collagen XVIIIs with retinal basement membranes showed that the heparan sulfate glycosaminoglycan side chains mediate the binding of collagen XVIII to basement membranes.

The collagens type XVIII and XV are members of the multiplexins, a collagen subfamily that is characterized by multiple alternating collagen and non-collagenous domains in the protein sequence (1,2). Collagen XVIII came into the public spotlight by the discovery that the C-terminal peptide of collagen XVIII, named endostatin, has anti-angiogenic and anti-tumor activities (3,4). The anti-angiogenic activity of the peptide led to the idea that collagen XVIII might be involved in the development of the vascular system. The targeted deletion of collagen XVIII in mice (5) and a naturally occurring mutation in human (6), however, showed that its main function is restricted to the development of the vasculature in the eye but has no function in blood vessel development in other parts of the body. The fact that collagen XVIII is, next to collagen IV, the only collagen that is conserved from Drosophila (7) and Caenorhabditis elegans (8) to humans suggests an important function of the protein in evolution. The deduced amino acid sequences from collagen XVIII cDNA in mouse and human suggested that collagen XVIII is a proteoglycan (1,9). It was later revealed by investigating the naturally occurring protein from chick embryos (10) that collagen XVIII is a heparan sulfate proteoglycan (HSPG), 1 which is, next to perlecan and agrin, the third extracellular matrix HSPG currently known (10,11). HSPGs are members of a family of cell surface proteins with polysaccharide side chains characterized by alternating uronic acid and glucosamine units (12,13). They occur as either integral membrane (14,15) or as secreted extracellular matrix proteins (12). The polysaccharide chains are connected to the core proteins through serine of the Ser-Gly (SG) consensus sequence whereby the presence of nearby acidic amino acids and the repetition of SGs are factors that promote attachment of glycosaminoglycan (GAG) side chains and favor synthesis of heparan sulfate (HS) over chondroitin sulfate (CS) side chains (16,17). Although the locations of HS chains in syndecan and glypican have been determined (16,17), their locations in agrin and collagen XVIII are unknown. In this study, we expressed the full-length chick collagen XVIII and determined the number and locations of the GAG chains. Our results showed that the GAG glycosylation was affected by cell culture conditions such as the presence or absence of fetal calf serum, and the preference for attachment of HS and CS was determined by the consensus sequence.

EXPERIMENTAL PROCEDURES
Molecular Cloning of Chick Collagen XVIII DNA by Nested PCR-To obtain cDNA sequences covering the entire length of collagen XVIII, a 2.4-kb chick collagen XVIII 3Ј cDNA (p10d; Ref. 10) was extended by nested PCR from a random-primed E6 chick amnion library (-ZapII; Stratagene, La Jolla, CA). The nucleotide sequence of the T3 or T7 promoter from the library vector was used as the forward primer, and two nucleotide sequences designed from the internal 5Ј cDNA sequences of p10d were used as the reverse and nested reverse primer. The first round of PCR contained 4 l of supernatant of the boiled and centrifuged library (2 ϫ 10 10 plaque-forming unit/ml) and elongase mix as the DNA polymerase (Invitrogen) in a total volume of 40 l. Bands with the expected molecular weights were extracted from a 1% agarose gel and re-amplified in the second round of PCR with the T7 or T3 forward primer and the nested reverse primer. The second round of PCR was carried out using Pfu turbo DNA polymerase (Stratagene) following the recommendations provided by the manufacturer. The re-amplified DNA was extracted from the agarose gel, ligated into pPCR-Script AMP SK(ϩ) vector, and transformed into Epicurian coli (Ep. coli)XL 10-Gold ultracompetent cells using the PCR-Script AMP cloning kit (Stratagene). Plasmids from white colonies were purified and sequenced in the University of Pittsburgh Sequencing Facility. The sequences were analyzed using the Sequencer software (DNA Codes Inc., Ann Arbor, MI) and compared with published databases using the Blast search algorithm (17). The amino acid sequence was deduced from the DNA sequence using Sequencher. Three clones were obtained (P53T7P64, P12T7P21, INT2T7ANTI22) that covered the remaining collagen XVIII cDNA.
Molecular Cloning of Full-length of the Coding Sequence for Chick Collagen XVIII-The three PCR clones obtained by nested PCR and the previously obtained p10d were connected by PCR to obtain a cDNA for the full-length coding sequence of chick collagen XVIII. The reaction contained 0.07 pmol each of the DNA segments, a forward primer (5Ј-TTCGCTAGCTGAGCCCGAGAACCTGAGC-3Ј) that contains an NheI cutting site in addition to 3 nucleotides to protect the cutting site at 5Ј end, and a reverse primer (5Ј-TCTCGAGCTCATTTTTTGGCG-GCAGTCATG-3Ј) that contains a XhoI cutting site and 4 protection nucleotides at the 5Ј end. The PCR was carried out using Pfu turbo DNA polymerase. The cycling condition was 94°C for 1 min and 72°C for 7 min for 2 cycles; 94°C for 1 min, 55°C for 1 min, and 72°C for 6 min for 2 cycles; and 94°C for 1 min and 72°C for 7 min for 30 cycles, following denaturing at 94°C for 3 min. A 4-kilobase band extracted from the 1% agarose gels was ligated into pPCR-Script AMP SK(ϩ) vector and transformed into Ep. coli XL 10-Gold ultracompetent cells as described above. One of the clones (Full-12) contained the complete chick collagen XVIII-coding sequence, as confirmed by DNA sequencing.
Site-directed Mutagenesis-The oligonucleotides used to generate the mutant constructs are shown in Table I. Site-directed mutagenesis was carried out on the Full-12 cDNA clone using the QuikChange site-directed mutagenesis kit (Stratagene). Briefly, a mutated duplicate of the original plasmid was produced by PCR using Pfu DNA polymerase and primers containing the desired mutation. After digestion of the original plasmid by DpnI, Ep. coli XL 10-Gold ultracompetent cells were transformed with the nicked plasmid, which contains the desired mutation.
Construction, Expression, and Purification of Recombinant Collagen XVIII-Plasmid from Full-12 and the mutated Full-12 clones were digested with NheI and XhoI (New England Biolabs, Beverly, MA) and de-phosphorylated with alkaline phosphatase (0.05 units/pmol ends; Roche Molecular Biochemicals). The digested and de-phosphorylated DNA inserts were isolated from 1% agarose gels and ligated into an NheI-and XhoI-digested pCEP-PU expression vector (Ref. 18; kindly provided by Dr. Mats Paulsson, University of Cologne, Germany). The plasmid contains a BM40 signal peptide sequence, puromycin-resistant gene, Myc and histidine tags, and a Factor X digestion site at the N terminus. 293-EBNA cells (Invitrogen) and chick meningeal cells were transfected with the constructs using LipofectAMINE (Invitrogen). Stably transfected 293-EBNA cells were cultured in Dulbecco's modified Eagle's medium, 10% fetal bovine serum (Invitrogen), 1 mM ascorbic acid, 1% glutamine-penicillin-streptomycin, 350 g/ml G418, and 1 g/ml puromycin (Sigma), and transiently transfected chick meningeal cells were cultured for 5 days in the culture medium without G418 and puromycin. Cell culture supernatants (10 ml) were collected after 24 h of incubation and applied to a Q-Sepharose column (2 ml; Amersham Biosciences) equilibrated with 0.05 M phosphate buffer (PB) containing 0.15 M NaCl, pH 7.2, to absorb the collagen XVIII. The columns were washed with PB containing 0.15 M NaCl (10 ml). Non-glycosylated collagen XVIII was eluted with 0.5 M NaCl in PB (10 ml), and the glycosylated version was eluted with 1.5 M NaCl in PB (10 ml). Fractions containing collagen XVIII were combined and concentrated to 0.5 ml using ultrafree-50 centrifugal filter devices (Millipore, Bedford, MA) and dialyzed against Hanks' solution before stored at Ϫ80°C.
SDS-PAGE and Western Blotting-Vitreous body was collected from E5-E7 chick eyes and centrifuged at 15,000 rpm for 5 min. Recombinant collagen XVIII was purified by Q-Sepharose chromatography as described above and dialyzed against calcium-and magnesium-free Hanks' solution. The samples were subjected to 3.5-15% SDS-PAGE under reducing conditions. Some of the samples were also run under non-reducing conditions and without boiling the sample. The proteins were transferred onto nitrocellulose (Millipore), and the blots were probed with 6C4, a monoclonal antibody to the NC1 domain of chick collagen XVIII (10,19). The proteins were visualized using alkaline phosphatase-conjugated goat anti-mouse IgG (Jackson ImmunoResearch, West Grove PA) and 4-nitro blue tetrazolium and 5-bromo-4chloro-3-indolyl phosphate (Roche Molecular Biochemicals) as coloring reagents.
Northern Blots-Northern blotting was performed as described previously (10). Briefly, total RNA was isolated from E7-E10 chick brain, embryo, liver, and kidney using TRIzol Reagent (Invitrogen). Poly(A) ϩ RNA was purified using PolyATrac mRNA isolation system IV (Promega, Madison WI), and samples of 2 g of poly(A)ϩ RNA were separated in 1% agarose gels containing 2.2 M formaldehyde, transferred to positively charged nylon membranes (Immobilon N Millipore), and cross-linked with UV. Digoxigenin-labeled cRNA hybridization probe was synthesized from the linearized P12T7P21 collagen XVIII clone template using the RNA-polymerase labeling kit (Roche Molecular Biochemicals). The digoxigenin-labeled cRNA hybridized blot was visualized using alkaline phosphatase-conjugated anti-digoxigenin Fab fragment with CSPD ready-to-use chemiluminescent substrates (Roche Molecular Biochemicals). RNA molecular weight marker II (Roche Molecular Biochemicals) was used to determine the size of the collagen XVIII mRNA.
Binding of Collagen XVIII to Retinal Basement Membranes-A PAPpen (EMS, Washington, PA) was used to circle an ϳ1-cm 2 area of a plastic dish coated with nitrocellulose (20). The hydrophobic ring reduces the volume of the incubation solutions to less than 100 l. Retinal basement membranes were isolated from E6 chick retinae as flat sheets within the encircled areas of the dishes as described (25). The basement membranes were incubated in 2% Triton X-100 for 1 h to remove cellular debris, and the basement membranes with the surrounding plastic were blocked with 1% bovine serum albumin in 0.2% Triton. The samples were incubated with the supernatant from the transfected EBNA cells for 2 h. After washing in calcium-and magnesium-free Hanks' solution, the preparations were incubated with a monoclonal antibody to the Myc epitope (Santa Cruz Biotechnology, Santa Cruz, CA) for 1 h, washed, and visualized with a Cy-3-labeled goat-anti-mouse secondary antibody. Binding of collagen XVIII to tissues were done by TABLE I Mutagenic oligonucleotides for site-directed mutagenesis of chick collagen XVIII Mismatches with the template are indicated by boldface letters. Two single mutants, M2 and M7, were produced using Full-12 as the template with forward and reverse primers 2 and 7, respectively. The PCR system for double mutant DM78 contained M7 as the template and forward and reverse primer 8. For DM27 and DM28, M2 was used as the template with forward and reverse primers 7 and 8, respectively. For triple mutants TM278, TM378, and TM478. DM78 was used as the template with forward and reverse primers 2, 3, and 4, respectively.

Extension of the cDNA Sequence of Chick Collagen XVIII by
Nested PCR-A cDNA clone (p10d) for chick collagen XVIII was previously obtained by conventional screening an E5 chick yolk sac library with the 6C4 monoclonal antibody. It covers 2.4 kb of the 3Ј end of collagen XVIII with 1.24 kb of the 3Ј-untranslated region (10). To extend the sequence in 5Ј direction, we sequentially amplified three collagen XVIII cDNAs from a randomly primed amnion library using a PCR procedure. Taking the T3 or T7 primer of the library vector as a forward primer and a specific reverse primer from the 5Ј end of p10d, we amplified two DNA bands in the first round of PCR with the T3 primer and five bands with the T7 primer. In the second round of PCR using the nested reverse primer, only two DNA bands were obtained with the T7 primer. These bands were slightly smaller in size than the DNA bands from the first round of PCR, as expected from the close proximity of the specific reverse primer and the nested reverse primer. One of the DNA sequences, named INT2T7ANTI22, was homologous to human and mouse collagen XVIII cDNA. The INT2T7ANTI22 clone contained a 1270-bp insert and overlapped with p10d by 150 bp. We repeated the procedure using a specific reverse primer and a nested reverse primer designed from INT2T7ANTI22 to obtain P12T7P21 and then from P12T7P21 to obtain P53T7P64. Both P12T7P21 and P53T7P64 overlapped with their respective 3Ј predecessors (Fig. 1). The deduced amino acid sequence of P53T7P64 contained the starting Met codon and a 26-residue signal peptide.
The Complete cDNA Sequence of Chick Collagen XVIII-A continuous chick collagen XVIII cDNA sequence was obtained by the alignment of P53T7P64, P12T7P21, INT2T7ANTI22, and p10d. The sequence contained 5279 bp with an open reading frame of 4032 bp (Fig. 1) that corresponded to the short version of human and mouse collagen XVIII ␣1. The sequence is deposited in the NCBI gene bank under the access number AF083440.
The deduced protein sequence from the cDNA of chick collagen XVIII ␣1 chain is 1344 amino acids long, with the first 26 amino acid residues comprising the signal peptide. The chick sequence is slightly longer than the human (1336) and mouse (1315) short variants (Ref. 21; Fig. 1). Between chick and human, the overall identity is 61%, and the homology is 71%, lower than those between human and mouse of 75 and 95%, respectively. Like the human counterpart, chick collagen XVIII consists of 11 non-collagenous (NC) domains, which are separated by 10 collagenous domains (COL). The N-terminal COL-10 is 71 amino acid residues long with 2 imperfections in the GXY repeats, whereas the corresponding COL-10 in human collagen XVIII is 25 residues (21). Six cysteine residues are present in chick collagen XVIII, two of which are located in NC11, and four are located in the endostatin domain of NC1. The presence of cysteine residues was conserved between chick and human. A potential N-linked glycosylation site located in COL8 was conserved between chick and human. Another potential N-linked glycosylation site, present at the beginning of the NC11, was not conserved. There are eight potential GAG glycosylation sites in chick collagen XVIII, three of which are in fact GAG attachment sites (see below).
In human and mouse tissues, the mRNAs of collagen XVIII ␣1 exist in a short and a long splice variant, with the long version occurring prominently in liver (1,9,21). Attempts to obtain the long version by rapid amplification of cDNA 5Ј ends using poly(A) ϩ RNAs from chick E6 -7 liver and kidney tissues failed (data not shown). The absence of the long collagen XVIII splice variant in chick was confirmed by Northern blot analysis: poly(A) ϩ RNAs from E6 -7 chick liver, kidney, heart, and brain probed with a cRNA probe synthesized from the 5Ј clone P12T7P21 (see Fig. 1) showed a single band of 5.5 kb (Fig. 2). The finding was consistent with the previous Northern blots probed with the 3Ј p10d cRNA probe (see Fig. 1), that also showed a single band at 5.5 kb (10).
Expression and Characterization of Recombinant Chick Collagen XVIII-Using P53T7P64, P12T7P21, INT2T7ANTI22, and p10d as templates and nucleotide sequences from the 5Ј and 3Ј end of the collagen XVIII-coding sequence as primers, we obtained a continuous cDNA for collagen XVIII by PCR and cloned the cDNA into an expression vector. The expression construct was stably transfected into EBNA T293 cells, and the expressed protein was isolated from the cell culture supernatant using Q-Sepharose. The fraction eluted with 0.5 M NaCl contained non-glycosylated collagen XVIII core protein with an apparent molecular mass of 187 kDa. The GAG-free core protein accounted for less than 5% of total expressed collagen XVIII. The high salt fraction eluted from the ion exchanger contained GAG-glycosylated collagen XVIII, which appeared as a smear with an average molecular mass of 350 kDa (Fig. 3,  lane 3). In some batches a small portion of non-glycosylated core protein in the high salt fraction was also detectable. When non-reduced, unboiled samples were loaded onto the gels, the recombinant collagen XVIII appeared as a smear with a molecular mass between 187 and 1000 kDa (Fig. 3, lane 4). The main part of the smear showed a molecular mass between 700 and 1000 kDa, identical to the size of non-reduced, unboiled collagen XVIII from vitreous body (Fig. 3, lane 2). It most likely represents the homotrimer of the collagen XVIII ␣1 chain (10). The lower part of the smear migrated at the same position as the boiled and reduced collagen XVIII from vitreous body and most likely represents the collagen XVIII monomer. The presence of monomer in recombinant collagen XVIII indicates that trimerization is incomplete.
Collagen XVIII expressed in serum-free medium contained a higher proportion of core protein, and mobility of the GAGglycosylated product on SDS-PAGE was higher on average (Fig. 3, lane 5) than that expressed in the presence of serum (Fig. 3, lane 3). It indicates that serum in the culture medium not only promoted the degree of glycosylation of collagen XVIII but also affected the length or negative charge of the GAG side chains.
To prove the collagenous nature of the expressed collagen XVIII, we digested samples with collagenase and found that the protein was degraded to several fragments. The molecular mass of the main fragment was 35 kDa (Fig. 4, lane 4), identical to the size of the dominant, collagenase-resistant fragment of vitreous body collagen XVIII (Fig. 4, lane 2). The fact that the band was recognized by the NC1-specific 6C4 monoclonal antibody and the size of band was identical to that of collagen XVIII NC1 confirmed that the fragment is the NC1 domain. To test collagenase for potential protease contamination, vitreous body samples were treated with collagenase and probed with an anti-tenascin antibody. Tenascin, a 220-kDa non-collagenous protein, was not degraded by the collagenase (Fig. 4, lanes  5 and 6), showing that the collagenase preparations were free of proteases.
Collagen XVIII expressed in EBNA cells was proteolytically very sensitive. The degradation fragments on the blots ranged from 51 to 75 kDa (Fig. 3, lane 3, and Fig. 4, lane 3). Based on their reactivity with the 6C4 monoclonal antibody, they were from the C-terminal part of the molecule. To prevent degradation, we added a mixture of commercially available protease inhibitors including 3 M aprotinin, 50 M cathepsin B inhibitor II (Calbiochem), 100 M E-64, 50 M leupeptin, and 100 M pepstatin into cell culture medium and the solutions used for Q-Sepharose chromatography. The presence of inhibitors resulted in slightly different degradation fragments; however, the average intensity and size of the bands were roughly the same as those from collagen XVIII expressed without protease inhibitors present (data not shown). Cathepsin B inhibitor II and E64, which have been reported to inhibit the release of endostatin at the C terminus (27), was not effective in preventing degradation of the whole molecule. In addition, the results showed that the yield of collagen XVIII increased with longer incubation periods; however, at the same time, the degradation of collagen XVIII increased even more. As a compromise, we collected the culture supernatant for purification in 24-h intervals.
GAGs and Their Attachment Sites of the Recombinant Collagen XVIII-Similar to the highly glycosylated collagen XVIII from vitreous, amnion, kidney, and meninges (10), recombinant collagen XVIII appeared in Western blots as a smear with an apparent molecular mass of 350 kDa (Fig. 5A, lane 1). Treatment with heparitinase caused a slight decrease in size to 320 kDa and the appearance of a faint band in the size of the core protein at 187 kDa (Fig. 5A, lane 2). When treated with chon- droitinase ABC, the size of protein shifted to 240 kDa and the core protein band of 187 kDa became more prominent (Fig. 5A,  lane 4). When treated with a mixture of heparitinase and chondroitinase, the recombinant protein appeared as two sharp, overlapping bands with molecular mass round 187 kDa, showing that the recombinant collagen XVIII expressed in 293-EBNA cells is a hybrid HSPG/chondroitin sulfate proteoglycan (Fig. 5A, lane 3). An identical GAG composition of HS/CS was also detected with recombinant collagen XVIII expressed in chick meningeal cells (data not shown). To compare GAG composition of recombinant with that of the natural collagen XVIII, we treated vitreous body collagen XVIII with the same enzymes. The results confirmed previous experiments (10,19), showing that collagen XVIII from all chick tissues so far tested (vitreous body, kidney, amnion, and meninges) is a heparan sulfate proteoglycan with no participation of chondroitin sulfate (Fig. 5B).
Recombinant collagen XVIII has eight potential GAG glycosylation sites, five of which are located in NC-11. Other GAG attachment sites are located each in COL-10, NC-9, and NC-8. The sequences of potential GAG attachment sites are listed in Fig. 6. Sites 2, 3, and 4 appear as a group of three SG residues separated by 3 and 7 residues, respectively. Sites 3, 7, and 8 are conserved between chick, Xenopus (22), mouse, and human (Fig. 1). To reveal which site is in fact glycosylated, we used site-directed mutagenesis and expressed the mutated protein in EBNA cells. TM278, a triple mutant in which the serine residues in sites 2, 7, and 8 were substituted by alanines, appeared as a non-glycosylated protein of 187 kDa (Fig. 7, lane  1), showing that the glycosylation sites are located at these 3 sites. A double mutant DM78, in which the serine residues of sites 7 and 8 were substituted with alanines, showed a smear with an average molecular mass of 235 kDa (Fig. 7, lane 2). The GAG side chain remaining in DM78 molecules (site 2) shifted in molecular mass only after digestion with heparitinase (Fig. 8A,  lane 2) or a mixture of heparitinase and chondroitinase ABC (Fig. 8A, lane 3) but not by chondroitinase alone (Fig. 8A, lane  4), demonstrating that the carbohydrate side chain on site 2 is a heparan sulfate GAG. The double mutant DM28, in which the serine residues of sites 2 and 8 were substituted by alanines, showed a smear with an average molecular mass of 280 kDa (Fig. 7, lane 3). The remaining GAG (site 7) in the DM28 mutant collagen XVIII was completely digested with the mixture of heparitinase and chondroitinase ABC (Fig. 8B, lane 3) and partially digested with either heparitinase (Fig. 8B, lane 2) or chondroitinase (Fig. 8B, lane 4), suggesting the GAG attached at site 7 is a chondroitin sulfate or heparan sulfate GAG. The abundance of the core protein after heparitinase or chondroitinase ABC treatment (Fig. 8B, lanes 2 and 4) showed the major proportion of GAGs attached to site 7 is chondroitin sulfate. The double mutant DM27, in which the serine residues of sites 2 and 7 were substituted with alanine, showed a smear with an average molecular mass of 290 kDa (Fig. 7, lane 4). Similar to site 7, the carbohydrate at site 8 was sensitive to heparitinase (Fig. 8C, lane 2) and chondroitinase ABC (Fig. 8C,  lane 4) and could only be completely digested with both enzymes (Fig. 8C, lane 3), suggesting that the GAG attached on site 8 is either chondroitin sulfate or heparan sulfate. Based on the gel pattern of lanes 2 and 4 in Fig. 8C, the dominant GAG at site 8 is chondroitin sulfate.
The observation that EBNA cells secrete more collagen XVIII core protein when transfected with doubly or triply mutated as compared with singly or non-mutated collagen XVIII cDNAs indicates that not all of the three attachment sites were occupied with GAG side chains in every collagen XVIII molecule. This would also explain that core protein bands appeared in all of the mutant samples in Fig. 7. The fact that the collagen XVIII core protein appeared after heparitinase (Fig. 5A, lane 2) and chondroitinase ABC treatment (Fig. 5A, lane 4) suggests that some of molecules contain only chondroitin sulfate GAG or only heparan sulfate GAG.
Potential glycosylation sites 2, 3, and 4 are present as three SGs following each other in short sequence. SG residues of site 2 and 3 are separated by three residues (DFG; Fig. 6), and the SG residues of site 3 and 4 are separated by seven residues (AGDRHHP; Fig. 6). The results presented above showed that the serine at site 2 was glycosylated, and the GAG chain attached to this site was heparan sulfate. To investigate whether the serine residues in sites 3 and 4 has an effect on glycosylation of serine in site 2, we substituted the serine residues in sites 3, 7, and 8 to alanines (TM378) and serine residues in sites 4, 7, and 8 to alanines (TM478). TM478 ap- peared as a smear with an average molecular mass of 235 kDa (Fig. 9A, lane 3) and a banding pattern similar to that of DM78 shown in Fig. 9A, lane 4. The GAG of TM478 was a heparan sulfate (data not shown), showing that the mutation of serine at site 4 does not affect the glycosylation at site 2. Interestingly, TM378 appeared as a smear with an average molecular mass of 290 kDa (Fig. 9A, lane 2). The GAG at site 2 was sensitive to both heparitinase (Fig. 9B, lane 1) and chondroitinase ABC (Fig. 9B, lane 2) but was entirely digested only with the mixture of both enzymes (Fig. 9B, lane 3). According to the staining intensities of the core protein bands after the enzyme digestions (Fig. 9B), the dominant GAG side chain of TM378 at glycosylation site 2 is chondroitin sulfate. The result suggests that the assembly of HS or CS at site 2 is affected by the adjacent serine at site 3.
Binding of Collagen XVIII to Retinal and Tectal Basement Membranes-It is well established that collagen XVIII is a constituent of basement membranes (10,11). To investigate whether the recombinant collagen XVIII and its mutated versions could bind to basement membranes, we incubated isolated retinal basement membranes and sections of chick embryos with collagen XVIII and tested for binding by using the Myc tag fused to the recombinant protein. As shown in Fig.  10a, the recombinant collagen XVIII, which contained mixed CS and HS and mutant DM78 (Fig. 10, c and e), which has exclusively HS, bound to the retinal basement membrane. The mutant collagen XVIII TM278, which has no GAG side chains (Fig. 10b), and DM27 (Fig. 10, d and f) and DM28 (not shown), which have prominently CS side chains, did not bind to the basement membrane. The binding data showed that the heparan sulfate side chain on site 2 was solely responsible for the basement membrane binding.

DISCUSSION
Cloning of Chick Collagen XVIII by PCR-For cloning of the entire collagen XVIII we amplified the missing segments from a random-primed cDNA library by extending a previously isolated 3Ј sequence (p10d) (10). The T3 or T7 promotor sequence, part of the vector, was used as the forward primer. The only requirement for the PCR extension was to design a specific and a nested reverse primer from the known collagen XVIII cDNA. The method turned out to be a very reliable alternative to the more elaborate 5Ј rapid amplification of cDNA 5Ј ends. Furthermore, it not only was successfully used to obtain the 5Ј half of collagen XVIII but also allowed us to extend the extremely long sequence of laminin ␣1 from the very 5Ј end in the 3Ј direction. Thus, the method was applicable to extend DNA sequences from any location of the molecule in both directions.
Chick Collagen XVIII-The protein sequence of the chick collagen XVIII showed a high homology to human and mouse collagen XVIII with all 10 collagenous and all 11 non-collagenous domains preserved. The most conserved regions were in the C-terminal NC1 domain, with 90% homology in the endostatin region. Recombinant chick collagen XVIII was more sensitive to proteolytic degradation than the endogenous one in vitreous body. The reason, we speculate, is that triple helix formation is incomplete, as shown by presence of the collagen XVIII monomers in non-boiled non-reduced samples (Fig. 3). The dominant fragment was from the C-terminal part of the protein, suggesting that collagen XVIII has unique features at its C terminus, NC1, part of the molecule making it very sensitive to degradation. We also have preliminary data showing that collagen XVIII from vitreous body and the meninges also has its C-terminal part clipped off, and a naturally existing endostatin-like peptide was identified in vitreous body. 2 Collagen XVIII in human and mouse exists in a short and a long version (1,9,21), whereby the long version is highly expressed in liver. Northern blots with both 3Јand 5Ј probes and 5Ј rapid amplification of cDNA 5Ј ends did not reveal a long version of collagen XVIII in chick tissues. We speculate that the chick has only the short version of the protein.
GAG Glycosylation of Collagen XVIII-Eight potential GAG attachment sites were located in the chick collagen XVIII.  TM278 (lane 1), DM78 (lane 2), DM28 (lane 3), and DM27 (lane 4). The blot was probed with the 6C4 antibody. The serines at sites 2, 7 and 8, 7 and 8, 2 and 8, and 2 and 7, were substituted by alanines in TM278, DM78, DM28, and DM27, respectively. TM278 appeared as sharp bands with a molecular mass around 187 kDa, corresponding to the collagen XVIII core protein. The other mutants appeared as diffuse bands, indicating that long GAG chains are attached to the core protein. The location of the mutations are indicated by stars in the diagrams, and the remaining GAG-glycosylated sites are circled. Standard molecular masses were indicated. The collagen XVIII core protein band is indicated by an arrowhead.  4). Untreated DM28 and DM27 appeared as broad bands with a molecular mass around 320 kDa (B and C, lanes 1). The sizes of the proteins were only reduced to a sharp band of 187 kDa after treatment with both heparitinase and chondroitinase (B and C, lanes 3). Treatment with heparitinase shifted the molecular weight only slightly and led to a minor increase in core protein (B and C, lanes 2). Chondroitinase digestion led to a more prominent shift and a major increase of core protein (B and C, lanes 4). The collagen XVIII core protein band is indicated by an arrowhead.
Three of these sites carry GAGs, confirming that collagen XVIII is a proteoglycan (10). All three attachment sites and their adjacent amino acids were conserved in human, mouse, and Xenopus collagen XVIII (22), consistent with the notion that collagen XVIIIs from other species are also proteoglycans (21). The chick SG consensus sites 1-6 were reduced in human and mouse to 1 single site, suggesting an evolutionary pressure to maintain at least one SG site at this position of the protein. The recombinant chick collagen XVIII expressed in either human 293-EBNA cells or chick meningeal cells turned out to be a hybrid chondroitin sulfate proteoglycan/HSPG, in contrast to the endogenous collagen XVIII from chick vitreous body, amnion, kidney, and meninges, which has only heparan sulfate side chains. Obviously, the in vitro expression of the protein does not entirely recapitulate the normal posttranslational modifications that occurs in the chick embryo, and the unusual posttranslational modification seems irrelevant to the species from which the cells derived.
Previous studies show that the GAGs are connected to the core proteins via SG consensus sequences. By comparing peptide sequences close to GAG attachment (16,17), it was found that the presence of acidic amino acids before or after the SG sites enhances the chance of a protein being glycosylated (16). Furthermore, multiple SGs in short sequence increase the chance of a site becoming connected with a heparan sulfate side chain (17). Recent studies, however, showed that the clear-cut identification of an SG consensus sequence as a glycosylation site or the prediction of the type of glycosylation at a specific site is impossible. Some peptide domains distant to the SG sites were shown also to be important in the glycosylation (23,24). Furthermore, the fact that endogenous and recombinant chick collagen XVIII is different in its GAG glycosylation suggests that additional regulatory factors such as the cell type in which the protein is expressed and cell culture conditions are also important for glycosylation.
All three GAG attachment sites identified in collagen XVIII fulfill the requirements postulated for GAG glycosylation, as they have acidic residues ahead of the SG consensus sequences, and in one case (site 2), three SGs follow in short distance to each other. Indeed, site 2 is connected to the HS GAG, as expected, and the mutation of the SG at site 3 converts the HS into a CS side chain. However, sites 1, 3, and 4 also fulfill the requirements for GAG attachment sites, yet they are not glycosylated. Additional factors that turned out to be important for glycosylation were (a) the cell type expressing the proteins and (b) the growth condition of the cells. Collagen XVIII from vitreous body, amnion, the meninges, and kidney is a HSPG and has no CS side chains, in contrast to the recombinant protein in which 2 of the 3 GAG side chains are substituted for CS. It could mean that cells from the ciliary body, amnion, kidney, and the meninges, all of which express collagen XVIII (10,26), are capable of producing the fully heparan sulfateglycosylated protein, whereas the 293-EBNA cells cannot. This is probably not the case, since chick meningeal cells and 293-EBNA cells produce the same incorrect GAGs in vitro. The different glycosylation in vivo and in vitro could mean that growth conditions of cells in a living organism promote the glycosylation in a way that has not been reproduced in vitro. It was remarkable that the presence of fetal calf serum in the culture medium had a strong influence on the properties of the GAG chains, and it is conceivable that growth factors or signaling molecules are required to promote the correct glycosylation. The most reliable predictor for GAG glycosylation in collagen XVIII was the conservation of the SG consensus sites in different species. All three sites with GAGs were conserved between human, mouse, Xenopus, and chick, and an obvious ID for a GAG attachment site is the conservation of the site throughout different species.