Human Recombinant α1(V) Collagen Chain

Human embryonic kidney cells (293-EBNA) have been transfected with the full-length human α1 chain of collagen V using an episomal vector. High yields (15 μg/ml) of recombinant collagen were secreted in the culture medium. In presence of ascorbate, the α1(V) collagen is correctly folded into a stable triple helix as shown by electron microscopy and pepsin resistance. Circular dichroism data confirm the triple-helix conformation and indicate a melting temperature of 37.5 °C for the recombinant homotrimer. The major secreted form is a 250-kDa polypeptide (α1FL). N-terminal sequencing and collagenase digestion indicate that α1FL retains the complete N-propeptide but lacks the C-propeptide. However, α1FL might undergo a further N-terminal trimming into a form (α1TH) corresponding to the main triple-helix domain plus the major part of the NC2 domain. This processing is different from the one of the heterotrimeric (α1(V))2α2(V) and could have some physiological relevance. Analysis of cell homogenates indicates the presence of a 280-kDa polypeptide that is disulfide-linked through its C-terminal globular domain. This C-propeptide is rapidly cleaved after secretion in the medium, giving the first evidence of a C-terminal processing of recombinant fibrillar collagens. Rotary shadowing observations not only confirm the presence of a globular domain at the N-terminal end of the molecule but reveal the presence of a kink within the triple helix in a region poor in iminoacids. This region could represent a target for proteases. Together with the thermal stability data, these results might explain the low amount of (α1(V))3 recovered from tissues.

Fibrillar collagens represent the most abundant structural proteins in the extracellular matrix. They all participate in the elaboration of the fibrillar network and thus to the extracellular matrix architecture (1). However, the minor collagens V and XI can be distinguished from the others by their capacity to control fibrillogenesis (2,3). All fibrillar collagens are composed of a major triple-helix domain (COL1) flanked by two noncollagenous domains, namely the N-propeptide (NC2, COL2, and NC3) and the C-propeptide (NC1). Whereas collagens I, II, and III undergo a processing that reduces the molecule mainly to the triple-helix domain, collagen V retains a large part of the N-propeptide in the mature molecule (4,5). This propeptide forms a globular domain that could dictate the fibril diameter by sterically inhibiting the accretion of collagen I within heterotypic fibrils (6). In addition to the importance of the Npropeptide retention, heterotypic fibrils were shown to be thinner in tissues where the amount of collagen V is particularly high (7)(8)(9)(10)(11), and conversely, a reduction in the proportion of collagen V molecules alters the regulation of fibrillogenesis (12). Significantly, genetic alteration of collagen V molecules impairs the control of matrix assembly (13)(14)(15)(16). Therefore, despite being a quantitatively minor collagen, collagen V is involved in fundamental processes such as development and human connective tissue disorders.
So far, studies on collagen V function concerned the most abundant and widely distributed molecular form, i.e. the heterotrimer (␣1(V)) 2 ␣2(V). Nevertheless, collagen V also occurs with different chain associations: ␣1(V)␣2(V)␣3(V) described only in human placenta (17) and the homotrimer (␣1(V)) 3 . Homotrimers have not been isolated from tissues but were reported in Chinese hamster cell cultures (18). Moreover, previous reports inferred that it exists also in vivo based on the ratio of ␣1 versus ␣2 chains found in chick embryo crop (19) and human bone extracts (20). However, because of an inability to isolate this molecular form from tissues, it has not been possible to formally study the (␣1(V)) 3 homotrimer, which is thus almost completely uncharacterized.
We used a eucaryotic recombinant approach to engineer and generate ␣1(V) molecules in sufficient amounts for biochemical characterization and further functional analysis. The data presented here concern the first structural and biochemical study of ␣1(V) homotrimers.

MATERIALS AND METHODS
Construction and Transfection of the Expression Vector-Four clones coding for the ␣1(V) collagen chain have been kindly provided by Dr. Takahara (Biotechnology Research Laboratories, Takara Shuzo Co., Japan).
The 508 clone encodes the region starting with base 1 to base 1021, the clone 302 contains the sequence from 717 to 3430, the clone 401A contains the sequence from 3430 to 5240, and 401D contains the sequence from 5240 to 5676 (21).
The cDNAs were first subcloned from an EcoRI site in M13 into an EcoRI site of the Bluescript plasmid SK. As the ␣1(V) sequence itself comprises three internal EcoRI sites, several intermediate plasmids had to be designed.
The first step consisted in subcloning a BamHI-EcoRI (750 -3430) fragment of 302 into a Bluescript SK plasmid; this plasmid was called A. The main part of the vector was then constructed by cloning an EcoRI-AccI fragment (bases 3430 -5160) from clone 401 into plasmid A. The resulting plasmid contained bases 750 -5160 and was called B.
The 5Ј end of the cDNA was generated by combining plasmid 508 and 302 up to the SphI site. A fragment BamHI-XhoI of clone 302 was subcloned in KS Bluescript; this plasmid was called plasmid C. Clone 508 was digested with BamHI, and the fragment was subcloned into plasmid C, generating a plasmid called D, which encoded base 1-1900. A 1620-base pair NotI-SphI fragment from plasmid D containing the translational start site and the cDNA sequence up to SphI was subcloned into plasmid B digested with NotI-SphI. The resulting clone, plasmid E, contained sequences 1-5160 of collagen V cDNA.
Concerning the 3Ј end of the cDNA, a polymerase chain reaction product consisting of the last 400 bases of the cDNA (bases 5242-5643) was generated in order to remove an EcoRI site located at base 5671 and to keep the stop codon. The primers were TATATCGATCTAGCCCAT-GAAGCAAGCCGG, which generate a ClaI site and AGTGAAT-TCAAGCGTGGGAAACTGCTCTCC. The EcoRI-ClaI fragment was subcloned in Bluescript SK and sequenced. It was then cloned into the 3Ј-most EcoRI site (5240) of a plasmid 401 first digested with BamHI and religated to remove the EcoRI site at base 3430. A SalI fragment containing the sequences 5160 -5676 was excised from this plasmid and cloned into the single SalI site in plasmid E to generate the full-length cDNA (base 1-5750) in the Bluescript vector KS. The full-length cDNA was excised using KpnI and subcloned into the mammalian episomal expression vector pCEP-4 (InVitrogen).
Sequences at junction points were checked. The expression plasmid was amplified by Qiagen prep and transfected into human embryonic kidney 293-EBNA cells by electroporation (960 microfarads, 250 V). This cell line constitutively expresses the EBNA-1 protein from the Epstein-Barr virus, allowing episomal replication of the vector. 18 g of the DNA for 7 million cells were used. The transfected cells were selected by hygromycin (300 g/ml) during 15 days.
Protein Production and Characterization-293-EBNA-resistant cell media were tested for expression of ␣1(V) chains by 6% SDS-PAGE 1 , followed by Coomassie Blue staining. For immunostaining, proteins were electrotransferred onto polyvinylidene difluoride membranes (Immobilon-P; Millipore, Molshein, France) overnight in 10 mM CAPS, pH 11, 5% methanol. After saturation, membranes were incubated with polyclonal antibodies against collagen V (22), followed by secondary antibodies conjugated to alkaline phosphatase. Collagen V standards (pepsinized bovine bone and intact human embryonic bone heterotrimeric collagen V) were purified and characterized as described previously (23). For further protein characterization and purification, cells were grown in serum-free medium in the presence or absence of sodium ascorbate (50 g/ml). In some cases, cell layers were solubilized on ice for 20 min in 1 ml of lysate buffer (100 mM NaCl, 20 mM Tris-HCl, pH 7.6, 25 mM EDTA, 5 mM N-ethylmaleimide, 2 mM phenylmethanesulfonyl fluoride, 0.1% SDS, 1% Nonidet P-40, 0.1% Triton X-100). Cell lysate was centrifuged, and the supernatant was analyzed by SDS-PAGE electrophoresis followed by electrotransfer and immunostaining as described above.
Protein Purification-Large amounts of serum-free medium from transfected 293-EBNA cells were collected every 48 h and stored at Ϫ20°C before dialysis against 50 mM Tris-HCl, pH 8.6. After centrifugation, the pellet was resuspended in 0.1 M acetic acid and stored at Ϫ20°C. The supernatant was passed over a DEAE column (Econo column, 2.5 ϫ 15; Bio-Rad) and subsequently eluted with a linear 0 -0.5 M NaCl gradient. Pools containing purified recombinant ␣1(V) triplehelix domain were recovered from 0.25 M NaCl elution, dialyzed against 0.1 M acetic acid, and stored at Ϫ20°C until used.
Proteolytic Digestion-The different samples were digested with pepsin in 0.5 M acetic acid for 3 h at 20°C at an enzyme/substrate ratio of approximately 1:5. For bacterial collagenase digestions, freshly collected serum-free medium was dialyzed against 50 mM Tris-HCl, pH 8.6, and centrifuged, and the pellet was resuspended in a small volume of 50 mM Tris-HCl, 150 mM NaCl, 6 mM CaCl 2 , pH 7.4. Collagenase digestions was performed at 37°C for 3 h at an enzyme/substrate ratio of 1:7 (Advanced Biofacture). Digestion products were analyzed by SDS-PAGE electrophoresis.
Analytical and Electron Microscopy Methods-Amino acid compositions were determined after hydrolysis under vacuum (6 N HCl, 115°C, 24 h) in the presence of 2-mercaptoethanol in a Pico Tag system (Waters) with a Beckman amino acid analyzer. Amino acid sequence analysis was performed by automated Edman degradation using an Applied Biosystems 473A protein sequencer.
Triple-helix conformation and thermal stability of the recombinant homotrimer (␣1TH) were analyzed by circular dichroism. Spectra were recorded at 4°C in 0.05 M acetic acid on a CD6 Jobin Yvon spectropolarimeter equipped with a variable temperature unit. For comparison, we used pepsinized heterotrimeric collagen V extracted from amniotic/ chorionic membrane of human placenta as described previously (23). However, it is worth mentioning that the N-terminal extensions of the pepsin-treated heterotrimer are completely cleaved off, whereas the homotrimeric ␣1TH molecules retain the major part of the NC2 domain. Measurements were done with a 1-mm path length cuvette at a constant rate of 1 nm/min with a 0.2-nm resolution. Thermal transition curves were obtained by monitoring (⌰) 222 nm as a function of temperature.
For rotary shadowing, samples were diluted to 10 g/ml with 0.1 M acetic acid, mixed with glycerol (1:1), sprayed onto freshly cleaved mica sheets, and immediately placed on the holder of a MED 010 evaporator (Balzers). Rotary shadowing was carried out as described previously (23). Observations of replicas were performed with a Philips CM120 microscope at the CMEABG (Center de Microscopie Electronique Appliquée à la Biologie et à la Géologie, Université Claude Bernard, Lyon I).

Expression of Recombinant ␣1(V)
Chains-Electrophoresis analysis of serum-free medium from ␣1(V)-transfected 293-EBNA cells demonstrated an additional 250-kDa protein band referred to ␣1FL, which is absent in nontransfected cell medium (Fig. 1A). A concentration in the range 15-20 g/ml for the recombinant ␣1FL chains was estimated based on the intensity of Coomassie Blue staining after electrophoresis.
The secreted ␣1FL chains rapidly underwent degradation into a band referred to ␣1TH that migrated to a position identical to pepsinized ␣1(V) collagen isolated from tissue (Fig. 1A). The identity of the recombinant ␣1(V) chain products was confirmed by immunoblot analysis with polyclonal antibodies against pepsinized collagen V (Fig. 1B).
The N-terminal sequence AQPA for the upper band, ␣1FL, was determined by Edman degradation after electrotransfer. This sequence starts with the first amino acid of the ␣1(V) chain after the predicted peptide signal cleavage site and indicates that the entire N-propeptide is present. This is in agreement with the slower migration of ␣1FL compared with intact ␣1(V) chains isolated from human bone (Fig. 1A). Indeed, the tissue form of collagen V undergoes N-terminal processing that removes a large part of the N-terminal propeptide including the PARP domain (5,20). The ␣1TH migration is not affected by pepsin treatment, attesting that this band corresponds to the main triple-helix domain of ␣1(V) chain (data not shown). However, N-terminal sequencing performed on the ␣1TH form gave the peptide sequence RFGGGGDAGS with starting position at residue 525. This result indicates that the ␣1TH form retains the major part of the NC2 domain.
A first purification step showed that ␣1FL chains precipitate selectively at pH 8.6 in Tris-HCl buffer. Although ␣1TH forms coprecipitate at this pH, large amounts of ␣1TH remain soluble and can be eluted from a DEAE column with 0.25 M NaCl as a single protein (Fig. 1A).
Recombinant ␣1(V) Chains Characterization-When samples were not reduced before electrophoresis (Fig. 1A), the ␣1FL band is not converted into a high molecular mass product, indicating the absence of disulfide-bonded trimers in the culture medium. Immunoblotting of transfected cell homogenates showed a band staining for collagen V antibodies with a slower migration than ␣1FL from medium (Fig. 2). The difference in migration is in good accordance with the predicted molecular mass of the ␣1(V) C-propeptide (less than 30 kDa). Moreover, a high molecular mass product is observed in unreduced cell homogenate and likely corresponds to disulfide bonded trimers. Taken together, these results indicate that, although the C-propeptide of synthesized recombinant chains is involved in disulfide bonding, it is rapidly cleaved after secretion in the medium. In addition, collagenase digestion of ␣1FL chains from cell medium revealed a unique 86-kDa fragment ( Fig. 3) with the same N-terminal sequence as ␣1FL (AQPA), indicating that this band corresponds to the entire unprocessed N-propeptide. When nonreduced (Fig. 3), the 86-kDa fragment migration is converted into a slightly faster migrating product. This difference in migration between the reduced and nonreduced forms of this fragment attests to a correct folding of the N-propeptide via intrachain disulfide bonds.
The ␣1FL chains secreted by transfected cells were triple helices as indicated by resistance of the COL1 domain to pepsin digestion (Fig. 4). When ascorbate was omitted, ␣1FL was secreted as single chains or loosely formed triple helices sensitive to pepsin digestion (Fig. 4). Amino acid composition of media in the presence or absence of ascorbate revealed the absence of hydroxyproline in the medium devoid of ascorbate. In contrast, purified ␣1FL homotrimers contains 112 Hyp residues/1,000, which is in agreement with the maximal value predicted from cDNA ␣1(V) sequence (115 Hyp residues/1,000 for the COL1 domain and 124 including COL2). Although the recombinant ␣1(V) chains were poorly hydroxylated at lysyl residues (6/1,000 instead of 31/1,000 for tissue-purified ␣1(V) chains), the triple-helix domain is, at least in part, glycosylated, since the corresponding band (␣1TH) stains with Schiff's reagent (data not shown).
Thermal stability of the recombinant homotrimers was investigated by circular dichroism (Fig. 5). The CD spectrum obtained for ␣1TH presents similar features to that monitored for the human pepsinized heterotrimeric collagen V (Fig. 5A). They both show a negative minimum peak at 197 nm and a positive peak around 222 nm attesting for the triple-helix conformation in collagen (24). The thermal transition curves ex-hibit a transition at 37.5°C for the recombinant homotrimer and a transition at 39.5°C for the heterotrimer.
Rotary Shadowing-Rotary shadowing of ␣1FL homotrimers revealed 300-nm-long molecules with a large globular domain at one extremity, the N-propeptide (Fig. 6A). The small triplehelix domain COL2 within the N-propeptide is not visible in our preparations. The absence of a small globular domain at the C-terminal end of the molecule confirms the cleavage of the C-propeptide on secreted molecules. In addition, a kink at about 70 nm from the N-terminal extremity of the homotrimers is almost always observed on molecules (Fig. 6B). A 120-kDa fragment often occurred in our preparations and seems to correspond to further trimming of the triple-helix domain ␣1TH (Fig. 6C). This fragment is also generated by ␣1FL pepsin digestion (Fig. 4B), likely indicating a micro-unfolding area within the triple-helix domain of ␣1(V) homotrimer. Interestingly, the amino acid sequence of ␣1(V) contains a unique segment of 13 Gly-X-Y triplets between Gly 793 and Ile 831 containing only one proline. This region, poor in imino acids, is liable to generate a flexible site in the triple helix. The position of this sequence corresponds to the location of the kink observed in the triple-helix domain of rotary-shadowed recombinant molecules. To test whether the 120-kDa fragment arose from proteolytic cleavage of the flexible site observed by rotary shadowing, amino acid sequencing was performed on the corresponding band by Edman degradation. The sequence obtained, DGPPGHPGKEGP, perfectly matched the ␣1(V) triple-helix sequence at positions 759 -770, about 30 amino acids upstream the micro-unfolded region. Although this cleavage site did not occur in the imino acid-poor region, it is a difficult task to estimate in what extent the triple-helix regions flanking the flexible site might be altered. Thus it cannot be excluded that these neighboring regions represent a target for proteases as well. DISCUSSION Together with collagen XI, collagen V was called a minor collagen because of its low level of expression in tissue. Additionally, the biochemical extraction of its intact form is difficult, and the amount of material available is quite limited. Its recombinant production seemed an indispensable step to obtain enough high grade material. Recombinant approaches have been widely used to obtain human full-length collagen chains including ␣1(I), ␣1(II), and ␣1(III) (25)(26)(27). Baculovirus insect cells and human cells have been proven both useful in this regard. Human embryonic kidney 293 cells are known to synthesize few, if any, extracellular matrix proteins and to produce recombinant proteins in high quantity when manipulated by genetic engineering (28,29). Moreover, as it has been shown recently, these cells ensure correct post-translational modifications specific to collagens (30). To this end, we chose these cells to provide ␣1(V) chains. As shown here, 15-20 g/ml ␣1(V) chains were obtained, which is above the yields obtained for other human fibrillar collagens in other eucaryotic systems (2 g/ml for ␣1(III) and 0.6 g/ml for ␣1(II)). The results presented here demonstrate that the 293 cells expression system is a highly efficient method for the production of substantial amounts of ␣1(V) homotrimer. This quantitative aspect, although important to consider for further experiments, had to be correlated to qualitative investigations to ensure the biochemical and functional integrity of the recombinant protein.
The ␣1(V) chains we produced are present in two major species called ␣1FL and ␣1TH, which correspond, respectively, as will be discussed below, to an N-terminal unprocessed form and a fully processed one. Both are resistant to pepsin digestion and are thus triple helical in the presence of ascorbate. As commonly observed with fibrillar collagens, the requirement of ascorbate to obtain the triple-helix formation is a prerequisite in our experiments (31,32). It allows the hydroxylation of proline in a range comparable to the one predicted from the cDNA sequence. It has been shown for collagen III and for collagen XII that proline hydroxylation could be essential not only for the stability of the helix but also for the triple-helix nucleation itself, thus canceling the role of the disulfide bonds in this particular process (33,34). This has to be investigated for our model. Concerning the hydroxylation of lysine, a weak rate is observed, which is the case for other fibrillar recombinant collagens (27). The subsequent glycosylation of the lysines that are hydroxylated, however, occurs since carbohydrates are present in the truncated molecule corresponding to the COL1 domain only. As a consequence of correct hydroxylation of the produced homotrimer, the triple helix formed is properly folded and stable at physiological temperature. Although the major form of collagen V in tissues is the heterotrimer (␣1(V)) 2 ␣2(V), several observations hint at the existence of the homotrimer not only in cell and tissue culture systems (18,19,36) but also in tissues (20). So far, the only experiments showing that ␣1(V) chains are able to reform stable triple-helix molecules came from in vitro renaturation experiments (35). However, the renatured product showed a T m of 35°C, whereas our results clearly indicate that the melting temperature of the recombinant homotrimer is more compatible with in vivo conditions (T m ϭ 37.5°C). Furthermore, our results provide the first analysis of biochemical and structural properties of the homotrimers. Indeed, rotary shadowing observations showed that the homotrimers exhibit a regular kink at 70 nm from the N-terminal end of the molecule. This kink is correlated with the paucity of prolyl residues in this region. Since N-terminal sequencing of the 120-kDa degradation product corresponds to this area of the molecule, this region is likely to be in vivo as well a target for proteases. The susceptibility of the homotrimers to proteases together with the melting temperature close to 37°C may explain the difficulties in detecting substantial amounts from tissues. Fibrillar collagens undergo a processing involving two specific proteinases that removes the N-and C-propeptides. This processing is complete in collagen I, II, and III, leaving only two telopeptides flanking the triple helix (37). The N-terminal processing is, however, different in collagen V. In the (␣1(V)) 2 ␣2(V) tissue form, the ␣2(V) chain is unprocessed, and different authors agree with the fact that for ␣1(V) chains only, sequences of NC3 are lost and that COL2 and NC2 remain intact (4,5,20,38). Thus, in the heterotrimer, a globular domain persists at the N-terminal end of the triple-helix domain, and this is of functional importance since it could sterically inhibit the accretion of heterotypic collagen V/I fibers (6,10). However, the tissue form of ␣1(V) homotrimer was suggested to exist only as a fully processed molecule (20). This could be of physiological importance, since the absence of the N-propeptide could promote the existence of thicker heterotypic fibrils. In this regard, the different processing occurring in the various collagen V isoforms could influence fibril diameter modulation.
Interestingly, we observed two molecular forms in the culture media, N-terminal unprocessed (␣1FL) and fully processed homotrimers (␣1TH). Although a putative N-proteinase cleavage site analogous to the collagen I was designated in ␣1(V) chain at positions Ala 541 -Gln 542 (39,40), the determination of the N-terminal sequence of the recombinant product indicates that the cleavage occurs at positions Phe 524 -Arg 525 . Thus, as it generally occurs in fibrillar collagens, the fully processed form of the recombinant homotrimer retains the major part of the NC2 domain.
Concerning the C-terminal domain, three pieces of evidence indicate that it is processed in our model. (i) Cell medium collagenase digestion generates only a single band of 86 kDa whose sequence indicates that it is the NC3 region, (ii) the molecule observed by rotary shadowing lacks a small globular domain at the C terminus, and (iii) when working with cell homogenates, the form obtained exhibits a molecular mass of about 30 kDa greater than that of ␣1FL. This difference matches the size of the C-propeptide. All these results mean that the cells synthesize a precursor that contains the Cpropeptide and that this C-propeptide is rapidly cleaved in the medium after secretion as it occurs in vivo.
Interestingly, our results show that 293 cells contain proteases able to cleave the C-propeptide to generate pN-␣1(V) homotrimers. So far, it constitutes the only expression system where the C-propeptide is removed. As these cells do not synthesize any detectable amounts of collagen, investigations have to be done to elucidate which proteases are involved. However, the process occurring in 293 cells provides an undeniably in-teresting tool for further studies of the role of collagen V homotrimer, particularly in fibrillogenesis.