Vascular Ehlers-Danlos Syndrome Mutations in Type III Collagen Differently Stall the Triple Helical Folding*

Background: Mutations in collagen may lead to posttranslational over-modifications and a delay in secretion. Results: Mutations locally delay the propagation of the triple helix. Conclusion: Position and nature of a mutation are the key factors in the triple helix folding delay. Significance: Consequences of natural collagen mutations can be now analyzed in terms of rate of folding. Vascular Ehlers-Danlos syndrome (EDS) type IV is the most severe form of EDS. In many cases the disease is caused by a point mutation of Gly in type III collagen. A slower folding of the collagen helix is a potential cause for over-modifications. However, little is known about the rate of folding of type III collagen in patients with EDS. To understand the molecular mechanism of the effect of mutations, a system was developed for bacterial production of homotrimeric model polypeptides. The C-terminal quarter, 252 residues, of the natural human type III collagen was attached to (GPP)7 with the type XIX collagen trimerization domain (NC2). The natural collagen domain forms a triple helical structure without 4-hydroxylation of proline at a low temperature. At 33 °C, the natural collagenous part is denatured, but the C-terminal (GPP)7-NC2 remains intact. Switching to a low temperature triggers the folding of the type III collagen domain in a zipper-like fashion that resembles the natural process. We used this system for the two known EDS mutations (Gly-to-Val) in the middle at Gly-910 and at the C terminus at Gly-1018. In addition, wild-type and Gly-to-Ala mutants were made. The mutations significantly slow down the overall rate of triple helix formation. The effect of the Gly-to-Val mutation is much more severe compared with Gly-to-Ala. This is the first report on the folding of collagen with EDS mutations, which demonstrates local delays in the triple helix propagation around the mutated residue.

Ehlers-Danlos syndrome (EDS) 2 comprises a wide variety of soft connective tissue disorders (1). Genetic studies have shown that the genes of types I, III, and V collagens (2), lysylhydroxylase 1 (coded by PLOD1 gene) (3), procollagen-N-proteinase (coded by ADAMTS2 gene) (4), FKBP22 (coded by FKBP14 gene) (5), and tenascin X (coded by TNXB gene) (6) cause EDS. Lysylhydroxylase 3 (coded by PLOD3 gene) causes a kind of EDS like phenotype (7). Vascular EDS, type IV, is the most serious form of the disorder due to the lethal rapture of the artery or hollow organs caused by autosomal dominant mutations in the COL3A1 gene. The skin of the patients is usually thin and translucent; therefore, the veins can be clearly seen. The patients frequently have large eyes, lobeless ears, a small chin, short stature and thin scalp hair. Extensive bruising is also frequently observed. Type III collagen is ubiquitously distributed as a minor component of collagen fibrils, but relatively abundant in skin, artery, and intestine. Bone (8) and cartilage (9) also contain type III collagen. The type III procollagen molecule secreted from cells is not fully processed at its N-terminal noncollagenous part. ADAMTS-2 is the processing enzyme of the N-terminal propeptide, which is also linked to dermatosparaxis EDS (type VIIC) (10,11). Immunoelectron microscopic analysis with a monoclonal antibody against the triple helical domain suggested that type III collagen can be present on banded collagen fibrils regardless of fibril diameter (12). The type III collagen molecules interact with the cartilage collagen fibril surface (9), and also with type I collagen fibrils in many tissues (13). Type III collagen has a slightly longer major triple helical domain (1,029 residues) compared with type I collagen (1,014 residues). Because the chain stagger of the type I collagen molecule and the molecular packing manner even of pure type I collagen fibril remain unclear (14), the molecular recognition of type III collagen with type I collagen and their role in maintaining the structure and tensile strength of the fibril are largely unexplored. The col3a1 knock out mouse embryos develop phenotypically normal organs, indicating that type III collagen is not critically important for the early organ morphogenesis stage. At weaning age, the survival rate of col3a1 null mouse was 5% mainly due to rupture of the aorta (15), indicating that type III collagen is an essential molecule for the maintenance of the mechanical strength of the extracellular matrix of the aorta and to withstand the high blood pressure produced by the heartbeat.
As of now, 276 unique DNA mutations in the COL3A1 gene have been reported in the EDS variant database at the University of Leicester (16,17). About half of them are the point mutations of glycine residues of the G-X-Y tripeptide unit that repeatedly constitutes the triple helical sequence. About a third of the point mutation of repeated Gly residues have been reported to cause the vascular EDS. The phenotypes of patients are somewhat similar. In addition, haploinsufficiency of type III collagen in humans (18) and mice (19,20) also leads to EDS, suggesting that the amount of type III collagen is also important. Therefore, the amount and structural integrity of the type III collagen molecule in the extracellular matrix must be essential for the full function of the aorta and other organs. The mutations of the smallest residue to any other bulkier residue should obviously disturb the molecular structure (21,22). Furthermore, the resulting change of the three-dimensional molecular surface can disturb intermolecular interactions and also change the normal fibril packing and cross-linking patterns. Potentially, the mutation of Gly can also slow down the rate of triple helix folding. Delayed secretion of overglycosylated type I collagen has been previously demonstrated in cultured cells of the osteogenesis imperfecta (OI) patients (23). OI can be caused not only by mutations in the collagen genes but also by the absence or defects in the proteins involved in collagen synthesis and secretion. For example, the lack of prolyl 3-hydroxylase 1 (P3H1) causes the delayed secretion of overglycosylated type I collagen from the cultured skin fibroblasts (24). In one report of vascular EDS, a broadening of the electrophoresis gel band of type III collagen presumably indicated an over-modification as well as an increase of type III collagen inside of fibroblasts from the patient was observed (25).
It is reasonable to speculate that the slower folding of type III collagen molecules leads to a prolonged exposure of single (non-triple helical) procollagen chains to modifying enzymes. However, there are no studies that elucidate the effect of mutations on the kinetics of the triple helix folding.
We have recently found that certain noncollagenous domains of several collagen types, such as XVIII (26), XV (27) and XIX (28), form a stable trimerization domain (29). These trimerization domains are attractive tools for protein engineering (29) and are of special interest for studying collagen structure and folding (30,31). Here we recombinantly attached the C-terminal quarter fragment of human type III collagen to a (GPP) 7 stretch stabilized by the type XIX collagen NC2 domain. The natural collagenous domain forms the correctly staggered collagen triple helical structure. Importantly, the (GPP) 7 -NC2XIX portion of this chimeric molecule is thermally more stable than the collagenous domain. This allows us to thermally unfold the collagenous domain located N-terminally while keeping the C termini in a correctly pre-staggered trimeric conformation (locked by the (GPP) 7 -NC2XIX portion). By lowering the temperature, the kinetics of the triple helix propagation can be immediately monitored. This system allows the study of triple helix propagation when it is not hindered by other processes such as chain association and triple helix nucleation. In the type III collagen quarter fragment two EDS mutation sites located either in the middle or close to the C terminus, G910V 3 (32) or G1018V (33), were chosen to uncover their aberrant role in the collagen folding kinetics. Interestingly, the probability of Gly-to-Ala point mutation is expected to be equal to Gly-to-Val substitution. Nevertheless, the number of reported EDS cases of Gly-to-Ala mutations is much lower than that of Gly-to-Val. Many of the Gly-to-Ala mutations likely do not exhibit EDS symptoms. We made both Val and Ala variants at sites Gly-910 and Gly-1018 sites to uncover the effects of the nature of the mutation and its position on the folding kinetics. For the first time we demonstrate a direct effect of EDS mutations on the rate of triple helical propagation.

EXPERIMENTAL PROCEDURES
Cloning, Expression, and Purification of the C-terminal Quarter Fragment of Human Type III Collagen with (GPP) 7 and a Short NC2 Domain of Type XIX Collagen-Human cDNA clone of COL3A1 was kindly gifted by Prof. Takako Sasaki (Oita University, Oita, Japan). The C-terminal quarter fragment, the residue 775-1026 of the major triple helical region was PCRamplified using two oligonucleotides 5Ј-TGC GGA TCC GGA GCT CCA GGC CCA CTT G-3Ј (forward primer, the BamHI site is underlined) and 5Ј-AGA GGG CCC AGG GGC ACC AGG AGG TCC A-3Ј (reverse primer, the ApaI site is underlined). The amplified DNA was inserted into the pET23-HisTrx_gpp7-XVIIItR vector (31) with a 6ϫ His tag-thioredoxin-thrombin cleavage site Gly-Ser type XVIII collagen trimerization domain (26). The preliminary experiments revealed that an unexpected aggregation problem occurred (data not shown). Thus, we replaced the type XVIII collagen trimerization domain with the NC2 domain of human type XIX collagen. The 5Ј-NcoI-His 6 -thioredoxin-thrombin cleavage site BamHItype III collagen (775-1026) (wild-type, G910A/V, or G1018A/ V)-(GPP) 7 -XmaI fragment was recloned into pET23-His-Trx_gpp7-NC2XIXB (31) using the cloning sites NcoI and XmaI. As a result the sequence encoding the type XVIII trimerization domain was replaced with the human type XIX collagen NC2 domain containing the last COL2 domain tripeptide Gly-Ile-Pro (residues 1007-1040 of collagen type XIX human protein, accession number CAI42716 version CAI42716.1 GI:57208771). The exact protein sequence attached after (GPP) 7 repeat is GIPADAVSFEEIKKYINQEVLRIFEERMAV-FLSQ. The final sequence of the fusion protein is shown in Fig.  1. Plasmids encoding point mutations G910A, G910V, G1018A, and G1018V were prepared with the QuikChange site-directed mutagenesis kit (Agilent). The oligonucleotides were designed at the company's website. For G910A mutation, 5Ј-CTGGTGCTCCCGCTCCTGCTGGTTC-3Ј and 5Ј-GAA-CCAGCAGGAGCGGGAGCACCAG-3Ј (mutated nucleotides are underlined) were used as the sense and antisense primers, respectively. For G910V mutation, 5Ј-GCTGGTGCT-CCCGTTCCTGCTGGTTCC-3Ј and 5Ј-GGAACCAGCAGG-AACGGGAGCACCAGC-3Ј were used as the sense and antisense primers. For G1018A, 5Ј-CAGGGCAACCAGCCCCTCC-TGGACC-3Ј and 5Ј-GGTCCAGGAGGGACTGGTTGCC-CTG-3Ј were used as the sense and antisense primers, respectively. For G1018V, 5Ј-CAGGGCAACCAGTCCCTCCTGGACC-3Ј and 5Ј-GGTCCAGGAGGGACTGGTTGCCCTG-3Ј were used as the sense and antisense primers, respectively. All DNA sequences were verified by Sanger dideoxy DNA sequencing.
The recombinant proteins were expressed at 15°C in the Escherichia coli BL21(DE3) host strain (Invitrogen) after isopropyl ␤-D-1-thiogalactopyranoside induction (final concen- 3 In this study we number the type III collagen sequence starting from the first residue of the major triple helical domain. Thus the residue numbers are 167 smaller than in the translated molecule. tration 1 mM) for 3 days in 2 flasks with 500 ml of 2ϫTY medium (16 g of tryptone, 10 g of yeast extract, and 5 g of NaCl per liter) for each. Cells were chilled and harvested by centrifugation and lysed by ultrasonication, and soluble material was precipitated with saturated solution of ammonium sulfate. Furthermore, the expressed proteins were purified by HiTrap TM chelating HP column (GE Healthcare) with immobilized nickel ions. The His-thioredoxin tag fragment was cleaved by thrombin (BD BaculoGold TM BD Biosciences) as described in the manufacturer's instructions. The cleavage was performed with thrombin protease (1 units/ml) in 40 mM Tris/HCl buffer, pH 8.2, containing 150 mM NaCl on ice. The resulting fragments had two additional amino acid residues, GS, before the natural type III collagen sequence. After thrombin cleavage, the protein solutions were treated with 1 mM diisopropyl fluorophosphate for 3 h on ice. Then the solutions were applied to HisTrap TM HP column again to remove the His-thioredoxin tag. The natural type III collagen fragments with the trimerization knot were eluted with ϳ10 -20 mM imidazole. After the addition of 0.05% Tween 20, the protein solutions were dialyzed against 20 mM HEPES buffer, pH 7.4, containing 50 mM NaCl, 0.05% Tween 20, and purified with a 5-ml volume of SP-Sepharose Fast Flow column (GE Healthcare). The fractions were collected using a stepwise NaCl gradient. The fractions with the pure proteins (150 -200 mM NaCl) were dialyzed and concentrated in 20 mM Tris/HCl buffer, pH 7.0, containing 200 mM NaCl and 0.05% Tween 20 for further analysis. Hereafter, the quarter fragment of human natural type III collagen with (GPP) 7 and a short NC2 domain of type XIX collagen protein is indicated as QF.
Rotary Shadowing-The purified proteins (0.1 mg/ml) were dialyzed against 100 mM ammonium bicarbonate at 4°C, sprayed onto freshly cleaved mica, and shadowed with platinum carbon (34). The images were viewed by transmission electron microscopy. The length of the molecules from the images was analyzed using the software ImageJ (Version 1.47c; Rasband, W. S., ImageJ, National Institutes of Health, Bethesda, MD; http://imagej.nih.gov/, 1997-2012).
Circular Dichroism (CD) Analysis-CD spectra were recorded on an AVIV model 202 spectrometer (AVIV Biomedical, Inc., Lakewood Township, NJ) with a thermostatted quartz cell of 1 mm path length (Starna Cells, Inc. Atascadero, CA). The spectra were normalized for concentration and path length to obtain the mean molar residue ellipticity after subtraction of the buffer contribution. Thermal scanning curves were recorded at 223 nm. Peptide concentrations were determined by amino acid analysis.
Detection of Refolding Intermediates by Trypsin Resistance and Polyacrylamide Gel Electrophoresis-Detection of refolding intermediates by analysis of trypsin-resistant fragments on the SDS-PAGE was performed as described (35) with some modification. Twenty l at 0.2 mg/ml or higher concentration of type III collagen quarter fragment stabilized by the (GPP) 7 -NC2XIX knot was heat-treated in 0.65 ml of thin-wall polypropylene tube at 33°C for 10 min. The natural type III collagen sequence was unfolded under this condition, but the (GPP) 7 -NC2XIX part remained stable. The sample solutions were then immediately placed on ice. After time periods ranging from 0 to 6 h, 2 l of 50 mM Tris/HCl buffer, pH 7.4, containing 0.2 M NaCl, 5 mg trypsin/ml was added at 4°C to the sample. After a 2-min incubation on ice, 2 l of 50 mM 4-(2-aminoethyl)benzensulfonyl fluoride hydrochloride in cold water was added to the tube. Then, 20% SDS were added, and the mixture was brought to a boiling temperature by immersing it into an oil bath heated to 140°C. The samples boiled for 5 s after the addition of the SDS buffer and were placed into a heat block kept at 96°C for another 3 min. The samples were analyzed by the bis-glycine-buffered SDS-PAGE. Photos of the gels taken after the stain with GelCode Blue (Thermo Scientific) were analyzed quantitatively using the software GelQuant.NET provided by biochemlabsolutions.com. The band intensities were integrated and normalized with respect to the undigested sample. The N-terminal sequences of the trypsin-cleaved fragments were analyzed after blotting the SDS-PAGE gel onto a Immobilon TM P membrane (Millipore) with a Bio-Rad Mini Trans-Blot Cell with 40 V for 2.5 h in 50 mM sodium tetraborate. The blot was stained with GelCode Blue. The bands were analyzed on a Procise Protein Sequencer (Applied Biosystems, Inc.).

RESULTS
Rotary Shadowing Images of the Expressed Protein-The QF protein (human type III collagen C-terminal quarter fragment with (GPP) 7 and the type XIX collagen NC2 domain, Fig. 1) was expressed in BL21(DE3) cells and purified as described under "Experimental Procedures." Rotary shadowing images showed the rope-like structures with neither significant kinks along the molecules nor globular domains at either end (Fig. 2). From the CD spectrum (28), the secondary structure of the NC2 domain of type XIX collagen is estimated to be mainly ␣-helical. If all the 31 residues of the NC2 domain form the ␣-helical coiled coil, the corresponding length of the domain is estimated to be roughly 5 nm (based on the rise 0.15 nm per residue). The thickness of the trimeric ␣-helical coiled coil structure is about 2.3 nm as determined by the crystal structure of the trimeric coiled coil leucine zipper GCN4 from yeast (36). On the other hand, the diameter of collagen helix is about 1.4 nm from type III collagen peptide crystal structure (37). The resolution of the rotary shadowing images is about a few nanometers (38). Therefore, the absence of the globular object at the end of the molecules is in accordance with the extended coiled coil structure of the type XIX collagen NC2 domain. The average length of the observed molecules is 73.8 nm (S.D. 6.9 nm). For the collagen triple helix in hydrated tissues or crystals the residue pitch along the molecular axis is 0.285 nm (39,40). Thus, the expected length of the triple helical domain is about 78 nm including the (GPP) 7 segment. The observed average value of the shadowing image is a little smaller than expected (83 nm for the collagen triple helix extended with the coiled-coil). This can possibly be caused by the shrinkage during the sample preparation (dehydration) or a partial unfolding of the N-terminal triple helical domain (breathing). As determined by trypsin digestion (presented below), the N-terminal part of the collagen sequence is somewhat unstable. As previously reported, the unfolded single chain collagenous polypeptide cannot be observed with this technique (41).
Thermal Stability of Bacterially Expressed Type III Collagen C-terminal Quarter Fragment with (GPP) 7 -NC2XIX Knot-The QF constructs were purposely designed to keep the C termini of the collagenous domain of interest in a trimeric prestaggered conformation even at higher temperature when the collagenous domain is unfolded. This is achieved through the stable (GPP) 7 -NC2XIX structure at the C terminus. The melting temperature of the (GPP) 7 portion is around 50°C (31). All temperature experiments were performed below 35°C to ensure the intact structure of the trimerization and staggerdetermining knot. Fig. 3 shows the equilibrium unfolding/refolding experiments with the wild-type (WT) and mutated QF proteins monitored by CD. The heating and cooling rates are 10 K/h. Upon heating the protein samples demonstrated a linear base line at the lower temperature intervals which correspond to the triple helical structure. Two transition phases are observed for WT, which indicates the presence of two cooperative domains. The first transition around 20°C makes up almost 75% of the net signal change and must be assigned to the unfolding of the major N-terminal part of the QF; the second transition around 25°C points to a more stable C-terminal end of the QF (Fig. 3A). The thermal transitions observed for the mutated QF proteins further supports these assignments. Indeed, the QF proteins with G910A and G910V (mutations in the middle of the triple helix) demonstrate a decrease in the melting temperature only for the first major transition (Fig. 3A). Mutants G1018A and G1018V (located close to the C terminus) show a destabilization of the second domain, which apparently leads to a shift of the melting temperature to around the position of the first transition (Fig. 3B). The results indicate that the thermal stability of the mutated proteins is globally affected within two separate cooperative domains. Because the proteins were expressed in E. coli, all proline residues in the Y position are not posttranslationally modified to 4-hydroxyproline, resulting in a lower thermal stability of the collagenous domain.
Upon cooling, the refolding profiles shows a hysteresis for all the QF, a known phenomenon for collagen unfolding-refolding experiments (30). In general, the WT QF has a propensity to refold faster than any mutated QF. Interestingly, the QF proteins with G910A and G910V demonstrate similar refolding profiles in Fig. 3A around the second transition (with the higher temperature). It additionally supports the above assignment of the second transition to the C-terminal end, as it should not be affected by the distant mutations. It is also remarkable that the refolding of the mutants with the bulkier Val is significantly FIGURE 2. Rotary shadowing images of the QF. The purified QF protein (0.1 mg/ml) was dialyzed against 100 mM ammonium bicarbonate at 4°C, sprayed onto mica, and shadowed with platinum (34). The images were viewed by transmission electron microscopy. A representative field is shown in A. Only molecules with defined, extended tails were measured. The observed average length of the strings is 73.8 nm (S.D. 6.9 nm). Statistical analysis was performed, and a S.D. value was calculated. The length distribution is shown in B.
diminished after a certain point when compared with the Ala substitution.
The N-terminal Sequences of Trypsin-resistant Fragments of QF-The thermal transition profiles determined by CD showed that the stability of the collagen triple helix of QF is not homogenous. To search for the potential unstable regions we treated the QF with trypsin under mild conditions. Fig. 4 shows two examples of blots generated from the transfer of SDS-PAGE gels. Nine trypsin fragments were observed at different conditions. Even at 4°C, the N-terminal end of the WT QF was partially cleaved. Apparently, the Gly-775 to Arg-789 stretch does not completely form a trypsin-resistant stable collagen helix, which reflects the observation of shorter fragments in the rotary shadowing experiments.
Refolding Kinetics Analysis of QF by CD Signal-Refolding kinetics of the natural collagenous domain was monitored by the ellipticity change at 223 nm (Fig. 5). As previously reported (35), collagen molecules with a long triple helical domain (above ϳ250 residues) linked by a trimerization knot at one end demonstrate a zero-order refolding (linear gain of refolded structure) at the beginning of refolding. The rate-limiting factor in this system is the slow prolyl cis/trans isomerization (35,42). The initial apparent first order rates, k app , of WT, G1018A, and G1018V peptides were calculated from the linear fit of the first 500 s of the refolding data. Table 1 shows the k app of the QF peptides. We compared our bacteria-expressed protein data with the tissue-extracted type III collagen quarter fragment. For bovine type III quarter fragment, the k app value was reported to be 0.115 min Ϫ1 (or 1.9 ϫ 10 Ϫ3 s Ϫ1 ) at 25°C (35). Applying the activation energy, Ea, of cis/trans isomerization of 53.5 kJ/mol published for the knotted (Gly-Pro-Pro) 10 molecules (43), k app at 4°C can be calculated as 3.7 ϫ 10 Ϫ4 s Ϫ1 if we assume the relative rate constants for the same reaction at two different temperatures are calculated with the Arrhenius equation, where R is the gas constant. Although the natural type III collagen quarter fragment contains the posttranslationally modified 4-hydroxyprolines, its k app value appears basically the same as that for non-hydroxylated molecules ( Table 1). As described in detail by Bächinger et al. (35), the initial apparent rate constant, k app , for the long collagen peptide is related to the number of tripeptide units, n, and the equilibrium constant of trans/cis, K, in the following way, where F is the fraction folded, and k is the rate of cis to trans prolyl-bond isomerization. In 1979, the whole amino acid sequence of bovine type III collagen had not been reported yet. The QF fragment used in this study has the human sequence, which is known. The number of the tripeptide units per chain is 84. Then, the total number of the tripeptide units incorporated into the triple helix is n ϭ (3 ϫ 84) Ϫ 2 ϭ 250. In the QF, the number of proline residues per trimer is 57 ϫ 3 ϭ 171.
Replacing n with the number of prolyl bonds seems to be more reasonable, although the distribution of proline is not homogeneous. If we assume that K ϭ 6.25 (as has been determined for acetyl-Pro-O-methyl (42)), k for WT is 9.4 ϫ 10 Ϫ3 s Ϫ1 , (calculated using k app ϭ 4.0 ϫ 10 Ϫ4 s Ϫ1 at 4°C). From the Arrhenius equation, k at 25°C can be calculated as 4.8 ϫ 10 Ϫ2 s Ϫ1 . For the natural bovine type III collagen quarter fragment used by Bächinger et al. (35), the k value can be calculated as 4.7 ϫ 10 Ϫ2 s Ϫ1 at 25°C if we use n ϭ 153 (Pro-819 was determined as the first N-terminal residue of the bovine type III collagen quarter fragment 4 ), 51 proline residues per chain, 153 proline residues per trimer, and K is 6.25. Again, the rates of cis to trans are essentially similar whether the peptides have 4-hydroxyproline or not. If we used K ϭ 30 as in Ref. 35, the k values would decrease Samples were equilibrated at 4°C for more than 2 days and heated with a rate of 10°C/h. The CD signal at 223 nm was monitored. At 35°C the heating was stopped for 10 min to achieve complete unfolding of the QF. The temperature was then lowered at a rate of 10°C/h. The lines are colored as follows: WT (black), G910A (red), G910V (blue) shown in A and WT (black), G1018A (green), and G1018V (violet) shown in B. The arrows indicate the direction of the temperature change.
The initial apparent rates of folding of WT, G910A, and G910V are faster than those of the C-terminal mutated peptides, G1018A and G1018V (Fig. 5B). The QF with mutations at Gly-1018 demonstrated slower refolding from the very beginning, whereas the Gly-910 mutants refolded very similarly during the initial phase when compared with the WT QF. The delay in refolding is more significant in the Gly-to-Val mutations than in the Gly-to-Ala mutations. It is clear that the limiting factor for the apparent rate constants of the peptides G1018A and G1018V is a renucleation of the triple helix behind the mutated G1018. The ratios of k app to that of WT, 0.19 for G1018V, and, 0.43 for G1018A (Table 1) demonstrate the link between the nature of mutation and the decrease in the renucleation rate. The Gly-to-Val mutation obviously has a more severe effect and slows down the apparent rate to ϳ20% of that of WT.
Analysis of Refolding Kinetics after Trypsin-resistant Fragments of QF-The refolding process was monitored by analyzing the appearance/disappearance of trypsin-resistant fragments of varying sizes as previously described (35) (Fig. 6). The type III collagen part was denatured at 33°C for 10 min. This is possibly due to the lack of 4-hydroxyproline, the transition temperature of the natural sequence is around 15°C (Fig. 3). To minimize the protease treatment time we used trypsin at a high concentration. We found that 0.2 mg/ml trypsin is a suitable concentration to chop off the unfolded chains within 2 min on   Table 1.

Folding of Type III Collagen with EDS Mutations
ice. Because of the high concentration of trypsin and/or the partial instability of the N-terminal triple helix, even without the heat treatment, a part of the natural sequence was cleaved off with trypsin. At the beginning of the refolding, peptide with an apparent molecular mass of 10,000 was observed for WT on the SDS-PAGE (Fig. 6A). This band corresponds to the cleavage after Arg-1005 as determined by the N-terminal peptide sequencing. This is the most C-terminal potential trypsin cleavage site in the natural collagen part. The calculated mass of this fragment is only 7.6 kDa, but the slower mobility of collagenous peptides is a well known fact. A band corresponding to a slightly shorter fragment with an apparent molecular mass of 6 kDa was also observed. This small fragment had the N-terminal sequence of the cleaved NC2 domain, which contains four potential sites for trypsin digestion (Fig. 1). The appearance and disappearance of ever larger tryp-sin resistant fragments in the WT QF is indicative of the zipperlike folding of the triple helix (Fig. 6B). A similar initial (covering only the QF part) pattern was observed for the refolding of the tissue-extracted type III collagen molecule (35).
The overall refolding profiles correspond to the result observed during the CD refolding experiments. The Gly-to-Ala-mutated peptides refold faster than the Gly-to-Valmutated peptides. Also, the initial refolding rate is faster for the Gly-910-mutated peptides than the Gly-1018-mutated peptides.
There are differences in the band patterns of the mutated proteins, reflecting delays at the point of mutation. The Gly-910-mutated peptides have less larger-sized fragments, whereas the Gly-1018-mutated peptides do not show much of the 12-kDa bands. The appearance of band e of all mutants and WT are shown in Fig. 6C. It is evident that the nature of the FIGURE 6. Refolding kinetics of the QF analyzed by trypsin resistance of folded fragments. QFs of WT, G910A, G910V, G1018A, and G1018V proteins were unfolded at 33°C for 10 min. The refolding of the triple helix was monitored by trypsin cleavage resistance of the folded helical portions. The unfolded collagen is readily cleaved by trypsin, whereas the refolded triple helical domain is resistant. The time points indicated are when the trypsin solution was added to the sample. After the 2-min incubation on ice, the enzyme was inactivated. A, the proteins were analyzed by 12% SDS-PAGE followed by GelCode Blue staining. The time point is indicated in minutes unless specified as hours. St, starting material; E, trypsin; ϩE, trypsin-treated sample without heat denaturation; M, protein marker. Trypsin shows two bands indicated as arrows in the WT panel. The molecular weight of the globular protein markers is indicated in kDa. B, the gel of WT was scanned, and the relative abundance of several bands was plotted. The bands are as indicated on the first gel (WT). The relative abundance of the e band (fully folded chains) of all peptides was quantitated and is shown in C. mutation is more important than the location of the mutation for the appearance of fully refolded molecules. These curves differ from the CD measurements (Fig. 5) at earlier folding times that are dominated by partially folded molecules but converge at later refolding times.
Fitting of the Folding Kinetics-We attempted to fit the data shown in Fig. 5 with the model of linear propagation steps reflecting all cis-to-trans isomerization events as described (35). However, the calculations required unrealistic computer time when we added a single mutation interruption in the model. Therefore, we applied a simpler model. We simplify the refolding kinetics of the QF as schematically drawn in Fig. 7A. We assume only three different triple helical species, zero, half, and full-length. All other intermediate species are ignored to simplify the equations. We assume that all reactions are first order with one direction. The rate for the reaction for propagation of the helix without mutations is defined as k n . If there is a mutated site between the different lengths of the species, we introduce the rate for that reaction, k m , which is defined as a renucleation rate without propagation of the triple helix (Fig. 7B).
For the C-terminal-mutated molecular species, we assume four molecular species: Y, the quarter fragment is completely unfolded without renucleation at Gly-1018 (F (fraction folded) ϭ 0); Y1, renucleated at Gly-1018 (F ϭ 0); Z1, a halffolded (F ϭ 0.5); W, fully folded (F ϭ 1). With the rate constant, k m , which is assumed to be the rate of the renucleation at the N-terminal side of Gly-1018 to propagate the helix, the reaction can be expressed as follows.
The differential equations are shown below. After heat treatment at 33°C, the type III collagen part is unfolded, but (GPP) 7 keeps three chains at the correct stagger. Refolding starts at the C terminus of QF. For the peptides with Gly-910 mutation the refolding trace is about the same as for WT until the residue Gly-910 is encountered. At the mutated site, the renucleation of the triple helix is necessary to restart the triple helix formation. The mutated peptides with the Val mutations take longer time to renucleate than those with the Ala mutations. For the Gly-1018 peptides the initial refolding is affected from the very beginning. Once the triple helix is renucleated after the Gly-1018 site, the zipper-like folding proceeds with the same rate. B, the model used for the fitting is illustrated. Details are described in the text. All the reaction steps are assumed as the first order. The folding rate for the helix fragment without a mutation is defined as k n . The renucleation rate (without propagation of helices) at the site of a mutation is defined as k For the mid-mutated molecular species, G910A and G910V, the equation is changed as below. Here, Y is the completely unfolded (F ϭ 0), Z and Z1 are the half-length-folded triple helices (F ϭ 0.5), and W is fully folded triple helix (F ϭ 1).
The differential equations are shown below.
For WT, the reactions are expressed as simply as below.
The differential equations are shown below With the initial conditions, Z (time ϭ 0) ϭ 0, Z1(0) ϭ 0, W(0) ϭ 0, Y1(0) ϭ 0, Y(0) ϭ 1, these differential equations can be solved. The integral forms of the solutions are available in supplement A. Sets of differential equations for all three models were iteratively solved by the MicroMath Scientist algorithm for Windows (Version 2.01; MicroMath, St. Louis, MO). The equations are also available in the supplement B. First, we introduced one k m for the Ala-and Val-mutated peptide for each as parameters for the fitting (Fig. 7C). Table 2 shows the fitted results using the initial 7,500-s and also up to 24,000-s refolding. The k m /k n for Gly-to-Ala mutants is 0.63-0.90, whereas the k m /k n for Gly-to-Val is 0.13-0.22. Second, because the renucleation of the mutated sites must depend on the surrounding amino acid sequences, we also introduced a different k m for the Gly-910 and Gly-1018 peptides (Fig. 7D). Although the model is highly simplified, the results are reasonable in that the effect of the Gly-to-Val mutation is much more severe than the Gly-to-Ala mutation. However, the simplified model ignores the CD contribution of partially formed triple helical intermediates, because only zero, half, or full-length triple helices are considered. As a result, the model underestimates the rate of folding at the beginning of the refolding kinetics for the Gly-1018 mutants (Fig. 7, C and D).

DISCUSSION
We designed a system that is able to align three polypeptide chains in a specifically staggered manner without the need to nucleate the triple helix. We believe that this is the most advanced system to study the role of mutations in the triple helical folding available to date. An interesting system has been recently developed that is based on a bacterial collagen (46,47). Although this system provided useful insights into the triple helix folding, it has several limitations. First, the triple helical nucleation starts at the N-terminal trimerization domain, which is opposite to most collagen types where the trimerization domain is located C-terminally, and the triple helix folding proceeds from the C to the N terminus in a zipper-like fashion (35,48,49). Second, the stability of the trimerization domain is insufficient to keep three unfolded chains aligned at the beginning of the experiment, which complicates the refolding mechanism by introducing two additional steps, i.e. the trimerization and triple helix nucleation. And third, a collagen site of interest is placed in the alien context of a bacterial sequence, which can significantly disturb any cooperative intra-and interchain interactions. Our system is void of these complications. The only drawback of the system reported in this study is the lack of posttranslational modifications, such as hydroxylation of proline and lysine residues as well as glycosylation. Our experimental data indicate that at least the effect of 4-hydroxylation is essentially small on the refolding kinetics. If necessary, this bacterial system could be translated into the appropriate eukaryotic expression system with the desired posttranslational modifications. We analyzed the triple helix folding of the type III collagen fragment using the model peptides with vascular EDS mutations, G910V (32) and G1018V (33), as well as G910A and G1018A. Our study indicates that the mutations of Gly cause a delay in the triple helical folding. The smaller mutated residue Ala can be easily skipped, and the triple helix can be renucleated faster after the mutation site (Table 1). The polypeptides with alanine mutations can thus extend the triple helical region faster than those with a larger valine residue at the mutation sites. Because the delay is more significant in Gly-to-Val mutations than in Gly-to-Ala, it can somewhat simulate the autosomal dominant phenotype with only one or two mutated chains in the trimeric molecule. Either one, two, or three chains with a mutation can be associated with their trimerizing C-propeptide in the rough endoplasmic reticulum (ER). Thus the rate of renucleation will vary among these different molecules. Our model describes the worst case scenario with three mutated chains. The substitution of a single nucleotide base for Gly codon may cause nine possible mutations (1/4 for Arg, 1/6 for Ala/Val, 1/12 for Asp/Glu/Ser/Cys, 1/24 for Trp, and 1/24 for Stop codon). If the frequency of the occurrence of mutations is all equal, the ratio of Gly-to-Ala mutations cases should be higher than that currently reported in the database (16,17). Presumably some Ala mutations might not be severe enough to cause EDS.
The delayed triple helix formation potentially might cause overglycosylation of hydroxylysine residues as seen in osteogenesis imperfecta (23). The effect of the number of mutated chains as well as staggering positions is an interesting factor that should be investigated. Until recently, there were no easy methods to make heterotrimeric collagen model proteins. Now we have found that three chains of the type IX collagen NC2 domain form the ␣1(IX)␣2(IX)␣3(IX) heterotrimer (50), which also determines the stagger of the adjacent collagenous sequences (31). This system is suitable for exploring the composition and staggering effects on the structure, stability, and folding of various collagen types with any sort of mutations including EDS.
The mechanism of the biosynthesis of collagen molecules is still far from being completely understood. A number of processes, such as transcription, mRNA splicing, translation, transport to ER, trimerization of the propeptides, N-glycosylation of the propeptide, posttranslational modifications of Pro and Lys, O-glycosylation, peptidyl cis to trans isomerization, triple helix folding, secretion, and maturation are orchestrated by a series of enzyme complexes with many chaperone proteins as well as carbohydrate and lipid molecules in the ER lumen and on the ER membrane. The defect of any of these players affects the whole sound of the symphony. Dilation of the rough ER is reported in G910V EDS patient dermal fibroblasts (33). Slower folding rate increases the potential chance of overglycosylation of hydroxylysine residues, and the degradation system might be stimulated. Interestingly, the vascular EDS mutations are found not only C-terminally but also N-terminally of the collagen domain, i.e. at Gly-13, Gly-16, Gly-25, Gly-34, Gly-37, Gly-43, Gly-49, Gly-52, Gly-55, Gly-58, Gly-61, Gly-70, Gly-73, Gly-76, and Gly-82 (16,17). Lys-84 is the most N-terminal Yaa position Lys in the major Gly-Xaa-Yaa-repeat of human type III colla-gen. Thus, there are no possibilities for the over-modification of Lys if we assume that the delay of the triple helical formation is only in the N-terminal part starting from the mutation site. We do not have any information about the supramolecular association of type III collagen in tissues. It is possible that the N terminus of the triple helical chain might have an important role to associate with other collagen molecules or with some chaperones. The occurrence of vascular EDS at the N terminus of the collagenous domain suggests that the over-modification of the lysine is not the critical factor for the EDS disorder in some cases. Future basic biochemical studies are necessary to understand the molecular mechanism of EDS.
Our study has clearly demonstrated that the triple helix propagation of type III collagen with the EDS mutation is locally delayed around the mutated residue. The use of the thermally stable staggered trimerization domain (GPP) 7 -NC2XIX has the potential to advance the understanding of molecular mechanisms of the EDS and other collagen-related heritable disorders.