Trimerization and Triple Helix Stabilization of the Collagen XIX NC2 Domain*

The mechanisms of chain selection and assembly of fibril-associated collagens with interrupted triple helices (FACITs) must differ from that of fibrillar collagens, since they lack the characteristic C-propeptide. We analyzed two carboxyl-terminal noncollagenous domains, NC2 and NC1, of collagen XIX as potential trimerization units and found that NC2 forms a stable trimer and substantially stabilizes a collagen triple helix attached to either end. In contrast, the NC1 domain requires formation of an adjacent collagen triple helix to form interchain disulfide bridges. The NC2 domain of collagen XIX and probably of other FACITs is responsible for chain selection and trimerization.

The fibril-associated collagens with interrupted triple helices (FACIT 2 collagens) include type IX, XII, XIV, XVI, XIX, XX, XXI, and XXII. Collagen IX is a heterotrimer composed of three different ␣-chains, and all others are homotrimers whose ␣-chains are characterized by short collagenous (COL) domains interrupted by several noncollagenous (NC) domains (1,2). Unlike the fibril-forming collagens, the FACITs have significantly shorter carboxyl-terminal NC domains (NC1 domains): 37 residues for collagen XIV, 20 -25 residues for collagen IX, and even fewer than 20 residues for collagen types XII and XIX (based on human sequences). In contrast, the carboxyl-terminal NC domains of fibrillar collagens (C-propeptides) are of a different type and contain about 260 residues. The FACITs share a remarkable sequence homology at their COL1/ NC1 junctions, each having two strictly conserved cysteine residues separated by four residues in their NC1 domain. Several studies suggest that the COL1 domain and the NC1 domain are involved in the mechanism of chain selection in the assembly of collagens XII and XIV (3)(4)(5)(6). Contrary to these studies, very recent studies on collagen IX show that three ␣-chains can associate in the absence of the COL1 and NC1 domains to form a triple helix, although the COL2-NC2 region alone is not sufficient for trimerization (7). This suggests that folding and chain selection of collagen IX is a cooperative process involving multiple COL and NC domains (7). It has also been hypothesized that the NC2 domain of all FACIT collagens is able to form an ␣-helical coiled-coil, thus bearing an ability to trimerize those collagens (8), but no experimental evidence has been reported so far.
Collagen XIX was identified from independently isolated clones from a human rhabdomyosarcoma cell line (9,10). The type XIX chain is composed of a 268-residue, noncollagenous amino terminus, an 832-residue discontinuous collagenous region, and a 19-residue carboxyl-terminal peptide (10 -12). It is by far the least abundant collagen so far purified, with a composition of ϳ10 Ϫ6 % of the dry weight of umbilical cord (13). Several features in the type XIX sequence place this collagen in the largest subclass of the nonfibrillar group, the FACIT collagens. These include a ϳ250-residue thrombospondin module at the amino terminus, the position of two 2-amino acid interruptions in the collagenous subdomain closest to the carboxyl terminus, and a Cys-Xaa 4 -Cys motif situated at the junction of the collagenous region and carboxyl peptide (COL1/NC1 junction) (11,12).
Characterization of mice harboring null or structural mutations in the collagen XIX (Col19a1) gene has revealed the critical contribution of this matrix protein to muscle physiology and differentiation (14). The phenotype includes smooth muscle motor dysfunction and a hypertensive sphincter. Mice without collagen XIX also display impaired smooth-to-skeletal muscle cell conversion in the abdominal segment of the esophagus (14). Electron microscope images of protein purified from human umbilical cord revealed a sharply kinked and highly polymorphic collagenous region as well as higher order complexes (13). The higher order complexes involve the aminoterminal domain, which is responsible for intermolecular disulfide linkages and contains a heparin-binding site (13).
To explore the trimerization abilities of collagen XIX at the biophysical level, we have studied the NC1 and NC2 domains in their isolated form as well as in conjunction with a triple helix. We show that the folding and the interchain disulfide bridge formation in the NC1 domain is driven by the triple helix formation; in other words, the triple helix formation in the COL1 domain is necessary for the folding of the NC1 domain. The artificial triple helical sequence (GPO) 6 , which forms a triple helix by itself, efficiently induces the folding of the NC1 domain. The NC2 domain alone is an effective trimerization domain. It forms a stable trimer and substantially stabilizes the triple helical domains attached to it.
Oxidation of NC1-containing Peptides and Separation of Oligomers-The lyophilized, reduced peptides were dissolved in degassed and N 2 -saturated 50 mM sodium phosphate buffer, pH 8.0, under N 2 atmosphere and were kept at 4°C for 24 h in order to allow triple helix formation prior to oxidation. Oxidation was induced by the addition of reduced (10 mM) and oxidized (1 mM) glutathione and exposure to atmospheric O 2 . The pH was raised to 8.3 with a saturated solution of Tris. Oxidation was carried out for 5-7 days, and the peptide mass was periodically analyzed by liquid chromatography/mass spectrometry. To separate covalently linked oligomers, the oxidized crude material was dissolved in deionized 8 M urea containing 0.1% trifluoroacetic acid to prevent disulfide exchange and applied to a sieve column (BioSep-SEC-S 2000, 600 ϫ 20 mm; Phenomenex). Fractions were characterized by electrospray/quadrupole/time-of-flight mass spectrometry. Trimer-containing fractions were pooled and further purified by reverse phase HPLC using a C 18 column with an acetonitrile/water gradient and 0.1% trifluoroacetic acid.
Cloning, Expression, and Purification of NC2, (GPP) 10 NC2, and NC2(GPP) 10 -To facilitate expression and folding of short peptides containing the NC2 domain of human collagen XIX, they were cloned as part of fusion molecules with a His-tagged minifibritin that has a thrombin cleavage site (HT-mf-thr). Minifibritin is an obligatory trimer with both its amino and carboxyl termini exposed to solvent. The modified version of minifibritin (HT-mf-thr) has recently been used to successfully express, initiate folding, and stabilize a set of collagen fragments (15,16). The plasmid pET23-HisMf (17) has multiple cloning sites just after the HT-mf-thr gene. The plasmid (clone ID 8322416, NCBI accession number BC113364), containing the full-length human collagen XIX cDNA was purchased from Open Biosystems. The NC2 domain was PCR-amplified using two oligonucleotides: 5Ј-TGCAGATCTCCGGCTGATGCA-GTTTCATTTGAA-3Ј (forward primer fNC2; the BglII site is underlined) and 5Ј-GTCAGTCGACTTAGCCATAAGCTTG-GGCAGCCAACATT-3Ј (reverse primer rNC2; the SalI site is underlined). Two other constructs for (GPP 10 )NC2 and NC2(GPP) 10 were PCR-amplified in two steps. The DNA fragment coding (GPP 10 )NC2 was prepared as follows. A synthetic oligonucleotide coding for (GPP) 10 (5Ј-GGTCCGCCTGGCC-CACCGGGCCCGCCCGGTCCGCCGGGCCCGCCGGGT-CCCCCAGGTCCGCCCGGCCCGCCGGGCCCTCCGGGT-CCACCG-3Ј) was used as a template to amplify the (GPP) 10 part, including a complementary end for the NC2 domain, using two oligonucleotides, 5Ј-TCAGGATCCGGTCCGCCT-GGCCCACCGGGC-3Ј (forward primer fGPP; the BamHI cloning site is underlined) and 5Ј-TTCAAATGAAACTGC-ATCAGCCGGAATCCCGGGTGGACCCGGAGGGCCC-GGC-3Ј (reversed primer; one half of it is complementary to the end of (GPP) 10 , and the other half is complementary to the beginning of the NC2 domain). The product of this PCR was purified and used together with the plasmid containing the NC2 domain sequence as a complex annealed template to PCRamplify the (GPP 10 )NC2 fragment using two oligonucleotides: fGPP and rNC2. The same strategy was applied to amplify the fragment NC2(GPP) 10 , using the following primers: 5Ј-GTTG-GCTGCCCAAGCTTATGGGCGGCCGGGTCCGCCTGGC-CCACCGGGC-3Ј (forward primer; one half of it is complementary to the end of NC2 domain, and the other half is complementary to the beginning of (GPP) 10 ) and 5Ј-GTCAG-TCGACTTACGGTGGACCCGGAGGGCCCGG-3Ј (reverse primer rGPP; the SalI site is underlined) for the first step and fNC2 and rGPP for the second step.
The PCR-amplified fragments NC2, (GPP) 10 NC2, and NC2(GPP) 10 were cut by BglII/BamHI and SalI and cloned into the pET23-HisMf vector using the restriction sites BamHI and SalI, so that all three fragments were a part of the fusion encoding the His 6 tag, minifibritin, the flexible linker with the thrombin cleavage site, and the insert.
The DNA inserts were verified by Sanger dideoxy DNA sequencing. The recombinant proteins were expressed at 30°C in the Escherichia coli BL21(DE3) host strain (Novagen) after isopropyl 1-thio-␤-D-galactopyranoside induction (final concentration 1 mM) for 12 h. Purification of the His 6 -tagged fusion proteins by immobilized metal affinity chromatography on a HisTrap TM HP column (Amersham Biosciences) and separation of the fragments NC2, (GPP) 10 NC2, and NC2(GPP) 10 after thrombin cleavage were carried out as described in the manufacturer's instructions. Thrombin cleavage was performed at 4°C for 24 h with thrombin protease (ICN) in 50 mM Tris/HCl buffer, pH 8.3, containing 150 mM NaCl. The resulting fragments had two additional amino acid residues, GS, before the native amino acid sequence. The peptides were then purified by reversed phase HPLC using a C 18 column with an acetonitrile/ water gradient and 0.1% trifluoroacetic acid as an ion-pairing agent. The peptides eluted at a ϳ60% acetonitrile concentration, which is highly denaturing. Finally, the peptides were characterized by electrospray/quadrupole/time-of-flight mass spectrometry and amino acid analysis.
Analytical Ultracentrifugation-Sedimentation equilibrium measurements were performed on a Beckman model XL-A analytical ultracentrifuge. Absorbance was measured at 240 nm. Runs were carried out at 20°C in an An60-Ti rotor using 12-mm cells with Epon two-channel centerpieces.
Circular Dichroism Analysis-CD spectra were recorded on an AVIV model 202 spectropolarimeter (AVIV Instruments, Inc.) with thermostatted quartz cells of 0.1-1-mm path length. The spectra were normalized for concentration and path length to obtain the mean molar residue ellipticity after subtraction of the buffer contribution. Thermal scanning curves were recorded at 222 nm for the NC2 peptide to monitor the ␣-helical secondary structure transition and at 230 nm for (GPP) 10 NC2 and NC2(GPP) 10 to monitor the collagen triple helix transition. Peptide concentrations were determined by amino acid analysis.
Evaluation of the Thermodynamic Data-CD transition curves were interpreted based on a two-state mechanism in which three unfolded chains, u, cross-linked by the NC2 domain, combine to a native collagen triple helix, n: u 7 n.
The equilibrium constant K u is as follows, where c u and c n are concentrations of unfolded and native species, and F u is the fraction of unfolded species, where ⍜ n and ⍜ u are the extrapolated native and unfolded ellipticities in the transition region and ⍜ is the observed ellipticity at the given temperature. From Equation 3,

Sequences of the NC1 and NC2
Domains of FACIT Collagens-All FACIT collagens except human type XX have at least two collagenous domains, COL1 and COL2, and two noncollagenous domains, NC1 and NC2 (Fig. 1), where the NC2 domain is positioned between COL2 and COL1. Sequence analysis shows no significant sequence homology either among NC1 or NC2 domains, with the exception of the cysteine pattern at the junction between COL1 and NC1. The length of the NC1 domain varies from 19 to 37 residues (collagens XII/XIX and XIV, respectively), assuming that the first cysteine in the conserved cysteine pattern belongs to COL1. The length of NC2 domains, including one GXY triplet on both ends, varies from 37 to 50 residues (Fig. 1).
Folding and Oxidation of the NC1 Domain-Formation of covalently cross-linked trimers is chosen as an indicator of the correct folding of the NC1 domain. To test the ability of the NC1 domain alone to form a covalently linked trimer, we oxidized the NC1 peptide (Table 1). Mass spectrometry analysis of oxidation products revealed the presence of only two peaks corresponding to a monomer and a dimer. Those products were separated on a sieve column under denaturing conditions. Two prominent peaks ( Fig. 2A) were determined to be the dimer and the monomer by mass spectrometry. No trimer was detected.
The h1NC1 peptide (Table 1) contains the last uninterrupted segment of the collagenous domain and the full-length NC1 domain. The length of this helix appeared insufficient to form a triple helix by itself at 4°C, since no thermal transition was monitored by CD. The oxidation at 4°C produced only monomers and dimers, as in the case with the NC1 peptide. Interestingly, the oxidation experiments of the h1NC1 peptide performed at Ϫ20°C in the presence of 30% methanol to avoid freezing as described in Ref. 5 were more successful. In addition to monomeric and dimeric oxidative forms, a small fraction (ϳ15-20%) of a trimer was formed (Fig. 2B). The purified trimeric form showed the formation of a moderately stable collagen triple helix (Table 2). This led us to the idea that the collagen triple helix formation might be a prerequisite for the correct folding and oxidation of the NC1 domain. Indeed, the addition of two GPO triplets to the beginning of the h1NC1 peptide dramatically changed the efficiency of the NC1 domain oxidative folding. The reduced (GPO) 2 h1NC1 peptide showed collagen triple helix formation (Table 2), and the consequent oxidation produced about 90% of the correctly oxidized trimeric product (Fig. 2C).
To analyze the importance of the triple helical sequence composition on the trimer formation, we substituted the native sequence with (GPO) 6 (Table 1, peptide (GPO) 6 NC1), which is known to form a stable triple helix. The yield of the correctly oxidized product of the (GPO) 6 NC1 peptide was even slightly higher (data not shown). Thus, the oxidative folding of the NC1 domain does not depend on the composition of the triple helical sequence.
The sequence of the NC1 domain is rather short, being only 19 amino acids long. To test its importance on oxidative folding, we synthesized a peptide, (GPO) 6 NC1⌬, where the last 10 residues of the NC1 domain are deleted ( Table 1). The yield of the correctly oxidized trimer was the same as for the (GPO) 6 NC1 peptide. The comparison of the triple helix stability in the oxidized form of the two peptides shows a difference of 5°C (Table 2). Thus, the missing part of the NC1 domain might be important for additional stabilization of the triple helix, either via a direct interaction of the missing part with the triple helix or by sterically restricting the dynamics of the disulfide knot.
Trimeric Structure of the NC2 Domain- Fig. 3A shows an SDS-polyacrylamide gel of the expressed NC2 domain, (GPP) 10 NC2 and NC2(GPP) 10 . The far ultraviolet CD spectrum shows that the NC2 domain has predominantly an ␣-helical structure (Fig. 3B) at 20°C. The oligomeric state of the NC2 domain was investigated by analytical ultracentrifugation. Sedimentation equilibrium runs revealed a molecular mass of 15.7 Ϯ 3.0 kDa, which is consistent with a trimeric structure (calculated molecular mass of the single chain is 5.3 kDa) ( Table  3). Thermal transition curves monitored at 222 nm showed that the NC2 domain forms a highly stable structure that requires a substantial concentration of guanidine hydrochloride for thermal unfolding (Fig. 4). The reversibility of unfolding is shown by the gain of the signal upon reverse temperature scanning of the NC2 domain in a buffer supplemented with 3 M guanidine hydrochloride. Without the denaturant, the ␣-helical structure was stable up to 90°C.
The NC2 Domain Stabilizes the Triple Helix-The single chain peptide with the sequence (PPG) 10 is able to form a collagen triple helix of moderate stability. The midpoint transition temperature (T m ) for (PPG) 10 is a concentration-dependent value and is about 25°C at 1 mg/ml concentration in phosphate-buffered saline (18). It has been previously shown that the T m value of (GPP) 10 can be sufficiently increased by fusion to an obligatory stable trimer (19) or by covalent cross-linking of all three chains (18,20). Stabilization of the (GPP) 10 triple helix by forcing trimerization is possible from both amino-and carboxyl-terminal ends, although the effectiveness is highly varied (18).
Far ultraviolet CD spectra of both (GPP) 10 NC2 and NC2(GPP) 10 are similar in shape and deviate significantly from the CD spectrum of the isolated NC2 domain (Fig. 3B). Equimolar subtraction of the NC2 domain spectrum from either (GPP) 10 NC2 or NC2(GPP) 10 and subsequent adjustment of the mean molar ellipticity demonstrates the presence of the collagen triple helical structure in both peptides (Fig. 3C). The (GPP) 10 part in the (GPP) 10 NC2 peptide has higher content of triple helix then in NC2(GPP) 10 . Thermal transition curves monitored at 230 nm, at which the collagen triple helix transitions can be observed, show a substantial stabilization of the triple helix in both cases (Fig. 5). The T m values are 68 and 55°C for (GPP) 10 NC2 and NC2(GPP) 10 , respectively. The thermal range of the triple helical transitions is below the melting point of the NC2 domain (Fig. 5A); thus, the chains remain associated by the NC2 domain. The difference in the T m values also correlates with the difference in the extent of the triple helical content. Both heating and cooling transition curves are shown to demonstrate a known effect of hysteresis observed for collagen transitions (16,(21)(22). Unfolding of the collagen triple helices observed in both peptides should be considered as a monomolecular reaction, since three chains remain linked by the NC2 domain. A thermodynamic analysis of the transitions shows that the absolute values of ⌬H 0 and ⌬S 0 are smaller for the NC2(GPP) 10 peptide compared with the (GPP) 10 NC2 peptide ( Fig. 5B and Table 3).

DISCUSSION
Oxidative refolding experiments performed on the peptides containing the NC1 domain (Table 1) of collagen XIX demonstrate that a stable triple helical region is absolutely required for the correct disulfide knot formation in the junction region between the COL1 and NC1 domains. The full-length NC1 domain together with the preceding 9 residues of the COL1 domain is not capable of trimerization and disulfide knot formation. The presence of a sequence that can form a stable collagen triple helix is enough to initiate disulfide knot formation. The nature of the amino acid sequence in the triple helix is not important for disulfide knot formation; indeed, the natural sequence can be successfully substituted with the artificial (GPO) 6 sequence. Furthermore, the deletion of the carboxylterminal half of the NC1 domain sequence has no effect on the yield of the correctly oxidized trimers. The same features are observed in the collagen III carboxyl-terminal disulfide knot (16,23). The presence of the correctly formed disulfide knot significantly increases the thermal transition temperature of the collagen triple helix. This is very important for all FACIT collagens, since the carboxyl-terminal collagenous region COL1 has a number of interruptions that should significantly destabilize the triple helix. The presence of the disulfide knot just at the end of the last collagen domain serves as a covalent clamp preventing the carboxyl-terminal end of the collagen triple helix from "breathing" or "fraying." The NC2 domain alone forms a highly stable trimer with a T m value higher than 90°C under physiological conditions. This is the first time that the trimerization abilities of the NC2 domain were demonstrated experimentally. The NC2 domain has an ␣-helical structure and might adopt a coiled-coil structure, as suggested by McAlinden et al. (8). Coiled-coils are versatile motifs that can form a variety of different oligomeric

Summary of biophysical data of the NC1-containing peptides
The number of residues, calculated molecular masses (in Da), molecular masses of the reduced and oxidized peptides determined by mass spectrometry (in Da), and midpoint transition temperatures T m of the triple helix of the reduced and oxidized peptides at a 1 mg/ml concentration in 50 mM sodium acetate buffer, pH 4.5, are shown. NA, not applicable; ND, not detected. Trimerization of Collagen XIX NC2 DECEMBER 5, 2008 • VOLUME 283 • NUMBER 49

Number of residues Mass (calc.) Mass (reduced) Mass (oxidized) T m (reduced) T m (oxidized)
states, including parallel and anti parallel arrangements of chains (24). Coiled-coil proteins contain a characteristic sevenresidue sequence repeat whose positions are designated a to g. The interacting surface between ␣-helices in a classical coiled-coil is formed by interspersing nonpolar side chains at the a and d positions with hydrophilic residues at the flanking e and g positions (25). The heptad repeat patterns in the NC2 domains of FACIT collagens have interruptions and unusual distribution of apolar and polar residues (8). The NC2 domain is located between two adjacent collagenous domains, and its structure might resemble the staggered conformation of triple helices. Thus, the unusual heptad repeat pattern might be a way to form a staggered coiled-coil. There are a few known examples of natural and designed ␣-helical parallel coiled-coils having four or seven chains with shifted register (26 -28), although none of them would resemble the collagen triple-helical staggered packing. Mapping of NC2 residues on a theoretical ␣-helical wheel shows preferential distribution of polar side chains on one side that could possibly interact with two other chains to form a hydrophobic core. The trimerization properties and stabilization of the collagen triple helix on either end of the NC2 domain suggests a universal role for this domain in chain selection, chain registration, and initiation of the triple helix formation in all FACIT collagens. This is somewhat in contradiction with studies on collagen IX, where others believe that the COL2-NC2 region alone is not sufficient for trimerization (7). Although this might reflect a different role of the NC2 domain in type IX collagen, a more detailed analysis is required. In future studies, we will investigate if the NC2 domain of type IX collagen is responsible for chain selection of the only heterotrimeric FACIT collagen. Since no mutations are described in NC2 domains of FACITs, it remains to be elucidated what res-   idues are important for folding, stability, and especially for chain selection. To verify the proposed staggered conformation of the NC2 domain, the determination of an atomic structure by x-ray crystallography or NMR is required.