Crystal Structure of Human Type III Collagen Gly991–Gly1032 Cystine Knot-containing Peptide Shows Both 7/2 and 10/3 Triple Helical Symmetries*

Type III collagen is a critical collagen that comprises extensible connective tissue such as skin, lung, and the vascular system. Mutations in the type III collagen gene, COL3A1, are associated with the most severe forms of Ehlers-Danlos syndrome. A characteristic feature of type III collagen is the presence of a stabilizing C-terminal cystine knot. Crystal structures of collagen triple helices reported so far contain artificial sequences like (Gly-Pro-Pro)n or (Gly-Pro-Hyp)n. To gain insight into the structural properties exhibited by the natural type III collagen triple helix, we synthesized, crystallized, and determined the structure of a 12-triplet repeating peptide containing the natural type III collagen sequence from residues 991 to 1032 including the C-terminal cystine knot region, to 2.3Å resolution. This represents the longest collagen triple helical structure determined to date with a native sequence. Strikingly, the Gly991–Gly1032 structure reveals that the central non-imino acid-containing region adopts 10/3 superhelical properties, whereas the imino acid rich N- and C-terminal regions adhere to a 7/2 superhelical conformation. The structure is consistent with two models for the cystine knot; however, the poor density for the majority of this region suggests that multiple conformations may be adopted. The structure shows that the multiple non-imino acids make several types of direct intrahelical as well as interhelical contacts. The looser superhelical structure of the non-imino acid region of collagen triple helices combined with the extra contacts afforded by ionic and polar residues likely play a role in fibrillar assembly and interactions with other extracellular components.

Collagens are the most abundant proteins in animals, comprising an estimated one-third of the total protein by weight (1)(2)(3). At least 27 collagen types, which are formed from 42 distinct polypeptide chains, exist in vertebrates (1)(2). In addition, more than 20 additional proteins adopt collagen-like structures such as collectins, ficolins, and scavenger receptors (2). Collagen is an essential molecule in vertebrates, because it plays the dominant role in maintaining the structure of tissues. However, collagen and collagen-like proteins have many other important roles, such as cell adhesion, chemotaxis, cell migration, and the regulation of tissue remodeling during cell growth, differentiation, morphogenesis, and wound healing (2).
All collagen molecules consist of three polypeptide chains, called ␣ chains, which contain at least one region of repeating Gly-Xaa-Yaa sequences (1)(2). In the collagen molecule, the three ␣ chains each fold into a polyproline II-like left-handed structure, and the three polyproline II-like chains twist around each other to form a right-handed superhelix, called the collagen triple helix (4 -7). Critical to the formation of the triple helix is the presence of a glycine residue at each third position in the chain because this residue is the only one that can exist in the small space at the center of the triple helix (8). Each of the three chains therefore has the repeating structure Gly-Xaa-Yaa, in which Xaa and Yaa can be any amino acid but are frequently the imino acids proline in the Xaa position and hydroxyproline (Hyp) in the Yaa position. Because both proline and Hyp are rigid, cyclic imino acids, they limit rotation of the polypeptide backbone and thus contribute to the stability of the triple helix. Collagen polypeptides that lack Hyp can fold into a triple helical conformation at low temperatures, but the triple helix formed is not stable at mammalian body temperature (8).
The number of Gly-Pro-Hyp repeats is the main, but not exclusive, factor in determining collagen thermostability (9). Approximately 90% of collagen tripeptide units contain at least one non-imino acid residue in the Xaa and/or Yaa position, and these residues likely play a role in collagen structure, stability, and function (5)(6)(7). Indeed, a notable feature of the collagen triple helix is that the amino acids occupying Xaa and Yaa positions are solvent-accessible. Because of this, these residues would be predicted to play important roles in interactions with other molecules, such as extracellular matrix proteins. In addition, these residues are predicted to be important in collagen * This work was supported by a grant from the Shriners Hospital for Children, in part by Burroughs Wellcome Career Development Award 992863 (to M. A. S.), and in part by United States Department of Energy Contract DE-AC03-78SF00098. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The atomic coordinates and structure factors ( triple helix self-association leading to fibril formation. The best characterized and most common collagen fibril form is the 67-nm (D) periodic fibril, which is observed in most connective tissues (10,11). Collagen types I, II, III, V, and XI self-associate to form these characteristic fibrils. Studies suggest that the axial staggering of collagen molecules that give rise to these fibrils is due to electrostatic and hydrophobic interactions between neighboring molecules (12)(13)(14). Specifically, oppositely charged residues in the Xaa and Yaa position of the collagen triple helix are predicted to play a role in determining the axial stagger of the fibril (11). Such charged residues have also been implicated in the interactions between macrophage scavenger receptor and other molecules and the hexamer formation of the serum complement protein C1q (15)(16)(17). Type III collagen contains multiple charged residues in the Xaa and Yaa positions of its chain (18). Type III collagen is the second most abundant collagen in human tissues after type I and is primarily found in tissues exhibiting elastic properties, such as skin, blood vessels, and internal organs. Type III collagen is common in fast growing tissue, particularly at the early stages of wound repair (19,20). Mutant mice have been generated by gene targeting in which the type III collagen gene (COL3A1) has been knocked out (20). These knock-out mice exhibit irregularly sized collagen fibers in the skin dermis as well as the aortic adventitia, which indicates the importance of type III collagen in regulating collagen fiber size. Notably, type III collagen null mice die perinatally, usually because of blood vessel or intestinal rupture. Thus, these data demonstrate that type III collagen is necessary for proper organ and tissue function, especially in distensible organs. These findings are consistent with the fact that mutations of type III collagen cause the most severe form of Ehlers-Danlos syndrome, EDS IV, 3 which affect internal organs, arteries, joints, and skin. Indeed, EDS IV can result in sudden death when the large arteries rupture (20). The most severe forms of EDS IV are correlated with point mutations that substitute a residue for a glycine near the C-terminal end of the triple helix, including G1012R, G1018V, and G1021E (21).
Structurally, type III collagen is a homotrimer composed of three ␣1(III) chains and resembles other fibrillar collagens. A key feature in the formation of type III collagen is a so-called disulfide or cystine knot, which is located between the triple helical region and the C-terminal telopeptide (22,23). The knot is formed by three interchain disulfide bonds, and it significantly stabilizes the triple helical structure. The stability imparted by the disulfide knot has been successfully used to produce collagenous peptides that otherwise would be too unstable to study (24,25). Production of these peptides involves the C-terminal extension by the bis-cysteinyl-sequence GPCCG, followed by air or glutathione oxidation at lower temperature under slightly basic conditions.
In the 1950s two structural models for collagen with different fiber periods and helical symmetries were proposed based on the fiber diffraction pattern of native collagen. These were the 7/2 helical model with a 20-Å axial repeat (26,27) and the 10/3 helical model with a 30-Å axial repeat (27). These single-helical models were discarded after the proposal of a triple helical structure with 10/3 helical symmetry and a 28.6-Å axial repeat by Ramachandran and Kartha (28). The first crystal structure of (Pro-Pro-Gly) 10 showed a triple helix with 7/2 helical symmetry with a 20-Å axial repeat (29). Recently, the fiber diffraction analysis of native collagen was performed based on the advanced diffraction data acquisition techniques and revealed that the x-ray diffraction data can be explained not only by the prevailing 10/3 helical model but also by the 7/2 helical model (7). Almost all of the high resolution structures of model peptides adopt a 7/2 helical symmetry, and the conformation close to the 10/3 helix appears only in the guest region of host-guest peptides, like the T3-785 peptide and the integrin-binding protein complexed with integrin (7).
Crystallization and structure determination of collagen peptides is a challenging task. All of the available crystal structures are of either artificial mimics (like (GPP) n or (GPO) n , where O ϭ 4(R)-hydroxyproline) or host-guest peptides where a short stretch of one to three native tripeptide units are flanked by three to five GPO repeats.
Unfortunately, these structures give only limited insight into how side chains other than imino acid residues in the Xaa and Yaa positions contribute to collagen structure and stability (30 -43). The length of a collagen peptide is an important determinant in crystallization. Longer fragments impart higher flexibility, which may impede crystallization, whereas the production of a thermally stable triple helix actually requires longer sequences and the addition of artificial stabilizing tripeptide units, like GPO. These factors have limited the length of collagen fragments that have crystallized to only nine to eleven tripeptide units. Furthermore, this has restricted the number of integrated native tripeptide units to only one to three. One strategy used to obtain structural information on a longer triple helix was to fuse a stable trimeric molecule, foldon, which is a trimerization domain of fibritin, to a (GPP) 10 collagen mimic (44). The resulting crystal structure revealed that there was a dramatic kink between the (GPP) 10 triple helix and the foldon domain, which permitted adaptation to the mismatch between the 3-fold rotation symmetry of the foldon domain and the one residue stagger of the collagenous structure (45). Here we describe another approach to stabilize a long collagen triple helix structure by the use of a native type III collagen C-terminal disulfide knot (46 -49). Specifically, we crystallized a 42-residue collagen peptide containing the C-terminal type III collagen sequence from residues 991 to 1032, which contains the residues most commonly mutated in very severe forms of Ehlers-Danlos syndrome IV. This peptide also contains the native C-terminal cystine knot region. Thus, we demonstrate that utilization of the cystine knot opens doors to crystallization of lengthy and native collagen fragments.

EXPERIMENTAL PROCEDURES
Peptide Sequences-The sequence of the type III collagen peptide used in this study is as follows: (GPI GPO GPR GNR GER GSE GSO GHO GsMO GPO GPO GAO GPCCGG) 3 . (O is 4(R)-hydroxyproline, and sM is L-selenomethionine). These residues correspond to residues 991-1032 of the human collagen III a chain, with exception of the L-selenomethionine, which was substituted in place of the wild type glutamine for use in multiple wavelength anomalous diffraction (MAD) phasing.
Peptide Folding, Oxidation, and Purification of Disulfidelinked Trimer-The lyophilized, reduced peptide was dissolved in degassed and N 2 saturated 50 mM sodium acetate buffer, pH 4.5, under N 2 atmosphere and was kept at 4°C for 24 h to allow triple helix formation prior to oxidation. Two strategies of oxidation were used: exposure to atmospheric O 2 or addition of reduced (10 mM) and oxidized (1 mM) glutathione and exposure to atmospheric O 2 . In both cases, the pH was raised to 8.3 with a saturated solution of Tris. Oxidation was carried out for 5-7 days, and the peptide mass was periodically analyzed by liquid chromatography-mass spectrometry. The maximum yield of covalently linked trimeric peptide was ϳ60 -70% and was slightly higher when glutathione was used for oxidation. To separate covalently linked trimeric peptide from other oligomers, the oxidized crude material was dissolved in deionized 8 M urea solution with 0.1% trifluoroacetic acid to prevent disulfide exchange and applied to a sieve column. Trimer-containing fractions were pooled out and further purified by reverse phase HPLC using a C 18 column.
Peptide Crystallization, Data Collection, Structure Determination, and Refinement-The purified and lyophilized covalently linked trimeric collagen III peptide, Gly 991 -Gly 1032 , was dissolved at a concentration of 15 mg/ml in 5 mM acetic acid. The peptide was crystallized at 22°C using the hanging drop vapor diffusion method. For crystallization, 2 l of the peptide solution was mixed with 2 l of the reservoir solution of 20% polyethylene glycol monomethyl ether 550. The crystals appeared as very thin plates in a period of 1-5 days and are monoclinic, space group P2 1 with a ϭ 31.98 Å, b ϭ 21.52 Å, c ϭ 68.97 Å, and ␤ ϭ 92.58°. Although the crystals diffracted beyond 2.5-Å resolution, they displayed extremely high mosaic spread (Ͼ3.0°) even when x-ray intensity data were collected at room temperature. Thus, several strategies for cryoprotection were tried in an attempt to improve the quality of the diffraction. The data were collected at ALS Beamline 8.2.1. Good quality data were obtained for only one crystal. For this data collection, glycerol was first added to the drop containing the crystal to a final concentration of 10%. The drop sat for 8 h, and the crystal was placed directly in the cryostream. However, at this point the mosaic spread was still unacceptably high. Thus, the crystal was annealed several times by removing the crystal from the cryostream and placing back in the drop solution. After two annealing cycles the diffraction, although still highly mosaic (1.7°), was of sufficient quality to collect data. A complete three wavelength MAD data set was collected on this crystal and was used for structure determination.
The positions of the three selenomethionines in the triple helix were obtained using SOLVE. The phases obtained from these positions were improved by density modification in CNS, and the resulting density modified map was used for model building in O (50 -52). After two-thirds of the structure was built, phase combination using phases from the partial model was used to improve the map. This permitted the current structure, which consists of one complete triple helix in the crystallographic asymmetric unit (ASU) to be built. Multiple cycles of simulated annealing, xyzb refinement, and rebuilding in O resulted in an R work /R free of 24.3%/27/4% to 2.30 Å resolution. The current model has excellent stereochemistry ( Table 1) (53). Multiple omit maps were calculated throughout the process to confirm the correctness of the model. Nonetheless, the electron density of the cystine knot residues remained poor, and the C-terminal residues that precede the cysteine residues display high B-factors consistent with this region being disordered or consisting of multiple conformations.
Analysis of Triple Helix Geometry-Helical parameters were calculated based on the method of Sugeta and Miyazawa (54) 1.5

Ramachandran analysis
Most favored (%/no.) 92.9/52 Additionally allowed (%/no.) 7.1/4 Generously allowed (%/no.) is the observed intensity and I hkl is the final average value of intensity. b The values in parentheses are for the highest resolution shell.
the reflections belong to a test set of 5% randomly selected data.

RESULTS AND DISCUSSION
Folding and Oxidation of the Gly 991 -Gly 1032 Peptide-Previous studies demonstrated that the correct folding and oxidation of the disulfide knot requires preformation of the collagen triple helix (47,48). Triple helix formation of the reduced Gly 991 -Gly 1032 peptide was confirmed by thermal unfolding as monitored by the change of CD signal at 225 nm. The midpoint transition temperature was 12°C measured at the 1 mg/ml peptide concentration in 0.1 M sodium acetate buffer, pH 4.5. Pronounced hysterisis was observed for the reduced peptide transitions, as expected (48). Oxidation was performed at 5 mg/ml peptide concentration and 4°C, which is the required temperature for triple helix formation. Notably, the effectiveness of trimer formation was ϳ60 -70%, which is similar to what has previously been reported (47). The T m of 30°C measured for the oxidized peptide was substantially higher in the same buffer. Crystallization was performed at 22°C, at which temperature the triple helix is stable.
Overall Structure of Collagen Type III (Gly 991 -Gly 1032 ) Peptide-The structure of the Gly 991 -Gly 1032 peptide was determined by MAD. For simplicity the residues in each chain have been numbered from 1 to 42 (i.e. residue 1 corresponds to 991 and residue 42 corresponds to 1032 in the native sequence). For MAD analysis, the Gln, corresponding to Gln 16 was replaced by selenomethionine (see "Experimental Procedures"). The crystallographic ASU consists of one complete triple helix (Figs. 1 and 2). As has been observed in other collagen peptide crystal structures, the N-and C-terminal ends of each of the three triple helical chains display weak electron density and appear mostly disordered. The C-terminal residues are particularly disordered and C-terminal residues that are present display significantly elevated B-factors (Fig. 1B). The final FIGURE 1. Overall structure of the Gly 991 -Gly 1032 peptide structure. A, the overall triple helix is shown for the Gly 991 -Gly 1032 structure and colored according to superhelix symmetry: region 1 is colored yellow, region 2 is red, region three is colored yellow, and region 4, the cystine knot region, is green. Residue 4 from each chain is numbered (to indicate stagger), and the last residues that are observed in each chain are also labeled. B, the Gly 991 -Gly 1032 triple helix shown in the same orientation as A but colored according to B-factor, with the gradation of blue to red reflecting increasing B-factors (i.e. blue represents low B-factors, and red represents high B-factors). This figure was made using PyMOL (55). model consists of residues 3-39 of chain A, residues 4 -40 of chain B, and residues 2-38 of chain C. There are also 61 water molecules in the structure. The final R work /R free is 24.3%/27.4% to 2.3-Å resolution. The packing of triple helices in the crystal appears to be pseudotetragonal (Fig. 3).
The Disulfide Knot-The Gly 991 -Gly 1032 peptide contains a GPCCGG sequence at the C terminus, which has been shown to form a so-called cystine knot. The connectivity of the cystine knot has not been determined despite the fact that it has been extensively studied by several techniques including NMR (47). There are eight possible ways to connect the six cysteines to form the disulfide knot. Two models were previously suggested based on steric compatibility (47,56). According to these modeling studies, the three collagen chains in our structure are designated as A, B, and C, where chain A has a one-residue stagger toward the N terminus, followed by the B chain, and finally the C chain (Fig. 1A). Based on this type of stagger, the first model proposed by Bruckner et al. (56) has a connectivity as follows; A1-B1/A2-C1/B2-C2 (Fig. 4A). The second model proposed by Barth et al. (47) has the connectivity, A1-B2/A2-C1/ B1-C2 (47) (Fig. 4A). The two models share the A2-C1 disulfide bridge yet differ in the other disulfide connectivities. These two models also differ in the final conformation that would be adopted by the peptides containing the cysteines. Specifically, in the first model, the cystine knot residues adopt / dihedral angles consistent with a collagen triple helix, whereas this is not the case for the second model. We see weak electron density for this region of the structure. Clear density is only observed for one disulfide bond, between the A1 and B2 cysteines (Fig. 4B), which is in conflict with the first model. In addition, the / dihedral angles of the cysteines in our structure deviate significantly from those observed in a collagen triple helix. Thus, although the electron density for the two  . The stagger of the triple helix that corresponds to the Gly 991 -Gly 1032 structure is indicated at the left side (A to B to C) in red. B, composite omit map contoured at 1.0 showing the density for the cystine knot region. Clear density is only observed for the A1-B2 connection, which is consistent with model 2 and 3. The map is shown as a blue mesh. The structure is shown as sticks and colored according to atom type with carbon, nitrogen, oxygen, and sulfur shown in yellow, blue, red, and green, respectively. remaining disulfide bonds are poor, these combined data are supportive of the model proposed by Barth et al. (47) or a third model with the connectivity A1-B2/A2-C2/B1-C1 (Fig. 4A).
The poor electron density we observed in the disulfide knot region might be due to several reasons. One may be the flexibility of the last residues of the polypeptide chain. Indeed, there are no crystal contacts to the C-terminal region, which could stabilize these residues (Fig. 4B). However, the interchain disul-fide bridges formed in the disulfide knot region are expected to produce an extremely rigid structure and thus might be expected to be visible in electron density maps. If this is the case, another possible explanation for the lack of electron density is the existence of multiple disulfide connectivities, which are averaged over the whole ensemble of molecules in the crystal. Indeed, as was noted only 60 -70% of the peptide is correctly oxidized into trimers, whereas the rest are trapped into covalently linked dimers with two interchain disulfide bridges and monomers with intrachain disulfide bonds. Therefore, it seems highly probable that even the covalently linked trimers might contain more than one possible structure of the disulfide knot. The presence of multiple disulfide connectivities would explain why NMR also failed to delineate a single structure for the cystine knot (47). Multiple connectivities are highly unlikely in the real collagen structure. It is possible that the formation of the correct disulfide connections may require other regions of the collagen III chain not present in our structure. Indeed, the peptide used in our studies does not include the C-terminal telopeptide sequence, which normally follows the cystine knot, nor does it contain the C-terminal propeptide that initiates trimerization of the triple helix. These regions could play important roles in selecting the native conformation of the disulfide knot.
The Gly 991 -Gly 1032 Structure Contains Regions of Distinct Superhelical Symmetry-The Gly 991 -Gly 1032 peptide can be divided into four main regions based on amino acid sequence type: an N-terminal imino acid-containing region, a middle stretch that contains non-imino acids in the Xaa and Yaa posi-   NOVEMBER 21, 2008 • VOLUME 283 • NUMBER 47
As noted, the residues in the cystine knot region show strong deviation from any type of triple helical symmetry. The fact that region 3, residues 18 -36 (corresponding to collagen III residues 1010 -1027), is a hot spot for mutations leading to the most severe forms of EDS IV suggests that the formation of this tighter triple helix symmetry is important in maintaining the stability of the collagen III triple helix structure. Indeed, the stretch of residues from 6 to 17 that precede region 3 adopts a looser conformation. The looser conformation observed in this region is likely attributed to the presence of non-imino acids in the Xaa and Yaa positions of the triple helical chains as similar alterations in collagen superhelical symmetry have been observed in the T3-785 peptide, which contains a short stretch of biologically relevant sequence (37).
Intra-and Interstrand Contacts-The Gly 991 -Gly 1032 structure contains a number of non-imino acid residues including charged and polar residues and therefore provides examples of several kinds of intrachain, interchain and interhelical contacts that may aid collagen stability and self-assembly. Previous studies examining collagen stability at different pH values indicated that ion pairing interactions increase the stability of the triple helix (57). In the Gly 991 -Gly 1032 structure there are numerous intra-and interchain ion pair and hydrogen bonding interactions observed between charged and polar residues (Fig. 6) ( Table 2). There are multiple examples of interchain contacts, but the only intrachain ion pairing interaction is that between the side chains of Arg 15 (A) and Glu 18 (A). Interchain ion pairs include those between Arg 9 (C) to Glu 14 (A) and Arg 12 (A) to Glu 14 (B) (Fig. 6, A and C). In addition to ion pairing interactions, there are numerous interchain hydrogen bonds. Interestingly, many of the contacts involve serine residues. For example, Ser 17 (B) contacts Glu 18 (A), Ser 17 (A) interacts with both Arg 12 (C) and Arg 15 (C),and Ser 20 (A) hydrogen bonds with Arg 15 (C) (Fig. 6, A and C). These hydrogen bonds form a continuous stretch of interactions: Arg 12 (C) to Ser 17 (A) to

interchain and interhelical side chain interactions
Each residue in the peptide is indicated at the left and found three times in the triple helix (chains C, A, and B). The amino acids contacted by this residue are shown in the table. Contacts between residues from different triple helices are indicated by the residues and the chain with a prime designation (i.e. AЈ, BЈ, or CЈ). CK, cysteine knot region; NP, residue is not present in structure; Wat indicates that the residue is involved in water contacts; Sol indicates that the residue is exposed to solvent but not observed to be making direct contacts to any water molecule; -indicates glycine residue; CϭO, carbonyl; SeMet, L-selenomethionine.
Interhelical Contacts-As found in other collagen peptide structures, in the Gly 991 -Gly 1032 structure water molecules play a key role in mediating interactions between residues in different triple helices. Such water-mediated contacts are thought to be important in fibril formation. However, the Gly 991 -Gly 1032 structure differs from other peptide structures in that there are numerous direct ionic and polar contacts between triple helices as well as direct nonpolar interactions. One type of direct hydrogen bonding contact between collagen triple helices that has been observed in structures of other collagen triple helices is that between two Hyp residues located on different triple helices. However, in our Gly 991 -Gly 1032 peptide structure there are no such contacts. Instead, direct hydrogen bonding interactions between non-imino acid side chains dominate and in fact, appear critical to the formation of the staggered stacking of collagen molecules observed in the crystal. The staggered axial packing is between helices arranged in an antiparallel manner. However, in vivo helices primarily pack in a parallel fashion to form fibrils. Nonetheless, data suggest that these types of interactions must play crucial roles in helical packing and fibril formation in vivo (12,13).
One extensive hydrogen bonding network is comprised of arginine and serine residues. In this Arg-Ser network there are hydrogen bonds from Ser 17 (A) to Arg 15 (C) to Ser 20 (A) on one triple helix to Ser 17 (C)Ј and Arg 15 (B)Ј (where prime indicates residues from a different triple helix) ( Fig. 6C and Table 2). This complicated arrangement of interactions serves to stabilize the packing of the molecules in the crystal. This is supported by the fact that the atoms of the residues involved in this network display among the lowest B-factors in the structure (Fig. 1B). Interestingly, another contact that appears crucial for stabilization of the staggered packing is a unique His(C)-His(B)Ј stacking interaction between residues on different collagen triple helices (Fig. 6B). This appears to be the first instance of a hydrophobic interhelical stacking interaction observed in a collagen structure; yet the importance of such contacts in fibril formation has been indicated by previous studies (12,13).
The finding that non-imino acid side chains can contribute to the stacking of triple helices into staggered arrays is consistent with data from Doyle et al. (11) indicating that such residues participate in determining the axial stagger of collagen fibrils in vivo (7,58). However, the variability of the polar and ionic interactions observed from the same residue in different chains suggests that the interactions we observe are likely an interchangeable subset of many possible interactions (Table 2). Indeed the lack of highly specific side chain-side chain interactions supports data from magnetic resonance studies, which suggested that collagen molecules in fibrils experience a large degree of rotational freedom about the helical axis, and therefore interhelical contacts are not comprised of a single set of interactions (59). Gly 991 -Gly 1032 Peptide / Dihedral Angles Show Deviation from Typical Collagen Structures-In the collagen structures solved to date, the / dihedral angles of residues in the peptides all fall within a very narrow range of values (3, 30 -41). This likely reflects the fact that these dihedral angles are critical in preorganizing the formation of the triple helix while also allowing for the formation of optimal Gly-NH-OC-Pro Xaa interchain "signature" hydrogen bonds. As seen in Table 3, the average Gly-NH-OC-Pro Xaa interchain hydrogen bond distances and angles observed in our structure (3.03 Å and 167°for region 2; 2.97 Å and 164°for region 3) are essentially the same as those in other high resolution collagen triple helical structures (Table 3) (30 -41). Moreover, the / dihedral angles of residues in region 3 of our structure are also essentially the same as those observed in other high resolution collagen peptide structures.
However, examination of the / dihedral angles of the residues in region 2 of the Gly 991 -Gly 1032 peptide shows significant deviations of for glycine and Xaa position residues (Gly value of ϳ163°compared with 175°and an Xaa value of ϳ154°c ompared with 163°) as well as the dihedral angles of the Yaa position arginine residues (Table 4). Although region 1, the N-terminal region of the structure, also shows some divergence from optimal triple helical geometry, similar deviations have been observed in the extreme N-terminal and C-terminal ends of other collagen peptide structures and appear to be caused by end fraying. Moreover, the small number of residues in region 1 makes these averaged / dihedral angles values statistically insignificant.
Thus, unlike the notable departure of region 2 from triple helical geometry, the deviations in region 1 are likely not significant. Interestingly, the multiple deviations from ideal dihedral angles observed for residues in region 2 are consistent with the looser conformation of the triple helical conformation in this area. These data indicate that region 2 likely does not form as stable a triple helix as does region 3 or other (GPP) n -containing stretches. It should be noted that the low B-factors of the residues in region 2 do not reflect on the stability of the triple helical conformation in this region but rather are indicative of the numerous interactions mediated by these residues to symmetry-related molecules, which act to stabilize their positions within the crystal. Thus, it appears that the presence of nonimino acid residues plays an important role in collagen structure/function by providing a looser overall conformation of the triple helix while also permitting the formation of important hydrogen bonding and ionic interactions that stabilize not only the individual triple helix but may also mediate interhelical contacts that are important in collagen packing. However, more examples of structures of collagen molecules with nonimino acid residues in the Xaa and Yaa positions and concomitant biochemical studies of such peptides are needed to elucidate the importance of such contacts and the energetic trade off of a looser structure versus the added intrachain, interchain, and interhelical interactions.
In conclusion, the structure of the collagen III peptide, Gly 991 -Gly 1032 , opens the possibility of obtaining structural information for native sequences of collagen peptides that had previously been too unstable to crystallize. More importantly, the structure, which is the first of a long, non-imino acid-con-

Structure of Human Type III Collagen (Gly 991 -Gly 1032 )
taining sequence of a native collagen triple helix, reveals properties imparted by such amino acid residues that may be important in collagen folding, fibril formation, and interactions with other proteins.