Conformational Features of a Natural Break in the Type IV Collagen Gly-X-Y Repeat*

Fibrillar collagens have an absolute requirement for Gly as every 3rd residue, whereas breaks in the Gly-X-Y repeating pattern are found normally in the triple helix domains of non-fibrillar collagens, such as type IV collagen in basement membranes. In this study, a model 30-mer peptide is designed to include the interruption GPOGAAVMGPOGPO found in the α5 chain of type IV collagen. The GAAVM peptide forms a stable triple helix, with Tm = 29 °C. When compared with a control peptide with Gly as every 3rd residue, the GAAVM peptide has a marked decrease in the 225 nm maximum of its CD spectrum and a 10 °C drop in stability. A 50% decrease in calorimetric enthalpy is observed, which may result from disruption of ordered water structure anchored by regularly placed backbone carbonyls. NMR studies on specific 15N-labeled residues within the GAAVM peptide indicate a normal triple helical structure for Gly-Pro-Hyp residues flanking the break. The sequence within the break is not disordered but shows altered hydrogen exchange rates and an abnormal Val chemical shift. It was previously reported that a peptide designed to model a similar kind of interruption in the peptide (Pro-Hyp-Gly)10, (GPOGPOPOGPO), is unable to form a stable triple helix, and replacement of GAA by GPO or VM by PO within the GAAVM break decreases the stability. Thus, rigid imino acids are unfavorable within a break, despite their favorable stabilization of the triple helix itself. These results suggest some non-random structure typical of this category of breaks in the Gly-X-Y repeat of the triple helix.

The repeating (Gly-X-Y) n sequence of the collagen triple helix allows for easy identification of this structural motif from amino acid sequences. The collagen triple helix structure consists of three supercoiled polypeptide chains, each in a polyproline II-like conformation (1)(2)(3)(4). The repeating pattern of Gly as every 3rd residue is generated by steric requirements of the close packing of the three supercoiled polypeptide chains near a central axis. The collagen amino acid sequence is also characterized by a high frequency of proline and hydroxyproline (Hyp 4 ) in the X and Y positions, respectively, which serves to stabilize the polyproline II-like helix of the individual chains. The collagen triple helix is the defining motif of all extracellular matrix proteins classified as collagens and is also found as a domain in various other proteins, including C1q, macrophage scavenger receptor, collagenous tail of asymmetric acetylcholinesterase, and the bacterial proteins Streptococcus pyogenes scl and Bacillus anthraces bcl1 (5)(6)(7)(8)(9).
Although the (Gly-X-Y) n repeating sequence is a requirement to form a collagen triple helix, this pattern is perfectly maintained in some collagen domains but not in others (9). The most abundant collagens are those found in fibrils with a 670 Å axial period (types I, II, III, V, XI), and these all maintain a precise Gly-X-Y repeat throughout their ϳ1000residue triple helix domain. The replacement of even one Gly by another residue as a result of a single base change results in a pathological condition (10 -12). For instance, missense mutations leading to osteogenesis imperfecta have been identified at more than 200 Gly sites along the ␣ 1 and ␣ 2 chains of type I collagen (13). However, such a strict regularity of Gly as every 3rd residue throughout the triple helix domain is not required in non-fibrillar collagens, which may be found as membrane proteins, on the surface of periodic collagen fibrils, as networks, or in anchoring fibrils (9). For instance, type IV collagen, which forms a network-like array in basement membranes, contains more than 20 breaks in the (Gly-X-Y) n pattern of its ϳ1350-residue long triple helix (14 -16). These normal interruptions found in non-fibrillar collagens may play a role in molecular structure, self-association within the network, or binding (17).
A number of possible structural perturbations have been proposed to occur at interruptions in the Gly-X-Y repeating pattern of a triple helix. A kink in the triple helix is known to occur in C1q at a site where the A chain has an extra Thr, the C chain has a Gly-to-Ala replacement, and the B chain has an uninterrupted Gly-X-Y sequence (18,19). Alternatively, an interruption could lead to a flexible site within the triple helix (16,20). Visualization of type IV collagen chains by rotary shadowing shows some flexible sites, and the positions of flexible sites within the molecule correlate with several sequence interruptions. Two crystal structures were solved for the homotrimeric model peptide (Pro-Hyp-Gly) 10 containing an interruption. Peptides containing either a Gly-to-Ala replacement (3) or a deletion of one Hyp residue 5 show straight rod-like triple helices but with a loss of angular or axial register at the interruption site.
In addition to affecting molecular and higher order structure, breaks in the (Gly-X-Y) n pattern can also serve as interaction sites of biological importance. For example, the sequence of the ␣1 chain of type IV collagen contains a break GAKGEPGEFYFDLRLKGDKGDP, which was shown to promote the adhesion of melanoma, ovarian carcinoma, and Jurkat cells (21,22). In the mannose-binding protein, the serine proteinases are shown to bind near a kink in the triple helix domain (23). A kink, flexible site, or loss of loss of helix registration would introduce a structurally distinct site that could be involved in recognition.
Structural and functional consequences could be influenced by the nature of the break and by the sequence of surrounding residues (24). Breaks can be classified by the way in which they interrupt the Gly-X-Y pattern. In many cases, the presence of Gly as every 3rd residue is altered such that glycines are separated by 1 or 4 residues, rather than the normal 2 residues, whereas in other cases, there are larger stretches of non-repeating residues separating Gly-X-Y repeats. It has been suggested that the Gly-X-Y residues surrounding breaks have some characteristic features with respect to charges and imino acids (24). Peptide models provide an approach to characterizing the effects of breaks on triple helix structure and stability (3,25,26). Here, studies are presented on a peptide that models a break in the triple helix found in a chain of type IV collagen. Type IV collagen is a family of six chains, ␣1(IV), ␣2(IV), ␣3(IV), ␣4(IV), ␣5(IV), and ␣6(IV), which can form three distinct heterotrimeric collagen molecules with tissue-specific locations: (␣1(IV)) 2 ␣2(IV) in all mammalian basement membranes; ␣3(IV)␣4(IV)␣5(IV) in renal glomerular basement membrane; and (␣5(IV)) 2 ␣6(IV) in smooth muscle basement membrane (27). Each molecule has a long triple helix domain, with a globular C terminus involved in chain selection and dimerization of molecules. There are over 20 breaks in the Gly-X-Y repeat of each type IV chain (14). Despite the presence of these interruptions in native type IV collagen, an additional break due to the replacement of one Gly residue in the Gly-X-Y repeat is not tolerated. More than 150 mutations leading to Alport syndrome have been defined within the X-linked ␣5(IV) chain, and these are predominantly single base mutations replacing one Gly by another residue (15,28,29). Studies are presented here on a homotrimer peptide model for a natural break found in the ␣5(IV) chain of type IV collagen, GPOGAAVMGPOGPO (residues 386 -399) and a set of homologous peptides with imino acids introduced within the break. A peptide in which the Gly as every 3rd residue is restored was designed as a control. CD spectroscopy, differential scanning calorimetry (DSC), and NMR studies indicated that the GAAVM break is successfully incorporated within a stable triple helical structure, and information was obtained concerning the energetics and conformation within the break.

MATERIALS AND METHODS
Peptides-Peptides are denoted by the single amino acid code for the residues surrounded by Gly-Pro-Hyp triplets. In this notation, Hyp is denoted as O. Peptides GAAVM, GAAGVM, GAAPO, GPOVM, GPOPO, and GPGPO were synthesized by Tufts University Core Facility (Boston, MA). The peptides include a Tyr at the C terminus, and the concentration was determined using the molar extinction coefficient 1400 (M Ϫ1 cm Ϫ1 ) at 275 nm on a Beckman model DU640 spectrophotometer. Peptides were purified on a Shimadzu reversed-phase high pressure liquid chromatography system, and the identity of the peptides was confirmed by laser desorption mass spectrometry.
Circular Dichroism Spectroscopy-CD spectra were recorded on an Aviv model 62DS spectropolarimeter. Cuvettes of 1-and 0.2-mm path lengths were used, and the temperature of the cells was controlled using a Peltier temperature controller. Samples were prepared at different concentrations in either phosphate-buffered saline buffer, pH 7, or 50 mM sodium phosphate buffer, pH 2.2, and equilibrated at 0°C for at least 48 h prior to recording spectra. Wavelength scans were collected in 0.5-nm steps with 4-s averaging time and repeated three times. For temperature-induced denaturation, the ellipticity was monitored at 225 nm. The peptides were equilibrated for 2 min at each temperature, and the temperature was increased at an average rate of 0.1°C/min. The melting curves are obtained under standard conditions used in our laboratory for comparison, although equilibrium is not fully reached (30). The T m was determined as the temperature at which the fraction folded is equal to 0.5 in a trimer-to-monomer transition that was fit to the experimental curve.
Differential Scanning Calorimetry-DSC transition curves were recorded on a NANO-DSC II model 6100 (Calorimetry Sciences Corp.) calorimeter. Samples were dialyzed against phosphate-buffered saline buffer (20 mM sodium phosphate, 150 mM NaCl), pH 7, and 50 mM sodium phosphate buffer, pH 2.2, respectively, before each run. Samples were equilibrated at 0°C for at least 48 h prior to recording the melting curves. The melting transition curves were monitored at a heating rate of 1°C/min. Since this heating rate gives a scan that is far from equilibrium (30), the denaturation temperature T m observed is scan rate-dependent and higher than that observed for CD studies done at slower rates. The enthalpy was calculated from the first scan since the scans were not reversible upon cooling. Enthalpy measurements under these conditions should be valid since it was shown for similar collagenous peptides that the calorimetric enthalpy determined by DSC is independent of the heating rate (30).
NMR Spectroscopy-Two GAAVM peptides of identical sequence were synthesized with 15 N amino acids labeled at different positions to optimize NMR assignments. One peptide is selectively 15 N-labeled at positions Gly-13, Ala-15, Gly-24, whereas a second peptide is labeled at positions Gly-7, Gly-13, Ala-14, and Val-16. Samples for both peptides were prepared in 10% D 2 O/90% H 2 O at pH 1.9 with concentrations of 5.3-5.6 mM.
NMR experiments were performed on a Varian INOVA 500 MHz spectrometer. 1 H-15 N heteronuclear single quantum coherence (HSQC) at 0°C and three-dimensional 1 H-15 N HSQC-TOCSY experiments at 25°C with a mixing time of 45 ms were carried out for assignment of NMR resonances. All data were processed on a Silicon Graphics work station using the FELIX 2004 software package (MSI, San Diego, CA). HSQC spectra of GAAVM peptides 1 and 2 were overlaid using NMRView (31).
Hydrogen exchange experiments were carried out at 10°C, pD 2.5-2.6, where the pD includes the correction for the glass electrode solvent isotope artifact (32). The sample was equilibrated in H 2 O at 10°C for a minimum of 72 h to ensure that the monomer-trimer interconversion had reached equilibrium. The sample was then lyophilized and dissolved in D 2 O. The HSQC spectra were acquired with a 5-min acquisition time for the first 14 spectra and a 44-min acquisition time for the following spectra to allow the measurements of both fast-and slowexchanging amide protons. Peak volumes were measured as a function of time. The volumes were normalized by the integration of the nonexchangeable resonance of DSS in one-dimensional spectra collected before and after the HSQC experiments. The hydrogen/deuterium exchange rates were single exponential with a non-zero asymptote I ∞ in the equation The error in k ex was the fitting error from the variance-covariance matrix. The protection factors (P) were calculated by the equation where k int is the theoretical monomer exchange rate for the amide proton at a specific pH and temperature (33). The error in the intrinsic rate was estimated by assuming an experimental uncertainty of Ϯ0.2°C and Ϯ0.1 pH units. The error in the protection factor P was then calculated using the equation where k ex is the error in the measured exchange rate and k int is the error in the intrinsic rate.
Statistical Analyses of Amino Acids within Breaks-The 2 test of statistical significance at a level of p Ͻ 0.001 was applied to test whether amino acids within the Gly-AA 1 -AA 2 -AA 3 -AA 4 -Gly breaks differ from amino acids in the X-and Y-position of the triple helical region. Since the number of Gly-AA 1 -AA 2 -AA 3 -AA 4 -Gly breaks in all type IV chains is only 28, the amino acids were grouped in categories as follows: hydrophobic residues, Ile, Leu, Met, Val; small residues, Ala, Cys, Gly, Ser; charged residues, Asp, Glu, His, Lys, Arg; imino acid, Pro; aromatic residues, Phe, Trp, Tyr; and polar residues, Asn, Gln, Thr.

RESULTS
Peptide Design-In the ␣5 chain of human type IV collagen, there are 22 breaks in the Gly-X-Y repeating sequence distributed all along the triple helix (14) (Fig. 1). A set of peptides was synthesized to investigate one of these naturally occurring breaks, by incorporating the sequence GPOGAAVMGPOGPO (residues 386 -399) into a peptide model. Additional Gly-Pro-Hyp sequences were added on each side to ensure stability and to create a 30-mer peptide that could be compared with earlier peptides of this length (3, 34 -38). All peptides in this study are designated by the single-letter code for the central sequence, which is flanked by Gly-Pro-Hyp units, so this peptide is labeled the GAAVM peptide (Table 1). Two peptides were synthesized with this sequence, with 15 N-Gly, 15 N-Ala, or 15 N-Val at various positions, so that residuespecific NMR assignments could be made.
CD Spectroscopy and Differential Scanning Calorimetry of the GAAVM and GAAGVM Peptides-The peptide including the native GPO-GAAVMGPOGPO sequence was found to form a structure with a substantial amount of triple helix. The GAAVM peptide has a CD spectrum with a maximum at 225 nm (MRE 225 nm ϭ 2360 deg cm 2 /dmol) and a minimum at 198 nm (MRE 198 nm ϭ Ϫ33866 deg cm 2 /dmol). This spectrum is characteristic of the collagen triple helix. Monitoring the ellipticity at 225 nm as a function of temperature showed a cooperative thermal transition with a T m value of 29.1°C (Fig. 2). As shown below, NMR studies support a trimeric nature throughout this peptide at low temperature, suggesting that this is a trimer-to-monomer transition (see Fig. 5).
To investigate the consequences of the interruption, comparison was made with a homologous peptide, with the Gly inserted to recreate the Gly-X-Y pattern: GPOGAAGVMGPOGPO. The magnitude of the 225 nm maximum and the 198 nm minimum are both greater for the GAAGVM peptide (MRE 225 nm ϭ 4950 deg cm 2 /dmol, MRE 198 nm ϭ Ϫ41127 deg cm 2 /dmol) when compared with the GAAVM peptide. The thermal stability of GAAGVM is 39.7°C, 11°C higher than that of GAAVM, indicating that the absence of the Gly also caused a substantial decrease in stability (Fig. 2). The observed T m value of 39.7°C is high and close to that predicted (T pred ϭ 36°C) on the basis of host-guest peptides using the collagen stability calculator (37). Temperature-induced refolding profiles were very similar for both GAAVM and GAAGVM peptides at concentration of 1 mg/ml (data not shown), which may reflect a similar nucleation step.
Differential scanning calorimetry indicates that the calorimetric enthalpy for GAAGVM is 354 kJ/mol in phosphate-buffered saline (pH 7), whereas the value for GAAVM is 188 kJ/mol (Fig. 3, Table 1), suggesting a substantial loss of hydrogen bonding. The decrease in thermal stability is again observed, although the DSC T m values are higher than the CD T m values because of the faster heating rate. The ⌬H cal values for both peptides are considerably higher at lower ionic strength (50 mM sodium phosphate), at both neutral and acidic pH (428 kJ/mol for GAAGVM; 259 kJ/mol for GAAVM). Such large ionic strength effects were not observed for other peptides (30).
Consequence of Increasing Imino Acid Content in the Break-The effect of an artificial break with 4 adjacent non-Gly residues was previously studied in an all Gly-Pro-Hyp environment (25). The stable (Pro-Hyp-Gly) 10 peptide was compared with (Pro-Hyp-Gly) 4 -Pro-Hyp-Pro-Hyp-Gly-(Pro-Hyp-Gly) 4 , which is denoted here as the GPOPO peptide (previously designated as the Gly deletion peptide) (25). The peptide (Pro-Hyp-Gly) 10 has a T m of 60°C with a calorimetric enthalpy of 390

FIGURE 2. Temperature-induced denaturation monitored by circular dichroism spectroscopy at 225 nm of the GAAVM peptide (f) and GAAGVM peptide (E).
The inset shows a wavelength scan monitored by CD at 0°C. The concentration of the peptides is 1 mg/ml, and the buffer is 20 mM sodium phosphate, 150 mM NaCl, pH 7. All samples were run under the same standard conditions (average 0.1°C/min) and can be compared, although these are not equilibrium conditions (30).

TABLE 1 Thermal stability, mean residue ellipticity, and calorimetric enthalpy of peptide modeling a natural break in the ␣5 chain of type IV collagen and a control peptide with an artificial Gly insertion
Peptides were both blocked at the termini and were characterized in phosphate-buffered saline buffer, pH 7, at a concentration of 1 mg/ml. kJ/mol (30). The GPOPO peptide does not form a stable triple helix at c ϭ 1 mg/ml (Table 2), in contrast to the GAAVM peptide containing a natural break (25). Given the striking observation that a GPOPO sequence cannot be accommodated in a triple helix, whereas GAAVM is allowed, further peptides were designed to investigate the effect of imino acids within the interruption. The sequence GAA was replaced by GPO to form a GPOVM peptide, and the sequence VM was replaced by PO to form the GAAPO peptide. Although the two resultant peptides precipitate at neutral pH, both GAAPO and GPOVM peptides form stable triple helices at pH 2.2. The ellipticity of the GPOVM peptide at 225 nm is similar to that of the GAAVM peptide at both low and high temperatures (Fig.  4), but its stability is decreased (from T m ϭ 23.6°C to 18.5°C, for unblocked forms of both peptides, pH 2.2) ( Table 2). The GAAPO peptide has a similar ellipticity value at low temperature, but at high temperatures, the unfolded state appears to have a higher ellipticity value than seen for GAAVM and GPOVM. The thermal stability of GAAPO is about 12.8°C, significantly lower than seen for GAAVM and GPOVM (Fig. 4). The calorimetric enthalpies of GPOVM and GAAPO did not appear to differ significantly from that obtained for GAAVM, but the low stability of these peptides made it problematic to obtain accurate baseline values ( Table 2).

Peptide
The decrease in stability in going from a Gly-X-Y repeating sequence to a sequence containing a break was calculated. To estimate the loss of thermal stability resulting from the absence of a Gly residue in a break, the observed T m for each peptide with a break was subtracted from the T m value predicted for the same sequence with Gly as every 3rd residue using the collagen stability calculator (GAAVM versus GAAGVM; GPOVM versus GPOGVM; GAAPO versus GAAGPO; and GPOPO versus GPOGPO). As seen in Table 2, the extent of triple helix destabilization depends strongly on the imino acid content of the break. The decrease in stability ⌬T m values ranges from ϳ8°C for the GAAVM peptide, which has no imino acids in the break to ϳ30°C for the peptides with two imino acids in the break (GAAPO and GPOVM) to more than 60°C for the GPOPO peptide where all 4 residues in the break are imino acids.
NMR Spectroscopy of GAAVM Peptide-The NMR spectrum of peptide GAAVM contains the typical features previously found in triple helical peptides (Fig. 5A). The 15 N-labeled resonances of the GAAVM peptide each show distinct monomer and trimer peaks, which can be differentiated by their R 2 and nuclear Overhauser effect values as well as their chemical shifts. The appearance of three trimer peaks as well as a single monomer resonance for each labeled residue supports a trimer form of the peptide in equilibrium with monomer. Residues Gly-24 and Gly-7 show only a single trimer resonance due to their repetitive Gly-Pro-Hyp environment. The proton and nitrogen chemical shifts for the trimer resonances of Gly-7 and Gly-24 are similar to those reported previously for Gly-Pro-Hyp triplets in other 30-mer triple helical peptides (35). Residues Gly-13, Ala-14, Ala-15, and Val-16 each have three trimer resonances, indicating that the three chains differ in their envi- FIGURE 3. Temperature dependence of the excess partial molar heat capacity (heating rate 1°C/min) for the GAAVM peptide (f) and GAAGVM peptide (E). Although these are not equilibrium conditions, the enthalpy has been shown for a similar peptide to be heating rate-independent (30).

TABLE 2 Effect of imino acid content within a break on thermal stability and calorimetric enthalpy
All peptides were unblocked at their termini and were characterized in 50 mM sodium phosphate, pH 2.2, since there were solubility problems at neutral pH.

Peptide
Sequence ronment within the break region (36). The greatest perturbation in chemical shift arises for Val-16, where the chemical shift difference between the three chains is particularly large. Most significantly, the chemical shift arising from one of the resonances in Val-16 is very downfield-shifted (ϳ9.7 ppm) relative to the others (Fig. 5A).
Hydrogen exchange experiments were performed on the GAAVM peptide to explore the protection of labile amide protons from solvent (Fig. 5B). Rates of hydrogen/deuterium exchange for the individual amide protons were determined at 10°C, and protection factors were calculated from these exchange rates by taking the ratio of the theoretical monomer rate to the trimer rate for the same residue. A very high degree of protection from exchange (protection factors of 346 and 461) was seen at the C-and N-terminal ends of the peptides that contain the stable and repeating Gly-Pro-Hyp regions. The central Gly residues at the beginning of the break, Gly-13, show a 2-3-fold reduction in its protection factor (average 155) relative to the terminal Gly-7 and Gly-24, which could relate to decreased strength of hydrogen bonding or shielding from solvent. The three resonances assigned to Val-16 residue, with their atypical chemical shifts, show a protection factor of ϳ102, indicative of significant shielding from solvent or hydrogen bonding but somewhat lower than the Gly-13 value. The 2 Ala residues in the break, Ala-14 and Ala-15, have protection factors in the range of 2-18, indicating that these 2 residues are fully exposed to solvent and are not hydrogen bonded.
Analysis of Residues within Breaks-The break studied in the ␣5(IV) chain of type IV collagen, GAAVMG, is one of a category of breaks that has 4 non-Gly residues between two glycines Gly-AA 1 -AA 2 -AA 3 -AA 4 -Gly and is denoted as a G4G break. This category of breaks could also be considered as having a Gly missing in the periodic Gly-X-Y-Gly-X-Y pattern or a Gly deletion type (24). In the six chains of the type IV collagen family, there are 28 G4G breaks: five in ␣ 1 (IV); two in ␣ 2 (IV); seven in ␣ 3 (IV); five in ␣ 4 (IV); seven in ␣ 5 (IV); and two in ␣ 6 (IV). To test whether residues within the G4G breaks differ significantly from the residues found within the X and Y positions of the standard Gly-X-Y segments of these chains, a statistical analysis was carried out. Comparison of hydrophobic (Ile, Leu, Met, Val), small (Ala, Cys, Gly, Ser), imino acids (Pro), charged (Asp, Glu, His, Lys, Arg), aromatic (Phe, Tyr, Trp), and polar (Asn, Gln, Thr) residues indicated that the residues in the breaks were significantly different from X and Y residues in (Gly-X-Y) n regions (p Ͻ 0.001). The major contributors to these differences are the high number of small amino acids and of hydrophobic residues in the break; the proline content is also well below that seen in the X and Y positions (Fig. 6). Examination of the 28 G4G breaks suggests that the AA 2 position is frequently occupied by Ala, Ser, or Asp, whereas the AA 3 position is predominantly (ϳ70%) occupied by a hydrophobic residue. In addition, positions AA 1 , AA 2 , and AA 3 are particularly low in Pro residues. In terms of composition and the identity of residues in the AA 2 and AA 3 positions, the GAAVMG break studied here represents a typical G4G break in type IV collagen.

DISCUSSION
The characterization reported here of a triple helical peptide incorporating a type IV collagen break provides an opportunity to investigate conformational, energetic, and dynamic features at and surrounding an interrup-  . Observed (hatched) versus expected (no fill) frequencies of groups of amino acids in GX 1 X 2 X 3 X 4 G breaks of all (␣1-␣6) chains of type IV collagen. The expected frequencies were calculated based on the identity of amino acids in the X and Y positions of (Gly-X-Y) n sequences of all collagen type IV chains. The hypothesis that the composition of amino acids within the break differs from the composition of amino acids in the normal was tested statistically by a 2 analyses. With a level significance of p Ͻ 0.001, the hypothesis is supported. The higher frequency of hydrophobic and small residues and the lower frequency of proline in the break contribute the largest 2 terms. tion in the Gly-X-Y repeating sequence. The GAAVM sequence chosen is hydrophobic and is surrounded in the ␣5(IV) chain by highly stabilizing Gly-Pro-Hyp units. It is not sterically possible to continue the triple helix through this break, where Gly is not present as every 3rd residue. This sequence with a break is compared with the artificially created control peptide, GAAGVM, which forms a stable triple helix, with a T m of 40°C. Enthalpy is clearly an important stabilizing factor for the GAAGVM control triple helical peptide, which has a calorimetric enthalpy even greater than seen for (Pro-Hyp-Gly) 10 at low ionic strength. Previous work suggests that exposed hydrophobic Val and Met residues are likely to be coated with a backbone anchored hydration network, which would contribute to its high enthalpy value (38).
Evidence for perturbation of the triple helix at the GAAVM break includes loss of the 225 nm CD signal, lower stability, and loss of calorimetric enthalpy when compared with the GAAGVM peptide. The loss of almost half the enthalpy, going from GAAGVM to GAAVM, is striking and is consistent with an interruption in the hydration network now that the backbone is not in a regular triple helix. It is likely that there is now some hydrophobic burying of the Val side chain and/or hydrogen bonding of its amide group, consistent with the NMR data showing slower exchange of Val NH groups and unusual Val chemical shifts. It is possible that non-associating monomer species, e.g. GAAVM peptide missing a single residue, could make some contribution to the decreased ellipticity at 225 nm and lower calorimetric enthalpy since the GAAVM peptide appears to have a small but significant amount of monomer present even at low temperature.
The NMR data clearly support the presence of some ordered, nonrandom structure within the GAAVM break. The presence of three trimer resonances, distinct from the monomer resonances, for each residue in the GAAVM sequence indicates that these residues can adopt a specific conformation and that each chain is distinct in the break region. The formation of some ordered structure within the break is also consistent with NMR studies showing exchange rates that are slower than a random chain for the Gly and Val within the GAAVM sequence.
The presence of multiple imino acids in the break is found to be unfavorable for a stable triple helix. The decrease in T m of the triple helix peptide caused by the introduction of a break is smallest for GAAVM, followed by GAAPO, GPOVM, and finally the highly destabilizing GPOPO, which cannot be incorporated into a triple helix. This contrasts with the well known stabilization of the triple helix itself by imino acids. Although the polyproline II-like conformation of the individual chains in the triple helix is highly favored by a high content of imino acids, proline residues appear to be too rigid to be accommodated within a break. Such rigidity might prevent hydrophobic interactions between Val side chains or some hydrogen-bonding interaction. Their destabilizing nature is consistent with the observation of low contents of Pro residues within G4G breaks.
The conformation adopted for GAAVM is likely to be typical of other breaks of the form Gly-AA 1 -AA 2 -AA 3 -AA 4 -Gly (G4G) since it has the consensus Ala/Ser/Asp residue at position 2 and hydrophobic residue at position 3. It is likely that breaks in type IV collagen fall into categories based on their specific amino acid composition and that the conformation is related to these features. In this sense, the GAAVM peptide is a prototype of the features expected for the G4G category.
Information on the sequence dependence of the conformation of breaks in these peptide models will provide a basis for exploring the consequences of interruptions in the Gly-X-Y repeating sequences in collagen domains and investigating the effects of the breaks on function. The peptides studied here form homotrimers, whereas type IV collagens are present as heterotrimers consisting of two or three distinct types of chains (27). The locations of interruption sites in the Gly-X-Y sequence tend to be conserved among type IV chains, whereas little conservation is seen for the number of amino acids and their sequence within corresponding breaks of different chains (14). Further studies are needed to investigate whether the features observed for homotrimers also hold for heterotrimers.