![]()
|
|
||||||||
J. Biol. Chem., Vol. 280, Issue 19, 19343-19349, May 13, 2005
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


||
From the
Department of Biochemistry, University of Medicine and Dentistry of New Jersey-Robert Wood Johnson Medical School, Piscataway, New Jersey 08854 and ¶Commonwealth Scientific and Industrial Research Organisation, 343 Royal Parade, Parkville, Victoria 3052, Australia
Received for publication, February 14, 2005
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
The collagen triple helix motif is found widely in structural proteins of the extracellular matrix and in an increasing set of non-collagenous proteins, many of which are involved in host-defense functions (4, 5). The close packing of three supercoiled polyproline II-like polypeptide chains in the collagen triple helix generates a requirement for Gly as every third residue (68). The observation of such a repeating (Gly-X-Y)n sequence pattern over a stretch of residues signifies a triple helix conformation. However, the collagen triple helix is not uniform in structure or stability. Crystal structures of collagen peptides show that variation in amino acid content leads to small but significant variations in the super-helix twist (911). Calorimetric results suggest the presence of multiple independent folding domains along a collagen molecule (12), and the presence of regions of different stability was confirmed by recent studies on recombinant collagen constructs (13). There are multiple binding domains in collagens (14), and regions of decreased triple helix stability have been implicated in binding in some cases (1517). Self-association of type I collagen into fibrils is preceded by microunfolding of specific triple helix regions (18, 19). Thus, specific residues along the (Gly-X-Y)n sequence determine functionally important modulation of structure and stability.
Experimental thermal stability data obtained from host-guest peptides is integrated here to produce an algorithm for predicting global melting temperatures of collagen triple helical peptides and short fragments and for detecting modulations in relative stability along a collagen chain. Good agreement is observed between predicted and observed stabilities of a number of collagen peptides. In cases in which the predicted Tm is significantly different from that observed, interactions involving longer range electrostatic interactions or unraveling of the ends are suggested. The variations in stability along the collagen chain appear related to known functional sites, and high stability is achieved through a combination of stabilizing imino acid and KGE/D sequences.
| MATERIALS AND METHODS |
|---|
|
|
|---|
The (Pro-Hyp-Gly)n peptides for n = 6, 7, 8, and 12 were synthesized by Tufts Core Facility (Boston, MA) and purified using high pressure liquid chromatography; their identity was confirmed by matrix-assisted laser desorption ionization.
To extrapolate the dependence of the Tm of the host peptides on peptide length, the experimental values for (Pro-Hyp-Gly)n and (Pro-Pro-Gly)n versus n, where n is the number of tripeptide units, were fit to the exponential decay function
![]() | (Eq. 1) |
is defined as the base thermal stability of the repeating polytripeptide standard,
is the maximum melting temperature, and the constant n0 represents the length of the repeating peptide with Tm = 0.
|
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
Individual Residue Propensities for X and Y PositionsThe propensity measurements for all 20 residues in the X position in a Gly-X-Hyp context and all 20 residues in the Y position in a Gly-Pro-Y context were determined by measuring thermal stability of host-guest peptides (21). The most stable tripeptide unit is Gly-Pro-Hyp (Tm = 47.3 °C). Replacing Pro in the X position leads to a decrease in stability ranging from 4 °C for Gly-Glu-Hyp (Tm = 42.9 °C) to 15 °C for Gly-Trp-Hyp (Tm = 31.9 °C). Replacing Hyp in the Y position leads to a decrease in stability ranging from almost 0 °C for Gly-Pro-Arg (Tm = 47.2 °C) to 21 °C for Gly-Pro-Hyp (Tm = 26.1 °C).
Gly-X-Y Tripeptide SequencesDirect intrachain interactions are not sterically possible between adjacent X and Y residues in the Gly-X-Y unit of a chain, but interchain interactions can take place between the Y residue in one chain and the X residue in an adjacent chain staggered by 1 residue (Fig. 1). Peptides with Gly-X-Y guest triplets were designed to model these interchain interactions. Only a restricted set of possible Gly-X-Y tripeptides are significantly populated in collagens (24), reflecting in part strong preferences for basic residues to be in the Y position and for Glu and hydrophobic residues to be in the X position and very low occurrence of Cys, Trp, and Tyr. A limited set of 41 guest Gly-X-Y sequences was selected to include the most common tripeptide sequences and to model a range of typical electrostatic and hydrophobic interactions. Because of the strong bias in collagen compositions, the selected 41 Gly-X-Y, 19 Gly-X-Hyp, 19 Gly-Pro-Y, and Gly-Pro-Hyp tripeptides cover about 80% of human fibrillar collagen sequences (22). Although Pro residues in the Y position are post-translationally modified to Hyp in multicellular animals, collagenous domains have recently been found in bacteria and viruses where there is no hydroxylation of Pro (25, 26). To model these sequences, Gly-Pro-Pro and Gly-Ala-Pro guest triplets were also included.
A complete table of the stability for all Gly-X-Y triplets was constructed using the experimental values for all frequent sequences and the predicted values for all others (Table I; experimental values are in bold). Predicted values were calculated on the basis of additivity of residues in the X and Y position (22).
![]() | (Eq. 2) |
|
Interactions between Adjacent Gly-X-Y UnitsInteractions between adjacent Gly-X-Y tripeptides were included in the calculations. A recent study reported stabilities of a selection of host-guest peptides including residues in two adjacent tripeptide units, Gly-X-Y-Gly-X'-Y', covering possible direct interchain or intrachain interactions between residues that are separated by
3 residues in sequence (23) (Fig. 1). Significant deviations from predicted stability were seen for six hexapeptides, which suggested favorable interchain and intrachain electrostatic and hydrophobic interactions (Table II). The most dramatic difference was the electrostatic and hydrogen bonding stabilization observed when Lys is in the Y position and a negatively charged residue is in the X' position (KGD or KGE), with observed Tm values 15.4 °C to 17.5 °C more stable than expected. The large magnitude of KGD/E interactions is comparable with the Tm spread of all X (14 °C) and Y residues (21 °C) (23) (Table II).
|
The effect of blocking groups on peptide stability was also taken into consideration. Studies at different pH values and on peptides with and without blocked termini are consistent with a reduction of stability by about 2 °C when unblocked charged N termini are present and by about 3 °C when unblocked charged C termini are present, for a peptide length of n = 10 (29). This destabilization is presumed to be due to repulsion when three charged termini are in close proximity, consistent with the unraveling of the termini observed in high resolution structures of collagen peptides (8, 30). End effects are more pronounced for short peptides than for longer ones, as seen for (Pro-Hyp-Gly)7 and for (Pro-Hyp-Gly)8 (Fig. 2).
Algorithm Relating Amino Acid Sequence to Triple Helix Stability
The relative stability of each Gly-X-Y tripeptide compared with Gly-Pro-Hyp and the interaction between adjacent Gly-X-Y tripeptides were used to derive an algorithm for predicting triple helix stability. The Tm values, rather than Gibbs free energy values, were used for calculating peptide stability. The extremely long times needed to reach equilibrium and the lack of agreement of the equilibrium curve with a two-state model presented practical and theoretical limitations to thermodynamic characterization (20). Fortunately, the use of Tm values obtained under standardized conditions has proved to be useful as an empirical measure of triple helix stability (20). Additivity of Tm values was observed for peptides with residues that cannot interact. Thus, Tm values are seen to be a good measure of relative stability, as long as standard conditions of buffer, pH, and rate of heating are maintained (20). The algorithm predicts a global Tm value for collagen model peptides between 6 and 20 tripeptides in length and predicts a relative stability for collagen sequences.
|
n
20 is predicted by an algorithm consisting of the following steps.
1) For the total number of triplets n in a given peptide, the base
for (Pro-Hyp-Gly)n or (Pro-Pro-Gly)n is calculated from the length dependence (Eq. 1), including any effect of blocking groups.
2) The melting temperature value is decreased for every triplet in the sequence that is not Gly-Pro-Hyp, subtracting a value of
(Table I). The N-terminal and C-terminal tripeptide units are excluded from the calculation due to the staggering of the chains and the reported disorder for the peptide ends (8, 10, 30, 31).
3) The final value for the peptide melting temperature is adjusted using the
values for interactions between neighboring tripeptides (Table II).
The algorithm can be formulated as follows.
![]() | (Eq. 3) |
The collagen stability algorithm is available to all users for calculation of global stability of peptides and local stability variations in collagens and collagen-like domains (rwjms.umdnj.edu/biochemistry/collagen).
Prediction of Tm Values for Collagen-like Peptides
The stability algorithm was applied to 40 synthetic collagen-like peptides whose Tm values have been experimentally determined under the same defined standard conditions (Table III). Most of the peptides are n = 10 tripeptide units in length, and some have unblocked ends, whereas others have terminal blocking groups. Excellent agreement was found between the calculated and observed Tm values for peptides with GPO tripeptide units on both ends. For instance, for the unblocked peptide T3785, the predicted Tm value is 17.1 °C (58.8 °C - [(47.3 °C - 30.8 °C) + (47.3 °C - 38.2 °C) + (47.3 °C - 31.2 °C)]), in close agreement with the observed Tm of 18.0 °C. When KGE or KGD sequences are present, the good agreement is dependent on the inclusion of
correction values for interactions between adjacent triplets. For instance, peptide T1655, which has GPO caps on both ends, has an observed Tm value of 42.8 °C. If each independent triplet is considered, one would subtract 16.5 °C for GAK, 15.7 °C for GDA, and 6.4 °C for GPA, yielding 58.8 °C - 38.6 °C = 20.2 °C. However, there is a KGD sequence, which gives +17.5 °C, and an increase of 5 °C because the ends are blocked, giving a net predicted value of 42.7 °C, which is very close to the observed value of 42.8 °C. The set of peptides related to T1892 with GPA sequences on the N-terminal ends also show excellent agreement with predictions. It is notable that the "reverse" peptide, T1892r, which has the same tripeptide composition but in a different order, has the same Tm as T1892, supporting the dependence of thermal stability on tripeptide unit composition when there are no interactions present (Table III) (32).
|
|
values show excellent agreement (within ±2 °C) for 14 peptides and an overall correlation coefficient of r = 0.83 (Table III, Fig. 3). However, in a number of cases, the predicted values differed from experimental Tm values by >4 °C. Predicted values are consistently higher than observed ones for peptides containing consecutive strings of positively and negatively charged residues: T72058 (GER-GEK-GER-GEQ), T1904 (GPR-GDK-GET), MBL (GKD-GRD-GTK-GEK-GEO), and MSR-1 (GPK-GQK-GEK). This suggests that there are long-range effects in strings of residues of opposite charge such that simple additivity of individual triplets plus KGE/D effects no longer applies. Examination of peptides including highly basic sequences from the heparin binding region of the collagenous tail of the asymmetric form of acetylcholinesterase points to a potential destabilizing effect of charge repulsion at the uncapped N terminus. When these highly basic sequences are included in a host-guest context, with GPO caps at both ends, there is very good agreement between predicted and observed Tm values. However, when there is an uncapped N terminus, the experimental Tm values are often lower than those predicted. It is likely that charge repulsion is leading to unraveling at the N terminus and a lower than expected stability. Remaining discrepancies are likely pointing to other effects that have not been taken into consideration in the stability algorithm.
Calculation of Relative Stability for Collagen-like Domains and Full-length Collagens
The thermal stability algorithm must be modified for collagens because of their length and the presence of multiple cooperative units during unfolding. Following the averaging approach first set forth previously (33, 34), the algorithm can be applied to discover thermally stable and labile domains along the triple helix. A stability coefficient is assigned for every GXY triplet (Table I) and corrected for the interaction between triplets (Table II). The stability is averaged over a window of 5 tripeptide units, with the average relative stability value for the triplet i equal to the average of the stability coefficients in the interval [i - 2, i + 2], inclusive. The averaged relative stability values are plotted against the tripeptide number in collagen sequence. For heterotrimer sequences, the values of the three individual collagen chain sequences are averaged.
|
Previously, it has been assumed that regions of high imino acid content will be the most stable, whereas regions deficient in imino acids will be less stable. Examination of the stability profiles shows that regions lacking imino acids often contain KGE/D sequences and thus are quite stable. A wide range of KGE/D contents is observed in different collagens, ranging from 3% in type I and II collagens to 10% in type IV collagen in basement membranes (Table IV). The high KGE/D content may provide stability to compensate for the numerous destabilizing interruptions present in type IV collagen. The very high KGE/D content of some bacterial proteins, such as 20% for Scl1 (26), suggests the importance of electrostatic stabilization when the imino acid content is low and Hyp is absent (Table IV).
|
| CONCLUSIONS |
|---|
|
|
|---|
-helices,
sheets, and coiled coil
-helices (3841), but, to the best of our knowledge, this report represents the first case in which it is feasible to use amino acid sequence to predict Tm values of peptides and to predict stability variations along proteins in a quantitative manner. Calculation of the relationship between amino acid sequence and stability is possible for the collagen triple helix because of 1) its linear nature, which limits interactions to be local, involving residues close in sequence; 2) the small size of the repeating unit Gly-X-Y; and 3) the strong preferential occurrence of a limited number of possible sequence combinations. The algorithm derived from peptide studies gives good predictions for the Tm values of many collagen-like peptides, suggesting that the important propensities and interactions are well described. The cases in which the agreement is not good point to the need for a better understanding of interactions. These include multiple like and unlike charges and charge repulsion that may unravel ends, as well as factors that were not explored in this study, such as the stabilizing effects of glycosylation of threonine in the Y positions (42, 43). Application of this algorithm to collagens suggests that regions of unusually high or low stability are likely to be of biological importance. The establishment of the rules of collagen stability highlights the different strategies used for stabilization of the triple helix in bacteria. About 80% of mutations in disease have been found to lead to protein destabilization in globular proteins (1), and destabilization appears to correlate well with collagen disease and severity as well (3, 44). The prediction of the effect of single amino acid replacements in the X or Y positions on collagen stability is now possible, as well as evaluation of the stability of the region in which the mutation occurs. This approach may also be useful in the design of novel triple helical constructs for production in recombinant systems and applications in biomaterial and tissue engineering structures (45).
|
| FOOTNOTES |
|---|
Supported by the Michael Geisman Fellowship from the Osteogenesis Imperfecta Foundation. ![]()
|| To whom correspondence should be addressed: Dept. of Biochemistry, University of Medicine and Dentistry of New Jersey-Robert Wood Johnson Medical School, 675 Hoes Lane, Piscataway, NJ 08854. Tel.: 732-235-4048; Fax: 732-235-4783; E-mail: brodsky{at}umdnj.edu.
| ACKNOWLEDGMENTS |
|---|
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
S. M. Sweeney, J. P. Orgel, A. Fertala, J. D. McAuliffe, K. R. Turner, G. A. Di Lullo, S. Chen, O. Antipova, S. Perumal, L. Ala-Kokko, et al. Candidate Cell and Matrix Interaction Domains on the Collagen Fibril, the Predominant Protein of Vertebrates J. Biol. Chem., July 25, 2008; 283(30): 21187 - 21197. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Kar, Y.-H. Wang, and B. Brodsky Sequence dependence of kinetics and morphology of collagen model peptide self-assembly into higher order structures Protein Sci., June 1, 2008; 17(6): 1086 - 1095. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. E. Jarvis, N. Raynal, J. P. Langford, D. J. Onley, A. Andrews, P. A. Smethurst, and R. W. Farndale Identification of a major GpVI-binding locus in human type III collagen Blood, May 15, 2008; 111(10): 4986 - 4996. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. K. Gara, P. Grumati, A. Urciuolo, P. Bonaldo, B. Kobbe, M. Koch, M. Paulsson, and R. Wagener Three Novel Collagen VI Chains with High Homology to the {alpha}3 Chain J. Biol. Chem., April 18, 2008; 283(16): 10658 - 10670. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Makareeva, E. L. Mertz, N. V. Kuznetsova, M. B. Sutter, A. M. DeRidder, W. A. Cabral, A. M. Barnes, D. J. McBride, J. C. Marini, and S. Leikin Structural Heterogeneity of Type I Collagen Triple Helix and Its Role in Osteogenesis Imperfecta J. Biol. Chem., February 22, 2008; 283(8): 4787 - 4798. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Harrington and J. H. Waite Holdfast heroics: comparing the molecular and mechanical properties of Mytilus californianus byssal threads J. Exp. Biol., December 15, 2007; 210(24): 4307 - 4318. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Mohs, T. Silva, T. Yoshida, R. Amin, S. Lukomski, M. Inouye, and B. Brodsky Mechanism of Stabilization of a Bacterial Collagen Triple Helix in the Absence of Hydroxyproline J. Biol. Chem., October 12, 2007; 282(41): 29757 - 29765. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. Freudenberg, S. H. Behrens, P. B. Welzel, M. Muller, M. Grimmer, K. Salchert, T. Taeger, K. Schmidt, W. Pompe, and C. Werner Electrostatic Interactions Modulate the Conformation of Collagen I Biophys. J., March 15, 2007; 92(6): 2108 - 2119. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Oganesian, S. Au, J. A. Horst, L. C. Holzhausen, A. J. Macy, J. M. Pace, and P. Bornstein The NH2-terminal Propeptide of Type I Procollagen Acts Intracellularly to Modulate Cell Function J. Biol. Chem., December 15, 2006; 281(50): 38507 - 38518. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. J. Hyde, M. A. Bryan, B. Brodsky, and J. Baum Sequence Dependence of Renucleation after a Gly Mutation in Model Collagen Peptides J. Biol. Chem., December 1, 2006; 281(48): 36937 - 36943. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Kar, P. Amin, M. A. Bryan, A. V. Persikov, A. Mohs, Y.-H. Wang, and B. Brodsky Self-association of Collagen Triple Helic Peptides into Higher Order Structures J. Biol. Chem., November 3, 2006; 281(44): 33283 - 33290. [Abstract] [Full Text] [PDF] |
||||
![]() |
W A Cabral, S Milgrom, A D Letocha, E Moriarty, and J C Marini Biochemical screening of type I collagen in osteogenesis imperfecta: detection of glycine substitutions in the amino end of the alpha chains requires supplementation by molecular analysis. J. Med. Genet., August 1, 2006; 43(8): 685 - 690. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Mohs, M. Popiel, Y. Li, J. Baum, and B. Brodsky Conformational Features of a Natural Break in the Type IV Collagen Gly-X-Y Repeat J. Biol. Chem., June 23, 2006; 281(25): 17197 - 17202. [Abstract] [Full Text] [PDF] |
||||
![]() |
|