Integrin activation state determines selectivity for novel recognition sites in fibrillar collagens.

Only three recognition motifs, GFOGER, GLOGER, and GASGER, all present in type I collagen, have been identified to date for collagen-binding integrins, such as alpha(2)beta(1). Sequence alignment was used to investigate the occurrence of related motifs in other human fibrillar collagens, and located a conserved array of novel GER motifs within their triple helical domains. We compared the integrin binding properties of synthetic triple helical peptides containing examples of such sequences (GLSGER, GMOGER, GAOGER, and GQRGER) or the previously identified motifs. Recombinant inserted (I) domains of integrin subunits alpha(1), alpha(2) and alpha(11) all bound poorly to all motifs other than GFOGER and GLOGER. Similarly, alpha(2)beta(1) -containing resting platelets adhered well only to GFOGER and GLOGER, while ADP-activated platelets, HT1080 cells and two active alpha(2)I domain mutants (E318W, locked open) bound all motifs well, indicating that affinity modulation determines the sequence selectivity of integrins. GxO/SGER peptides inhibited platelet adhesion to collagen monomers with order of potency F >/= L >/= M > A. These results establish GFOGER as a high affinity sequence, which can interact with the alpha(2)I domain in the absence of activation and suggest that integrin reactivity of collagens may be predicted from their GER content.

Collagen, the most abundant structural protein of the vertebrate organism, currently has 27 reported family members (1). As either a mechanical support or as a bioactive surface, collagen plays a crucial role in processes as diverse as morphogenesis, wound repair, inflammation, tumor metastasis, hemostasis, and thrombosis. The tensile strength of the fibrillar collagens, types I-III, V, XI, and XXVII, is crucial to the function of connective tissues including the blood vessel wall. The non-fibrillar collagens, such as types IV and VI, provide a flexible support for endothelial and epithelial cell attachment and development. Integrins, heterodimeric adhesion molecules, form an important subgroup of receptors, which mediate interaction between collagen and cells. Four ␣-subunits, ␣ 1 , ␣ 2 , ␣ 10 , and ␣ 11 , which associate non-covalently with ␤ 1 , constitute the native collagen-binding integrin family (2).
Integrin ␣ 2 ␤ 1 is a well characterized and widespread receptor for collagen, laminin, and other non-matrix ligands among nucleated cells including epithelial and endothelial cells, smooth muscle cells, fibroblasts, leukocytes, and mast cells (3,4), mediating a wide range of cellular activities. ␣ 2 ␤ 1 is the only collagen binding integrin in platelets, and is crucial for deposition on collagens exposed in damaged arterial walls (5,6). Accordingly, the expression levels of ␣ 2 ␤ 1 have been associated with myocardial infarction and stroke (7). Recently, knockout studies suggested that the platelet activatory collagen receptor glycoprotein VI (GPVI) is mandatory for the initiation of integrin-mediated adhesion (8), but this concept was challenged by further mouse studies (9). Moreover, in vitro studies suggest that soluble collagen binding to ␣ 2 ␤ 1 requires a change in the conformation of the integrin (10). Therefore the mechanism by which ␣ 2 ␤ 1 acts as a primary adhesive receptor and yet is subject to affinity modulation remains to be reconciled.
The native collagen binding integrins contain within their ␣-subunit an inserted (I) domain (␣I), which binds collagen through its metal ion-dependent adhesion site (MIDAS), 1 perhaps the only site of interaction, although in integrins lacking an ␣I domain, the I-like domain of the ␤-subunit constitutes the ligand binding site (11). Use of collagen-derived triple helical peptides identified the sequence GFOGER (where O is hydroxyproline) as the minimal recognition motif for ␣ 1 I, ␣ 2 I, and ␣ 11 I (12)(13)(14), and yielded the first integrin-ligand co-crystal (15), which revealed the crucial interaction of the MIDAS and the GER sequence. The peptide glutamate residue is directly coordinated to the divalent cation bound to the MIDAS, whereas the arginine forms a salt bridge to the ␣ 2 I aspartate, 219. Changing Glu to Asp abolishes binding, while Arg to Lys reduces the binding by 50% (13). In contrast, the phenylalanine may be less important, since two other collagen type I ␣ 1 chain sequences, GLOGER and GASGER, were found to bind ␣ 1 I and ␣ 2 I, when I-domain-interacting areas were mapped within collagen by rotary shadowing (16). No other recognition sequences have been unequivocally identified hitherto, although collagens I and III contain 11 and 14 GER triple helical motifs respectively some of which occur within cyanogen bromide-cleaved peptides, which support integrin binding (17)(18)(19)(20)(21). One conserved sequence, GMOGER, is cleaved at methionine by cyanogen bromide, and so is not present in such peptides.
We used bioinformatics to locate and identify novel GXXGER sequences within the collagen triple helices, and examined the affinity of synthetic peptides representing examples from them for ␣ 2 ␤ 1 in different cell types, and for recombinant ␣ 1 , ␣ 2 , and ␣ 11 I domains. All the double triplets were synthesized in two different flanking hosts (22,23), (GPP) 5 and (GPO) 3 , which provide the minimal structure to maintain triple helix stability at 20°C (24). We identified new sequences analogous to GFOGER at conserved loci within the collagen helix that can bind integrin, and confirmed the reactivity of GLOGER and GASGER. Competition experiments established an affinity series for the recognition motifs: GFOGER Ͼ GLOGER Ͼ GLS-GER Ͼ GMOGER Ͼ GAOGER Ϸ GASGER Ϸ GQRGER. The cellular context was found to be crucial in determining the binding to these sequences, since, in contrast to adhesion of ␣ 2 ␤ 1 -containing HT1080 cells (a constitutively adherent cell line), ␣ 2 ␤ 1 -mediated platelet adhesion was poor to GER sequences other than GFOGER unless the peptide also contained GPVI recognition motifs (GPO n ) (25,26), or the platelets were otherwise activated (27) in line with their hemostatic surveillance function. This concept was confirmed by non-selective binding of active ␣ 2 I mutants to all sequences.

EXPERIMENTAL PROCEDURES
Materials-Human platelets were isolated from citrate-anticoagulated whole blood, provided by the National Blood Service (Cambridge, UK). Pepsin-digested monomeric collagen type I from bovine skin has been previously described (19). Monoclonal anti-␣ 2 antibody 6F1 was a kind gift from Dr. Barry Coller (Mount Sinai Hospital, New York). GR144305F was a gift from Glaxo Wellcome (Stevenage, UK). Anti-GPVI Ab 10B12 was generated as previously described (28). Horseradish peroxidase-conjugated anti-GST antibody was from Amersham Biosciences UK Limited (Buckshire, UK). Unless otherwise stated, other reagents were from Sigma.
Sequence Alignment-Fibrillar collagen sequences CA11, CA21, CA12, CA13, CA15, CA25, CA1B, CA2B, and CA1R were downloaded from the Swiss-Prot data base (ca.expasy.org/sprot/), and non-helical parts of the sequences were removed before alignment using ClustalX. Small modifications to the alignment were made manually before assigning loci of GER sequences. Residue 1 of the type 1 collagen helix was denoted residue 1 of the alignment and of the first D-period of 234 residues. Subsequently, all 90 non-cuticle collagen sequence in the Swiss-Prot data base with the "CA" prefix encompassing all collagen types across a number of species were downloaded using the sequence retrieval system (srs.embl-heidelberg.de:8000/srs5/). These were analyzed for the frequency of GXXGER sequences (see Supplementary Data).
Cell Adhesion and Spreading-Human fibrosarcoma cells, HT1080, obtained from the European Collection of Animal Cell Cultures (Porton Down, UK) were maintained in Dulbecco's modified Eagle's medium containing 10% fetal bovine serum, 2 mM glutamine, 100 international unit/ml penicillin, 100 g/ml streptomycin, and 2.5 g/ml amphotericin. Cells were harvested with trypsin/EDTA, washed, and suspended at 0.3 ϫ 10 6 /ml in platelet adhesion buffer, supplemented with glucose at 1 g/liter and 2 mM EDTA or MgCl 2 . 100 l of cells were allowed to adhere at 20°C for 30 min for standard adhesions and 20 min in the presence of 6F1. Adhesion (% of total number of added cells) was determined as for platelets. Cells were allowed to spread on GER substrates for 90 min, then fixed and stained with 2% Crystal Violet in PBS, then ϫ200 bright-field images from a Nikon TMS microscope were analyzed using a Leica Q550C image analyser and QWIN software (Leica, Cambridge UK). The area of at least 100 cells was measured for each peptide.
Platelet Adhesion-Platelet (1.25 ϫ 10 8 /ml) adhesion was determined colorimetrically (absorbance 405 nm), as described (31). Peptides or collagen were coated at 10 g/ml on Immulon-2 HB 96-well plates (Thermo Life Sciences, Basingstoke, UK), at which concentration platelet binding reached a maximum. When peptides were used to inhibit platelet adhesion to collagen, they were added to the platelets prior to plating with 2 M GR144305F, which prevents platelet aggregation. Stimulation by 30 M ADP was carried out in the presence of 2 M GR144503F 5 min prior to the addition of platelets to peptide-coated wells.
Data Analysis-Data from each donor platelet batch used for competitive inhibition of platelet binding to collagen (shown in Figs. 4 and 8) were fitted to the ligand binding Equation 1 (32), where P ϭ response from platelet binding, P (min) ϭ response at zero peptide concentration, P (max) ϭ response at infinite peptide concentration, [A] ϭ peptide concentration, pA 50 ϭ Ϫlog [peptide concentration] that gives a response of (P (min) Ϫ P (max) )/2, and nH ϭ Hill coefficient. The maximal inhibition of platelet binding, I (max) , for each peptide was expressed as a percentage defined in Equation 2.
As P (min) varied between patients, datasets for any one peptide were normalized such that variance was evenly distributed between peptide concentration points and that the average P (min) value across them all was 1. Mean and S.E. values for pA 50 , nH, and I (max) were determined. Values for pA 50 were converted to A 50 values in micromolar units, and divided by 3 to give the concentration of triple helix. Plasmids-The recombinant I-domain encoding plasmids of ␣ 1 (33), ␣ 2 (34), ␣ 2 mutant (E318W) (35), and ␣M (36) were a generous gift from Danny Tuckwell (F2G Ltd, Manchester, UK). For integrin ␣ 11 I expression plasmid, a DNA fragment corresponding to the integrin ␣ 11 I was generated by PCR using the truncated human ␣ 11 -fos HMT vector (a kind gift from D. Gullberg, Uppsala, Sweden) as template. The forward and reverse primers with an additional BamHI and EcoRI sites were (5Ј-ATCATCAATTGGATCCCTGGATGGCTCCAACAGCAT-3Ј) and (5Ј-TCGATATTGAATTCCAGGGCATCGACAATGTCCT-3Ј), respectively. The fragment was digested with the above mentioned enzymes, ligated into pGEX-2T plasmid (Amersham Biosciences), and used to transform Escherichia coli strain BL21. The I-domain sequence from transformants was sequenced and compared with the published sequence (37).
Mutagenesis of ␣ 2 I-pGEX-2T containing the ␣ 2 I DNA was used to generate the I-domain mutant "locked open" (LO) containing cysteine at 172 and 322 positions (G172C and L322C) by site-directed mutagenesis essentially as describe previously for ␣ L I (38). The underlying rationale was to secure helix 7 in the position defined in the I-domain:peptide co-crystal, so that the MIDAS remained open. The primers used were: GGAAAAATTTGTACAATGCCTTGATATAGGCCCCACAAAGACAC-AGG, CCTGTGTCTTTGTGGGGCCTATATCAAGGCATTGTACAAAT-TTTTCC, GTCTGATGAAGCAGCTCTATGCGAAAAGGCTGGGACAT-TAGGAG, and CTCCTAATGTCCCAGCCTTTTCGCATAGAGCTGCT-TCATCAGAC.
Expression of I Domains-For protein expression and purification of ␣ 1 , ␣ 2 , ␣ 11 , ␣ M , LO, and E318W I domains, a 40-ml overnight culture of transformants was used to innoculate 400 ml of Luria Broth, 50 mg/ml ampicillin. The culture was grown for 1 h at 37°C and then induced for 4 h with 0.1 mM isopropyl-␤-D-thiogalactoside. Cells were harvested by centrifugation (4,500 ϫ g, 10 min), and pellets resuspended in PBS without divalent cations (PBSϪ). Suspensions were sonicated and centrifuged (2,500 ϫ g, 10 min), and the supernatants were adjusted to 1% Triton X-100. Pellets were resuspended in PBSϪ, sonicated, and centrifuged twice more, and the supernatants pooled. The lysate was passed down a glutathione-agarose column equilibrated in 150 mM NaCl, 20 mM Tris-HCl, pH 7.5 (TBS), the column washed with 10 volumes of TBS and the glutathione S-transferase-I domain fusion proteins eluted with 10 mM glutathione in 50 mM Tris-HCl, pH 8.0. The proteins were then dialyzed against TBS and concentrated using a Microcon 3 microconcentrator (Amicon, Stonehouse, Gloucester, UK). The I domains were checked for purity and degradation by 10% SDS-gel and by Western blotting. Nitrocellulose blot were probed with HRPconjugated polyclonal anti-GST antibody.

I-domain
Binding-Immulon-2 96-well plates were coated as described for platelet adhesion, and blocked for 2 h with 200 l of TBS with 50 mg/ml bovine serum albumin. Wells were washed four times with 200 l of the adhesion buffer (TBS with 1 mg/ml bovine serum albumin) before adding 100 l of adhesion buffer containing 5 g/ml of recombinant GST I domains in the presence of either 2 mM MgCl 2 or EDTA for 1 h at room temperature. This concentration of I domains provides optimal detection at a submaximal level of binding (39). Wells were washed five times with 200 l of adhesion buffer containing MgCl 2 or EDTA, before adding 100 l of adhesion buffer containing the anti-GST horseradish peroxidase-conjugate (1:5000) for 1 h at room temperature. After washing, color was developed using an ImmunoPure TMB Substrate Kit (Pierce) according to the manufacturer's instructions.
Statistics-The absorbance values for the adhesion of cells or I domains to peptide substrates were compared using either 1-or 2-way ANOVA. Specific comparisons were performed using the Neumann-Keuls (1-way) or Bonferroni (2-way) post-tests. Student's t tests were used where single conditions were compared in adhesion assays and to compare parameters of peptide inhibition (Figs. 4 and 8).

RESULTS
Collagen primary amino acid sequences GFOGER, GLOGER, and GASGER were previously shown to bind ␣ 2 ␤ 1containing cells, the purified integrin and recombinant ␣ 1 , ␣ 2 , and ␣ 11 I domains (13,14,16). A bioinformatic approach was taken to analyze the presence of GER sequences in human fibrillar collagens. An alignment of individual ␣-chains from collagens I, II, III, V, and XI suggested that GXXGER sequences occur at conserved loci within triple helical collagen domains, and these sites contain novel GXXGER sequences (Fig. 1). Three conserved loci for GFOGER and GLOGER in collagen types I, II, and XI, started at residues 127, 502, and 550 in the alignment. When other sequences occur in these loci, they are usually GMOGER or GAOGER, a previously proposed recognition sequence (25). Because the C-terminal GASGER, at residue 811, occupied a less conserved site and was shown to be of poor affinity at best (16), we also identified another Cterminal GXXGER locus nearby at 787, which contains several GXXGER sequences each having a polar rather than a hydrophobic residue within the first triplet. From this locus we chose to test GQRGER. Peptides containing these novel sequences were synthesized in addition to the other previously identified integrin recognition sequences using two different hosts, (GPP) 5 and (GPO) 3 (Table I). As controls, (GPP) 10 , or the GPPGPP sequence in a (GPO) 3 host were used. All the peptides had stable triple helices at room temperature with GER peptides lacking hydroxyproline as their third amino acid exhibiting lower melting points, demonstrating its important role in stabilizing the peptide helix.
To assess cellular function of these recognition sequences, two different cell types, human fibrosarcoma cell line HT1080 and human platelets, each of which express no collagen-binding integrin other than ␣ 2 ␤ 1 , were assayed to determine for their capacity to adhere to the peptides in the (GPP) 5 host. HT1080 cells bound well to all GER peptides, whereas the binding was poor to (GPP) 10 (Fig. 2). The HT1080 expressed a distinct binding profile: while GFO did not differ from GLO, GLS, or GMO, nor GAS from GAO, GMO, and GLS, significantly greater binding occurred to GFO or GLO than to GAO and GAS ( Fig. 2A). Adhesion to (GPP) 10 was significantly lower than to each GXXGER sequence. In separate experiments, the binding of HT1080 to the polar GQRGER motif was found to be within the same range as to GAO/GAS (data not shown). Integrin specificity of the adhesion to these sequences was demonstrated by their Mg 2ϩ dependence (Fig. 2B), and by blocking with an ␣ 2 inhibitory mAb 6F1 (Fig. 2C). To detail the differential binding of HT1080 cells, we also measured their spreading. HT1080 cells expressed peptide preference for spreading, which paralleled their adhesion levels and which was clearly observed as a reduction in number of fully spread cells on GAO, GAS, and GQR and (GPP) 10 control (Fig. 2D). GFOGER supported greater spreading, cell area being 150% of that obtained on GQR (p Ͻ 0.001), with cell area on other substrates decreasing in order GFO, GLO Ͼ GLS, GMO Ͼ GAO, GAS, GQR (data not shown). Finally, we assayed adhesion of another cell type, platelets, to the GXXGER peptide set. In striking contrast to the HT1080 cells, divalent cation-dependent binding of platelets to all sequences except GFOGER was reduced, and platelet binding to GAS, GAO, and GQR was almost negligible, paralleling the lower affinity observed with HT1080 cells (Fig. 3).
To test whether the binding capacity of the GXXGER sequences for platelets could be improved by allowing their interaction in solution rather than as an immobilized ligand, the peptides were tested as inhibitors of platelet adhesion to bovine type I collagen monomers. Inhibition experiments indicated three affinity classes of peptides: GFO had a 45-fold higher affinity over the next best sequence, GLO, which had medium affinity together with GLS and GMO (Fig. 4A), whereas GAO/ GAS/GQR exhibited no additional inhibition when compared with the (GPP) 10 control (Fig. 4B and data not shown). This qualitative order of affinity was always preserved irrespective of the absolute binding levels, which varied between platelets from different donors. A 50 values and statistical data derived from the curves modeled for Figs. 4 and 8 as described under "Experimental Procedures" are given for the different ligands   FIG. 1. Loci of GXXGER peptide sequences in human fibrillar collagens. Collagen helices were aligned using ClustalX, and a schematic diagram is shown, where the lower bar shows the D-period helical overlap (dark gray), telopepetide overlap (light gray), and gap regions (white) of a fibril to give D-periods of 234 residues. Some helices start or finish differently in the alignment and/or have three-residue deletions. For example type V␣1 starts 3 residues late at the N-terminal (Ϫ3 helix length), has a deletion around residue 650 (vertical bar), and finishes six residues later at the C terminus (ϩ6 helix length). An asterisk denotes that the collagen has no telopeptide at that end. GXXGER sequences are listed at their consensus sites numbered along the collagen bar, where sequences studied in this work are shown in bold. All GER sequences where X is hydrophobic or hydroxyproline/serine are included. A second class of GER sequence represented by the boxed 787 consensus site is characterized by X being a polar residue, represented by GQRGER in this study. There are eight other sites in fibrillar collagens, including e.g. GPR, GRO, and GKA.
in Table II. Inhibition of platelet adhesion to human collagens types I and III displayed the same affinity order as bovine monomers, as expected (data not shown).
To analyze the direct capacity of isolated ␣I to bind the GER peptides, we purified recombinant I domains as glutathione S-transferase (GST) fusion proteins (Fig. 5A). In an anti-GST ELISA, ␣ 2 I binding to GXXGER was compared with that of ␣ 1 I and ␣ 11 I (Fig. 5B). Binding to GFOGER was used as an interassay reference to normalize the results with the three I domains. Mg 2ϩ -dependent adhesion of all I domains was significant (p Ͻ 0.05) to all peptides except (GPP) 10 (all ␣I) and GQR (to which ␣ 11 I bound significantly). The binding profile of wildtype ␣ 2 I was more similar to that of ␣ 2 ␤ 1 in situ in platelets than in HT1080 cells. Effective binding was observed to GFO, GLO, and GLS. ␣ 2 I bound less to GMO than platelets did, and less well to GMO and GLS than HT1080 cells. Comparison of the I domains demonstrated both qualitative and quantitative differences (Fig. 5). ␣ 1 I strongly preferred GLO (ϳ90% binding of that of GFOGER) in comparison to ␣ 2 I (ϳ40% p Ͻ 0.001) and ␣ 11 I (ϳ70% p Ͻ 0.05). In contrast to ␣ 2 I and ␣ 11 I, ␣ 1 I also had better affinity to GLS than GMO, the two former having GLS and GMO binding comparable to GAS and GAO. As a control, no ␣ M I binding to the sequences was observed (data not shown).
The difference between the adhesion profiles of HT1080 cells and platelets to GXXGER peptides suggested that ␣ 2 ␤ 1 was displayed in different conformation in the two cell types. Platelet binding to collagen monomers in suspension has previously been shown to require affinity modulation of the integrin ␣ 2 ␤ 1 through activation by agonists such as ADP or via collagen's main signaling receptor in platelets, GPVI (10). GPVI can be specifically activated with a cross-linked (GPO) 10 polymer, collagen-related peptide (CRP), and peptides containing (GPO) 3 can bind GPVI when immobilized to a surface (26). To allow platelet interaction or activation by GPVI, we synthesized the same set of GXXGER peptides in (GPO) 3 hosts (Table I). In contrast to data shown above, platelets bound to all immobilized GXXGER sequences in which the (GPP) 5 host was replaced by (GPO) 3 (Fig. 6A). An ␣ 2 ␤ 1 -dependent element of adhesion could be resolved for GPO peptides containing GFO, GLO, GMO, and GLS as well as for the controls, GFOGER in (GPP) 5 and monomeric collagen, since a significant decrease in binding was observed in the presence of either EDTA or 6F1 (p Ͻ 0.001 ANOVA). Adhesion to GAS and GAO within (GPO) 3 host, in contrast, was almost completely dependent on GPVI, judged from the marked attenuation of adhesion in the presence of an anti-GPVI antibody, 10B12 (28). The effect of 10B12 on adhesion to all these peptides was uniformly significant (p Ͻ 0.001, ANOVA). Thus, for most GER peptides, GAO and GAS being exceptions, platelet binding was dependent on both ␣ 2 ␤ 1 and GpVI. Platelets from 30 donors bound GFOGER in (GPO) 3 host about 40% better than in (GPP) 5 host, revealing that ␣ 2 ␤ 1 interaction with GFOGER predominates over the GPVI involvement (Fig. 6B), whereas HT1080 cells, which lack GPVI, bound to peptides in (GPP) 5 and (GPO) 3 hosts equally well (Fig.  6C). Together, these data suggested that GFOGER is a high affinity sequence not requiring affinity modulation of the integrin for its interaction.
To determine whether platelet activation improves the capacity of ␣ 2 ␤ 1 to interact with low affinity sequences through affinity modulation, we stimulated platelets prior to adhesion to the peptides in (GPP) 5 host with 30 M ADP, providing a GPVI-independent activation. ADP treatment markedly increased platelet adhesion to all substrates except for GFOGER and collagen (Fig. 7), and now GAO, GAS, and GQR demonstrated clear binding. (GPP) 10 controls showed little binding activity even after preactivation of the integrin. Although ADP stimulation improved platelet adhesion to the weaker affinity peptides more than introduction of (GPO) 3 host (likely to provide only weak activation of GPVI, Ref. 40), replacement of (GPP) 5 with (GPO) 3 host dramatically improved the capacity of all peptides to competitively inhibit platelet adhesion to monomeric collagen. The most improvement was achieved for the medium affinity sequences GMO/GLS, which became almost as effective as GLO (Fig. 8A). Also, the lower affinity sequences GAS, and GAO differed from their respective GPP-GPO control (Fig. 8B), but GQRGER (not shown) remained similar to control; A 50 was Ͼ200 M; nH and I (max) were indeterminate (n ϭ 2). Relative A 50 values were determined where possible from pooled experiments (Figs. 4 and 8) for the set of peptides, which demonstrates an overall decrease in A 50 with all the peptides in a (GPO) 3 host in comparison to (GPP) 5 host, particularly for GLS and GMO (Table II). However, GFOGER was a marked exception and conversely, it actually had a significantly lower A 50 in the (GPP) 5 host (p ϭ 0.008). These data confirm that GFOGER supports I-domain interaction without integrin affinity modulation.
Final proof that the activation state of the integrin underlies the observed adhesion profiles, was obtained using two different active mutants of ␣ 2 I, E318W (35) and a newly generated disulfide LO form, which was produced by substituting cysteines at residues 172 and 322, in which helix 7 is predicted from I-domain crystal to be constrained at the activated down position (15). Both active mutants were found to effectively adhere to all peptides (Fig. 9). The overall absorbance levels of both LO and E318W binding to GFOGER were very similar to that of wild-type ␣ 2 I. Most interestingly, the binding pattern of both E318W and LO mutants to lower affinity sequences resembled that of HT1080 cells, except that E318W binding to GQR binding was poor.
In summary, our data show that integrin recognition sequences of different affinities exist in conserved loci within fibrillar collagens, and that the activation state of the integrin depends on the cellular milieu and determines its affinity for the collagen species. The first identified integrin recognition sequence, GFOGER, is a high affinity motif, which can bind the integrin without its prior activation. In contrast, binding to most other GER sequences, including the newly found GMOGER, GLSGER, GAOGER, and GQRGER, is sequence-dependent, exhibiting a spectrum of affinity, both in the resting state and after integrin activation. DISCUSSION Human fibrillar collagen ␣-chains contain multiple two triplet integrin binding motifs, which have GER as their second triplet. The conserved spacing of these various sequences within collagens I, II, III, V, and XI, reflects their close evolutionary origins. The final collagen XXVII has a large fibrillar domain, which contains only three GXXGER sequences of unknown affinity. When 90 available collagen sequences were investigated, a total of 452 GXXGER sequences were found, of which just 70 are predicted according to our results to have significant binding to integrins based on their first triplet (GFO, GLO, GMO, GLS, GIO, GVO, GYO, and GMS; see also Supplementary Data). Of the sequences studied here, there exist 21 GFOGER, 23 GLOGER, 15 GMOGER, 3 GASGER, 19 GAOGER sequences, whereas GLSGER is unique to human type III collagen. The first three have very high occurrence given the relatively low amino acid frequency of F, L, and M in collagen and thus are likely to have important function. GFOGER was found in collagen chains of I␣1, II, IV␣1, IV␣3, expressed as a mean Ϯ S.E. of three separate determinations or n is shown below the bars. EDTA and 6F1 treatments differed significantly from their respective peptide only controls, two-way ANOVA, p Ͻ 0.0001. D, HT1080 spreading on the GXXGER peptide panel. Bar indicates 50 m. IV␣4, IV␣5, VII, XI␣1, and XI␣2, but not in any other collagens. In eel and Drosophila, GYOGER might substitute for the absence of GFOGER based on amino acid similarity. Moreover, a multitude of GXXGEK sequences in collagens coincide with the proposed GXXGER loci, suggesting that they too, are integrin binding sites with a lower affinity than GER, as the Arg to Lys change attenuates affinity by 50% (13). Nine GFOGEK sequences that may bind integrin were located, including in human collagens X␣1 and VI␣3, followed by a further 26 GLOGEK sequences in collagen types III, IV, V, VIII, and XI. This remarkable sequence conservation suggests that these motifs serve an important role, such as the ability to support receptor avidity and subsequent signaling as receptors align with the higher order structure of the fibril. The GXXGER sites may also have other functions: it has been proposed that dimerization of collagen monomers depends on the juxtaposition of GER with the A-domain of collagen VI (41).
The affinity series of ␣ 2 ␤ 1 for GER sequences was remark-ably well conserved between HT1080 cells, platelets, and recombinant I domains, whether direct adhesion to peptides or displacement from collagen was measured: GFO Ͼ GLO Ͼ GLS Ͼ GMO Ͼ GAO Ϸ GAS Ϸ GQR. Minor exceptions to this rule were observed for the constitutively active mutants, where GLO, GLS, and GMO did not differ significantly from GFOGER. The elucidation of different preferences of binding requires structural characterization of these mutants. Relatively low binding was found for ␣ 2 I and resting platelets to GLS and GMO in comparison to HT1080 cells and activated  Table II for A 50 values). GQRGER data were omitted for clarity, as it was indistinguishable from GAO/GAS and GPP 10 data. GFO *** GLO Ͼ Ͼ *** GLS Ͼ Ͼ ** GMO *** GAS ϭ GAO ϭ GPP GPO order GFO Ͼ * GLO ϳ GLS Ͼ * GMO Ͼ Ͼ ** GAS Ͼ GAO Ͼ GPP a Concentrations are given for the triple helix. b NS, no significant difference. c ND, effectively no or very little inhibition (Ͻ10%) detected at maximal concentration. d Inhibition curve was incomplete. As a result, I (max) was defaulted to 100% in order to get a converging solution for the equation. e p values for the difference between the series of GXXGER guest peptides with either (GPP) 5 or (GPO) 3 hosts are summarized in "GPP order" and "GPO order" at the foot of the table where *, p Ͻ 0.05; **, p Ͻ 0.01; and ***, p Ͻ 0.005.
The affinity series illustrates how sequence variation influences I-domain:ligand interaction. Although the amino acids E and R are crucial for the GFOGER:I-domain complex, the aromatic ring of the F residue contributes to the binding by asso-ciating with the hydrophobic pocket on the surface of the I-domain (15). The first triplet may diminish ␣ 2 I-domain binding if it lacks a sufficiently long hydrophobic chain in the second position. Here we show that L, and to a lesser extent M and A, may substitute for F in this context, but the efficacy of the binding depends on the activation state of the integrin. GAOGER has previously been proposed, but not proven, to mediate ␣ 2 ␤ 1 -dependent binding of collagen III, which does not contain GFOGER (25), while GMOGER and GLSGER sequences identified here are completely novel. The final hydrophobic amino acid-containing sequences, GIOGER, which occurs in collagens types I (chicken, frog) and III (chicken), and GVOGER, which occurs in collagens types VIII and X (human) would also be predicted to be a higher affinity integrin binding sequences as isoleucine and valine also have a large non-polar side chains.
No side chain contacts were observed between the hydroxyproline in the third position of the first triplet and the ␣ 2 I in the GFOGER co-crystal. Previously GFPGER has been shown to bind platelet ␣ 2 ␤ 1 almost as well as GFOGER. 2 Thus, other residues (Ser, Pro) may be substituted without wholly compromising binding to ␣ 2 I. However, while a recombinant type I collagen devoid of hydroxyproline bound ␣ 2 ␤ 1 well, it had a poor interaction with both ␣ 1 I and ␣ 1 -expressing cells (43), suggesting selectivity among I domains. GLS was a substantially weaker binder than GLO for all wild-type ␣I domains, but GAS was similar to GAO, the latter presumably being a poor substrate compared with GFO by virtue of the short alanine side chain. We also identified 8 GXXGER loci in the collagens containing several alternative polar residue combinations within the first triplet. GQRGER, present at locus 787 in the consensus alignment in Fig. 1 of I␣1, II␣, and V␣2 chains, was tested as an example of them. Integrin and the I domains had similar affinity to GQR as to GAS and GAO, and GQR also required integrin affinity modulation for its binding. Thus, the GXXGER sequences provide a versatile means of interaction with the collagen binding integrins.
Several lines of evidence presented here demonstrate that GFOGER is a unique high affinity recognition motif. First, of all the sequences tested, GFO had the highest affinity for I domains, and no difference was observed in the level of binding between the mutants and the wild-type ␣2I. Second, resting platelets were fully capable of binding GFO: inclusion of a GPVI interacting/activating host sequence did not change the binding or inhibitory capacity of GFO, neither did prior activation of platelets with ADP. As HT1080 cells had no significantly higher affinity for GFO than GLO/GLS/GMO, and their adhesion profile resembled that of a gain-of-function mutant E318W, these cells are likely to express the integrin in an active conformation.
In platelets ␣ 2 ␤ 1 is thought to act as a primary adhesive receptor upon vascular trauma, detailed in the "two-site twostep"-process of platelet-collagen interaction (44). Paradoxically however, in circulation ␣ 2 ␤ 1 is in a resting state regulated by the divalent cation concentration in the blood. How can ␣ 2 ␤ 1 operate? Platelet ␣ 2 ␤ 1 has been suggested to be a shear-dependent mechanoreceptor (45), a concept which has been strengthened by studies of I-domain-containing integrins in 2 P. Siljander, unpublished data.  (Table II). GQRGER data were omitted for clarity. leukocytes (46). It also becomes activated through inside-out signaling by soluble agonists and by GPVI (27). The concept of a need for prior activation of ␣ 2 ␤ 1 for hemostatic function was furthered by murine studies, in which platelet adhesion was markedly attenuated by the absence of GPVI (8). Here, we show that although platelet binding to the lower affinity sequences does indeed require prior activation, binding to GFOGER (and to a lesser extent GLOGER) does not, which may allow platelets to interact with such sequence-containing collagens without a preceding switch in the integrin conformation. Thus, the collagen-integrin interaction represents a continuum, defined by the spectrum of affinities of the different GER sequences and the activation state of the integrin itself.
These results are supported by the previous observations that under flow, primary platelet adhesion is impaired by blocking the function of human ␣ 2 (47,48), while the case is less clear with thrombus formation in mice (49 -51). Comparison of the structures of the free ␣ 2 I-domain alone and in complex with GFOGER reveals the existence of two distinct conformations, open and closed. Freed from the constraints imposed by the remainder of the integrin, the I-domain most likely exists in equilibrium between the two conformations, with GFOGER peptide most readily able to select and stabilize the open state, i.e. activate the integrin. Whether the GFOGER-liganded and unliganded structures represent the extremes of movement within the I-domain, and whether they are also sufficient to fully describe also the inside-out activation process is currently under investigation. The concept of ligand binding-induced activation is supported by intracellular platelet signaling by GFOGER (52). 3 The GXXGER sequences of various affinities in conserved loci within the collagen fibril may underpin the basis of differential cellular fine-tuning by collagens, a topic which needs further study. A simple example is that of collagen types I and III: while type I monomer has 2 GFO, 4 GLO, 2 GMO, 2 GAS, and 2 GQRGER among several other GXXGER sequences, type III has only 3 GMO, 3 GLS, and 6 GAOGER, all of relatively low affinity based on this study. This correlates well with the observed differences in studies using these collagens (5,28,53) and with the different ␣ 2 ␤ 1 reactivity of different collagen types (54). Thus the requirement for integrin activation in thrombus deposition, by GPVI or other agonists such as ADP, may be determined by the relative expression of collagen types in the subendothelium, which changes upon e.g. development of an atheromatous plaque (55). Further, the influence of the quaternary fiber structure on platelet function has been a long standing issue, and these results may begin to offer insights into the differential ␣ 2 ␤ 1 dependence of platelet or cell function on collagen monomers compared with fibrils (48,56,57). The possible modulation of integrin binding provided by the different chains in heterotrimeric collagens has not yet been fully elucidated, and it is essential to identify whether all these sequences are accessible for cell interaction within an intact fibril (58), or whether they only become exposed upon degradation and remodeling of the collagen matrix. Finally, we consider that additional integrin recognition sequences remain yet to be discovered.