The Cation-binding Domain from the α Subunit of Integrin α5β1 Is a Minimal Domain for Fibronectin Recognition*

The cation-binding domain from the α subunit of human integrin α5β1 was produced as a recombinant protein, α5-(229–448). This protein displays a well defined fold with a content of 30–35% α-helix and 20–25% β-strand, based on circular dichroism. The binding of Ca2+ or Mg2+ to α5-(229–448) results in a biphasic conformational rearrangement consistent with the occurrence of two classes of cation-binding sites differing by their affinities. The two classes of sites are located in two conformationally independent lobes, as established by a parallel study of two recombinant half-domains (N- and C-terminal) that also adopt stable folds. Upon saturation with divalent cations, α5-(229–448) binds an Arg-Gly-Asp (RGD)-containing fibronectin ligand to form a 1:1 complex. Complex formation is associated with a specific conformational adaptation of the ligand, suggesting an induced fit mechanism. In contrast, neither of the half-domains is competent for ligand binding. The α5-(229–448)-fibronectin complex is dissociated in the presence of an RGD peptide, as well as of a simple carboxylic acid, suggesting that the RGD aspartyl carboxylate is an essential element that directly interacts with the α5 cation-binding domain.

erodimers of non-covalently associated ␣ and ␤ subunits. In general, ligand binding occurs through recognition by the integrin of a short amino acid sequence from the ligand (3). The prototype for these integrin-binding sites is the Arg-Gly-Asp (RGD) sequence (4) that is present in fibronectin, fibrinogen, vitronectin, and other adhesive proteins (5). Divalent cations are essential for integrin function (6). Both integrin ␣ and ␤ subunits have been implicated in cation and ligand binding (7), although the precise nature of the ligand recognition site on integrins remains elusive. The ␤ subunits are characterized by a conserved domain in their N-terminal regions that contributes to cation as well as ligand binding (8 -10). The integrin ␣ subunits are characterized by the presence of seven N-terminal repeats that encompass nearly half of the subunit residues (5). Three to four of these repeats (IV or V to VII) display sequences that resemble the EF-hand consensus sequence found in various divalent cation-binding proteins (11). However, the integrin EF-hands systematically lack an acidic residue at their relative positions 12, a highly invariant Glu residue in the typical EF-hands, that is replaced by a non-polar residue. It has been hypothesized that the integrin EF-hands are responsible for the cation binding properties of the integrin ␣ subunits (11). In agreement with this hypothesis, a soluble recombinant fragment that encompasses the four EF-hand type sequences from the ␣ subunit of integrin ␣ IIb ␤ 3 displays four Ca 2ϩ -binding sites distributed in two classes differing by their affinities (12). Moreover, this isolated, fully Ca 2ϩ -loaded, ␣ IIb EF-hand domain binds fibrinogen in an RGD-dependent manner, establishing the functional importance of an ␣-integrin EF-hand type domain in ligand-integrin interactions.
Knowledge of the tertiary organization of these macromolecular components, in their free states and in the integrin-ligand complexes, is of paramount importance to our understanding of the molecular bases of cell adhesion. It is now established that the RGD adhesion motif occurs in extended flexible loops between ␤-strands, based on atomic resolution structures of ligand proteins (13)(14)(15)(16)(17). Additional sequences besides the RGD motif are required for full adhesion activity with some of the ligand proteins. A region in the fibronectin 9th type III repeat (or III 9 ) that contains the PHSRN sequence appears to synergize with RGD in the 10th type III repeat (or III 10 ) for recognition by ␣ 5 ␤ 1 (18). The x-ray structure of a recombinant protein spanning type III repeats 7-10 of fibronectin has revealed (17) that both sites, RGD and PHSRN, are located on the same face of the ligand molecule in agreement with the synergistic properties of the latter. By contrast, no tertiary structure information is available so far for the integrins, with the exception of the I domains in ␣ M , ␣ L , and ␣ 2 (19 -21). Hypothetical models based on molecular modeling have been proposed to account for the folding of the N-terminal regions of the integrin ␣ subunits, including the cation-binding EF-hand type repeats.
A first model predicted a 4-fold helix-loop-helix (HLH) arrangement, as in CaM, for the cation-binding domain in the ␣ L subunit (22). Another model envisages a ␤-propeller fold that encompasses all N-terminal repeats I-VII (23). This fold consists of seven modules, with 4-stranded ␤-sheets each, in a toroidal arrangement. Both models, the CaM model and the ␤-propeller, appear mutually incompatible with regard to the structure of the EF-hand domain. The first one predicts a large contribution of ␣-helices, whereas the second one is only ␤-strand and coil.
We report here on the cloning, expression, and purification of three protein fragments representing different sequences from the cation-binding domain of the ␣ subunit in human integrin ␣ 5 ␤ 1 . Two different types of proteins were produced as follows: (i) ␣ 5 -(229 -448) representing the entire cation-binding domain, with four EF-hand loop sequences; (ii) ␣ 5 -(229 -316) and ␣ 5 -(350 -448) representing the N-and C-terminal half-domains, respectively, with two EF-hand loop sequences each. The search for a minimal ligand-competent region from the ␣ 5 cation-binding domain was guided by the results of Gulino et al. (12) with ␣ IIb , as well as by the CaM model (22). The production of such recombinant ␣ 5 proteins in soluble form opens for the first time the possibility of investigating the structural organization of the ␣ 5 divalent cation-binding domain as a prototype of the homologous EF-hand domains in the different integrin ␣ subunits. By using CD spectroscopy, we established that these ␣ 5 proteins display a well defined structural organization in solution. We then investigated their capacity to bind divalent cations, as well as their conformational variability upon cation binding, and their capacity to bind RGD-containing fibronectin (FN) ligands.
Purification of Recombinant Proteins from the Soluble Fraction-DNA was precipitated with protamine sulfate (26). The protein fraction obtained was dialyzed against buffer A plus 5 mM MgCl 2 and loaded onto a Ni-NTA agarose column (Qiagen) previously equilibrated in the same buffer. His tag-␣ 5 -(229 -448) was eluted with a linear imidazole gradient (0 -200 mM). The recombinant protein was eluted at about 150 mM imidazole. The pooled fractions were dialyzed against buffer A plus 5 mM MgCl 2 and loaded onto a DEAE-Sephacryl column (IBF, France) previously equilibrated with the same buffer. Elution was carried out with a linear KCl gradient (0. 15-1.5 M; protein eluted at about 1.2 M KCl). Desalting was carried out on a Sephadex G-50 column (Pharmacia, Sweden) equilibrated in buffer A plus 5 mM MgCl 2 .
Purification of Denatured Proteins-The pellet obtained after bacterial lysis was washed with buffer B plus 1 M urea (see "Results") and resuspended in buffer B plus 6 M urea. The suspension was loaded onto a Ni-NTA agarose column previously equilibrated in buffer B plus 6 M urea. After washing with the same buffer, the protein was eluted using a linear imidazole gradient (0 -200 mM) in buffer B plus 6 M urea.
Refolding of the Denatured Proteins-The denatured protein was dialyzed against buffer B plus 6 M urea to remove all imidazole in excess and was progressively added to a slurry of Ni-NTA agarose, previously equilibrated in buffer B plus 6 M urea, to achieve a protein-to-resin ratio of about 0.5 mg of protein/ml of packed wet resin. The suspension was incubated for 1 h at 4°C under gentle rocking. The resin was then loaded onto a column and washed with buffer B plus 6 M urea. Refolding was carried out using a linear urea gradient (6 to 0 M in buffer B). The refolded protein was then eluted from the column with buffer B plus 0.2 M imidazole and centrifuged (150,000 ϫ g for 1 h) to discard insoluble protein. The protein concentration in the supernatant was determined from its UV absorbance at 276 nm (see below). The recovery yield corresponds to the ratio of the amount of solubilized protein to the amount of protein initially bound to the Ni-NTA agarose matrix.
Thrombin Cleavage-The His-tagged proteins were first dialyzed against 20 mM Tris-HCl at pH 8 containing 50 mM KCl, 2.5 mM CaCl 2 , and 1 M urea (see "Results"). Digestion was carried out at 37°C at an enzyme-to-substrate ratio of 10 units of thrombin (Sigma) per mg of cleavable protein. The reaction was stopped by the addition of PMSF to a final concentration of 2 mM. The time needed for an efficient cleavage reaction was previously determined by a time course SDS-PAGE analysis. The digestion mixture was dialyzed against 20 mM Tris-HCl, 400 mM KCl, 5 mM MgCl 2, 2 mM PMSF, 10% glycerol (v/v) at pH 8 and loaded onto a Ni-NTA column. The unbound protein was dialyzed against buffer A plus 5 mM MgCl 2 and loaded onto a DEAE-Trisacryl column. Elution was carried out with a linear KCl gradient (0.15-1.5 M KCl). The proteins obtained include the tetrapeptide GSHM resulting from the cleavage by thrombin (see "Results").
Protein Concentrations-UV absorbance measurements at 276 nm (Cary 118 UV spectrometer) were used to determine protein concentrations. The molar absorbances of the different recombinant proteins used throughout this work were determined by the procedure of Gill and von Hippel (27). The contribution of scattering to the UV spectra was usually avoided by ultracentrifugation (150,000 ϫ g for 1 h). The values obtained (expressed in liters⅐M Ϫ1 ⅐cm Ϫ1 ) are 24,390 for ␣ 5 -(229 -448), 18,640 for ␣ 5 -(229 -448)W406L, 10,150 for ␣ 5 -(229 -316), 10,530 for ␣ 5 -(350 -448), 32,150 for FNIII 8 -10 , and 19,500 for FNIII 10 -11 . In the absence of any additional aromatic residue in the 20-residue N-terminal His tag appendix absorbing at 276 nm, identical values were adopted for the His-tagged proteins.
Circular Dichroism Measurements-CD spectra were recorded at 25°C with a dichrograph CD6 (Jobin Yvon, Paris, France) under a stream of pure nitrogen. The spectra are the average of five scans recorded using a bandwidth of 2 nm, a step width of 0.1 nm, and a 0.5-s averaging time per point. Cell path lengths were 1.00 Ϯ 0.01 mm (far UV region, Ͻ250 nm) or 50.00 Ϯ 0.01 mm (near UV region, Ͼ250 nm). Protein concentrations in the 10 M range were used. The protein concentration was similar for both the far and the near UV measurements to avoid concentration-dependent effects. Absolute ellipticity values were inferred from standardization with D-10-camphosulfonic acid (Fluka, Switzerland) according to the procedure of Venyaminov and Yang (28). Molar ellipticities are given in degrees⅐cm 2 per dmol of mean amino acid residue, with M r ϭ 115 (28). Deconvolution of the far UV CD profiles was carried out using standard computer procedures including ridge regression (29), variable selection procedure (30), and neural network (31). The dispersion in the values given in Table I corresponds to the extrema obtained with these different methods.
Molecular Weight Determinations by Mass Spectrometry-The puri-fied recombinant proteins were desalted by reverse-phase high pressure liquid chromatography on a Nucleosil C 4 column (SFCC, France) using a linear acetonitrile gradient (35-65%) in deionized water containing 0.1% trifluoroacetic acid. The pooled protein fractions were analyzed by electrospray mass spectrometry using a Fisons VG Trio 2000 mass spectrometer equipped with an ion spray source. The protein molecular masses were determined using the ESI deconvolution software from Fisons. Amino Acid Sequencing-N-terminal amino sequencing was carried out by automated step-by-step Edman degradation using a gas phase amino acid sequencer with an on-line phenylthiohydantoin analyzer (Applied Biosystems Inc., model 476A), as described by Speicher (32). Ten steps were performed.
Divalent Cation Binding Measurements-Ca 2ϩ blotting experiments were carried by using either radioactive 45 Ca 2ϩ (33) or the fluorescent quinoline Ca 2ϩ indicator quin2 (34). Quantitation of the amount of Ca 2ϩ ions bound to the ␣ 5 recombinant fragments was carried out by densitometry (NIH Image software) of the electrophoretic band labeled with either 45 Ca 2ϩ or quin2. Parvalbumin with a known stoichiometry of 2 Ca 2ϩ /protein (35) was used as a standard to obtain a plot representing the intensity of the labeled electrophoretic band as a function of Ca 2ϩ content. The linear plots obtained (not shown) were used to infer the binding stoichiometry of the ␣ 5 recombinant fragments. CD-monitored titrations with Ca 2ϩ or Mg 2ϩ were carried out using the apoforms of our different integrin recombinant fragments that were obtained as described previously for parvalbumin (36). Titrations were carried out by adding aliquots of either CaCl 2 or MgCl 2 in buffer A at concentrations ranging from 10 to 500 mM to an apoprotein solution at 10 M concentration to afford regularly spaced points in the titration plot (see "Results"). Under the conditions used, the protein concentration was kept nearly constant over the entire range of cation concentrations used (the maximum dilution effect did not exceed 2% of the initial protein concentration).
Fibronectin Binding Measurements-A protein affinity chromatography method similar to that described by Roberts and Green (37) was used to evaluate the binding of the ␣ 5 recombinant fragments to the immobilized fibronectin ligand. The ␣ 5 recombinant fragments were incubated with the GST fusion fibronectin fragment, GST-FNIII 8 -10 , from human origin (38) in buffer A at different CaCl 2 or MgCl 2 concentrations. The protein mixture was then loaded on a glutathione-agarose matrix (Sigma). Washing of the column was carried out with the incubation buffer until all unbound material was eluted. The proteins bound to the matrix were identified by SDS-PAGE analysis. We verified that GST itself did not interfere with the integrin-ligand association (not shown).
Chemical Cross-linking-Before complex formation, the IN and FN components were dialyzed against 20 mM Hepes, 150 mM KCl, 5 mM CaCl 2 at pH 6.0. After centrifugation (80,000 ϫ g for 30 min), the recombinant integrin and fibronectin molecules were mixed at a 1:1 molar ratio (in the 10 M range), incubated at room temperature for 2 h, and then subjected to chemical cross-linking by the addition of dithiobis(succinimidyl propionate) (Pierce) in N,N-dimethylformamide to a final concentration of 0.5 mM (39). The reaction was stopped by the addition of glycine to a final concentration of 50 mM. The time necessary for a complete cross-linking reaction was previously determined by a time course SDS-PAGE analysis (not shown). The cross-linked molecular species were separated by size-exclusion chromatography using a Sephacryl S-100 HR column (2.6 ϫ 100 cm; Pharmacia, Sweden) under non-denaturing conditions (40).

RESULTS
Cloning and Expression of ␣ 5 -(229 -448)-Two different clones were obtained as revealed by nucleotide sequencing. One clone codes for the parent amino acid sequence ␣ 5 -(229 -448) and the other for ␣ 5 -(229 -448)W406L, a single point mutant with the unique Trp-406 (see Fig. 9B) substituted by Leu. Since similar results were obtained for the expression, purification, and refolding of the wild-type protein and the W406L mutant, no further distinction will be made between the two proteins throughout this paper unless otherwise directly stated. The pET15b-␣ 5 -(229 -448) vector was transformed into the E. coli strain BL21(DE3), and protein expression was induced by the addition of IPTG (Fig. 1A, lanes 1 and 2). Independently of temperature and IPTG concentration, only a minor part of the recombinant protein was found in the soluble fraction of bac-terial lysate, whereas the major part remained in the insoluble fraction. The protein was purified from the soluble fraction by affinity chromatography using a Ni-NTA agarose column followed by two additional chromatographic steps (see "Experimental Procedures") to yield the purified His-tagged protein that was subsequently cleaved with thrombin ( Fig. 1A, lanes 3 and 4, respectively). Cleavage yields by thrombin were optimized using variable urea concentrations in the 0.5-2 M range (see "Experimental Procedures"). Approximately 2 mg of protein/1.5 liters of bacterial culture were obtained from the soluble fraction after purification and thrombin cleavage. As shown in Fig 45 Ca 2ϩ binding after detection by autoradiography. Markers are as follows: carbonic anhydrase A (31.0 kDa) and soybean trypsin inhibitor (21.5 kDa). The influence of salt and protein concentration on the recovery of His tag-␣ 5 -(229 -448) as a soluble protein is illustrated in B and C. The Ni-NTA immobilized protein in its denatured form was refolded as described under "Experimental Procedures": B, at constant protein concentration of 1 mg/ml hydrated resin in buffer A at different KCl concentrations; C, at different protein concentrations (in mg of protein/ml of hydrated resin) at 150 mM KCl in buffer B. The recovery yield was calculated as described under "Experimental Procedures." 45 Ca 2ϩ overlay blotting experiments. The stoichiometry of the binding was determined from the intensity of the 45 Ca 2ϩ -labeled band using parvalbumin as a standard (see "Experimental Procedures"). Two independent measurements gave 3.9 and 4.1 Ca 2ϩ ions per molecule of ␣ 5 -(229 -448). Similarly, the fluorescence quin2 procedure adapted for quantitative measurements (see "Experimental Procedures") afforded 3.91 and 3.88 Ca 2ϩ ions per ␣ 5 -(229 -448) molecule. This leads to the conclusion that the isolated ␣ 5 EF-hand domain displays a total binding stoichiometry of four Ca 2ϩ per protein molecule.
Refolding of Insoluble ␣ 5 -(229 -448)-After a first washing of the pellet in 1 M urea which significantly reduced the amount of contaminant bacterial proteins (SDS-PAGE, not shown), the recombinant protein from the insoluble fraction was solubilized in 6 M urea. Based on previous refolding assays using matrixbound proteins (41,42), the refolding of His tag-␣ 5 -(229 -448) was carried out with the denatured protein immobilized on the Ni-NTA affinity matrix (see "Experimental Procedures"). The efficiency of protein refolding was dependent on different parameters that were optimized as follows: (i) the recovery yield is strongly dependent on KCl concentration, as shown in Fig.  1B; a maximum yield (ϳ65%) was observed at about 150 mM KCl, whereas at lower and higher KCl concentrations the recovery yield rapidly decreases. (ii) As shown in Fig. 1C, renaturation of the matrix-bound protein can be performed at a gel loading up to 0.5-0.6 mg of protein/ml of hydrated resin without affecting the recovery yield; above this value, the recovery yield rapidly decreases. (iii) The recovery yield was slightly increased by adding divalent cations, glycerol, and a non-ionic detergent (Triton X-100) to the refolding buffer. A final set of refolding conditions, combining the optimal parameters defined above, was thus adopted to yield a recovery of about 80% protein initially solubilized in buffer B plus 6 M urea. Under these conditions, about 10 mg of properly refolded protein were obtained from 1.5 liters of bacterial culture. The refolded protein displays both Ca 2ϩ binding and fibronectin binding properties similar to the protein recovered from the soluble fraction. Moreover, the refolded protein displays a CD spectrum superimposable to that of the protein recovered from the soluble fraction (not shown), indicating that a correct refolding is achieved under the conditions used. To test for both correct synthesis in bacteria and correct thrombin cleavage, we determined the exact mass of the recombinant proteins (wild-type and mutated) by electrospray ionization mass spectrometry (see "Experimental Procedures"). Relative molecular masses of 23,485.6 Ϯ 0.7 and 23,411.6 Ϯ 0.4 were obtained for ␣ 5 -(229 -448) and ␣ 5 -(229 -448)W406L, respectively, in good agreement with the calculated values, i.e. 23,485.0 and 23,411.9, respectively. Partial N-terminal amino acid sequencing (see "Experimental Procedures") was also performed with ␣ 5 -(229 -448) and gave the expected sequence, GSHMSYLGYS (see Fig. 9B), with the four N-terminal residues representing the appendix left after thrombin cleavage of His tag-␣ 5 -(229 -448).
Cation-dependent Conformation of the ␣ 5 Cation-binding Domain-As shown in Fig. 2, both the apoform and the Ca 2ϩ or Mg 2ϩ -loaded form of ␣ 5 -(229 -448) display superimposable far UV CD profiles, indicating that the secondary structure is independent of the binding of divalent cations. At this point, the secondary structure content of ␣ 5 -(229 -448) was inferred from the far UV region of the CD spectrum ( Fig. 2A). The results are summarized in Table I. Caution needs to be exerted regarding the absolute numbers of ␣-helical and ␤-stranded residues inferred since these values could be biased, among others, by the deconvolution methods, as well as by effects originating from the amino acid composition (occurrence of aromatics; see Ref. 43). We used the typical EF-hand parvalbumin to assess the validity of our deconvolution approach with our integrin EF-hand fragments. The secondary structure of pike parvalbumin (isoform pI 5.0 fully Ca 2ϩ -loaded) was inferred from its far UV CD spectrum (not shown) thus yielding 45-50% ␣-helix and 5% ␤-stranded residues in close agreement with the crystal structure predictions (62 helical residues among 109 in total, i.e. 57% ␣-helix). Such an agreement between the CD data and parvalbumin structure makes us confident that the values in Table I closely represent the numbers of ␣-helical and ␤-stranded residues in our integrin proteins.
In contrast to what is observed in the far UV region, a marked difference was observed in the near UV region of the CD spectrum between the apoprotein and the Ca 2ϩ -or Mg 2ϩloaded form (Fig. 2). The Phe-and Trp-associated CD bands undergo large intensity variations with no significant wavelength shift upon cation binding, whereas the Tyr-associated bands remain essentially unaltered. The increase in rotational strength (for a definition see Ref. 28 bands is likely to be associated with a compact state of the protein hydrophobic core upon cation binding, as previously reported in the case of the typical EF-hand proteins (44,45). Comparison of Fig. 2, B and D, indicates that both ions Ca 2ϩ and Mg 2ϩ induce practically identical effects on the near UV CD spectrum of ␣ 5 -(229 -448).
We then carried out a titration of the apoform of ␣ 5 -(229 -448) by Ca 2ϩ and Mg 2ϩ using the Phe-and Trp-associated CD bands. As shown in Fig. 3, A and C, the intensity variations of the Phe band at 268 nm resulted in a biphasic profile upon progressive addition of either Ca 2ϩ or Mg 2ϩ to the ␣ 5 -(229 -448) apoform. Practically identical results are observed using Ca 2ϩ or Mg 2ϩ , indicating that the affinity of ␣ 5 -(229 -448) is the same for both cations. The first transition in the biphasic profile corresponds to the occupancy of a class of high affinity sites, whereas the second transition is associated with a class of low affinity sites. The midpoints of the first (I) and second (II) transitions occur at 0.03 and 0.16 mM, respectively. A K dI value of 30 M was determined for class I, and a K dII value of 120 M was determined for class II for Ca 2ϩ and Mg 2ϩ binding. These K d values are closely related to those previously reported for Ca 2ϩ binding to the recombinant fragment ␣ IIb -(171-464) from the ␣ IIb subunit (12). Although the ratio K dII / K dI does not exceed a value of 4, both transitions I and II in the Ca 2ϩ or Mg 2ϩ titration profiles are well resolved, and this is likely to be due to cooperativity within each class of sites with two cationbinding sites each. In contrast, when titration of the ␣ 5 -(229 -448) apoform by Ca 2ϩ or Mg 2ϩ was monitored using the Trp CD band at 290 nm (Fig. 3, B and D), only a monophasic profile was observed with a midpoint at 0.03 mM that closely coincides with the midpoint of transition I in the biphasic profile. ␣ 5 -(229 -448) contains a single Trp residue at position 406 such that the monophasic CD variations are likely to translate a local conformational rearrangement of the protein in contrast to Phe with 12 residues distributed all along the primary structure (see Fig. 9B).
At this point, we analyzed if a similar conformational adaptation occurred upon cation binding to the half-domains, ␣ 5 -(229 -316) and ␣ 5 -(350 -448). The far UV region of the CD spectra of both fragments indicates that both proteins display a well defined secondary structure that is independent of their divalent cation contents (see Fig. 4, A and C). Both half-domains are characterized by very similar numbers of ␣-helical residues and ␤-strand residues ( Table I). As in the case of ␣ 5 -(229 -448), the Phe-and Trp-associated bands in the CD spectra of ␣ 5 -(229 -316) and ␣ 5 -(350 -448) appear dependent on the presence of cations (Fig. 4, B and D). In agreement with their amino acid compositions, ␣ 5 -(229 -316) and ␣ 5 -(350 -448) markedly differ by their near UV spectra. The presence of the Trp band at 290 nm in the CD spectrum of ␣ 5 -(350 -448) that is absent in the ␣ 5 -(229 -316) spectrum is consistent with the occurrence of a unique Trp residue at position 406 in the ␣ 5 cation-binding domain (see Fig. 9B).
The apoforms of ␣ 5 -(229 -316) and ␣ 5 -(350 -448) were then titrated with divalent cations. As shown in Fig. 5   Fibronectin Binding Properties-We used the binding assay depicted in Fig. 6A (see also "Experimental Procedures") to assess the ligand binding properties of our ␣ 5 recombinant fragments using the fibronectin recombinant fragment FNIII 8 -10 expressed as a GST fusion protein (38). This fibronectin fragment contains the RGD adhesive motif, located in III 10 (4), as well as the synergistic, topologically homologous site located in III 9 (17,18). As shown in Fig. 6B, ␣ 5 -(229 -448) formed a complex with GST-FNIII 8 -10 at Ca 2ϩ concentrations above 0.2 mM. Very similar results, if not identical, were obtained with Mg 2ϩ ions (not shown). A quantitative comparison between the immobilization assays and the CD-monitored cation titrations was carried out since similar protein concentrations were used in both experiments (see "Experimental Procedures"). Interestingly, transition II in Fig. 3 displays a midpoint of 0.16 mM closely related to the divalent cation concentration of 0.2-0.4 mM corresponding to the onset of immobilization of ␣ 5 -(229 -448). In contrast, neither ␣ 5 -(229 -316) nor ␣ 5 -(350 -448) formed a complex with fibronectin, even under saturating concentrations of divalent cations (not shown). As shown in Fig. 6C (lane 4), the complex is dissociated in the presence of the cyclic RGD-containing peptide, G(Pen)*ELRGDGWC* (the symbol * denotes cyclization), which is known to compete with the intact FN ligand for binding to integrin ␣ 5 ␤ 1 (46). This indicates that the RGD motif participates in the recognition of the FN ligand by ␣ 5 -(229 -448). In an attempt to delineate the respective roles of the Arg and Asp residues in the RGD motif on the formation of the ␣ 5 -(229 -448)⅐FNIII 8 -10 complex, we investigated the competitiveness of mimic compounds of the guanidinium and carboxylate groups in Arg and Asp, respectively. As shown in Fig. 6C, the recombinant IN⅐FN complex is dissociated in the presence of n-butyrate, indicating that a simple carboxylic compound is able to dissociate the IN⅐FN complex. In contrast, no dissociation of the complex was observed in the presence of equivalent or higher concentrations in guanidinium chloride.
The stoichiometry of the ␣ 5 -(229 -448)⅐GST-FNIII 8 -10 com-plex was determined through covalent cross-linking (see "Experimental Procedures"). Under saturating concentrations of divalent cations, a major peak appears by gel filtration close to the exclusion volume with an estimated molecular mass of 80 kDa (Fig. 7A), in agreement with the formation of a 1:1 complex. The high molecular weight species is no longer observed in the presence of an excess of the G(Pen)*ELRGDGWC* peptide. We then used the cross-linking procedure to determine if the synergistic motif found in III 9 was also necessary for the recognition of the fibronectin ligand by ␣ 5 -(229 -448) to occur. As shown in Fig. 7B, no complex was observed even in the presence of excess divalent cations using the recombinant GST-FNIII 10 -11 fragment that lacks III 9 as a ligand. Similar results (not shown) were observed with FNIII 10 -11 , in the absence of GST, indicating that the proximity between GST and the RGDcontaining III 10  binding properties unaffected in comparison to the wild-type protein, the mutated protein was used to monitor structural variations of the ligand upon formation of the IN⅐FN complex using the FN Trp residues as specific conformational probes. As shown in Fig. 8A, the difference CD spectrum obtained after subtracting the sum of the spectra of ␣ 5 -(229 -448) and FNIII 8 -10, taken separately (1:1 molar ratio), from the spectrum of the complex itself (input molar ratio of 1, in the presence of 5 mM CaCl 2 ) shows three well defined peaks at the level of the Phe and Trp CD bands (Fig. 8A). No effect was observed at the level of Tyr. In the absence of Ca 2ϩ , the difference spectrum strictly corresponds to a base-line profile (Fig. 8B), in agreement with the fact that no complex is formed under such conditions (see above and Fig. 6B). We note that the difference spectrum under the conditions of Fig. 8A corresponds to a null profile in the far UV region (not shown), indicating that no changes in secondary structure occur upon complex formation for either of the two molecular components. The CD spectrum of FNIII 8 -10 , which is not presented in this work, is typical of a ␤-strand protein, in agreement with the three-dimensional structure of the type III modules in FN (15,17). A secondary structure content of 55-60% of ␤-strand residues, with no ␣-helix contribution, was inferred from the far UV CD spectrum of FNIII 8 -10 in agreement with the tertiary structure data. Finally, we checked that addition of Ca 2ϩ to FNIII 8 -10 in solution did not have any influence on the CD spectrum of this fibronectin fragment. DISCUSSION The search for a minimal ligand-competent region from the ␣ 5 EF-hand domain was guided by the results of Gulino et al. (12) that initially demonstrated that the recombinant fragment ␣ IIb -(171-464) encompassing the four EF-hand type sequences of ␣ IIb mimics the essential features of fibrinogen recognition by integrin ␣ IIb ␤ 3 . Both proteins ␣ IIb -(171-464) and ␣ 5 -(229 -448) are shown in Fig. 9A after alignment of their 4 EF-hand type sequences. Since the canonical EF-hand corresponds to a 30-residue HLH motif with each helix of about 10 residues in length (47), the rationale of our minimal 2 construct was to design the N-terminal EF-1 loop preceded by a 10-residue segment, as a putative E-1 helix, and the C-terminal EF-4 loop elongated by a peptide segment in order to accommodate a putative F-4 helix (see Fig. 9B). Similarly, the rationale in designing the two half-domains was to isolate two pairs of EF-hands by terminating the N-terminal half-domain, ␣ 5 -(229 -316), after the putative F-2 helix and starting the Cterminal half-domain, ␣ 5 -(350 -448), at the level of Arg-350 thus including the putative E-3 helix (see Fig. 9B).
Our CD data with ␣ 5 -(229 -448) demonstrate for the first time that this domain displays a well defined secondary structure. As much as 30 -35% of the residues in ␣ 5 -(229 -448) belong to helices, corresponding to a number of 66 -77 helical residues among 220 in total (Table I). This is compatible with the occurrence of four EF-hand type HLH motifs, as predicted by the CaM model (22). The occurrence of two separate lobes in ␣ 5 -(229 -448) displaying a similar secondary structure organization is also consistent with this model. The two lobes appear to be conformationally independent, as indicated by the fact that the CD signal of the unique Trp-406 in ␣ 5 -(229 -448) remains unaltered upon filling of the low affinity sites by divalent cations subsequently to the filling of the high affinity sites (see Fig. 3). Although some of the features reported under "Results" point to a resemblance between ␣ 5 -(229 -448) and a four EF-hand protein, such as CaM, a marked difference appears at the level of the number of residues in ␤-strands. The ␣ 5 cation-binding domain displays a significant amount of ␤-strand residues (20 -25%), and this is not the case of the typical EF-hand proteins with a very low ␤-strand content. Moreover, as apparent in Fig. 9B, the linker segments with lengths of 20 -35 residues between two consecutive HLH EFhand motifs in ␣ 5 -(229 -448) are longer than those observed in the typical EF-hand proteins (less than 10 residues; see Ref. 47). All these features lead to the conclusion that the tertiary fold of the ␣ 5 cation-binding domain must deviate markedly from the typical organization of CaM. Finally, the amount of ␣-helix determined in ␣ 5 -(229 -448) is inconsistent with the ␤-propeller model recently proposed to account for the tertiary fold of the N-terminal half sequence of an integrin ␣ subunit, including the cation-binding EF-hand domain, which only predicts ␤-strands and coils (23).
The comparison of the amino acid sequences of the E and F segments flanking each central EF-hand loop offers some clues as to the occurrence of four putative HLH motifs in the cationbinding domain of ␣ 5 . The E segments display two highly conserved hydrophobic loci at the relative positions i and i ϩ 4, respectively, to the EF-hand loop sequence in all integrin ␣ subunit sequences (␣ 5 , ␣ v and ␣ IIb were selected as representative sequences in Fig. 9C). Hydrophobic contacts between the side chains of residues i and i ϩ 4 could stabilize, among others, the putative E helices. Interestingly, the E segments are systematically phased differently in the N-terminal ("1 ϩ 2") and the C-terminal ("3 ϩ 4") lobes. The difference involves the shift by one position in the amino acid sequences of the i/i ϩ 4 hydrophobic cluster relative to the adjacent EF-hand loop. The prediction of the F helices appears to be more problematic. A Pro residue is systematically present in the central region of each segment F-1 to F-3 and occupies the relative position 15 in most integrin ␣ subunits (Fig. 9C). Interestingly, the F helix in the first EF-hand of parvalbumin (labeled the D helix) is kinked at the relative position 15 in all parvalbumin crystal structures (Ref. 48 and references therein). Such a kinked conformation is fully compatible with the occurrence of Pro-15 as suggested by our results from molecular modeling with parvalbumin, 3 so that the Pro-15-containing F segments in the integrin ␣ subunits could adopt a helical conformation, although distorted. The exact significance of all these features at the primary structure level will only be revealed through the determination of the tertiary fold of an integrin EF-hand domain. 4 As shown under "Results" (Fig. 7), ␣ 5 -(229 -448) interacts with an RGD-containing ligand to form a well defined 1:1 complex. The determination of this stoichiometry is an important result if one considers that several possibilities can be envisaged to account for an IN⅐FN complex. According to a current model (49), an RGD-containing ligand is assumed to coordinate directly a divalent cation (bound to the ␣-integrin EF-hand domain) through its aspartate group in place of the missing oxygenated residue at the relative position 12 in the integrin EF-hands (see Introduction). A given RGD-containing ligand could then interact with each integrin EF-hand so that more than one ligand molecule could interact with the isolated EF-hand domain. The 1:1 stoichiometry observed in our work is likely to underlie specific protein-protein interactions giving rise to a unique complex, as indicated by the fact that the stability of the FNII 8 -10 ⅐␣ 5 -(229 -448) complex involves both the RGD sequence and the synergistic site, in FNIII 10 and FNIII 9 , respectively. 5 Finally, we demonstrate that complex formation is associated with a conformational rearrangement of one (or several) of the fibronectin modules as indicated by the CD variations at the level of the FN Trp chromophores using our Trp-free ␣ 5 -(229 -448)W406L mutant (Fig. 8). This suggests that the formation of the ligand-integrin complex is characterized by an induced fit process.
The binding of FNIII 8 -10 to ␣ 5 -(229 -448) occurs above a given Ca 2ϩ concentration that coincides with the occupation of the low affinity sites by divalent cations (Fig. 6). The fully cation-loaded ␣ 5 -(229 -448) protein thus corresponds to the FIG. 9. Amino acid sequences of the integrin EF-hand type domains. A, comparison of the ␣ IIb (12) and ␣ 5 recombinant fragments. The EF-hand type loops are given in open boxes. B, sequence of the EF-hand type domain from the human ␣ 5 integrin subunit comprised between residues 229 and 448 (numbers below the sequence correspond to residues according to their position in the mature protein; numbers above the sequence correspond to positions relative to the first putative cation-liganding residue in each EF-hand type loop, EF-1 through EF-4). The four putative divalent cation-binding sites (EF-hand type sequences labeled EF-1 to EF-4) are underlined. The cation-liganding residues are numbered according to their relative position taking the first liganding residue at the relative position 1. Putative ␣-helical segments are underlined with a dashed line. C, amino acid sequence alignments of the E segments and the F segments of the human integrin subunits ␣ 5 (SwissProt accession number, P08648), ␣ v (SwissProt accession number, P06756), and ␣ IIb (SwissProt accession number, P08514). The hydrophobic clusters i/i ϩ 4 are framed in open boxes, as is the case of the FG and GAP consensus sequences (23). only ligand-competent state. Such a conclusion is in agreement with a previous study with the ␣ subunit of integrin ␣ IIb ␤ 3 which established that fibrinogen binding to recombinant ␣ IIb -(171-464) coincides with the occupation of the low affinity binding sites by Ca 2ϩ subsequently to that of the high affinity ones (12). Based on our CD results, three different conformational states of ␣ 5 -(229 -448) are defined according to the degree of occupancy of the cation-binding sites by Ca 2ϩ or Mg 2ϩ as follows: (i) the apoform, characterized by a loose packing of all its hydrophobic residues, which is unable to recognize any ligand; (ii) an intermediate form with its high affinity sites occupied by divalent cations characterized by the formation of a first hydrophobic core within lobe "3 ϩ 4", which displays no ligand competence; (iii) a fully cation-loaded form which is ligand-competent and is characterized by the formation of a second hydrophobic core in lobe 1 ϩ 2 upon filling of the low affinity sites by divalent cations. X-ray crystallographic and NMR studies are in progress using different recombinant proteins that encompass all four EF-hands in ␣ 5 or separate pairs of EF-hands (lobe 1 ϩ 2 and lobe 3 ϩ 4, respectively). The determination of these tertiary structures will help decipher the structural organization of this ligand-competent state that appears as a minimal domain contributing to the binding of fibronectin by integrin ␣ 5 ␤ 1 .
Our results with the ␣ 5 cation-binding domain lead us to consider a situation that could be judged paradoxical with regard to the respective roles of the different ligand-binding elements identified so far on both ␣ and ␤ integrin subunits. Several regions in the ␣␤ heterodimer have been identified as potential ligand-binding loci. Using integrin chimeras, Loftus et al. (50) noted that the ligand recognition specificity of both ␤ 3 integrins, ␣ v ␤ 3 and ␣ IIb ␤ 3 , is regulated by the N-terminal portion of the ␣ subunit including the first five N-terminal repeats (W1 to W5, as defined in Ref. 23), a region that also appears to be involved in the recognition of the fibronectin synergistic motifs (51). As stated by Loftus et al. (50), the EF-1 and EF-2 sites (fourth and fifth W repeats, respectively) appear to play a key role, as also established in this work. Point mutations in the third repeat in ␣ 5 or ␣ IIb result in a blocking of the binding to the ligand (52,53). However, caution needs to be exerted since the heterodimeric structural integrity could be perturbed by introducing such mutations. The specificity is not restricted to the ␣ subunit N-terminal domain, and the ␤ subunit is also involved in the process of ligand recognition. In apparent contrast with the results of Loftus et al. (50), Takagi et al. (54) showed that it is a short sequence in the ␤ subunit N-terminal I-type domain that confers ligand specificity to the integrin receptor, and mutations in this domain alter the ligand binding properties of the whole integrin receptor (8,9). Moreover, Alemany et al. (55) have provided evidence that an isolated fragment from the N-terminal region of ␤ 3 encompassing residues 56 -231 binds soluble fibrinogen in the presence of Ca 2ϩ . Finally, McKay et al. (56) have produced a recombinant minireceptor encompassing segments ␣ Iib -(1-233) and ␤ 3 -(111-318) that binds RGD-containing ligands, although it does not contain any element from the ␣ IIb region that had been previously shown to be involved in ligand binding, i.e. the EF-hand type domain (12). A more precise delineation of the putative RGDbinding elements in ␤ 3 comes from a previous study by D'Souza et al. (57) who used the short fragment ␤ 3 -(118 -131) that forms a complex with an RGD-containing peptide in the presence of Mn 2ϩ . This region of ␤ 3 includes the totally conserved DDL sequence that is directly involved in the interaction with RGD peptides (58). All these results, taken together, suggest that the integrin ligand-binding pocket is a structure assembled from different elements from both ␣ and ␤ subunits, thus explaining why multiple integrin regions contribute to ligand recognition. A general caveat to be considered in most of the ligand binding experiments reported so far is related to the energetic aspects of the individual molecular contacts that are at play between the ligand and its receptor. Binding energies need to be measured so that each individual contact can be compared with the free energy of ligand binding with the more integrated systems (e.g. native heterodimeric assemblies). Indeed, the ligand binding site of a heterodimeric integrin is likely to involve a subtle synergy of elementary interactions that will contribute to the global free energy of ligand binding.
As illustrated in Fig. 10, one possibility to account for the dual binding of an RGD-containing ligand to both ␣ and ␤ integrin subunits would be that, on the one hand, the RGD aspartyl residue directly interacts with a divalent cation bound to the ␣ subunit EF-hand domain, whereas on the other hand, the arginyl residue interacts with the negatively charged site DDL in the I-like domain of the ␤ subunit (see above), possibly through a "carboxylate-guanidinium" bridge. The fact that the EF-hand domains isolated from ␣ IIb (12), from ␣ 5 (this work), and from ␣ v 6 recognize RGD-containing ligands suggests that a common mechanism underlies the interactions between the 6 J. L. Banères, M. Green, M. LeCalvez, and J. Parello, manuscript in preparation.
FIG. 10. Hypothetical model of the interaction between the RGD adhesion motif in the ligand protein and its integrin ␣␤ heterodimeric receptor. A, the RGD sequence is schematically presented as an hairpin conformation (see B) within an interstitial domain between both integrin ␣ and ␤ subunits, with its negatively charged aspartyl carboxylate group (ᮎ) coordinating one divalent cation (ϩϩ) bound to an EF-hand-type site of the ␣ subunit and its positively charged arginyl guandinium group (ᮍ) interacting with a negatively charged site in the I-like domain of the ␤ subunit (see "Discussion"). The drawing includes the heterodimeric globular head of the extracellular domain of the integrin which is made up of both half N-terminal regions from the ␣ and ␤ subunits. B, unbound conformation of the RGD tripeptide motif of fibronectin observed in the crystal structure of the FNIII 7-10 fragment (Ref. 17; PDB accession number, 1FNF). ligand molecule and the integrin ␣ subunit cation-binding domain. Such a mechanism could be the one proposed in Fig. 10 where the aspartyl carboxylate in the ligand RGD motif directly coordinates one of the divalent cations bound to the integrin EF-hand type domain. In agreement with this model, the complex between ␣ 5 -(229 -448) and fibronectin ligands is destabilized in the presence of a carboxylate-containing molecule, n-butyrate, whereas no dissociation is observed with guanidinium chloride (see Fig. 6). These simple compounds were used to mimic the side chains of Arg and Asp in the RGD motif. Moreover, measurements of paramagnetically induced water NMR relaxation effects with ␣ 5 -(229 -448) substituted by Mn 2ϩ have shown that upon addition of an RGD-containing peptide to the binary integrin⅐Mn 2ϩ complex, the cation-coordinating water is lost at the level of the EF-1 and/or EF-2 sites, 6 in agreement with the view that the RGD carboxylate acts as the missing coordinating residue (49). All these results strongly support the view that the missing Glu-12 residue in the integrin EF-hands (see Introduction) results in the possibility of forming a stable contact between the RGD carboxylate and the integrin-bound cation. As expected, a typical EF-hand protein, parvalbumin, with two Glu-12-containing EF-hand sites, is not able to recognize the fibronectin ligand FNIII 8 -10 (assay as in Fig. 6, not shown). Finally, the introduction of Glu-12 in the EF-1 sequence of ␣ 5 -(229 -448) through site-directed mutagenesis leads to the loss of FN ligand recognition while reinforcing the affinity of EF-1 for divalent cations. 6 The model presented in Fig. 10 would then reconcile the observations known so far concerning the binding of an RGD-containing ligand to both the ␣ and ␤ integrin subunits and could be tested, in principle, by using a recombinant heterodimer, including the N-terminal regions from both ␣ and ␤ subunits. 4