Crystal Structure of the von Willebrand Factor A1 Domain and Implications for the Binding of Platelet Glycoprotein Ib*

von Willebrand Factor (vWF) is a multimeric protein that mediates platelet adhesion to exposed subendothelium at sites of vascular injury under conditions of high flow/shear. The A1 domain of vWF (vWF-A1) forms the principal binding site for platelet glycoprotein Ib (GpIb), an interaction that is tightly regulated. We report here the crystal structure of the vWF-A1 domain at 2.3-Å resolution. As expected, the overall fold is similar to that of the vWF-A3 and integrin I domains. However, the structure also contains N- and C-terminal arms that wrap across the lower surface of the domain. Unlike the integrin I domains, vWF-A1 does not contain a metal ion-dependent adhesion site motif. Analysis of the available mutagenesis data suggests that the activator botrocetin binds to the right-hand face of the domain containing helices α5 and α6. Possible binding sites for GpIb are the front and upper surfaces of the domain. Natural mutations that lead to constitutive GpIb binding (von Willebrand type IIb disease) cluster in a different site, at the interface between the lower surface and the terminal arms, suggesting that they disrupt a regulatory region rather than forming part of the primary GpIb binding site. A possible pathway for propagating structural changes from the regulatory region to the ligand-binding surface is discussed.

von Willebrand Factor (vWF) 1 is a multimeric protein that mediates platelet adhesion to exposed subendothelium at sites of vascular injury (1). The adhesive properties of vWF are tightly regulated so that plasma vWF does not normally interact with circulating platelets. vWF, however, will bind to platelets after it is "activated" by poorly understood conformational changes that occur after it binds to the vessel wall. A reduction in the plasma concentration of vWF or mutations that impair binding, activation, or assembly of vWF multimers cause von Willebrand disease (vWD), a common bleeding disorder characterized by decreased platelet adhesion and mucocutaneous bleeding (2).
vWF-mediated adhesion of platelets to the vessel wall, under the high flow/shear conditions present in circulating blood, is mediated by sequences within the first (A1 domain) and third (A3 domain) A type repeats of vWF. The A1 domain (residues 479 -717) binds to platelet glycoprotein Ib⅐IX⅐V complex (GpIb), subendothelial heparans, cell surface sulfatides (reviewed in Ref. 3), and the non-fibrillar collagen type VI (4). The vWF-A3 domain contains the principal site for binding the fibrillar collagens types I and III (5,6).
Although initially noted in the primary sequence of vWF, the A domain has been subsequently discovered in a large number of cell matrix-associated or adhesive proteins and receptors (7). For example, varying numbers of A domains are found in several of the atypical, short chain collagens. A single A domain is inserted into the sequence of several integrin receptors, where it is generally referred to as the I domain. A/I domains are frequently involved in either cell adhesion or cell ligand interactions. In 1995, we reported the crystal structure of the first family member, the I domain of the leukocyte integrin ␣M␤2 (8), and the crystal structures of several A domains have now been solved (9 -12). This work has provided new insights into how A domains mediate cellular adhesion and facilitates detailed structure-function studies. All A domains have a very similar structure comprising a variant of the dinucleotide-binding fold. The integrin I domains contain a metal ion-dependent adhesion site (MIDAS) on the upper face of the domain that is an important element of ligand binding (8,(13)(14)(15). In contrast, the vWF-A3 domain does not bind metal and does not require metal for binding to collagen (10).
To advance studies of vWF binding to platelet GpIb and to gain more understanding of the molecular switches that activate vWF, we have solved the crystal structure of the vWF-A1 domain and in this paper correlate its structure with existing biochemical and mutational data.

EXPERIMENTAL PROCEDURES
Purification and Crystallization-Recombinant vWF-A1 containing residues 475-709 of mature vWF and 12 residues at the N terminus from the expression vector (MRGSHHHHHHGS) was expressed in Escherichia coli, refolded, and purified as follows. Our previously published technique (16) was used to transform cells, induce protein, and harvest inclusion bodies. Next, the washed pellet was solubilized by the addition of 6.5 M guanidine hydrochloride in 50 mM Tris-HCl, pH 7.5. The solubilized protein was diluted 40-fold in 50 mM Tris-HCl, 500 mM NaCl, 0.2% Tween 20, pH 7.8. It was passed over an Ni 2ϩ -chelated Sepharose (Pharmacia) column equilibrated with 25 mM Tris-HCl, 200 mM NaCl (pH 7.8) buffer. vWF-A1 protein eluted from the column with 350 mM imidazole. The isolated protein was absorbed to and eluted from a Heparin-Sepharose column (Amersham Pharmacia Biotech). The highly purified protein was dialyzed against 25 mM Tris-HCl, 150 mM NaCl, 0.05% Tween 20, pH 7.8. This protein failed to produce crystals suitable for x-ray analysis.
The protein (0.4 mg/ml) was next treated with immobilized ␣-chymotrypsin (Sigma) in 0.1 M Tris, pH 8.0, 0.15 M NaCl, 0.1% Triton X-100 for 24 h at 4°C with constant agitation, loaded onto a Heparin-Sepharose column (Pharmacia) equilibrated with 0.1 M Tris, pH 8.0, 0.15 M NaCl, and then eluted with 0.1 M Tris, pH 8.0, 0.6 M NaCl. Finally, the protein was diluted 3-fold with water and concentrated to 4 mg/ml. The molec-ular mass estimated by SDS-polyacrylamide gel electrophoresis reduced from 27 to 24 kDa after chymotrypsin digestion. In the crystal structure (see below), the C terminus is ordered to within 4 residues of the C terminus of the expressed domain, and at the N terminus the first residue visible in the electron density map is Asp 498 . Chymotrypsin cleaves specifically after aromatic residues, and Tyr 495 at the N terminus is the only aromatic residue that is not visible in the final electron density map. Cleavage after Tyr 495 and the loss of 33 residues from the N terminus gives a predicted size of 24.5 kDa, consistent with the size estimated by SDS-polyacrylamide gel electrophoresis. Other aromatic residues are presumably protected from cleavage by the folded conformation of the domain. Cleavage did not detectably perturb vWF-A1 binding to GpIb or its ability to inhibit ristocetin-induced platelet aggregation by full-length vWF (data not shown).
Data Collection, Structure Determination, and Refinement-A single crystal was transferred into a cryoprotectant buffer consisting of 0.1 M Tris, pH 8.5, 5% polyethylene glycol 8000, 30% glycerol, and frozen by immersion in a stream of nitrogen at 100 K. Data were collected with a Rigaku RU-200 x-ray generator, focusing mirrors, and an R-AXIS IV imaging plate. Data were reduced with DENZO (17) and scaled with SCALEPACK (17) with an R merge of 8.3% and 91% completeness to 2.3 Å resolution. Subsequent calculations were performed using the CCP4 suite of programs (18) unless otherwise noted. A mercury derivative was prepared by soaking crystals in the cryoprotectant buffer plus 5 mM HgCl 2 for 48 h. A 2.8-Å data set was collected from this derivative with an R merge of 13.4%, a mean isomorphous difference with the native data set of 26.7%, and 99.8% completeness.
Molecular replacement was performed using the crystal structure of the integrin ␣2-I domain (11), stripped of loops and side chains, as the search model. The highest peak in the cross-rotation function gave the correct orientation of the monomer. The translation function gave a stronger top peak in space group P6 1 than in P6 5 , establishing the correct enantiomorph. Rigid body refinement using XPLOR (19) led to an R factor of 52% and a correlation coefficient of 0.45. Using the mercury derivative data and model-derived phases, a difference Fourier revealed the position of two mercury atoms. These sites were refined using HEAVY (20), and phases were calculated using MLPHARE (phasing power ϭ 0.41), leading to an experimental map that was solvent flattened using DM. The heavy atom-derived and molecular replacement-derived maps were then averaged using the program RAVE (21). This averaged map was of higher quality than either of the component maps and was readily interpretable. An initial round of model building using program O (22) allowed insertion of 173 residues including some side chains. This model was subjected to positional refinement using XPLOR, resulting in an R factor (R WORK ) of 42.3% for the working set and an R FREE (calculated on 10% of reflections omitted from refinement) of 47.2% at 2.8 Å resolution. Further rounds of positional and B-factor refinement, model building, and extension of the data to 2.3 Å resolution led to an R WORK of 24.3% and R FREE of 31.2%. An anomalous difference Fourier calculated at this stage revealed the presence of an anomalous scatterer (peak height, 4.6 ), presumed to be Cd 2ϩ , coordinating histidines His 559 and His 563 in the ␤B-␤C loop. Water molecules were next added at the positions of F o Ϫ F c peaks greater than 3 , where reasonable hydrogen-bonding partners existed. After applying a bulk solvent correction, the final R-factor was 19.2% for all data between 10 and 2.3 Å (R FREE ϭ 23.8%); the model includes residues 498 -705, 93 water molecules, and one Cd 2ϩ ion. All main chain torsion angles lie within the most favored (88.8% of residues) or additional allowed regions (11.3% of residues) of the Ramachandran plot as defined in PROCHECK. There are no cis-prolines. The RMS deviations from ideal bond lengths and angles are 0.011 Å and 1.43°, respectively.

RESULTS
Structure of the vWF-A1 Domain-We crystallized a recombinant vWF-A1 domain and solved its structure at 2.3 Å resolution (Table I). The overall fold is, as expected, very similar to the von Willebrand Factor A3 domain and the integrin Idomain, with a central hydrophobic parallel ␤-sheet flanked on two sides by amphipathic helices (Figs. 2 and 3). Helices are labeled 1-7 and strands A-F by homology with the ␣M␤2 I domain (8). There is no equivalent of the second helix, ␣2, found in the ␣M and ␣L I domains. The disulfide bridge linking the Nand C-proximal sequences (Cys 509 -Cys 695 ) is well ordered. In addition, ordered structure extends 11 residues N-terminal to, and 10 residues C-terminal to, the disulfide bridge.
The domain is 40 Å high, 30 Å wide across the ␤-sheet, and 40 Å broad. The overall shape is cuboid, with six fairly flat faces; four of these are shown in Fig. 4. (a) The "front" face lies at one edge of the ␤-sheet and includes side chains from helices ␣3 and ␣4, strand ␤C, and their connecting loops. This face also includes part of the salt-bridge network (see below). (b) The "upper" face lies at the C-terminal end of the ␤-sheet and is composed of loops connecting the sheet to the flanking helices.  This face contains two positively charged residues, Arg 524 and His 656 , packed together at the center of an otherwise polar but uncharged surface ϳ20 Å in diameter, with charged groups at the periphery. In integrin I domains, this upper face contains the MIDAS motif, consisting of three closely apposed loops that together coordinate a magnesium ion. However, the homologous loops of vWF-A1 are not suitable for metal coordination. (c) The "lower" face includes the N-and C-terminal arms and the loops connecting the ␣-helices to the N-terminal end of the ␤-sheet. (d) The "right-hand" face includes helices ␣4, ␣5, and ␣6 flanking one side of the ␤-sheet, and is highly basic. (e) The "left-hand" face (not shown) includes helices ␣1, ␣3, and ␣7 flanking the opposite side of the ␤-sheet and is highly acidic. (f) The "back" face (not shown) includes helices ␣6, ␣7, and strand ␤F. Salt Bridges and Buried Charges-The vWF-A1 domain contains a large number of charged amino acids. Some of these form an elaborate salt-bridge network that wraps around the lower rim of the domain (Fig. 4a). These stabilizing salt bridges may explain the sensitivity of this region to alanine mutagenesis (23). There are four buried acidic residues: (a) Glu 557 from strand ␤B forms stabilizing salt bridges with two histidine side chains (His 559 and His 563 ) from the ␤B-␤C hairpin; (b) Glu 626 is buried but sits at the N-terminal end of helix ␣4, where it may be stabilized by the helix dipole; (c) Asp 514 at the bottom of strand ␤A is buried and is stabilized by salt bridges to Arg 552 and Arg 611 , which form part of the salt-bridge network. A buried salt bridge between an aspartic acid from the bottom of strand ␤A and an arginine from the ␣4-␤D turn is seen in all of the A/I domain structures that have been studied, suggesting that this is an important element of folding; (d) Asp 520 is buried below the upper face of the domain without a stabilizing saltbridge partner. This residue is homologous in sequence to the aspartic acid of the DxSxS integrin MIDAS motif; it has been suggested that the side chain of Arg 524 might form a salt bridge with the buried aspartic acid, substituting for the metal ion, but our crystal structure shows that instead Arg 524 points outward into solution, packing against the side chain of His 656 from the ␤E-␣6 loop.
N-and C-terminal Arms and von Willebrand Disease Type IIb Mutations-Upstream from the N-terminal strand, ␤A, the chain makes a 90°turn, with Cys 509 three residues from the turning point. Further upstream, the side chains of Phe 507 , Tyr 508 , and Ile 499 pack against hydrophobic elements on the surface of the domain, and His 505 and Glu 501 make stabilizing salt bridges with helices ␣1 and ␣3. At the C terminus, the ␣7 helix extends a turn beyond the disulfide bridge, followed by an extended structure that packs against the hydrophobic side of the domain as far as Ala 701 ; in addition, Glu 700 makes a salt bridge with Arg 511 near the N terminus. Beyond this, a sequence of three prolines in an extended conformation protrudes downward from the body of the domain.
Ordered electron density exists for all residues that have been identified as natural mutation sites leading to the type IIb phenotype ("gain of function," constitutive binding to GpIb). All of these sites lie on the lower surface of the domain at the interface between the body of the domain and the N-and C-terminal arms (see Table II and Fig. 3). Our crystal structure shows that in the wild-type protein, these residues are all involved in salt bridges or hydrophobic packing. The most likely effect of these mutations is that they disrupt the interface between the N-and C-terminal arms and the body of the vWF-A1 Structure domain. Scanning mutagenesis has led to the identification of further mutants with a similar phenotype (23). These lie in the same region and include a triple-alanine mutation in the middle of the C-terminal helix (RDE 687-689 ), which breaks saltbridge contacts with helix ␣1 and the salt-bridge network. Mutations of Cys 509 to Gly or Arg, which break the disulfide bridge, also lead to constitutive GpIb binding (24).
Comparison with the vWF-A3 Domain-The central ␤-sheets of the A1 and A3 domains overlap very closely, with an RMS deviation on main chain atoms of 0.55 Å for 40 residues (similar comparisons with the integrin I domains also give values in the range 0.5-0.6 Å). Extending the overlap to the entire domain gives an RMS deviation of 1.4 Å for 165 residues in equivalent structural environments, as defined in MULTIFIT (25). The resulting alignment has 21% amino acid identity and is shown in Figs. 1 and 2. The A3 crystal structure lacks the N-and C-terminal extensions found in A1, but the disulfide bridge is similarly located. The ␣ helices are generally similar in length and orientation, with the major differences restricted to the loops connecting strands and helices on the upper and lower surfaces of the domain. vWF-A1 and vWF-A3 both lack the ␣2 helix found in the integrin ␣M␤2 and ␣L␤2 I domains. In A1, helix ␣7 is preceded by a turn of 3 10 helix, whereas in A3, the 3 10 helix is replaced by an ␣ helix that is longer by 3 residues. The surface charge distribution is less asymmetric in A3 than in A1, lacking the basic patch on the right-hand face of A1.
The shape of the upper surface of the domain is affected by three changes in the surface loops. (a) At the top of the ␤A strand, arginine Arg 524 in A1 points out into solution, whereas the equivalent A3 residue, Ser 938 , points inward. The side chain of the next residue, Leu 525 , points into the interior of the protein; homologous residues in the integrin I domains are similarly oriented. This contrasts with A3, in which Phe 939 points out into solution. These two differences give the ␤A-␣1 turn a quite different conformation in A1 than in A3. (b) There is a 1-residue insertion in the ␣3-␣4 loop of A1, which wraps over the top of the ␤B-␤C hairpin, allowing space for the side chain of His 559 that replaces a glycine in A3. (c) A1 has a 4-residue insertion in the ␤D-␣5 loop, which includes a turn of 3 10 helix. The lower surface is affected by the following changes: (i) in the ␤C-␣3 loop, Leu 568 packs more closely into the hydrophobic core than the larger tryptophan in A3, causing a shift of the entire loop; (ii) the end of helix ␣4 has a different conformation, and the ␣4-␤D loop is two residues shorter in A1; (iii) The ␣6-␤F turn adopts a different conformation. DISCUSSION Using our crystal structure of the vWF-A1 domain, coupled with the analysis of naturally occurring and experimentally introduced mutations, we can begin to map the binding sites for GpIb and activators like botrocetin and ristocetin. In addition, with an understanding of the type IIb vWD "gain-of-function"  (23) or black (26), and a mutant with reduced GpIb binding but normal botrocetin binding is in blue (23). The mutation of KKKK 642-645 in the ␣5-␤E loop also reduces binding to heparin (26). For multiple site mutants, spheres are placed near the midpoint of the mutation. mutations in relation to the ligand-binding surfaces, we can begin to formulate models of the activation process.
Botrocetin and Ristocetin Binding Sites-Two groups have performed scanning mutagenesis on the A1 domain (23,26) which can be used to help localize the binding sites for these activators. Matsushita and Sadler (23) measured the direct binding of botrocetin to vWF. Ignoring those mutations that involve buried charges and probably cause misfolding of the A1 domain, the remaining mutation sites leading to reduced (Ͻ50% normal) binding are located in helices ␣5 and ␣6 and adjoining structures: Arg 636 in helix ␣5, Arg 629 and Arg 632 in the 3 10 helix immediately preceding ␣5, and RLIEK 663-667 in the neighboring ␣6 helix. These data strongly suggest that botrocetin binds to the right-hand face of the domain (Figs. 3  and 4). Mutation of the lysine cluster KKKK 642-645 also reduced botrocetin binding (23), but Kroner and Frey (26) report normal botrocetin-induced function for this mutant. Matsushita and Sadler (23) reported four mutants with a selective loss-of-function: loss of ristocetin-induced GpIb binding but normal botrocetin-induced binding. These map near to the upper face of the domain (Fig. 3) and consist mainly of buried (Glu 626 , Asp 520 ) or partly buried charges (Lys 534 salt bridges to Glu 531 ). These mutations, which lead to a loss of stabilizing salt bridges, may cause local instability in the upper surface of the domain that disrupts GpIb binding. The natural mutant (Gly 561 3 Ser) has the same phenotype (27). Gly 561 is part of the tight ␤B-␤C turn at the upper front/edge of the domain. Its main chain torsion angles are not unusual, and modeling suggests that the mutant side chain can point out into solution without severe distortion of the main chain. The ability of botrocetin to overcome the binding deficiency in these mutants may arise from the stabilizing effect of its tight binding to an adjacent surface of the A1 domain. Kroner and Frey (26) reported three further mutations with the same phenotype, in the region of the type IIb mutations and the lysine cluster KKKK 642-645 , adjacent to the N-and C-terminal arms. The arms contain proline-rich segments that have been implicated in ristocetin binding (28). It is therefore possible that selective loss-of-function in these mutants arises from defective ristocetin binding.
The GpIb Binding Site-Data from various sources, when linked together, provide some clues to the location and extent of the GpIb interaction site within the A1 domain. First, previously reported studies from our laboratory on the functional properties of a vWF-A1/A3 chimera that contains the N-terminal half of vWF-A1 (as far as Leu 598 in the middle of helix ␣4), but still binds GpIb normally (6), strongly suggest a role for the front half of the domain. Second, the majority of mutations that lead to a selective loss-of-function (described above) cluster at or near the upper/front surface of the domain, in contrast to the FIG. 4. Four faces of the vWF-A1 domain, defined from the orientation in Fig. 3. Atoms are shown as spheres, drawn with BALLS, RASTER3D, and RENDER (33,34). Arginines and lysines are in blue, aspartic acids and glutamic acids are in red, and histidines are in green; all other residues are in gray. The proposed binding site for botrocetin includes the residues marked "B," and a possible binding site for heparin is circled. a, front face (same view as Fig. 3). Sites of mutations with impaired GpIb binding (Gly 561 , Glu 596 , and Lys 599 ) are labeled. b, upper face at the C-terminal end of the ␤-sheet. c, lower face including the N-terminal arm. d, right-hand face, including helices ␣5 and ␣6.
gain-of-function mutants, which cluster on the lower surface (see below). Third, Matsushita and Sadler reported that the double alanine mutation at Glu 596 and Lys 599 on the front surface (helix ␣4) showed reduced GpIb binding (without affecting botrocetin binding) and suggested that it might form part of the GpIb binding site. Two further natural mutations (Type IIm) with impaired GpIb binding lie in helix ␣4 and the following loop, but both residues (Phe 606 and Arg 611 ) are buried in our crystal structure and the mutations probably lead to destabilization of the folded structure. Overall, the data point to a role for the upper/front surfaces of the domain in GpIb binding, but further work is clearly required to confirm this location.
Type IIb Mutations and Regulation of GpIb Binding-The vWD type IIb mutations, which lead to constitutive binding of vWF to GpIb, are all located at or near the interface between the lower surface of the domain and the N-and C-terminal arms (Fig. 3), and the mutations are expected to break salt bridges and hydrophobic contacts that stabilize this interface. Because these are so numerous, are "gain-of-function," and map to both sides of the interface, it is very unlikely that this region forms a binding surface for GpIb; rather, it is more likely to be involved in the regulation of binding affinity. The separation between this region and the most likely GpIb binding surface begs the question of how structural changes are communicated between the two sites.
A possible clue comes from studies on the homologous integrin I domain. Whereas ligand binding sites have been shown to lie on the upper surface of the domain, Zhang and Plow (29) have made ␣M/␣L I domains chimeras, in which the swapping of sequences in the lower surface of the domain (at sites homologous to the type IIb mutations in vWF-A1) leads to constitutive high affinity ligand binding. A plausible pathway for communicating structural changes from one face of the I domain to the opposite face is provided by the two crystal structures of the integrin ␣M I domain observed by Lee et al. (30). In these structures, a change in the shape and charge distribution of the upper ligand-binding face that would influence its affinity for ligand is propagated via a large (10 Å) downward shift of the C-terminal helix, ␣7, to the lower face of the domain. The more extended conformation of the domain was identified with the high affinity state.
Although we have no evidence for such a conformational change in the vWF-A1 domain, we note that several type IIb mutations map either to the C-terminal helix or to a residue that salt-bridges to the helix (Table II). We expect that these mutations disrupt salt bridges and hydrophobic interactions that lock the helix against the body of the domain. Although the N and C termini remain linked by a disulfide bridge in type IIb vWD, a substantial downward shift of the C-terminal helix is theoretically possible, requiring a concerted movement of the N-terminal arm. Thus, by destabilizing the folded conformation of the terminal arms, the type IIb mutations could shift the equilibrium toward the extended conformation. A conformational switch of this kind, in which the high affinity state is identified with a more extended conformation of the A1 domain, could be triggered when high flow/shear unfolds immobilized vWF (31).