The Crystal Structure and Mutational Binding Analysis of the Extracellular Domain of the Platelet-activating Receptor CLEC-2*

The human C-type lectin-like molecule CLEC-2 is expressed on the surface of platelets and signaling through CLEC-2 causes platelet activation and aggregation. CLEC-2 is a receptor for the platelet-aggregating snake venom protein rhodocytin. It is also a newly identified co-receptor for human immunodeficiency virus type 1 (HIV-1). An endogenous ligand has not yet been identified. We have solved the crystal structure of the extracellular domain of CLEC-2 to 1.6-Å resolution, and identified the key structural features involved in ligand binding. A semi-helical loop region and flanking residues dominate the surface that is available for ligand binding. The precise distribution of hydrophobic and electrostatic features in this loop will determine the nature of any endogenous ligand with which it can interact. Major ligand-induced conformational change in CLEC-2 is unlikely as its overall fold is compact and robust. However, ligand binding could induce a tilt of a 3–10 helical portion of the long loop region. Mutational analysis and surface plasmon resonance binding studies support these observations. This study provides a framework for understanding the effects of rhodocytin venom binding on CLEC-2 and for understanding the nature of likely endogenous ligands and will provide a basis for rational design of drugs to block ligand binding.

Platelet activation at sites of vascular injury is critical for primary hemostasis, but can also trigger arterial thrombosis in vascular disease. The C-type lectin-like protein CLEC-2 was recently shown to be expressed on platelets and signaling through CLEC-2 is sufficient to mediate platelet aggregation (1). CLEC-2 was identified by sequence similarity to C-type lectin-like molecules with immune functions, such as the immunoreceptor NKG2D (2). The gene encoding CLEC-2 is located in the human natural killer (NK) 2 complex on chromosome 12, along with the C-type lectin-like receptors NKG2D, LOX-1, and Dectin-1 (3). CLEC-2 is a 32-kDa type II transmembrane receptor, and transcripts have been identified in immune cells of myeloid origin, including monocytes, dendritic cells, and granulocytes, and in liver (2). CLEC-2 has been shown to be a receptor on the surface of platelets for the snake venom toxin rhodocytin, which is produced by the Malayan pit viper Calloselasma rhodostoma (1,4). An endogenous CLEC-2 ligand has not yet been identified. Binding of rhodocytin to CLEC-2 leads to phosphorylation of a single tyrosine residue in a YXXL motif in the intracellular domain of CLEC-2. This phosphorylation promotes the binding of spleen tyrosine kinase (Syk), further downstream tyrosine phosphorylation events and the activation of PLC␥2 (1). These signaling events result in platelet aggregation.
The ability of rhodocytin or CLEC-2-specific antibodies to trigger platelet aggregation in the absence of other stimuli indicates the potency with which CLEC-2 can modulate platelet activity (1). Therefore, CLEC-2 is a potential therapeutic target in thrombotic cardiovascular disease. In addition, snake bites remain a serious health issue in many countries. Envenomation from the viper family has a mortality rate of up to 15% and in Vietnam there are an estimated 30,000 snake bite victims per year, with the Malayan pit viper responsible for the majority (5). A molecular understanding of the effects of rhodocytin could lead to more effective therapy of snake envenomation. The mechanism of action of venoms is of great biological interest. CLEC-2 was also recently identified as a novel human immunodeficiency virus type 1 (HIV-1) attachment factor, which promotes virus capture by cells and platelets (6). To better understand the ligand binding properties of CLEC-2 and to provide a framework for the rational design of inhibitory drugs, we have crystallized the extracellular domain of CLEC-2 and used synchrotron x-ray diffraction to solve its atomic structure to 1.6-Å resolution. We used this structural information to design alanine mutations of surface residues and have studied the effects of these mutations on ligand binding using surface plasmon resonance.

CLEC-2 Crystallization, Data Collection, and Structure
Determination-The extracellular domain of CLEC-2 from residues 96 to 221 was cloned, expressed, purified, and crystallized as described previously (7,8). Preliminary diffraction data were collected to 2.0 Å as described (8). Subsequent molecular replacement was unsuccessful with the programs Molrep, AMoRe, Resolve, and EPMR. Therefore, a composite search model was constructed for molecular replacement with CaspR from the following related proteins (PDB accession codes are in parentheses): LOX-1 (1YPQ), NKG2D (1KCG), DC-Sign (1K9I), Ly49C (1P1Z), Ly49A (1QO3), and DC-SignR (1XPH) (9). The resulting solution was incomplete in two loops. After further crystal optimization, a 1.6-Å dataset was collected from a dissected crystal at the Synchrotron Radiation Source, Daresbury Laboratory, UK. This crystal was grown in 0.1 M bis-Tris pH6.5, 20% PEG monomethyl ether 5000, and cryoprotected in 20% glycerol. The synchrotron beam was collimated to 50 ϫ 50 m to focus it on the central high quality portion of the crystal. Data were collected using 1°oscillations and auto-indexed, integrated, and scaled using the HKL2000 programs (10). The volume per unit weight (V M ) of this crystal was 1.81 Å 3 Da Ϫ1 , which is typical of crystals displaying diffraction to this resolution (11). The structure was determined from this dataset using the incomplete CLEC-2 structure from the earlier 2.0-Å dataset as a new CaspR search model. Initial refinement was performed using rigid body refinement and energy minimization algorithms implemented by CNS solve and Refmac5 (12,13). The ARP/wARP program was used to trace and build a model into the refined electron density map (14). Subsequent manual rebuilding and refinement cycles used Coot and Refmac5 (15). Extensive attempts were made to reduce the discrepancy between R cryst and R free (Table 1), including the use of translation/libration/screw refinement parameters, but we were unable to model the three N-terminal residues in the crystallized protein because of poor definition of the electron density in this region. An R free -R cryst separation of this magnitude is expected under these circumstances because of poor leastsquares minimization (16,17). The difference between our R cryst and R free values is within the observed range for structures determined at this resolution (17). Additionally, it was not possible to model the equivalent N-terminal residues of the C-type lectin-like domain in the crystal structures of the related proteins NKG2D, LOX-1, and Ly49C (18 -20).
The structure was deposited in the PDB with accession number 2C6U after Procheck and Whatif validation (21,22). Residues are numbered according to their position in the full-length protein. Electrostatic surface potentials were calculated using Grasp, and images were created using VMD, Molscript and Bobscript and then rendered with Raster3D (23)(24)(25)(26)(27).
Computational Dynamic Analyses-The Dynamite package was used to infer, analyze, and graphically represent the likely modes of motion of CLEC-2 (28). The Concoord method was applied to the crystal structure of CLEC-2 to generate an ensemble of protein structures, which each fulfill a set of inter-atomic distance constraints (28). Tools from the Gromacs software package were used to perform principal component analysis on this ensemble (28). The graphical representations of the dynamic analyses were visualized with VMD (24,28).
Rhodocytin Model Generation and Docking-Models of the ␣and ␤-chains of rhodocytin were constructed using the Swiss-Model Comparative Protein Modeling program (see Supplementary Materials for more details) (29). The ␣␤-rhodocytin dimer was manually assembled by superimposing the ␣and ␤-subunits of rhodocytin onto a template of the related C-type lectin snake venom protein Aa-X-Bp-I using Coot. This model of the ␣␤-rhodocytin dimer was regularized using energy minimization algorithms implemented by CNS solve (12).
Models of the rhodocytin-CLEC-2 complex were generated using a fast Fourier transform correlation approach implemented in the docking program Dot (30). The Cluspro fast algorithm was used to filter and rank docked conformations with good surface complementarity and low desolvation and electrostatic energies (30). Ranked complexes were subjected to van der Waals minimization using CHARMM to remove potential side chain clashes (see Supplementary Materials for more details of the docking method) (30). To these ClusPro-validated complexes, a set of biological selection criteria was applied to select the most plausible model. Interacting faces were matched on the basis of shape complementarity, surface hydrophobicity, electrostatic compatibility and the location of the CLEC-2 membrane insertion point. From this model of the interaction, and from our previous knowledge of the binding properties of C-type lectin-like molecules, CLEC-2 residues were selected for mutational studies to determine their involvement in ligand binding (31). Surface areas were calculated with GetArea (32).
CLEC-2 Mutagenesis and Surface Plasmon Resonance Binding Studies-Site-directed PCR mutagenesis was used to introduce single alanine mutations in CLEC-2 at residues 132, 150, 168,171,184,187,188,190,192,200, and 211. These mutated recombinant proteins were expressed, refolded and purified in the same way as wild-type CLEC-2 and had similar gel filtration profiles and stability (8).
Surface plasmon resonance experiments were performed using a Biacore 3000 machine and were conducted as described previously for NKG2D and its ligands (7). Rhodocytin was purified from the venom of C. rhodostoma as described previously (33,34). Briefly, rhodocytin was covalently attached to CM5 research grade sensor chips (Biacore AB) using amine coupling. The multimeric nature of rhodocytin precludes the acquisition of useful quantitative data with CLEC-2 on the chip. All biosensor experiments were conducted at 25°C in degassed HBS-EP buffer (150 mM NaCl, 1 mM CaCl 2 , 1 mM MgCl 2 , 10 mM Hepes, pH 7.4, 0.005% Surfactant P-20). Different concentrations of CLEC-2 mutant and wild-type proteins were injected over all flow cell surfaces. Equilibrium dissociation constants (K D values) were derived from fitting to the predicted hyperbolic function for a Langmuir 1:1 interaction at each site without cooperativity. Fitting was done using the Levenberg-Marquardt algorithm as implemented in the program Origin (OriginLab). In all biosensor experiments, the signal from flow cells coated

CLEC-2 Structure and Function
with rhodocytin was compared with the signal from control flow cells coated with an irrelevant protein.

RESULTS
Structural Overview-High resolution x-ray diffraction data were obtained from crystals of recombinant protein (Table 1). High quality continuous electron density was seen for the entire protein from residue 100 onwards, allowing unambiguous determination of the structure (Fig. 1). The basic architecture of CLEC-2 preserves key features of the C-type lectin-like domain, with two antiparallel ␤-sheets flanked by two ␣-helices ( Fig. 2A and supplementary Fig. S1). The structure was compared with related receptors, which bind protein ligands and for which detailed ligand binding data is available: the oxidized low density lipoprotein (LDL) receptor LOX-1 and the immune receptors NKG2D and Ly49C, which bind major histocompatibility complex (MHC) class I-like molecules (18 -20). These structures share the same overall conformation as CLEC-2 (r.m.s.d. are 1.36 Å over 1049 atoms, 1.74 Å over 996 atoms, and 1.97 Å over 983 atoms, respectively). However, while these related proteins are dimeric, CLEC-2 appears monomeric, as recombinant CLEC-2 was stable in solution as a monomer, and there is no suggestion of dimerization in the crystal structure and lattice packing. Consistent with this, immunoprecipitation of CLEC-2 yields only a monomer (1). Furthermore, both human and mouse CLEC-2 bind rhodocytin as monomers (data not shown). Despite its overall similarity to other C-type lectinlike molecules, distinct features of CLEC-2 are likely to account for its function. Of particular note is an additional tightly packed 3-10 ␣-like helix in the long loop region (Fig. 2A).

The Long Loop Region Dominates the Surface Available for
Ligand Binding-The long loop region is a flexible, highly variable segment, which is the key determinant of ligand-binding specificity in all structures of related molecules solved to date (31). The solvent accessible surface area of CLEC-2 is 6535 Å 2 and the semi-helical long loop region dominates ϳ50% of the upper surface on the face of the molecule opposite to that of the membrane insertion point. CLEC-2 is the first C-type lectinlike structure where this loop contains a formal helix (Fig. 2B) (31). The long loop region of the murine immunoreceptor Ly49C contains a short, three residue turn with 3-10 helical features and forms a key part of the binding interface with its MHC class I-like ligand H-2K b (20). In LOX-1, the long loop region forms a basic spine, which interacts directly with apoB100 in oxidized LDL (19). The equivalent loop in NKG2D forms key binding interactions with its multiple protein ligands and is flanked by several additional contact residues which strengthen the interaction (18). The residues in both the loop and flanking regions of NKG2D and CLEC-2 are similar suggesting that they are likely to play a role in rhodocytin binding by CLEC-2.
CLEC-2 does not bind carbohydrate on rhodocytin, as the venom protein is not glycosylated. This was suggested by computational bioinformatics with N-and O-glycosylation site prediction programs. The absence of glycosylation was verified by SDS-PAGE and confirmed by mass spectroscopy (supplementary Figs. S4 and S5). CLEC-2 contains only one of the five residues which are conserved in the long loop region in related carbohydrate binding molecules (a glutamate at position 187). In the carbohydrate binding molecules, these residues typically coordinate a calcium ion and stabilize the interaction with the sugar (35). There are no associated calcium ions in the CLEC-2 crystal structure. Carbohydrate on CLEC-2 itself is unlikely to play a role in ligand binding, as neither of the two potential N-linked glycosylation sites (at positions 120 and 134) in the   FEBRUARY 2, 2007 • VOLUME 282 • NUMBER 5 structure is sufficiently close to the predicted binding surface. Furthermore, the recombinant protein is expressed in bacteria and is unglycosylated, but can bind to rhodocytin.

Conformational Change Is Only Possible in the Long Loop
Region-There is considerable interest in the role of conformational change in the binding interactions of related molecules, such as NKG2D, which undergoes minor structural changes upon ligand binding (18,36,37). Major ligand-induced conformational changes in CLEC-2 are unlikely as considerable energy would be required to disrupt the covalent and hydrophobic interactions which maintain its stability (Fig. 3A). Three disulfide bonds, Cys 102 -Cys 113 , Cys 130 -Cys 216 , and Cys 195 -Cys 208 provide covalent stability. Furthermore, three core hydrophobic regions maintain tight, compact folding and stability through adhesive interactions between the delocalized electrons of aromatic side chains (Fig. 3A). The first hydrophobic region contains two stacked tyrosine rings (Tyr 109 and Tyr 148 ), stabilized by a neighboring tyrosine (Tyr 114 ), a robust arrangement seen in diverse globular protein cores (38). The second hydrophobic core draws the two ␤-sheets of CLEC-2 together and involves two perpendicular tryptophan rings (Trp 123 , Trp 158 ). The third hydrophobic region lies beneath the long loop region and contains a conserved Trp-X-Trp motif in which the tryptophan indole rings interact with a leucine at position 161. The relative orientation of these three hydropho-bic cores is stabilized by a further hydrophobic element, the WIGL motif in addition to the three disulfide bonds (31).
Ligand binding is highly unlikely to disrupt these hydrophobic regions, but could certainly change the long loop region. The low B-factors of the internal hydrophobic cores described above are consistent with them being relatively static zones. The B-factors of the atoms within the long loop region indicate that, while the helical portion is unlikely to become disordered, stretches of the loop residues on either side of it may display more flexibility. Ligand binding to CLEC-2 could alter the conformation of these more flexible segments, causing the helix to tilt on its long axis. The absence of any hydrogen bonds or salt bridges tethering the helix is consistent with this flexibility.
This description of the relative flexibilities of distinct regions within CLEC-2 is consistent with computational dynamic analyses of the crystal structure. A covariance web for CLEC-2 illustrates the likely flexibility of the long loop region of CLEC-2 and reveals that its movements are relatively independent from the rest of the molecule (Fig. 3B). This plot also demonstrates that the 3-10 helix is likely to move as an individual unit whose motions are greatly influenced by flexible regions in the loop. In the central and righthand panels in Fig. 3B, the dominant motion of each C␣ atom is represented by a cone whose height and orientation represent the magnitude and direction of the motion respectively. This porcupine plot highlights a distinct   FEBRUARY 2, 2007 • VOLUME 282 • NUMBER 5

JOURNAL OF BIOLOGICAL CHEMISTRY 3169
CLEC-2 and rhodocytin (Fig. 4, A and B). A binding site containing the long loop region of CLEC-2 may potentially dock onto a concave surface of rhodocytin with a total buried surface area of 1466 Å 2 . There are 65 ordered water molecules seen at this surface in the CLEC-2 crystal structure and their potential displacement would make a significant favorable entropic contribution to the free energy of binding and so to the affinity of the interaction. The predicted binding surface on CLEC-2 is dominated by residues in the long loop region and its flanking region. Charged residues within and around the long loop region surround two nonpolar patches, which may potentially make hydrophobic interactions with the ligand (Fig. 4,  A and B).
To investigate the effect of these charged residues upon ligand binding and to explore the binding surface of CLEC-2, we produced soluble recombinant CLEC-2 proteins with alanine mutations of surface residues (Table 2, Fig. 4C and supplementary Fig. S6). The molecular masses of all mutants were confirmed by mass spectroscopy as described previously (8). We used surface plasmon resonance to compare their binding affinities for rhodocytin with that of wild-type CLEC-2. We first confirmed that wild-type CLEC-2 does not bind to itself in a homo-interaction. Wildtype CLEC-2 bound rhodocytin with an equilibrium dissociation constant (K D ) of 1.01 M. This is within the range of affinities for the protein-protein binding interactions of other C-type lectin-like molecules (7,40,41). Proteins with the loop residue mutations K171A, E184A, E187A, D188A, K190A, and N192A displayed a statistically significant reduction in binding affinity for rhodocytin as characterized by their higher K D values ( Table 2,). Mutating the flanking residue K150 to an alanine substantially weakened the interaction of CLEC-2 with rhodocytin, with over a 19-fold increase in K D relative to wild-type.
These observations are consistent with our model of the rhodocytin-CLEC-2 interaction. The long loop region significantly influences binding, with stronger effects from the flexible portions of the loop than from the short helical segment, and dominant contributions from flanking res-  A and B, molecular surface representations of the predicted binding faces of CLEC-2 and rhodocytin are shown in the upper and lower panels, respectively. The orientation of CLEC-2 is identical to that given in the left and central panels of Fig. 3B. Potentially paired interacting regions are numbered accordingly. A, in the left panels, patches of hydrophobic surface residues are colored green. B, in the right panels negatively and positively charged regions are colored red and blue, respectively. The surface electrostatic potential scale is from Ϫ10 to 10 kTe. C, C␣ trace of CLEC-2 is shown in white with the same orientation as the left and central panels of Fig. 3B. The long loop region is colored cyan. Residues mutated to alanine for surface plasmon resonance analysis are represented as balls and sticks. Residues which affect ligand binding when mutated are colored red, and those which have no effect are colored blue.

CLEC-2 Structure and Function
idues. Residues on the opposite face of CLEC-2 or those whose sidechains are unavailable for binding by rhodocytin in our model of the complex do not significantly affect the interaction. The differences between the free energy changes (⌬G values) determined for several mutant reactions relative to the ⌬G for the wild-type reaction (Ϫ8.18 kcal/mol) are consistent with the loss of multiple van der Waals interactions ( Table 2). Residue 150 may make a hydrogen bond with rhodocytin, as the ⌬G of this mutant reaction differs from that of the wild-type reaction by 1.76 kcal/mol.

DISCUSSION
This study presents the crystal structure of human CLEC-2 and a mutational analysis of the interaction with rhodocytin. It provides a firm basis for future studies of the function of CLEC-2. The potential binding surface is dominated by the long loop region and unusually this loop contains a 3-10 helix. Mutational analysis confirms the importance of this loop region in ligand binding.
A key question is how binding of a ligand to the extracellular domain of CLEC-2 triggers intracellular phosphorylation events and ultimately platelet aggregation. Cell surface receptors such as the ␤-adrenergic receptor transmit signals by undergoing a major conformational change upon ligand binding (42). Ligand binding to CLEC-2 is unlikely to cause a major conformational change in the overall CLEC-2 fold, but could cause changes in the semi-helical long loop region. Molecular dynamics analyses of the long loop region highlight this segment as flexible with motions that are relatively independent from those of the rest of the molecule. These analyses are in line with changes in the binding affinity to rhodocytin, resulting from mutations in these regions. The flexible regions appear to have the capacity to undergo subtle movement resulting in a shift or tilt of the 3-10 helix toward the ligand. The flanking regions are also capable of undergoing some movement, which will open out the binding surface. Thus, ligand interaction may be associated with broadening of the surface available for binding and a tilt in the orientation of the 3-10 helix.
The pattern of charged and hydrophobic residues in the human CLEC-2 long loop and flanking region will impart selective binding properties to CLEC-2 and is conserved in mouse, chimpanzee, cow, dog, and rat CLEC-2. Mutating specific charged residues within and surrounding the loop to alanine significantly reduced the affinity of human CLEC-2 for rhodocytin by up to 19-fold. The free energy changes of the binding reactions of mutated CLEC-2 as compared those of wild-type CLEC-2 are consistent with the loss of hydrogen bonds and multiple van der Waals interactions with rhodocytin. This suggests a major role for the loop and flanking region in ligand binding. The semi-helical loop region may provide initial recognition of and attachment to its ligand, with binding affinity strengthened by flanking residues. However, as this region is distant from the membrane insertion point, any change in the loop is unlikely to be transmitted through the compact extracellular domain of CLEC-2 to the cytoplasmic domain.
Ligand binding may bring the cytoplasmic signaling domains of several CLEC-2 molecules into closer proximity, thereby facilitating signal transmission. Ligand-induced dimerization is observed in signaling receptors such as the receptor tyrosine kinase EGFR, and rhodocytin may similarly promote dimerization of CLEC-2 (43). The related molecules, NKG2D, CD69, and Ly49C have hydrophobic patches that form a dimer interface (18,20,44). However, in CLEC-2 there is only an alanine and a threonine residue in this region, which do not drive dimerization in solution and would be unlikely to do so at the cell surface. It is possible that this site may play a small role in ligand-induced dimerization. As rhodocytin is multimeric and each subunit will present an equivalent binding surface for CLEC-2, ligand binding may initiate higher order multimerization of CLEC-2 on the platelet surface (45). Such surface clustering of CLEC-2 receptors on platelets would localize their cytoplasmic signaling domains for phosphorylation and recognition by Syk, thereby augmenting signal transduction. As rhodocytin is likely to promote CLEC-2 clustering on the platelet surface, this raises the possibility that an endogenous ligand may also cluster CLEC-2.
This study demonstrates the structure of CLEC-2 and identifies the binding surface, highlighting the role of the long loop region with its 3-10 helix. Rhodocytin appears to have evolved with the capacity to interact with the semi-helical loop and flanking region of CLEC-2, which are highly conserved across species. The structure and mutational binding analysis of CLEC-2 indicates that an endogenous ligand is likely to be a protein with a predominantly negatively charged binding surface and discrete hydrophobic patches. The ligand is likely to be recognized by the CLEC-2 long loop region, with additional contributions from flanking residues. C-type lectin-like domains share the distinctive fold of the carbohydrate recognition domains of C-type lectins, but are not necessarily sugar binding themselves. CLEC-2 binds unglycosylated rhodocytin and is unlikely to bind carbohydrate on its natural ligand because it displays none of the features associated with sugar recognition or binding. CLEC-2 also lacks the WIH ␤-glucan binding motif found in the related receptor, Dectin-1, and is therefore unlikely to share the same binding capabilities (46).
In addition to offering insights into the binding of both exogenous and endogenous ligands by CLEC-2 on platelets, the structure and mutational binding analysis of CLEC-2 provides a secure foundation for the rational design of inhibitory drugs of  FEBRUARY 2, 2007 • VOLUME 282 • NUMBER 5 great potential benefit in cardiovascular disease and perhaps snake envenomation and HIV-1 transmission. Drugs which target the long loop region would be likely to influence CLEC-2mediated activation of platelets.