Structural Basis of the Interaction between Chemokine Stromal Cell-derived Factor-1/CXCL12 and Its G-protein-coupled Receptor CXCR4*

The chemokine stromal cell-derived factor-1 (SDF-1/CXCL12) and its G-protein-coupled receptor (GPCR) CXCR4 play fundamental roles in many physiological processes, and CXCR4 is a drug target for various diseases such as cancer metastasis and human immunodeficiency virus, type 1, infection. However, almost no structural information about the SDF-1-CXCR4 interaction is available, mainly because of the difficulties in expression, purification, and crystallization of CXCR4. In this study, an extensive investigation of the preparation of CXCR4 and optimization of the experimental conditions enables NMR analyses of the interaction between the full-length CXCR4 and SDF-1. We demonstrated that the binding of an extended surface on the SDF-1 β-sheet, 50-s loop, and N-loop to the CXCR4 extracellular region and that of the SDF-1 N terminus to the CXCR4 transmembrane region, which is critical for G-protein signaling, take place independently by methyl-utilizing transferred cross-saturation experiments along with the usage of the CXCR4-selective antagonist AMD3100. Furthermore, based upon the data, we conclude that the highly dynamic SDF-1 N terminus in the 1st step bound state plays a crucial role in efficiently searching the deeply buried binding pocket in the CXCR4 transmembrane region by the “fly-casting” mechanism. This is the first structural analyses of the interaction between a full-length GPCR and its chemokine, and our methodology would be applicable to other GPCR-ligand systems, for which the structural studies are still challenging.

The most abundant splice variant of SDF-1 (SDF-1␣) is composed of 68 amino acids, and its NMR (15,16) and crystal structures (17,18) demonstrated that SDF-1␣ assumes a typical chemokine fold as follows: an unstructured N terminus (Lys 1 -Tyr 7 ) followed by a long flexible loop (N-loop), a three-stranded anti-parallel ␤-sheet, and an ␣-helix. The mutational analyses revealed that although the SDF-1␣ N terminus is critical for the CXCR4-mediated signaling (15), both the N terminus and the N-loop residues are implicated in the receptor binding (15,18,19). In addition, recent mutational analysis suggested that the residues on the SDF-1␣ ␤-sheet are also important for receptor binding (20).
CXCR4, composed of 352 amino acids, belongs to the class A G-protein-coupled receptor (GPCR) family, with the seven transmembrane (TM) helices. Whereas GPCR activation is mediated by the conformational changes in its TM region (21,22), the mutational analyses revealed that the CXCR4 N terminus and the extracellular loops (ECLs) are also involved in the ligand binding (23)(24)(25). The recently solved NMR structure of a disulfide-bridged dimeric mutant of SDF-1␣, complexed with a CXCR4 N-terminal peptide, revealed the N-terminal peptidebinding modes of SDF-1␣ (20).
Based on these previous studies, a two-step/two-site binding model has been proposed for the SDF-1␣-CXCR4 interaction (15). In this model, two independent interactions are hypothesized as follows: the SDF-1␣ N-loop interacts with the CXCR4 N terminus, and subsequently the SDF-1␣ N terminus interacts with the CXCR4 TM region to trigger receptor activation.
However, little is known about the structural basis of the recognition mode of SDF-1␣ with the full-length CXCR4, and there is no direct evidence that the SDF-1␣ N terminus and the N-loop interact with structurally independent sites on CXCR4. This is mainly because of the difficulties in overexpression, purification, and crystallization of CXCR4. Although the preparative scale heterologous expression of CXCR4, using a baculovirus expression system, has recently been reported (26), the expression level is not sufficient for standard structural analyses, such as crystallography and NMR. In addition, the solubilization of CXCR4 by detergents, which is required for solution NMR measurements, results in a huge molecular mass of over 100 kDa, which also hampers NMR analyses.
We recently established a novel NMR method, transferred cross-saturation (TCS), to identify the molecular interface, the region involved in the molecular interaction, in a large protein complex (27)(28)(29) through the cross-saturation phenomenon, which identifies the proximal residues within 5 Å of their binding partner (30). In this method, the molecular weight limitation in NMR measurements is overcome by properly transferring the cross-saturation effects from the receptor-bound ligands to the free ligands. The advantage of the TCS methods is that only substoichiometric amounts of receptors relative to ligands are required. Moreover, the combined usage of TCS and methyl-utilizing cross-saturation (31), which provides much higher sensitivity for the detection of the cross-saturation phenomena, decreases the amount of receptor required.
Here, we report NMR analyses of the interaction between SDF-1␣ and full-length CXCR4. We applied the methyl-utilizing TCS method to elucidate the CXCR4-binding site of SDF-1␣, and we demonstrated that SDF-1␣ utilizes an extended surface on the multiple region dispersed throughout the protein, consisting of the ␤-sheet, 50-s loop, and N-loop, in addition to the N terminus, for the receptor binding. We further investigated the effect of AMD3100, which is a selective CXCR4 antagonist that binds to the CXCR4 TM region (32,33), on the SDF-1␣-CXCR4 interaction. Consequently, we found that even in the state with SDF-1␣ bound to CXCR4, the SDF-1␣ N terminus is released from CXCR4, upon the addition of AMD3100, whereas the other SDF-1␣ region still binds to CXCR4. Our data provide the first structural evidence for the existence of two independent interactions between SDF-1␣ and CXCR4.

EXPERIMENTAL PROCEDURES
Expression and Purification of CXCR4-The cDNA fragment encoding human CXCR4 with the C-terminal 1D4 epitope tag (TTVSKTETSQVAPA) was amplified by PCR and cloned into the pVL1392 vector (Pharmingen) via the XbaI-BamHI sites. Recombinant baculoviruses were generated using a BaculoGold transfection kit (Pharmingen), according to the manufacturer's instructions.
For the large scale expression of CXCR4, 2.8 liters of the expresSFϩ cells (Protein Sciences Corp.) in Sf900-II serumfree media (Invitrogen) were grown at 27°C, in a 3-liter spinner flask (Bellco) equipped with a dissolved oxygen controller (Wakenyaku). The cells, at a density of 1.8 -2 ϫ 10 6 cells/ml, were inoculated with the high titer virus stock (120 ml per 2.8 liters of cells) and were harvested 48 h post infection.
All of the following procedures were carried out at 4°C. Cells were disrupted by nitrogen cavitation (Parr Bomb) under 600 p.s.i. for 30 min in 250 ml of buffer A (50 mM Tris, pH 8.0, 100 mM NaCl, 1 mM EDTA) with a protease inhibitor mixture (Nacalai Tesque, Inc.), and the lysate was centrifuged at 800 ϫ g for 10 min. The supernatant was centrifuged at 100,000 ϫ g for 60 min, and the resulting membrane pellet was solubilized in 120 ml of buffer B (20 mM Hepes, pH 7.2, 150 mM NaCl, 15% (v/v) glycerol) ϩ 1% n-dodecyl ␤-D-maltoside (DDM, Dojindo) for 4 h. The solubilized membrane was centrifuged at 75,000 ϫ g for 30 min, and the supernatant was batch-incubated overnight with 2 ml of 1D4-Sepharose beads, in which 3.5-5 mg/ml 1D4 antibodies (University of British Columbia) were coupled to CNBr-activated Sepharose 4B (GE Healthcare). The beads were washed with 60 ml of buffer C (buffer B ϩ 0.1% DDM), and the protein was eluted with 7.5 ml of buffer D (buffer C ϩ 200 M "TETSQVAPA" peptide (synthesized by Toray Research Center)). The simultaneous concentration and buffer exchange of the eluate were accomplished by a centrifugal filter device (AmiconUltra-15, 30-kDa molecular mass cutoff, Millipore).
Expression and Purification of SDF-1-The cDNA fragment encoding human SDF-1␣ was amplified by PCR and cloned into the pET-11a vector (Novagen) via the NdeI-BamHI sites to express SDF-1␣ with an N-terminal methionine extension (MetSDF-1␣). All mutants were generated by QuikChange sitedirected mutagenesis (Stratagene).
MetSDF-1␣ was expressed in the Escherichia coli BL21-CodonPlus (DE3) RP strain (Stratagene). Cells were grown at 37°C in M9 minimal medium to an A 600 of 0.8 -1.0, induced with 1 mM isopropyl 1-thio-␤-D-galactopyranoside, and incubated for an additional 6 h. Selective incorporations of 1 H and 13 C labels into the methyl groups were achieved as described previously (34,35).
Cells were disrupted by sonication in buffer A. Inclusion bodies were isolated by centrifugation and were solubilized in buffer E (50 mM Tris, pH 8.0, 6 M guanidinium hydrochloride, 10 mM dithiothreitol). The protein was refolded by dialysis against buffer F (buffer A ϩ 0.4 M arginine hydrochloride, 1 mM reduced glutathione, 1 mM oxidized glutathione) and was purified by two cycles of reverse phase high pressure liquid chromatography. The N-terminal methionine extension of SDF-1 did not affect the chemotactic activity (data not shown) as reported previously (36).
Pulldown Analysis-The purified CXCR4 was combined with an excess amount of Cy3-12G5, in which 12G5 antibodies (R & D Systems) coupled to Cy3 mono-reactive dye (GE Healthcare) or SDF-1␣, and with 100 l of 1D4-Sepharose in 500 l of buffer C. The mixture was incubated at 4°C for 4 h, and the beads were washed twice with 1 ml of buffer C. The protein was eluted with 500 l of buffer D. Nonspecific binding of Cy3-12G5 or SDF-1␣ to 1D4-Sepharose was not detected.
Surface Plasmon Resonance (SPR) Analysis-The FLAG-M5 (Sigma) and 12G5 binding activities of CXCR4 were analyzed by SPR measurements using a BIAcore 2000 instrument (Biacore), as described in the literature (37).
NMR Experiments-NMR spectral assignments were performed using the standard triple resonance experiments (38) with uniformly 13 C-and 15 N-labeled MetSDF-1␣. Assignments of the two diastereotopic methyl groups of leucine and valine residues were achieved as described previously (39 . The pulse scheme was as described previously (31) but with the heteronuclear single quantum coherence-type (HSQC-type) sequence with echo/anti-echo gradient coherence selections (40). The irradiation frequency was set at 5.0 ppm, and the maximum radiofrequency amplitude was 0.21 kHz for WURST-20 (the adiabatic factor Q 0 ϭ 1) (41). The irradiation time and the additional relaxation times were set to 0.5 and 1.5 s, respectively.
To observe the effect of AMD3100 on the spectra of SDF-1␣ with an excess amount of CXCR4, 1 H-13 C heteronuclear multiple quantum coherence (HMQC) spectra (40) were recorded for the following samples: 10 M (final concentration) of lyophilized [[U-2 H]Leu-, Val-13 C 1 H 3 , 12 C 2 H 3 )]MetSDF-1␣ combined with the buffer only, with 20 M CXCR4 in the same buffer, and with 20 M CXCR4 and 1 mM AMD3100 in the same buffer. All of the recorded spectra were processed by Topspin 2.0 (Bruker) and were analyzed by Sparky (69).

RESULTS
Characterization of the Prepared CXCR4-CXCR4 was expressed in insect cells using a baculovirus expression system and was solubilized in 1% DDM. The solubilized CXCR4 with the C-terminal 1D4 epitope tag was purified by 1D4 antibody affinity chromatography. In both SDS-PAGE analysis ( Fig.  1A) and Western blotting analysis with 1D4 antibody (supplemental Fig. 1), one major band at an ϳ42-kDa band is observed. Considering the molecular weight of CXCR4 (41-43 kDa, including the glycosylation), we conclude that the major band is intact CXCR4. Purity of the intact CXCR4 is Ͼ80% as judged from SDS-PAGE (Fig. 1A).
To examine the conformational integrity of the obtained CXCR4, the binding activity with the anti-CXCR4 antibody 12G5, which recognizes ECL2 of CXCR4 in its native conformation (42), was analyzed by SPR experiments (37). The normalized response of 12G5 for CXCR4 immobilized on a sensor chip was about half that of a FLAG-M5 antibody for N-terminally FLAG-tagged CXCR4 (Fig. 1B), suggesting that ϳ50% of the obtained CXCR4 was correctly folded. A pulldown assay using fluorescently labeled 12G5 revealed that 50 -100 g of the correctly folded CXCR4 was obtained from 1 liter of insect cell culture. We also examined the SDF-1␣ binding activity of the obtained CXCR4 by pulldown assays. Almost stoichiometric amounts of SDF-1␣, relative to the correctly folded CXCR4 (ϳ50% of the ϳ42-kDa band), were co-precipitated (Fig. 1C), suggesting that the obtained CXCR4 is able to bind to SDF-1␣.
It is well known that detergent-solubilized GPCRs are extraordinarily labile. Because we also found that all of the DDM-solubilized CXCR4 aggregated after 48 h of incubation at FIGURE 1. Characterization of the purified CXCR4. A, purified CXCR4 was loaded on a 12% SDS-polyacrylamide gel. The gel was stained with Coomassie Brilliant Blue. B, normalized SPR responses for FLAG-M5 binding to N-terminally FLAG-tagged CXCR4 (dotted line) and 12G5 binding to CXCR4 (solid line). C, SDF-1 binding activity of the purified CXCR4 analyzed by a pulldown assay. Each fraction was loaded on a 15% SDS-polyacrylamide gel, which was silver-stained. D, normalized SPR responses for 12G5 binding to CXCR4, incubated under various conditions. B and D, the entire sensorgrams were divided by the CXCR4 capture levels (RU CXCR4 ). RU, resonance units; MAb, monoclonal antibody. room temperature (data not shown), we examined the stability of the solubilized CXCR4 under various conditions. Although an incubation of the sample solution at a low temperature pre-vented the aggregation of CXCR4 (data not shown), the normalized response of 12G5 binding to CXCR4 by the SPR analyses was reduced to ϳ60% after 48 h of incubation, even at 4°C (Fig. 1D). Further addition of glycerol to the sample solution markedly prevented the reduction of the response to 12G5 (Fig. 1D). Therefore, we performed the subsequent NMR analyses at a low temperature, in the presence of glycerol, and within 48 h after purification.
Methyl-utilizing TCS Experiments-To identify the SDF-1␣ residues in close proximity to CXCR4, methyl-TCS experiments were carried out. Fig. 2A shows an outline of the methyl-TCS experiment performed in this study. We prepared highly deuterated SDF-1␣, with protons selectively incorporated into the 13 C-labeled valine, leucine, and isoleucine (␦1 only) methyl groups (34). 1 H NMR spectrum of the labeled SDF-1␣ showed that the deuteration level was sufficiently high (Ͼ98%, data not shown) for the methyl-TCS experiments (31). The methyl-TCS experiments were carried out under conditions with an excess amount (10-fold) of the labeled SDF-1␣ relative to the CXCR4. In the methyl-TCS experiments, irradiation with a frequency corresponding to the methylene, methine, and aromatic protons (2.5-7.5 ppm) was applied to the mixture of the nonlabeled CXCR4 and the labeled SDF-1␣. The saturation caused by the irradiation was not kept within the CXCR4 molecule but was transferred to the SDF-1␣ residues in close proximity to the CXCR4, through the crosssaturation phenomena (30). If the complex has an exchange rate between the free and bound states that is faster than the longitudinal relaxation rates of the methyl protons of SDF-1␣ (1-2 s Ϫ1 ), then the saturation in the proximal residues was sufficiently transferred to the free state of SDF-1␣. As a result, the proximal residues can be identified by the selective intensity reductions of the methyl resonances in the 1 H-13 C HSQC spectra of unbound SDF-1␣.  Fig. 2). In the upper panels, the black lines show the 1 H one-dimensional slices through the Leu 55␦1 and Leu 62␦2 signals (the corresponding 13 C frequencies are indicated with cyan lines in the twodimensional spectra). The red lines show the same 1 H one-dimensional slices with selective irradiations of CXCR4. DECEMBER 11, 2009 • VOLUME 284 • NUMBER 50

JOURNAL OF BIOLOGICAL CHEMISTRY 35243
To achieve efficient exchange between the free and bound states of SDF-1␣, we utilized the SDF-1␣ mutant R8A/R12A for the subsequent TCS experiments, because the introduction of mutations into Arg 8 and Arg 12 reportedly reduces the receptor affinity but does not affect the chemotactic efficacy (18,19).
Despite the substantial improvement of the sample preparation procedure, the obtained CXCR4 samples significantly con-tained denatured CXCR4 and impurities. In addition, DDM micelles were also included in the samples. Therefore, to selectively observe the specific interaction between correctly folded CXCR4 and SDF-1␣, we subtracted the nonspecific binding effects from the TCS results by the control experiments, in which the specific binding surface on the correctly folded CXCR4 was selectively blocked by the addition of the stoichiometric amount of perdeuterated wild-type SDF-1␣ (Fig. 2B). The difference in reduction ratio (⌬RR), which represents the specific interaction between SDF-1␣ and CXCR4, was calculated for each resonance by subtracting the intensity reduction ratio in the control experiment from that in the TCS experiment. Fig. 2, C and D, shows the 1 H-13 C HSQC spectra of SDF-1␣ R8A/ R12A observed in the TCS and control experiments, respectively. In the TCS experiment, several resonances, including Leu 55␦2 , showed significantly higher intensity reductions upon irradiation than in the control experiment, whereas other resonances, including Leu 62␦1 , exhibited similar intensity reductions in the two experiments. The ⌬RR was calculated for each resonance (Fig. 3, A and B), and the residues were colored according to their ⌬RR values on the SDF-1␣ structure in the free state (Fig. 3C) Fig. 3), these resonances were excluded from the following considerations.
Our relaxation matrix calculations (31) suggested that the ligand methyl protons within 5 Å of the receptor protons exhibited the intensity reductions of more than 0.1. In addition, when the ligand methyl protons are close to each other, the ligand methyl protons more than 5 Å away from the receptor protons  (Fig. 2, A and C), and the green bars represent those from the negative control experiment (Fig. 2, B and D). The ratios for the Val 18␥1 and Val 49␥2 signals were not determined (N.D.), because these two signals were severely overlapped. The error bars represent the experimental errors, calculated from the root sum square of (noise level/signal intensity) in the two spectra, with and without irradiation. B, differences in the signal reduction ratios (⌬RR values) between the methyl-TCS experiment (purple bars in A) and the control experiment (green bars in A). The error bars represent the experimental errors, calculated from the root sum square of errors in two experiments. Red, light red, orange, and cyan bars represent the signals with ⌬RR Ͼ 0.150, 0.125, 0.100, and Ͻ0.100, respectively. Methyl protons completely buried in the structures are shaded in gray. C, mapping of the affected residues on the SDF-1␣ structure (Protein Data Bank code 1VMC). The SDF-1␣ structure is shown in a CPK representation. Isoleucine, leucine, and valine residues are colored according to the ⌬RR values of their methyl resonances, as in B. For the leucine and valine residues, the larger ⌬RR of their two methyl resonances was utilized. also exhibit some intensity reductions, because of the spin diffusion effects within the ligand molecule. However, because of the fast internal motions of methyl protons, the spin diffusion effects should be efficiently suppressed (31). Therefore, the spin diffusion effect should provide only minor effects relative to the cross-saturation between CXCR4 and SDF-1␣.
The deuteration levels of SDF-1␣ were sufficiently high, and thus most of the effects of the residual protons in SDF-1␣ have been eliminated from the TCS results. Although the residual proton effect is enhanced in the high molecular weight systems, the enhanced effect was usually Ͻ0.1, even in the case of the ϳ150-kDa the complex between fragment B of protein A and IgG (31). Consequently, the high ⌬RR values (Ͼ0.1) should represent the cross-saturation effect originating from CXCR4.
The cross-saturation effect is relatively insensitive to the irradiation bandwidth, due to the effective spin diffusion within the CXCR4 molecule, whereas the residual proton effect is sensitive to the irradiation bandwidth. Therefore, we carried out the TCS experiments with reduced irradiation bandwidth (4.5-7.5 ppm). Although the residues with low ⌬RR values are remarkably affected by the irradiation bandwidth, the residues with high ⌬RR values (Ͼ0.1), such as Leu 29␦2 , showed little dependence of ⌬RR on the irradiation bandwidth (data not shown), suggesting that the cross-saturation is dominant in residues with high ⌬RR values (Ͼ0.1).
The methyl resonances with high ⌬RR values (Ͼ0.125) formed a contiguous surface on the SDF-1␣ structure as follows: starting from the ␤1-strand (Val 23 , Leu 26 , and Ile 28 ), extending through one side of the ␤-sheet (Leu 29 , Val 39 , and Val 49 ), and ending at the 50-s loop (Leu 55 ). In addition, the resonances from the SDF-1␣ N terminus (Val 3 and Leu 5 ) also exhibited moderate ⌬RR values, which should represent the close proximity of these protons to CXCR4, because the effect of spin diffusion would be negligible for these isolated spins. Consequently, we conclude that Val 3 , Leu 5 , Val 23 , Leu 26 , Ile 28 , Leu 29 , Val 39 , Val 49 , and Leu 55 of SDF-1␣ are in close proximity to CXCR4 in the SDF-1␣-CXCR4 complex.
Effect of AMD3100 on the SDF-1␣-CXCR4 Interaction-To elucidate the CXCR4-binding mode of SDF-1␣ in more detail, the SDF-1␣-CXCR4 interaction was further investigated under conditions where the CXCR4 TM region was blocked by the CXCR4-selective antagonist AMD3100 (32,33).
We first carried out TCS experiments in the presence of AMD3100 to examine its effect on the interaction between SDF-1␣ and CXCR4. The calculated ⌬RR values are shown in Fig. 4A, and the residues are colored according to their ⌬RR values on the SDF-1␣ structure in the free state (Fig. 4B). As a result, the resonances from the N-terminal residues (Val 3 and Leu 5 ) showed almost no ⌬RR values in the presence of AMD3100, whereas significant ⌬RR values were observed for these residues in the absence of AMD3100 (Fig. 4, B and C). Except for the N terminus, the residues of the SDF-1␣ with significant ⌬RR values (Ͼ0.10) in the presence of AMD3100 included Val 23␥1 , Leu 26␦1 , Ile 28 , Leu 29␦2 , Val 49␥2 , and Leu 55 and are almost identical to those in the absence of AMD3100.
We next observed the effect of AMD3100 on the spectra of wild-type SDF-1␣ with an excess amount of CXCR4. Fig. 5A shows the 1 H-13 C HMQC spectrum of wild-type SDF-1␣, with protons selectively incorporated into the 13 C-labeled leucine and valine methyl groups (35). All 26 methyl resonances from the eight leucine and five valine residues were observed. Upon the addition of an excess amount of CXCR4, the signal intensities were remarkably decreased (Fig. 5B), indicating a significant increase in the molecular weight of SDF-1␣. Upon the further addition of AMD3100, the intensities of several signals, with chemical shifts almost identical to those of the methyl  DECEMBER 11, 2009 • VOLUME 284 • NUMBER 50

JOURNAL OF BIOLOGICAL CHEMISTRY 35245
resonances in the N-terminal residues (Val 3 and Leu 5 ) in the free state, were recovered (Fig. 5C).

CXCR4-binding Site on SDF-1␣-
The methyl-TCS experiments utilize isoleucine, leucine, and valine methyl groups as probes to identify the molecular interface. Our previous data base analyses demonstrate that the probability for the methyl protons to be located within 3 Å of their binding partner in a protein-protein complex is comparable or even larger than that of amide protons (31). In addition, the TCS experiments identify the residues located within 5 Å of the protons of their binding partners (30). Whereas methyl groups with small solvent exposure (accessible surface area ϳ0.1) cannot make van der Waals contact, they can be within 5 Å of the protons of their binding partners, as in the case of the complex between fragment B of protein A and IgG (31). Therefore, the methyl groups detected in the methyl-TCS experiments would demonstrate the regions constituting the binding interfaces.
Our TCS experiments revealed that the SDF-1␣ N terminus (Val 3 and Leu 5 ), the ␤1-strand (Val 23 , Leu 26 , and Ile 28 ), one side of the ␤-sheet (Leu 29 , Val 39 , and Val 49 ), and the 50-s loop (Leu 55 ) are in close proximity to CXCR4 in the SDF-1␣-CXCR4 complex (Fig. 3). It should be noted that the remarkable inten-sity reduction was observed for Val 49␥2 , which is close to the N-loop. Therefore, we conclude that SDF-1␣ utilizes an extended surface on the multiple region dispersed throughout the protein, consisting of the ␤-sheet, 50-s loop, and N-loop, in addition to the N terminus, for the receptor binding.
These results are in good agreement with the previous mutational analyses (15, 18 -20), where the deletion of the N terminus (Lys 1 -Val 3 ), and the substitutions of Arg 8 , Arg 12 , and Arg 47 , which are close to the Leu 29 , Val 39 , and Val 49 methyl groups, significantly affected the receptor binding affinities (supplemental Fig. 4), whereas Val 18 , which is in N-loop together with Arg 12 , is not considered in our TCS experiments, because Val 18␥1 is completely buried in the molecule. In addition, the CXCR4-binding site determined in the TCS experiment contains a more extensive surface than that proposed in previous mutational studies, suggesting that a broad area of CXCR4 is responsible for the SDF-1␣ binding.
The extensive surface may be composed of the region interacting with the CXCR4 N terminus and that interacting with the CXCR4 extracellular loop. The CXCR4 N terminus was suggested to bind to the SDF-1␣ N-loop (15). This is consistent with both the membrane proximal region of the CXCR4 N terminus and the latter half of the SDF-1␣ N-loop that contain acidic and basic residues. The SDF-1␣ ␤-sheet and 50-s loop would bind to the CXCR4 region, including the extracellular loops. Further investigations are necessary to determine the precise configuration between SDF-1␣ and CXCR4.
In contrast to most of the chemokine-chemokine receptor interactions (2), the SDF-1␣-CXCR4 interaction exhibits relatively stringent specificity; SDF-1␣ binds to only CXCR4 and CXCR7, and CXCR4 exclusively binds to SDF-1␣. Our TCS and mutational results, together with the previous mutational results, provide detailed information about the specific SDF-1␣ recognition mechanism of CXCR4. CXCR4 recognizes an extensive surface on the SDF-1␣ ␤-sheet, 50-s loop, and N-loop, which contains a variety of basic (such as Arg 12 and Arg 47 ), acidic (such as Glu 15 and Asp 52 ), and hydrophobic residues (such as Leu 29 , Val 49 , and Leu 55 ), with their side chains exposed on the molecular surface. This combination of different types of residues in the extensive binding site identified in this study would substantially contribute to the specific interaction. Most chemokines, including SDF-1␣, tend to dimerize at high concentrations (16,36,(43)(44)(45). The interactions between chemokines and glycosaminoglycans promote the chemokine dimerizations (43,46), and the recent NMR analyses revealed that the interaction between SDF-1␣ and CXCR4 N-terminal peptides also promote the SDF-1␣ dimerizations (20,44). On the other hand, the monomeric SDF-1␣ is relevant to the signal transduction, because the disulfide-bridged dimeric mutant of SDF-1␣ exhibits slightly larger Ca 2ϩ influx EC 50 values and the chemotactic antagonism (20). In addition, it has recently been reported that the SDF-1␣ monomer and not the dimeric mutant of SDF-1␣ is relevant in its cardioprotective effect in vivo (47). In this study, almost stoichiometric amounts of SDF-1␣, relative to the correctly folded CXCR4, were co-precipitated in the pulldown assay, suggesting that the stoichiometry between CXCR4 and SDF-1␣ is 1:1 (Fig. 1C). Therefore, we performed the TCS experiments under the conditions of 100 M SDF-1␣, where SDF-1␣ assumes the monomeric form. In our TCS experiments, such a 1:1 binding mode should be selectively observed by using the negative control experiments, in which the 1:1 binding mode was blocked by the addition of the stoichiometric amount of wild-type SDF-1␣ relative to the folded CXCR4. Interestingly, the CXCR4-binding site of SDF-1␣ determined in this study partially overlapped the SDF-1␣ dimer interface (Leu 26␦1 and Ile 28␦1 ) (Fig. 3), suggesting that the SDF-1␣ monomer would serve as a more effective ligand for CXCR4. Our results did not provide any information about the function of SDF-1␣ dimer, which might be more effective in the presence of glycosaminoglycans, and further investigations are necessary to address the physiological relevance of the SDF-1␣ monomer-dimer equilibrium.
There is increasing evidence that the GPCRs, including chemokine receptors, exist as homodimers or heterodimers on the cell surface (48 -51). However, the functional consequences of the receptor dimerizations are still under debate, because of the difficulties in the detection and the control of the receptor oligomerization states (52)(53)(54). In this study, we did not control the oligomerization of CXCR4, and further studies are necessary to investigate whether the CXCR4 dimerizations affect the ligand binding. For elucidating the relevance of GPCR dimerization, TCS experiments using monomeric or dimeric GPCR trapped by mutations (55) or cross-linkings (56) would be helpful.
It is well established that CXCR4 is post-translationally modified by sulfation of its tyrosines at the N terminus. Although it has been reported that CXCR4 expressed in Hi5 insect cells was tyrosine-sulfated (26), the sulfated residues and the extent of sulfation have not yet been determined. Although the tyrosine sulfation is reportedly important for the high affinity ligand binding of both the CXCR4 (57) and CXCR4 N-terminal peptides (20,44), CXCR4 lacking the tyrosine sulfation site can also bind to SDF-1␣ with substantial affinity (57) and has signal transduction activity (24,25). In addition, the complex structures of SDF-1␣ and CXCR4 N-terminal peptides have recently been solved, and the structures with and without tyrosine sulfations are almost identical (20). Therefore, we conclude that the tyrosine sulfation does not change the binding mode of the SDF-1␣-CXCR4 interaction and that our data represent the SDF-1␣-CXCR4 interactions in their native forms.

Two-step Binding Model for the SDF-1␣-CXCR4 Interaction-
In the methyl-TCS experiment with AMD3100 (Fig. 4), the loss of the ⌬RR values on the N-terminal residues revealed that the SDF-1␣ N terminus is not responsible for CXCR4 binding in the presence of AMD3100, whereas the similar ⌬RR values on the SDF-1␣ ␤-sheet and 50-s loop demonstrated that the binding of the SDF-1␣ ␤-sheet and 50-s loop to CXCR4 is unaffected by AMD3100. In the NMR observations of SDF-1␣ with an excess amount of CXCR4 (Fig. 5), the reappearance of the N-terminal signals upon the addition of AMD3100 indicated the highly dynamic nature of the SDF-1␣ N terminus, whereas the absence of the NMR signals from the other SDF-1␣ region, even in the presence of AMD3100, suggested that the stable interaction between the SDF-1␣ region without N terminus and the CXCR4-AMD3100 complex still exists. Considering that AMD3100 binds to the TM region of CXCR4 (32), these NMR experiments demonstrated that the SDF-1␣ N terminus should bind to the CXCR4 TM region, whereas the other SDF-1␣ region should bind to the CXCR4 extracellular regions. These two interactions should occur independently, because AMD3100 could block only the interaction between the SDF-1␣ N terminus and the CXCR4 TM region. The SDF-1␣ N-loop residues are suggested to bind to the CXCR4 N terminus (15). Considering that Val 49␥2 , which is close to the N-loop, showed significant ⌬RR even in the presence of AMD3100, the interaction between the SDF-1␣ N-loop and the CXCR4 N terminus would be also independent of the interaction between the SDF-1␣ N terminus and the CXCR4 TM region. Consequently, we can provide novel structural evidence that strongly supports the two-step binding model, as illustrated in Fig. 6A, with the characteristic features of the two independent interactions. The interaction between SDF-1␣ and CXCR4 in the presence of AMD3100 should mimic the 1st step of this two-step binding model. Although the direct observation of this 1st step intermediate state is difficult because of its low population, we successfully trapped this state by adding AMD3100, which inhibits the formation of the final 2nd step complex.
The recently solved structures of ␤ 2 -adrenergic receptor (58,59) and adenosine A 2A receptor (60) indicated that the binding pocket within the TM helices is partially covered by ECL2 (Fig.  6B). Considering the fact that CXCR4 also possesses the ECL2 with the similar length and the conserved disulfide bond between ECL2 and TM3, it is difficult for SDF-1␣ to bind the buried pocket, because SDF-1␣ is much larger than most of the other GPCR ligands. Therefore, it is likely that the rate-limiting step for the receptor binding of SDF-1␣ would be the opening of ECL2 to provide the space for inserting the critical residues into the TM region.
The two-step interaction, described above, would be important for the efficient binding between SDF-1␣ and CXCR4 (Fig.  6A). The independent interaction between the SDF-1␣ ␤-sheet, 50-s loop, and N-loop and the CXCR4 extracellular region should facilitate the rapid and efficient anchoring of SDF-1␣ near the CXCR4 TM region. Even in this 1st step bound state, the SDF-1␣ N terminus should exhibit a highly dynamic nature, as shown in the spectrum of SDF-1␣ with CXCR4 and AMD3100 (Fig. 5C), in the limited space. Because an unstructured protein has a large capture radius, the dynamics of the SDF-1-CXCR4 Interaction Investigated by NMR DECEMBER 11, 2009 • VOLUME 284 • NUMBER 50

JOURNAL OF BIOLOGICAL CHEMISTRY 35247
SDF-1␣ N terminus would increase the efficiency for searching the space for its binding site, through the well known "fly-casting" mechanism of intrinsically disordered proteins (61). This fly-casting effect, together with the trapping mechanism of SDF1-␣ to glycosaminoglycans (46), which elevates the local concentration of SDF1-␣, would enhance the rate of the complex formation. Consequently, the SDF-1␣ N terminus can bind to the buried cavities within the CXCR4 TM helices to trigger the conformational changes in the CXCR4 TM region, which lead to the G-protein signaling.
Because all known chemokines have flexible N-terminal regions that are critical for signaling through chemokine receptors, the two-step mechanism should be generalized across the chemokine families. Such a mechanism may also be present in other GPCRs with peptide ligands, including the class B GPCRs, with relatively large extracellular domains for anchoring part of their ligands.
Applicability of Methyl-TCS to GPCRs-GPCRs constitute the largest family of cell surface signaling receptors, with Ͼ800 members. GPCRs play crucial roles in many physiological processes, through signal transduction across cell membranes in response to a variety of stimuli (62). Although GPCRs include more than 30% of modern drug targets (63), the tertiary structures of only four GPCR structures have been solved to date (58 -60, 64, 65). Despite the recent advances in protein chemistry and x-ray crystallography, structural determinations of GPCRs are still challenging, and laborious optimizations of crystallization conditions for each GPCR are necessary, which hampers the analyses of multiple GPCR-ligand interactions. Although solution NMR provides powerful tools for investigations of the protein-protein interactions, NMR analyses with the full-length GPCRs are very limited to date (66 -68).
One of the problems in solving GPCR structures and investigating the GPCR-ligand interactions is their low expression levels in heterologous expression systems, and another problem is their extreme instability in detergent micelles, which often yield nonfunctional receptors during solubilization and purification. Therefore, the experimental conditions of expression, solubilization, and purification should be extensively optimized. In addition, the percentages of correctly folded receptors should be estimated, and the effect of denatured receptors on the structural analyses should be considered.
In this study, 100 -200 g of CXCR4, about half of which were correctly folded, were obtained from 1 liter of insect cell culture, and it retained the native conformation for 48 h (Fig. 1). The yield was far below the practical amounts needed for conventional NMR and crystallographic studies. In addition, the low temperature and the addition of glycerol, which were required for the stabilization of CXCR4, induced broadening of the NMR signals. However, the methyl-TCS methods, which are highly sensitive methods, were applicable even in such cases, because only ϳ10 M of receptor is sufficient for the determinations of the receptor-binding sites in ligand molecules with this method. In addition, the effects of denatured CXCR4 could be properly subtracted by the negative control experiment (Fig. 2B).
In this study, we have provided the first structural view of the two-step interaction between a chemokine and its full-length G-protein-coupled receptor. Our strategy with the methyl-TCS FIGURE 6. Physiological role of the two-step mechanism for the SDF-1␣-CXCR4 interaction. A, schematic diagram of the two-step mechanism for the SDF-1␣-CXCR4 interaction. The 1st step interaction between the SDF-1␣ ␤-sheet, 50-s loop, and N-loop and the CXCR4 extracellular region facilitates the rapid binding and efficient anchoring of SDF-1␣ on the extracellular side of CXCR4. The SDF-1␣ N terminus is highly dynamic even in this state, which is used to search for the binding cavities buried within the TM helices. Consequently, the 2nd step interaction between the SDF-1␣ N terminus and the CXCR4 TM region is formed, and the SDF-1␣ N terminus triggers the conformational changes in the CXCR4 TM to induce G-protein signaling. B, crystal structures of two GPCRs, ␤ 2 -adrenergic receptor (Protein Data Bank code 2RH1), and adenosine A 2A receptor (Protein Data Bank code 3EML). Small molecule antagonists (shown as sticks) are buried in the deep cavities within the TM helices (red ribbons). The entrances of these cavities are restricted by the long ECL2s (green ribbons) and the conserved disulfide bonds between ECL2 and TM3 (yellow sticks). Solvent-accessible surfaces of the receptors are also shown (transparent gray surfaces). experiments used in this study would be generally applicable to GPCR-ligand interactions and hence would provide a favorable alternative for the elucidation of the ligand recognition mechanisms of GPCRs. In addition, we have shown that the druginduced modulation of the GPCR-ligand interaction could be clearly detected by NMR experiments. Such experiments should be useful to elucidate whether and how the drug functions and thus would support drug development targeted to GPCRs.