The chemokine X-factor: Structure-function analysis of the CXC motif at CXCR4 and ACKR3

The human chemokine family consists of 46 protein ligands that induce chemotactic cell migration by activating a family of 23 G protein – coupled receptors. The two major chemokine subfamilies, CC and CXC, bind distinct receptor subsets. A sequence motif defining these families, the X position in the CXC motif, is not predicted to make significant contacts with the receptor, but instead links structural elements associated with binding and activation. Here, we use comparative analysis of chemokine NMR structures, structural modeling, and molecular dynamic simulations that suggested the X position reor-ients the chemokine N terminus. Using CXCL12 as a model CXC chemokine, deletion of the X residue (Pro-10) had little to no impact on the folded chemokine structure but diminished CXCR4 agonist activity as measured by ERK phosphorylation, chemotaxis, and G i/o -mediated cAMP inhibition. Functional impairment was attributed to over 100-fold loss of CXCR4 binding affinity. Binding to the other CXCL12 receptor, ACKR3, was diminished nearly 500-fold. Deletion of Pro-10 had little effect on CXCL12 binding to the CXCR4 N terminus, a major compo-nent of the chemokine-GPCR interface. Replacement of the X residue with the most frequent amino acid at this position (P10Q) had an intermediate effect between WT and P10del in each assay, with ACKR3 having a higher tolerance for this mutation. This work shows that the X residue helps to position the CXCL12 N terminus for optimal docking into the orthosteric pocket of CXCR4 and suggests that the CC/CXC motif contrib-utes directly to receptor selectivity by orienting the chemokine N terminus in a subfamily-specific direction. Elution fractions were refolded by infinite dilution (100 m M Tris, 10 m M cysteine, 0.5 m M cystine, pH 8) with agitation overnight. After concentration, the fusion protein was cleaved by the addition of the protease ULP1 for at least 4 h. Additional purification was accomplished by cation-exchange chromatography and reverse-phase HPLC. Final elution fractions were pooled and ly-ophilized, and identity was confirmed by MS.

improved description of the chemokine interaction network is as selectively promiscuous, neither fully specific nor fully promiscuous, where overlapping high affinity chemokine-receptor pairs exist as a subset of all possible combinations (e.g. CCR3 binds many but not all chemokines, and these ligands bind several but not all receptors).
Unraveling the principles underlying the selectively promiscuous chemokine network has proven challenging despite determined structures of chemokines, chemokine receptors (CKRs), and chemokine-receptor co-crystal complexes (9)(10)(11). These structural studies have revealed a common tertiary fold for chemokines and validated that CKRs adopt the GPCR seven-transmembrane architecture. The highly conserved structures of chemokines and CKRs yield a conserved interface, with the chemokine N terminus binding the orthosteric pocket of the receptor. This binding pose was predicted by an initial model of the interaction that separates the chemokine interface into two distinct sites (12). In this two-site model, the CKR N terminus binds the globular chemokine core but does not lead to receptor activation (site-1). Activation is achieved through interactions between the chemokine N terminus and the receptor extracellular loop and transmembrane residues (site-2). Determination of the co-crystal structures added to this model by revealing a large protein-protein interface that included regions, such as the chemokine loop linking the b1and b2-strands near residue 30 (i.e. the 30s loop), not previously predicted to significantly interact (13)(14)(15)(16). Thus, although the simplistic two-site model fails to capture the full complexity of chemokine-CKR recognition, it remains a useful framework for structure-function analyses.
Contrary to the chemokine conserved structure, chemokine sequence is highly variable. The most conserved residues are four characteristic cysteines that form two disulfide bonds necessary for function in some chemokines (17). The first two cysteine residues separate the N terminus from the N-loop and classify chemokines into two major families where the cysteines are adjacent (CC-family) or separated by a single residue (CXCfamily) (18).
CKRs lack a CC/CXC motif and thus adopt the family of its chemokine ligand. In a truly promiscuous system, this nomenclature would break down. However, receptors rarely bind chemokines from multiple families. Thus, CXC receptors are activated by CXC chemokines, and CC receptors are activated by CC chemokines. There has been little evidence to support a family-wide explanation for this familial selectivity. As the deterministic residue for family classification is the X residue in the CXC motif, this X-factor likely plays a functional role in chemokine discrimination.
Remarkably, many chemokine genes segregate into two major clusters corresponding to their family (19). In humans, chromosome 4 houses a majority of CXC chemokines, and chromosome 17 contains most CC chemokines. This implies that selectivity has been maintained through the successive gene duplications that rapidly expanded the chemokine family and that this selectivity may be as old as the molecules themselves. Phylogenetic analysis suggests that CXCL12 was a founding member of the CXC family. Duplication of a primordial chemokine gene followed by a permissive mutation that removed the X residue yielded the CC family progenitor (20)(21)(22). Subsequent duplication events and transmutation occurred throughout chemokine evolution, thus driving recognition complexity and allowing the expansion of the chemokine system to orchestrate the inducible migration and constitutive homing of many cell types.
The CC/CXC motif is not contained in the binding (site-1) or activation (site-2) regions of the chemokine; however, it is poised directly between these regions, allowing for structural modulation that could alter receptor binding or activation. In co-crystal structures of chemokine-bound GPCRs, the CC motif is above the pocket near the top of TM1 and the receptor N terminus. In the absence of a structure of a CXC chemokine bound to its receptor, it is unknown how the X residue interacts with the GPCR. Thus, the chemokine X-factor could determine selectivity between CC and CXC chemokines, and this effect could be achieved through structural perturbation of the chemokine N-loop (site-1), the N terminus (site-2), or a more subtle allosteric effect.
To shed light on selectivity determinants between CC and CXC chemokines, we interrogated the structural impact and functional consequences of mutation and deletion of the X residue in the prototypical homeostatic chemokine CXCL12. Using NMR, cell-based functional assays, and radioligand binding, we observed structural perturbation as well as reduced receptor binding and activation upon mutation or deletion of the X residue. Interactions with the CKR N terminus (site-1) were largely unchanged, localizing the binding defect to the chemokine interface with the CKR extracellular loops or transmembrane regions (site-2). Analysis of every chemokine with an NMR solution-state structure suggested preferred N-terminal states between CC and CXC chemokines. Homology modeling and subsequent molecular dynamics (MD) simulations of CXCL12-P10del support the hypothesis that the CC/CXC motif directly impacts chemokine N-terminal orientation important for receptor recognition.

Results
The chemokine network segregates CC and CXC chemokines Analysis of the chemokine interaction network revealed that each chemokine subfamily activates a distinct subset of receptors (Fig. 1A). In other words, aside from the atypical (e.g. ACKR1) or viral (e.g. US28) receptors, chemokine receptors exhibit subfamily (e.g. CC versus CXC) selectivity. The presence or absence of the X position is the only absolutely conserved sequence difference that distinguishes CC and CXC chemokines. Although there is no consensus X residue among the 17 CXC chemokines (Fig. 1B), this amino acid may contribute to receptor preference within the CXC family. Of the 11 different amino acids found at the X position, polar and aliphatic side chains are found in the ligands for the more promiscuous CXCR1, CXCR2, and CXCR3 receptors, whereas charged (Arg and Glu), aromatic (Tyr), or constrained (Pro) amino acids correspond to selective or monogamous chemokine-receptor interactions. We hypothesized that deletion of the intervening X residue would alter chemokine structure in a way that significantly impairs recognition by CXC family receptors, opening the door to interactions with other GPCRs.

Chemokine structural characteristics
To investigate the structural role of the X position, we tabulated all contacts between that side chain and other amino acids in the NMR structures of 11 different CXC chemokines (Fig.  1C). Most X interactions involved residues in the b1or b2strands, with fewer contacts with the end of the 30s loop or the beginning of the b3-strand. Although there were contacts with the N terminus, this region is unstructured and flexible in solution, and they likely represent transient interactions (23). Inspection of the NMR structures of 16 different CC chemokines revealed that the b1-b2 hairpin is longer in CXC chemokines, with an average of two additional residues that participate in b-strand hydrogen bonding (Fig. 1D). This is most apparent in CXCL12, where the proline at the X position packs tightly with residues flanking the 30s loop (Leu-29, Thr-31, and Gln-37), resulting in both b1and b2-strands extending an additional two residues into the 30s loop. In chemokines with bulkier X residues, such as glutamine in CXCL8, this region protrudes away from the N terminus into solution.
In addition to secondary structure characteristics, several Nterminal preferences became apparent after alignment of CC and CXC chemokines by the conserved core. To quantify this, each conformer in the NMR structure was separated (520 total models), and an average vector was calculated for each N terminus (Fig. 1E). These vectors were averaged by family to calculate the difference in the mean orientations for CC and CXC ligands of 45°. This preference in solution is influenced by the CC/CXC motif that bridges the N terminus to the chemokine body. Upon overlaying the CXCR4 co-crystal structure (4RWS) with the aligned coordinate system, the mean axis for CXC chemokine N termini was oriented in an appropriate direction for receptor docking and activation (Fig. 1F). In contrast, the mean axis of the CC ligand family extended past TM1 in a direction incompatible with binding the orthosteric pocket. This analysis extended our hypothesis that deletion of the X residue would impair binding to CXC receptors by altering the preferred orientation of the receptor-activating N terminus.

Chemokine sequence characteristics
Although the CC/CXC motif is used to separate chemokines into easily classified families, it is not necessarily the sole factor contributing to familial differences. Chemokine N-terminal Figure 1. Selectivity in the chemokine network between CC and CXC chemokines. A, the chemokine network is illustrated with chemokine receptors labeled inside the membrane and the chemokines that bind that receptor labeled outside. The inner phylogenetic tree describes the evolutionary relationships between chemokine receptors. Colors represent each chemokine family: CC (blue), CXC (green), CX3C (red), XC (orange), and atypical (purple). B, the CXC receptors are shown linked with the X residues of their cognate ligands. Each one-letter code is colored by amino acid property: hydrophilic (green), hydrophobic (pink), positive charge (blue), conformationally special (purple), aromatic (yellow), or negative charge (red). The 17 CXC chemokines are represented below the corresponding X residue. C, unique solution-state NMR structures of CXC chemokines were analyzed to determine intramolecular contacts with the side chain of the X residue. These data are illustrated on the structure of CXCL12 (PDB entry 2KED). D, the average lengths of the b1and b2-strands are shorter in CC chemokines than CXC chemokines, as calculated from 27 solution-state NMR structures. This is demonstrated by structures of CXCL12 (PDB entry 2KED) and CCL20 (PDB entry 2JYO). The X residue in CXCL12 (Pro-10) makes contacts with Leu-29, Thr-31, and Gln-37 to extend secondary structure into the 30s loop. E, each conformer in CC and CXC NMR structures were separated and aligned via the conserved chemokine core. Vectors describing the N terminus of each structure were averaged by family to construct the CC or CXC mean axes, which are separated by 45°. F, the structure of vMIP-II bound to CXCR4 (PDB entry 4RWS) is overlaid with the coordinate system of D. The CXC mean axis is near the N terminus of vMIP-II in the orthosteric pocket, whereas the CC mean axis is outside of the receptor. sequences were analyzed in an attempt to identify consistent differences in the activation domain of CC and CXC chemokines ( Fig. 2A, left). Sequences are displayed in a structural format due to the high sequence variance with the conserved N-terminal cysteines aligned. This format best represents the N-terminal length and depth of contact in the orthosteric pocket. The ELR motif is found directly preceding the first cysteine in 7 of 17 CXC chemokines. This region has the highest conservation yet is found in less than half of the family. However, the arginine in this motif is the most conserved position, with a basic residue found in 13 of 17 CXC chemokines. Among the other four CXC chemokines, CXCL4 and CXCL14 have not been shown to bind or activate one of the CXC receptors, and CXCL17 is a newly discovered ligand with an unconventional sequence that may adopt a different structure. Only one of the 26 CC chemokines has a basic residue directly preceding the cysteine. The CC family N termini have low conservation and no apparent family-spanning trends.
Familial selectivity could also be encoded at the level of binding rather than activation. CKR N-terminal sequences were analyzed to search for conserved binding epitopes between CC and CXC receptors ( Fig. 2A, right). Like the chemokine, the unbound CKR N terminus is unstructured with high sequence variability and is displayed continuously with the conserved Nterminal cysteine aligned. The most conserved position directly precedes the aligned cysteine and is a proline in half of the receptors with no preference between families.
The presence of the X position unequivocally divides CC and CXC chemokines, implying a direct functional role. In the absence of compelling sequence differences in binding or activation domains, the CC/CXC motif that bridges these two sites was the most likely factor to determine familial selectivity. To interrogate the role of the X residue in structure and function, a CC version of CXCL12 was engineered by deletion of the X residue ( Fig. 2B, CXCL12-P10del). We selected CXCL12 for this study due to its high conservation across species and its specific binding to CXCR4 (one known chemokine ligand) and ACKR3 (two known chemokine ligands). As a candidate for the primordial chemokine, it is plausible that a CXC-to-CC transition occurred near the beginning of chemokine evolution (20,21). Another construct was designed to probe the role of the X residue identity by mutating the X in CXCL12 to glutamine, the most common X residue (Fig. 2B, CXCL12-P10Q).

Molecular dynamics analysis of CXCL12-P10del
To determine whether the X deletion of CXCL12 would be tolerated or if additional mutations would be needed to stabilize the CC disulfide arrangement, models of CXCL12-P10del were constructed and were similar to CXCL12-WT with a core Ca root mean square deviation of 0.6 Å (Fig. 2C). Deviations were noted at the 30s loop and N terminus, but the cysteines adopted normal disulfide stereochemistry, suggesting that tertiary structure would be maintained. This modeling did not sample alternative conformations. To assess the impact of the X-factor on conformational dynamics, five 100-ns MD simulations were performed with CXCL12-WT or CXCL12-P10del in explicit solvent. Over the course of each run, the N terminus fluctuated through many states; however, residues proximal to the CC/CXC motif, particularly Arg-8, consistently differed between CXCL12-WT and CXCL12-P10del. With the CXCR4 co-crystal structure overlaid, Arg-8 in CXCL12-WT was near its expected binding partner, CXCR4:Asp-262 (13) (Fig. 2D). Recent modeling and mutations support this contact as important for receptor activation (24,25). In CXCL12-P10del, Arg-8 is facing the opposing side of the pocket and unlikely to establish this interaction. This reorientation of the proximal N terminus mimicked the differences between the mean N-terminal states found in NMR solution structures and may be a defining characteristic between CC and CXC chemokines.

Structure and stability of X position mutants
We expressed and purified [U-15 N]CXCL12-WT, -P10Q, and -P10del proteins for experimental comparisons of folding, stability, and receptor binding. Protein identity and folding were confirmed by MS and reverse-phase HPLC. NMR heteronuclear single quantum coherence (HSQC) spectra verified that each protein formed a stable, folded tertiary structure similar to CXCL12-WT (Fig. 2E). Larger peak shifts in the CXCL12-P10del spectrum relative to CXCL12-P10Q and variable peak intensities suggest that the deletion of the X residue caused greater structural perturbation and altered the internal dynamics of the protein. This increase in dynamics did not destabilize the protein as assessed by thermal denaturation, with T m . 85°C for each of the three constructs (Fig. 2F). We attempted to acquire three-dimensional NMR data to determine the structure of CXCL12-P10del but were prevented by extensive line broadening that reduced the quality of the spectra.

Receptor binding and activation at CXCR4
To further examine the structural effects of X residue deletion suggested by NMR HSQC spectra and MD simulations, receptor binding was assessed by radioligand displacement of [ 125 I]CXCL12 from CXCR4 ( Fig. 3A and Table 1). After incubation with HEK-293T membranes overexpressing CXCR4, CXCL12-WT displaced the radioligand ( Table 2) (K i = 0.14 nM). CXCL12-P10del showed displacement, but with a 100fold decrease in affinity (K i = 14.8 nM). Intermediate between WT and X deletion, CXCL12-P10Q had a loss of affinity of ;10-fold (K i = 1.5 nM). Although these data suggested that X residue deletion disrupts chemokine-GPCR binding, it was unclear whether this disruption prevented chemokine contacts with the receptor N terminus (site-1) or the receptor body (site-2) and whether biological function was maintained.
In preliminary studies on receptor activation, CXCL12-P10del failed to activate CXCR4 as assessed by ERK phosphorylation at 10 nM (Fig. 3C). Conversely, CXCL12-P10Q maintained robust ERK phosphorylation, demonstrating an X identity change being less impactful than familial exchange through X deletion. A result of balanced chemokine signaling is cell migration; thus, transwell chemotaxis assays were used to characterize the X mutants' ability to accomplish this main role. The biphasic response to increasing chemokine concentration was noted (Fig. 3D, left). Analyzing the increasing slope of this curve results in similar trends to ERK phosphorylation with CXCL12-P10Q having an intermediate impact compared with CXCL12-P10del (Fig. 3D, right).
As signaling through endogenous receptors was consistently perturbed, direct G i/o function was monitored via cAMP inhibition to gain detailed signaling insight (Table 3). Chemokine receptors are primarily coupled to Ga i/o proteins, which act . Sequence and structure variation between CC and CXC chemokines. A, the sequences of the activation (site-1) and binding (site-2) motifs were compared between CC and CXC chemokines. Both the chemokine and receptor N termini are unstructured in the unbound state and thus are aligned to conserved cysteine residues. The X residue in the CXC motif unequivocally divides the CC and CXC chemokines. The next deterministic position directly precedes the first cysteine and is negatively charged in 76% CXC chemokines and only 4% of CC chemokines. No difference in average N-terminal length was observed. B, constructs of CXCL12-WT, -P10Q, and -P10del are shown. Glutamine was chosen for substitution as it is the most common X residue. C, structural models of CXCL12-P10del were constructed using the solution-state NMR structure 2KED. Deviation from the WT structure was observed at the N terminus and 30s loop. D, to understand the dynamic impact, 100-ns MD simulations were performed with CXCL12-WT and CXCL12-P10del in explicit solvent. Orientation of the proximal N terminus consistently differs between CC and CXC final states shown overlaid with the CXCR4 co-crystal structure (PDB entry 4RWS). The position of Arg-8 in CXCL12-WT is near the expected binding partner CXCR4:Asp-262, whereas Arg-8 in CXCL12-P10del is facing the opposite side of the pocket. E, each construct was examined by NMR HSQC. Uniform peak distribution supports tertiary folding. Unequal peak intensities in CXCL12-P10del result from an increase in dynamics. Overlays with CXCL12-WT demonstrate altered peak position. F, the intrinsic fluorescence of each chemokine was measured throughout thermal denaturation. The plotted first derivative illustrates the T m of CXCL12-WT (91.1°C), CXCL12-P10Q (88.3°C), or CXCL12-P10del (86.3°C) performed in triplicate.

Receptor binding and activation at ACKR3
As CXC receptors appeared to be sensitive to mutation or deletion of the X position, we sought to test binding and function at another family of chemokine receptors. Atypical chemokine receptor 3 (ACKR3) is a scavenger receptor for CXCL12 and CXCL11 and is an example of a GPCR with inherent arrestin bias (26). Compared with CXCR4, CXCL12-WT displaced the radioligand bound to ACKR3 at a lower concentration (K i = 0.09 nM) (Fig. 4A). Whereas mutation of the X residue moderately reduced affinity (K i = 0.69 nM), deletion at this position reduced affinity over 450-fold (K i = 42.7 nM). ACKR3 activation was assessed by b-arrestin-2 recruitment measured using the Tango assay ( Fig. 4B and Table 3) as ACKR3 has no known G protein activity. Signaling trended with radioligand binding, with CXCL12-WT being the most potent (EC 50 = 3.0 nM), followed by CXCL12-P10Q (EC 50 = 5.4 nM) and then CXCL12-P10del (EC 50 = 38.1 nM). This trend with effective concentration was consistent among all assays (Fig. 4C).

Binding affinity to the CXCR4 N-terminal domain
To determine the source of the 100-fold reduction in CXCR4 binding affinity caused by X residue deletion, a peptide consisting of the first 38 residues of CXCR4 (P38) was titrated into each chemokine and observed by NMR HSQC (Fig. 5A). The chemokine-P38 interaction corresponds to site-1 in the twosite model and is thought to contribute mainly to chemokine binding (Fig. 5C). Throughout the 16-point titration, peak shifts were consistent between each chemokine, supporting a common binding mode. Nonlinear fitting of total chemical shift perturbations for each titration point yielded an apparent binding affinity of 2 mM for CXCL12-WT and CXCL12-P10Q (Fig. 5B). Several peaks broadened beyond detection in the CXCL12-P10del/P38 titration, but consistent shifts calculated an apparent affinity of 7 mM. To obtain more precise estimates of the P38 binding affinity, we used microscale thermophoresis (MST) to measure binding of each chemokine to P38 peptide that had been conjugated to a C-terminal Cy5 dye. Increasing concentrations of each CXCL12 variant were mixed with 25 nM P38-Cy5, and the resulting fluorescence after excitation indicated that no aggregation or adsorption occurred in CXCL12-WT (Fig. 5D). Similar traces were observed for CXCL12-P10Q and CXCL12-P10del. Dissociation coefficients were calculated for CXCL12-WT (2.0 6 0.5 mM), CXCL12-P10Q (5.0 6 1.0 mM), and CXCL12-P10del (0.9 6 0.4 mM) and broadly support the low micromolar range calculated from the NMR titrations (Fig. 5E). However, the MST data suggest a slight increase in affinity for CXCL12-P10del compared with WT. Thus, rather than distorting the site-1 binding site on the CXCL12 surface, deletion of the X residue has no negative impact on binding to the CXCR4 N-terminal domain P38 peptide.

Discussion
Human CXCL12, one of the best-characterized chemokines, is necessary for development and has ancient origins indicating that it may be the primordial chemokine (21,22,27,28). A CXC-to-CC transition was simulated in this protein by deletion of the X residue to create CXCL12-P10del. We predicted that this mutation would significantly impair receptor recognition via a structural change in the chemokine.
Modeling and NMR data suggested that P10del was a structurally tolerated mutation, and thermal shift assay confirmed stability (Fig. 2F), suggesting that reduced receptor binding was not due to a change in the bulk tertiary structure. However, NMR HSQC peak position differed from WT with variable peak intensity, suggesting an increase in protein dynamics. These altered spectra were not seen in X deletion of CXCL8, possibly due to CXCL12 having a unique proline at the X position (29). Most NMR structures of CXC chemokines have the X position packed tightly with b1and b2-strand residues to extend these secondary structures into the 30s loop, and X deletion likely disrupts this stability and leads to increased motion in this region and the adjacent N terminus.
MD simulations demonstrate that the CC/CXC motif may also constrain the preferred states of the N terminus in solution. This was most consistently seen in the residues proximal to the CC/CXC motif, particularly in the arginine immediately preceding the first cysteine in most CXC chemokines. In MD simulations of CXCL12, Arg-8 is reoriented upon deletion of the X residue, as shown in Fig. 2D. This residue has previously been identified as a potential determinant of subfamily selectivity, and disrupting its interaction with the receptor via reorientation of the N terminus may be deleterious to receptor binding (30). Comparing the solution-state NMR structures of chemokines supports this hypothesis, with the average CC N-terminal axis oriented 45°away from the average CXC axis. These preferred N-terminal ranges would limit CC chemokines ability to bind the orthosteric pocket and activate a CXC receptor.
CXCL12-P10del retains signaling ability through CXCR4, albeit reduced up to 100-fold in cAMP inhibition. With endogenous receptors, no ERK phosphorylation or chemotaxis was detected until 10 nM. At this concentration, CXCL12-WT reached a maximal effect. From an evolutionary perspective, this newly established CC chemokine would retain minimalistic function to preserve gene existence but be mostly free of Chemokine X-factor selective pressure to undergo neofunctionalization (31). CXCL12-P10Q was designed to investigate the role of the identity of the X residue. In every aspect tested, this construct was intermediate between WT and P10del, suggesting that the familial difference in the CC/CXC motif is a larger factor in selectivity than X identity. However, the X identity change did   reduce signaling ability and thus likely plays a role in discrimination between CXC chemokines. At high concentrations, a peptide corresponding to the N terminus of CXCL12 can activate CXCR4 (32). Similar studies were the basis for the two-site model of chemokine-receptor interaction (Fig. 5C), which separates regions involved in binding (CKR N terminus) and receptor activation (chemokine N terminus). This model overly simplifies the protein-protein interaction yet remains useful as a broad framework to understanding CKR activation. Thus, it is unlikely that modifying the X residue would eliminate the activation ability of CXCL12 as the N terminus is intact. The more probable defect in these mutants is the ability to bind the receptor. This is supported by radioligand displacement as the affinity compared with CXCL12-WT is decreased by 10-or 100-fold for CXCL12-P10Q and CXCL12-P10del, respectively. Interestingly, ACKR3 binds CXCL12-P10del relatively worse than CXCR4, with a decreased affinity of over 450-fold. As this receptor encodes selectivity across two chemokines, it may be more reliant on the CXC motif to discriminate binding partners than the monogamous CXCL12-CXCR4 relationship.
To localize the defect in binding, a peptide consisting of the first 38 residues of CXCR4 was titrated and observed through NMR HSQC experiments. The affinities of CXCL12-WT and CXCL12-P10Q with P38 were comparable, whereas CXCL12-P10del showed a 3-fold loss of affinity. In the NMR structure of CXCL12 bound to this peptide, the X residue makes minimal contacts with P38. Thus, it was expected that CXCL12-P10Q would not disturb peptide-binding affinity. The MST experiments more precisely measured the binding affinity and confirmed the low micromolar range of binding. However, these experiments calculate CXCL12-P10del as binding tighter than WT to P38.
The relatively small change in affinity for CXCL12-P10del binding to P38 cannot account for the larger affinity loss when binding the intact receptor. This implicates the chemokine N terminus or 30s loop as the main contributor to reduced binding affinity, as these regions directly contact the CKR body. As the chemical matter of these regions is unchanged, a conformational explanation is required. The co-crystal structures of CXC and CC receptors have unique binding modes with differing chemokine tilts and binding depths (Fig. 6A). Particularly, the 30s loop of the CC chemokine penetrates into the orthosteric pocket of the receptor and may be facilitated by an increased chemokine tilt in this direction. Compared with the CXCR4bound chemokine, this tilt is 36°and may act to counter the 45°d ifference in average N-terminal position in solution.
On the basis of these data and previous observations of Nterminal orientation and 30s loop dynamics (7,13), we propose a model of chemokine familial selectivity based on the X-factor in the CC/CXC motif (Fig. 6B). In this model, the X-factor is best described as a fulcrum that orients the N terminus into preferential binding positions, predominantly the proximal residues, including the arginine in the ELR motif found in most CXC chemokines. Chemokines with improperly aligned N termini have reduced binding affinity, as this region would be unable to establish ideal contacts with the receptor and may be unable to reach the proper orientation in the pocket to achieve efficient receptor activation. Additionally, the X residue is a sta-bilizing factor in CXC chemokines and adds rigidity by extending the b1and b2-strands through direct interactions. This sterically applies a pushing force on the 30s loop and results in a wide profile. In CC chemokines that lack this extended secondary structure and the steric burden of the X residue, the 30s loop is pulled into a narrow position. This would facilitate the unique binding mode found in CC chemokines while sterically inhibiting the 30s loop of CXC chemokines from penetrating the orthosteric pocket.
The role of the CC/CXC motif in receptor binding and activation has now been investigated in CXCL8 and CXCL12; however, knowledge of this motif in CC chemokines has not been adequately pursued. Based on the results of this study, it is plausible that insertion of an intervening X residue in a CC chemokine would substantially reduce binding and activation to CC receptors. Furthermore, addition, deletion, or modification of the X residue could be a step toward engineered chemokines with novel receptor-binding profiles. Chemokines interact with their receptors via two divergent, unstructured N-terminal domains (i.e. site-1 and site-2), suggesting that family-wide specificity is encoded in the structural scaffold. The CC/CXC motif couples the chemokine body and N terminus and plays an important role in determining binding affinity. We speculate that the orientation of the chemokine N terminus is influenced by the presence or absence of the X residue and is a defining factor for familial selectivity.

Experimental procedures
Chemokine purification CXCL12-WT, -P10Q, and -P10del were expressed and purified as described previously (33). Briefly, chemokines (construct 6xHIS-SUMO-chemokine) were expressed recombinantly in Escherichia coli at 37°C for 5 h before harvest by centrifugation. Pellets were resuspended and lysed by French press at 4°C and centrifuged. The remaining insoluble pellet was resuspended with 6 M guanidine and partially purified by nickel column. Figure 6. Impact of the chemokine X-factor. A, the two determined co-crystal structures of CC and CXC receptors display unique chemokine binding orientations. The CC chemokine body interacts closely with the receptor core and includes penetration of the 30s loop into the orthosteric pocket. The CXCR4 cocrystal complex has vMIP-II above the extracellular vestibule with few interactions between the receptor and the 30s loop. B, a proposed model describing the potential influence of the X-factor between CC and CXC chemokines. The X residue directly interacts with the b1-b2 hairpin to stabilize this region in an extended conformation. The loss of the X position pulls the 30s loop into a narrow conformation that favors the increased CC binding depth. Additionally, the presence of the X position constrains the orientation of the chemokine N terminus in solution. The more linear N terminus found in CXC structures matches with the chemokine position above the receptor pocket in the co-crystal structure.
Elution fractions were refolded by infinite dilution (100 mM Tris, 10 mM cysteine, 0.5 mM cystine, pH 8) with agitation overnight. After concentration, the fusion protein was cleaved by the addition of the protease ULP1 for at least 4 h. Additional purification was accomplished by cation-exchange chromatography and reverse-phase HPLC. Final elution fractions were pooled and lyophilized, and identity was confirmed by MS.

Sequence alignments
Sequences for the 46 human chemokines were taken from the UniProt database with the signal sequences removed. These sequences were visualized in Jalview and sent for alignment via MUSCLE with default settings. Jalview was used to separate CXC and CC chemokines and produce figures with continuous sequences aligned by conserved cysteines. Conformer structures were separated, and all structures were aligned in PyMOL by the conserved three-stranded b-sheet. All but the N termini were removed to determine the N-terminal axis in UCSF Chimera. Resulting vectors were averaged by family to construct a mean axis for CC, CXC, XC, and CX3C chemokines. Figures overlaying co-crystal complexes were generated in UCSF Chimera. PyMOL was used to determine residues contacting the X position with a cutoff of 4 Å. Positions were separated based on contact with the main chain or side chain of the X residue. The interdisulfide distance was measured in PyMOL between the second and third cysteines.

Molecular dynamics simulation
A model of CXCL12-P10del was constructed using Roset-taCM as described previously (34,35). The input template used was CXCL12-WT (PDB entry 2KED). After relaxation, the final models were compared with the NMR structure via Ca deviation using PyMOL. MD simulations were performed following previous protocols (36). Briefly, CXCL12-WT and CXCL12-P10del were simulated for 100 ns in Gromacs 2018 using the OPLS-AA/L force field and the SPC/E water model (n = 5). Simulations were overlaid with the CXCR4 co-crystal structure (PDB entry 4RWS), and differential positions for arginine 8 were illustrated in PyMOL.

Chemotaxis transwells
Chemotaxis assays were performed as described previously (37). THP1 cells were washed twice with RPMI 1640 with HEPES and 0.2% BSA. The lower chamber of Corning HTS transwell plates was loaded with varying chemokine dilutions.
The upper chamber was seeded with 75,000 cells. Plates were incubated at 37°C at 5% CO 2 for 2 h. The migrated cells in the lower well were quantified via flow cytometry.

Western blotting analysis
HEK-293 cells (Microbix, Toronto, Canada) were cultured in 6-well plates and transfected with 5 mg of FLAG-CXCR4 48 h before the assay, following transfection protocols as described previously (38). Cells were serum-starved in Dulbecco's modified Eagle's medium with 20 mM HEPES for 5 h and then stimulated for 5 min with 10 nM chemokine and immediately harvested for Western blotting analysis for pERK-1/2 (Sigma, catalog no. M8159) and total ERK-1/2 (Sigma, catalog no. M5670). Blots were subjected to densitometric analysis, and pERK-1/2 levels were normalized to total ERK-1/2 and quantified relative to vehicle.

NMR spectroscopy
For NMR studies, recombinant chemokines were grown in M9 minimal medium supplemented with 15 NH 4 Cl and purified as described previously (39). Samples were prepared in a 3-mm tube at 50 mM chemokine concentration in 25 mM MES, pH 7.6, containing 10% D 2 O and 0.02% sodium azide. NMR HSQC experiments were carried out on a 600-MHz Bruker spectrometer at 25°C. Titration series added buffer-matched peptide (CXCR4 residues 1-38, C28A) at specified concentrations.

Radioligand binding
HEK-293T cells were transfected with CXCR4 or ACKR3 for membrane purification. After 48 h, cells were resuspended in hypotonic lysis buffer (1 mM HEPES, 2 mM EDTA, pH 7.4) on ice and centrifuged at 30,000 3 g for 30 min. The membrane pellet was resuspended in binding buffer (20 mM HEPES, 10 mM MgCl 2 , 0.1 mM EDTA, pH 7.4) and homogenized via Polytron. 1-ml membrane aliquots were frozen at 280°C. For the competition binding assay 15-50 pM [ 125 I]CXCL12 (PerkinElmer Life Sciences, specific activity = 2200 Ci/mmol) was incubated with diluted membrane and chemokine for 4 h at room temperature with 1 mM CaCl 2 0.1% BSA, and 100 mM NaCl added to the binding buffer. Membranes were harvested via vacuum filtration onto a 96-well filtermat-A presoaked in 0.3% polyethyleneimine and washed three times with binding buffer. After drying, MeltiLex was applied to each filter, and radioactivity was assessed by luminescence via a MicroBeta plate counter. Results were analyzed in GraphPad Prism 8. Saturation experiments were performed similarly to those above, with the radioligand titrated from 4 to 130 pM in the presence or absence of 320 nM unlabeled CXCL12-WT. Protein concentrations of membrane preparations were measured via Bradford assay.
GloSensor G i/o -mediated cAMP inhibition assay HEK-293T cells were transfected 48 h before assay with CXCR4 and cAMP GloSensor-22F (Promega) plasmids overnight. Cells were transferred to a poly-L-lysine-coated 384-well white clear-bottom plate 24 h before the assay. The assay was started upon the addition of chemokine and GloSensor reagent in assay buffer (HBSS, 20 mM HEPES, 0.1% BSA, pH 7.4) using a FLIPR. After 15 min at room temperature, cAMP accumulation was initiated by forskolin (1 mM). After an additional 15 min, luminescence was recorded via a MicroBeta plate counter. Results were analyzed in GraphPad Prism 8.

Tango b-arrestin recruitment assay
Tango assays were performed as described previously with minor modifications (40). Briefly, HEK-293 HTLA cells were transfected with ACKR3 48 h before assay using the polyethyleneimine transfection method. Cells were transferred to a poly-L-lysine-coated 384-well white clear-bottom plate 24 h before the assay. Cells were stimulated with varying concentrations of chemokine prepared in drug buffer (HBSS, 20 mM HEPES, 0.1% BSA, pH 7.4) using a FLIPR (Molecular Devices). Assay plates were incubated overnight at room temperature and monitored using a MicroBeta (PerkinElmer Life Sciences) luminescence counter. Results were analyzed in GraphPad Prism 8.

Microscale thermophoresis
MST assays were performed using a Monolith ® NT.115 (NanoTemper) system. Binding affinity was evaluated between CXCL12-WT, -P10Q, and -P10del chemokines and C-terminally Cy5-labeled P38 CXCR4 peptide. The P38-Cy5 peptide was purified by reverse-phase HPLC with identity confirmed by linear trap quadrupole MS. For MST measurement, CXCL12-WT, -P10Q, and -P10del were dialyzed with P38-Cy5 into assay buffer (20 mM MES, 0.05% Tween 20, pH 6.5). Optimization for sufficient fluorescent signal and checking for sample aggregation or capillary adsorption was performed. Monolith ® NT.115 premium capillaries were loaded from lowbinding 200-ml fast reaction tubes containing chemokine at a maximum concentration of 200 mM for CXCL12-WT and -P10del or 800 mM for CXCL12-P10Q and with the instrument set with an LED excitation power of 20% and using MST power at 40%. Resulting dose response (DF norm ) obtained from normalized fluorescence was analyzed by least-squares curve fit according to a 1:1 binding model (Nanotemper) to calculate K d values. Figures were created using GraphPad Prism 8.

Differential scanning fluorimetry
Thermal protein denaturation was monitored via differential scanning fluorimetry in a Prometheus NT.48 (Nanotemper) using the intrinsic fluorescence of native Tryptophan residues. Each chemokine was prepared at 0.5 mg/ml in assay buffer (20mM MES, pH 6.5) and loaded into capillary tubes. Thermal unfolding was initiated by a linear thermal ramp from 20°C-95°C (1°C/minute) with an excitation power of 50%. Unfolding transition points were determined from wavelengths of Tryptophan fluorescence at 330 and 350 nm, and the first derivative of the 330/350 ratio was plotted as a function of temperature.

Data availability
The data supporting the findings of this study are contained within the article. The data sets generated during this study are available from the corresponding author (Brian F. Volkman, bvolkman@mcw.edu) upon request.