From G Protein-coupled Receptor Structure Resolution to Rational Drug Design*

A number of recent technical solutions have led to significant advances in G protein-coupled receptor (GPCR) structural biology. Apart from a detailed mechanistic view of receptor activation, the new structures have revealed novel ligand binding sites. Together, these insights provide avenues for rational drug design to modulate the activities of these important drug targets. The application of structural data to GPCR drug discovery ushers in an exciting era with the potential to improve existing drugs and discover new ones. In this review, we focus on technical solutions that have accelerated GPCR crystallography as well as some of the salient findings from structures that are relevant to drug discovery. Finally, we outline some of the approaches used in GPCR structure based drug design.

A number of recent technical solutions have led to significant advances in G protein-coupled receptor (GPCR) structural biology. Apart from a detailed mechanistic view of receptor activation, the new structures have revealed novel ligand binding sites. Together, these insights provide avenues for rational drug design to modulate the activities of these important drug targets. The application of structural data to GPCR drug discovery ushers in an exciting era with the potential to improve existing drugs and discover new ones. In this review, we focus on technical solutions that have accelerated GPCR crystallography as well as some of the salient findings from structures that are relevant to drug discovery. Finally, we outline some of the approaches used in GPCR structure based drug design.
The process of drug discovery from bench to market is a high risk long term investment that has been estimated to cost about $1.8 billion per drug (1). Such high costs coupled with the pressure of patent expiry dates and increased regulatory constraints have propelled the pharmaceutical industry to increase efficiency of research and development to reduce the attrition rate.
A key area of focus has been improvements in the quality of compounds that are discovered in the early stages of the drug discovery pipeline. The main driver for this effort has been the general observation that the hits identified from cell-based high throughput screening (HTS) 2 strategies are usually large lipophilic molecules that are difficult to optimize and carry a number of liabilities that significantly increase their failure rate (2). One of the main advances to tackle these issues has been the utilization of fragment-based drug discovery (FBDD) approaches that rely on screening small chemical fragments (100 -250 Da). Because of their small size, a significantly larger portion of chemical space can be explored with fewer compounds when compared with HTS. In addition, initial hits from a fragment screen bind more efficiently to their target and represent excellent starting points for medicinal chemists to grow and optimize these into lead and candidate molecules (3). The initial fragment hits exhibit low affinity, so they need to be screened at high concentrations that make them incompatible with biological assays. Instead, biophysical assays are used in FBDD cascades, and in addition, these approaches are combined with structural information derived primarily from x-ray crystallography. The application of structure-based drug design (SBDD) allows medicinal chemists to rationally convert fragment hits into larger compounds with higher affinity while maintaining the efficiency of binding and drug-like properties. Historically, SBDD methods gained traction with soluble proteins such as enzymes as routine generation of structural data is facilitated by their high stability in purified form (4 -6). This is in sharp contrast to membrane proteins such as G proteincoupled receptors (GPCRs) that have been refractory to SBDD due to the challenges associated with obtaining high quality crystals. The main problems in the field of GPCR structural biology stem from their low stability in purified form. This intrinsic instability is primarily due to their conformational flexibility as well as their hydrophobic nature. To overcome these issues, researchers have developed a number of tools and technologies that have collectively led to a significantly increased success rate in GPCR structure resolution. As a result, the application of SBDD to this class of medically important drug targets has become a real possibility.

Overview of Technology Developments Enabling GPCR Structures
Conventional protein crystallography requires highly purified and homogenous protein samples with accessible proteinprotein interaction surfaces to allow formation of well ordered crystals. The hydrophobic nature of membrane proteins necessitates the addition of detergents for purification. Inclusion of detergent in purification has two consequences. Firstly, detergents remove the stabilizing effect of the membrane bilayer, leading to loss of structural integrity. The extent of this loss depends on the type of detergent used, and as a simple rule of thumb, there is an inverse correlation with detergent chain length. Detergents with shorter chain length result in increased loss of activity; conversely, longer chain detergents are more protective. Secondly and more importantly, detergent molecules reduce the hydrophilic surfaces required for crystallization. The extent of surface occlusion is also a function of detergent chain length, with shorter chain detergents forming smaller micelles when compared with detergents with longer chains. However, as outlined above, the application of the short chain detergents in protein purification is not generally conducive to purification of correctly folded and homogenous membrane proteins. One of the main advances made to overcome this challenge has been the advent of lipidic cubic phase (LCP) (7). The key feature of LCP is that the crystallogenesis step is carried out in a lipidic environment that offers a more protective surrounding for the membrane proteins. This method of crystallography has gained significant popularity, and the majority of non-rhodopsin structures have been solved using this method. Non-rhodopsin receptors that have been sub-jected successfully to the conventional method of vapor diffusion have invariably been significantly engineered to exhibit increased thermal stability and can therefore be readily purified in harsher detergents. A technological advance with the potential to significantly increase the success rate of LCP crystallography is the application of the x-ray free electron laser to crystals grown in LCP (8 -10). Using this technique, it is possible to use small crystals for data collection by serial femtosecond crystallography. This method applies very intense and ultrashort x-ray pulses to thousands of microcrystals that are delivered to the intersection point with the beam using an injector. This method not only circumvents the need to grow sufficiently large crystals capable of withstanding radiation damage, it also reduces the total amount of protein required. Successful application of this method to GPCRs has been demonstrated by the recent structure resolution of the 5-hydroxytryptamine receptor 2B (5-HT 2B ), the smoothened receptor, and the angiotensin II type 1 receptor.
Recent developments in protein engineering have added another set of techniques that has greatly facilitated GPCR structural resolution. A routine protein engineering approach is to remove flexible extreme termini of receptors to reduce heterogeneity and increase chances of crystallogenesis. In addition, researchers routinely add fusion partners such as T4 lysozyme or apocytochrome b 562 RIL to either the N terminus or the second or third intracellular loops (11). Other fusion partners used include the catalytic domain of Pyrococcus abyssi glycogen synthase in the orexin 2 receptor (12) and rubredoxin in CCR5 (13). These fusion partners are selected because they are stable domains that crystallize readily, but more importantly, their N and C termini are in close proximity (less than 15 Å), thus allowing them to be inserted in the receptor loops without gross alteration of the receptor structure. However, in a number of cases, it appears that the addition of a fusion partner impacts the overall structure. In the cases of ␤ 2 -adrenergic and adenosine A 2A receptors, the addition of T4 lysozyme has been observed to increase the affinity of receptors for agonists when compared with wild type, indicating that the receptor conformation has been shifted toward the active form (14,15). Comparison of the A 2A structure in the absence of any fusion with the T4 fused structure provided structural explanation for this observation; the addition of T4 lysozyme appears to increase the outward movement of TM6 consistent with shifting the receptor conformation toward the active form (16). However, the structure of A 2A with apocytochrome b 562 RIL was closer to the non-fused structure and did not exhibit the shift toward agonist conformation (17). These observations indicate that it is critical to understand the pharmacology of receptors following generation of fusion constructs. The addition of the fusion partner to the N terminus is one way to minimize the effect of fusion on receptor conformation while maintaining the beneficial effects of mediating improved crystal contacts (18,19). In addition to fusion proteins, researchers have used antibody fragments to the same effect. Initially, mouse monoclonal antibody fragments raised against the cytoplasmic face of the receptor were used to increase the hydrophilic interaction surface and reduce flexibility (20). More recently, single chain camelid antibodies (nanobodies) have been used to great success in aiding GPCR crystallography. These antibodies are small (15 kDa), rigid, and easy to clone, express, and purify. As GPCR co-crystallization reagents, nanobodies have been primarily used to drive and stabilize the active receptor conformation because this conformation is generally more unstable, especially in the absence of G protein (21)(22)(23). In addition, a nanobody has been used to stabilize the ternary complex of an agonist-bound ␤ 2 -adrenoreceptor in complex with the full heterotrimeric G protein (24).
Conformational thermostabilization has become another established protein engineering solution to aid GPCR crystallography. This approach relies on the identification of single point mutations that increase the thermal stability of detergent-solubilized receptors in a particular conformation. To generate a conformationally thermostabilized receptor, a detergent-compatible thermal stability assay needs to be set up. In its simplest form, a labeled ligand (usually a radio-labeled one) is used to measure receptor thermal stability in a detergent-compatible ligand binding assay. The application of ligand binding to measure stability not only allows screening of mutants in a high throughput manner, it also results in the biasing of the receptor population to a particular conformation. Initially, systematic scanning mutagenesis was used to generate the mutant library (25); however, molecular evolution approaches have now been developed that apply random mutagenesis screening strategies (26). Regardless of the approach used, these methods result in stabilization of receptors in a particular conformation, thus reducing the conformational heterogeneity, which increases the success of crystallization. As with receptors with fusions, it is critically important to thoroughly evaluate the pharmacology of the stabilized receptors to ensure that the effects of stabilization on conformation are consistent with the ligand pharmacology used in the process of stabilization (27). A key advantage of receptor conformational stabilization is that stabilized receptors do not rely on ligand-induced stability to maintain the conformation and structural integrity. Consequently and in contrast to wild-type receptors, stabilized receptors allow successful structural resolution in complex with weak binding ligands (16). This is a critical advantage, especially when structure determination is an integral part of an SBDD campaign where many early hits may not exhibit high affinity and structural information would be critical in their development.

Key Features of GPCR Structures Relevant to Drug Discovery
The application of the technical advances described above has led to a plethora of GPCR structures that have in a significant way paved the way for the application of SBDD approaches to this class of proteins. One of the key insights derived from the recent GPCR structural biology revolution is the understanding of receptor activation gained from analysis of structures in different conformations. In addition, a number of recent structures have revealed novel ligand binding sites outside of the main orthosteric binding pocket, which in combination with our structurally refined understanding of activation mechanism have opened up exciting possibilities for discovering drugs to modulate GPCRs. In this section, we focus on aspects of structural biology that are of relevance to drug discovery. As most of the structures resolved to date belong to receptors in class A, this section focuses primarily on this class of receptors. The residues are referred to in the Ballesteros-Weinstein numbering system (28).
The first sight of structural changes underpinning receptor activation came from the resolution of opsin in complex with the C-terminal fragment of the ␣-subunit of transducin (the light transduction heterotrimeric G protein) (29). When compared with the dark-state structure of rhodopsin (the inactive state), the opsin-G␣ t peptide structure revealed that on activation, there is a large outward movement of TM6 that allows a large binding site to be created for the G␣ subunit. This significant rearrangement upon activation was further corroborated by the structures of the agonist-bound ␤ 2 -adrenergic receptor stabilized in the active conformation by a G protein-mimicking nanobody as well as the full ternary complex of this receptor with agonist and the G protein (21,22,24). These structures also revealed the outward movement of TM6, although the extent of this movement was larger than that observed in the opsin structure. Drawing on the information from these receptor systems and the wealth of receptor mutagenesis data, it is possible to propose a common activation mechanism (30). Central to this mechanism is a set of conserved residues in the core of the receptors consisting of Leu 3.43 , Phe 6.44 , and X 6.40 where X is a bulky hydrophobic residue (e.g. valine, isoleucine, leucine, or methionine). Prior to activation, interactions of these residues maintain TM3 and TM6 in the inactive state. In this state, Leu 3.43 is stabilized by X 6.40 residue and Phe 6.44 , and following agonist binding, TM3 moves upwards, which results in stabilization of Leu 3.43 against the conserved residue of Leu 2.46 , which in turn pushes up Asn 7.49 of the highly conserved NPXXY motif. This upward movement of Asn 7.49 allows Tyr 7.53 to hydrogen-bond with Tyr 5.58 via a water molecule, which facilitates its interaction with Arg 3.50 of the so-called ionic lock motif of the (D/E)RY. The ionic lock has been proposed to maintain the TM3 and TM6 in the inactive conformation through a salt bridge between Arg 3.50 and Glu 6.30 . Collectively, these structural changes result in the stabilization of the active conformation. Consistently with their proposed role in keeping the receptor in the inactive conformation, mutations that disrupt this network often result in constitutively active receptors (examples are discussed in Ref. 30).
Understanding how ligand binding will lead to receptor activation will be very important for utilizing structural information to support SBDD. However, given the varying nature of the agonist ligands as well as their different binding sites, it is difficult to provide a unifying route for activation originating from ligand binding. Each receptor system with its unique agonist molecule and binding site will effect the changes required for receptor activation differently. For example, in the case of the ␤ 2 -adrenergic receptor, agonists form hydrogen bonds with two serine residues on TM5 (Ser 5.42 and Ser 5.46 ), which results in an inward movement of TM5 causing the conserved Pro 5.50 to interact with and induce a different rotameric state at Ile 3.40 , which results in effect in the upward movement of TM3 and the key residue of Leu 3.43 in the hydrophobic core. In addition, the rotameric change of Ile 3.40 forces rotation of Phe 6.44 , which along with the upward shift of Leu 3.43 completely destabilizes the hydrophobic core that leads to receptor activation (30). An interesting observation from the resolution of the related turkey ␤ 1 -adrenergic receptor in complex with a range of full and partial agonists revealed how differential ligand efficacies might be rationalized from structural insight. Similar to ␤ 2 -adrenergic receptor, a full agonist in complex with ␤ 1 -adrenergic receptor forms hydrogen bonds with the two TM5 serines outlined above. In contrast, partial agonists such as salbutamol and dobutamine do not engage Ser 5.46 , which presumably results in weaker stabilization of the active conformation and thus leads to reduced levels of activity (31).
Although the collective knowledge gleaned from different structures is incredibly valuable and has advanced our knowledge of receptor biology significantly, it is wise to remember that crystal structures are in essence frozen snapshots. In addition, although different structures in different conformations can be used to generate a high tech version of a zoopraxiscope movie, the full spectrum of receptor activation is undoubtedly more complex. To compensate, researchers have used complementary approaches such as spectroscopic techniques and molecular dynamics simulations to get a more complete picture of the receptor activation mechanism. Using these approaches, it has been shown that there is a spectrum of conformations more complex than the simple active and inactive conformations. Specifically, computational approaches have provided evidence for metastable states that will be very difficult if not impossible to capture experimentally (32). Atomic-level simulations based on the available crystal structures of the ␤ 2 -adrenergic receptor reveal that different sections of the receptor (the ligand binding site, G protein binding site, and connector region) exhibit weak allosteric coupling and can occupy different conformations independently (32). The consequence of this loose structural relationship is that ligand efficacy only needs modulation of the equilibrium between different conformations to achieve distinct pharmacological outcomes. These computational predictions have been experimentally validated using NMR and pulsed electron paramagnetic resonance spectroscopy, where an agonist-alone bound receptor exhibits substantial conformational heterogeneity and full activation is achieved in the presence of G protein. It appears that as agonist binding shifts the equilibrium toward active conformation, there is a concomitant increase in receptor conformation heterogeneity, which presumably allows the receptor to engage alternative signaling or regulatory proteins depending on the context and environment. Binding of G protein (or G protein mimetic) to the agonist-occupied receptor results in a reduction in receptor conformation heterogeneity and stabilization of the fully active conformation (33,34).
In addition to the information regarding receptor activation, recent x-ray structures have revealed a wide diversity of unexpected binding sites not previously predicted by mutagenesis studies or pharmacology. It appears possible to activate or block activity of GPCRs by a number of different mechanisms other than mimicking or blocking the binding of the natural agonist ligand. The first example of this was the CRF 1 structure (35) where the antagonist CP-376395 was found to bind deep within the transmembrane domain close to the intracellular side of the receptor. The binding site for the receptor has very restricted access, and it is possible that the ligand may enter the receptor through the membrane rather than through the entrance to the orthosteric peptide binding site even though this is an open cavity in the CRF 1 receptor protein. CP-376395 has been noted to have a very slow on-rate (36), which may be due to ligand entry route.
Recent structures have revealed that multiple binding sites can be present on some of the receptors that offer further opportunities for drug design. The x-ray structures of the protease-activated PAR 1 receptor and the related class A purinergic receptor P2Y 12 both indicated the presence of multiple binding pockets within the transmembrane domain (37,38). In P2Y 12 , pocket 1 is formed between TM3 and TM7, whereas pocket 2 is formed between TM1-3 and TM7 (Fig. 1). The antagonist AZD1283 is bound in pocket 1. This is similar to the position of vorapaxar binding to the PAR1 receptor. Modeling and mutagenesis studies suggest that some P2Y 12 antagonists including those that bind covalently to the receptor may bind to pocket 2 rather than pocket 1 (38).
Structures of P2Y 12 have also been solved in complex with the full agonist 2MeSADP and the related 2MeSATP (39). These agonists bind to the same overall pocket as AZD1283 (i.e. pocket 1); however, they bind in a very different orientation, which is only partially overlapping. Of more significance and interest is that the agonist-bound structures show dramatic conformational rearrangements in the extracellular region of the receptor. In the agonist-bound form, the binding pocket appears to be occluded by a lid formed by the positively charged extracellular loops and the N terminus (Fig. 1). This closed conformation is facilitated by the negatively charged phosphate group of nucleotide agonist. In the absence of a negatively charged ligand, the binding pocket will remain open through charge repulsion of the arginine and lysine residues. Consistently, the P2Y 12 structure in complex with the non-nucleotide antagonist AZD1283 shows an open pocket (Fig. 1). Such arrangement of positively charged residues and the rearrangement of the extracellular side in response to the binding of a negatively charged agonist molecule indicate the evolution of a specific mechanism for specific ligand recognition. The full functional implications of this charged network and structural rearrangement remain to be fully explored.
The structure of the P2Y 1 receptor, another platelet receptor involved in platelet aggregation, has also revealed multiple binding sites (40). The nucleotide antagonist MRS2500 binds to a site at the top of the transmembrane domain between TM6 and TM7 but also involving the N terminus and extracellular loop 2. This site is distinct from the nucleotide binding site found in P2Y 12 , which sits deeper in the receptor. Interestingly, previous mutation studies on P2Y 1 have indicated that some antagonists may bind deeper in the receptor at a site analogous to the P2Y 12 receptor (Ref. 40 and references cited therein). The most unexpected finding of the P2Y 1 structure is the binding site of the non-nucleotide antagonist BPTU, which was found to bind on the outside of the receptor on the lipid interface between TM1, TM2, and TM3 (Fig. 2). This binding site explains the unusual structure-activity relationship of related compounds where binding affinity appears to correlate with lipophilicity. BPTU acts as a negative allosteric modulator accelerating the dissociation of nucleotide agonists (40).
The lipophilic agonist for the GPR40 (FFA1) receptor TAK875 also has an unusual binding site that is partly outside the helical bundle (41). This compound was found to bind partly in the perceived orthosteric binding site but extends through TM3 and TM4 to the lipid membrane. Similar to the CRF 1 antagonist, this ligand is also considered to enter the receptor via the lipid bilayer rather than directly from the extracellular domain. An entry route from the lipid membrane has also been proposed based on the S1P 1 structure between TM1 and TM7 because normal access to the pocket is excluded by the N terminus and extracellular loops (42) (Fig. 2). It seems likely that other lipid ligands may enter via similar routes, and indeed that has also been suggested for the cannabinoid receptor family, although this has not yet been confirmed by a structure (43).

Structure-based Design Techniques Applied to GPCRs
The availability of high resolution x-ray structures of GPCRs has provided the opportunity to apply structure-based methods to the design of drugs. Such methods are now well established Receptors are depicted in rainbow spectrum starting with TM1 in blue and ending with TM7 in red. In the P2Y 12 receptor, pocket 1 is formed between TM3 and TM7, whereas pocket 2 is formed between TM1-3 and TM7. P2Y 12 structure with 2MeSADP shows closed lid conformation. For clarity, portions of the extracellular loops have been removed in the PAR1 structure.
for soluble enzyme targets but are only now being applied to membrane proteins such as GPCRs. Starting points for GPCR chemistry projects have until recently been cell-based HTS campaigns. However, for many targets of current interest, such as lipid and peptide receptors, these tend to have a poor hit rate. In addition, HTS methods bias ligand selection toward higher potency compounds, which often have high molecular weight and undesirable physicochemical properties. In the absence of structural information, lead optimization can be difficult. An alternative method is to use the detailed knowledge of the protein-ligand binding pocket to run a virtual screen whereby vast libraries (e.g. the ZINC database is a free database of ϳ80 million commercially available compounds) are screened by docking compounds into models of the receptor and then scoring the fit using computational programs such as GLIDE (Schrödinger, LLC). The highest scoring hits are then selected for biochemical screening. Virtual screen hit rates of 3-10% have been reported for a range of GPCR targets including the adenosine A 2A receptor (44), histamine H 1 (45), and the chemokine receptor CXCR7 (46).
Characterizations of the binding site (in terms of size, shape, and physicochemical properties, such as lipophilicity and hydrogen bonding) are analyzed to design ligands that are optimized to achieve high affinity binding. It is now clear that water molecules within the receptor are an important consideration in drug design. The deep pockets of GPCRs are filled with water molecules. In the highest resolution structures, crystallographic waters can be resolved, and computational software programs such as WaterMap (Schrödinger, LLC) can be used to predict the positon of water molecules and determine their energetics. Water molecules that are trapped in lipophilic pockets have a high relative energy and have been called "unhappy waters" in comparison with "happy waters" present in bulk sol-vent. GPCR binding sites usually include a number of unhappy waters, and displacement of these by small molecule drugs is energetically favorable, contributing to potent and ligand-efficient binding. The location of water molecules within GPCR structures is important in understanding ligand binding, selectivity, and the design of new compounds (47).
Although the optimization of antagonist ligands is driven primarily by affinity, the application of structure-based approaches to the design of agonist ligands is more challenging. In this case, specific interactions must be made between the ligand and the receptor to trigger the conformational changes associated with receptor activation as described above. Compounds that bind preferentially to the agonist conformation will stabilize this form of the receptor and alter the equilibrium between agonist and antagonist forms. The availability of x-ray structures in the active conformation allow this to be modeled, although a deeper understanding of the structural basis of efficacy will require many more structures in complex with agonists of different levels of efficacy. For now, it is important to use structural information in conjunction with data obtained from cell-based functional assays to guide compound selection in agonist projects.
There is an increasing interest in the development of allosteric modulators as drugs directed at GPCRs. These may have improved selectivity and therapeutic index when compared with orthosteric ligands (48). Novel x-ray structures with allosteric modulators bound have now been solved for class A muscarinic M 3 receptor (23), class B CRF 1 receptor (35), and class C mGlu 1 and mGlu 5 receptors (49,50). Of particular interest was the x-ray structure of the M 3 muscarinic receptor simultaneously bound to the orthosteric agonist iperoxo and the positive allosteric modulator LY2119620 found in the extracellular vestibule of the receptor (23). The extracellular vestibule has been FIGURE 2. Surface representation of GPR40, ␤ 2 -adrenergic receptor and S1P1 receptor in complex with the indicated ligands. The top panel shows the extracellular faces of the receptors; in contrast to the ␤ 2 -adrenergic receptor, the route to the ligand binding site GPR40 and S1P1 is occluded from top. The bottom panel shows the side view of the GPR40 and S1P1 receptors where the ligand binding site is clearly visible. It is thought that in these receptors with lipophilic ligands, the route to ligand binding is through the membrane bilayer.
suggested as an entry point on the receptor for the binding of the endogenous ligand acetylcholine as well as drugs targeted at these receptors (51). For example, the orthosteric muscarinic antagonist tiotropium is predicted by molecular dynamics simulations to bind to an allosteric site in a metastable binding form on its way to bind to the orthosteric site in the receptor. Interestingly, differences in binding at the site between M 2 and M 3 receptors may contribute to the different kinetics that tiotropium shows at these receptor subtypes (51).
Biased agonists are another highly active area in the field of GPCR drug discovery (52). As yet, the structural basis of bias remains to be elucidated and will rely on solving structures in the presence of other signaling molecules such as ␤-arrestin as well as getting multiple co-structures with ligands showing different biases.
Drugs that have been identified using structure-based methods are now progressing to clinical trials (53). It is anticipated that as has been shown for soluble targets, the success rate of these compounds progressing through the different stages of development should be higher than those obtained by more empirical methods.
Author Contributions-All authors contributed to the planning, discussions, and writing of the manuscript.