Integrated Approaches for Genome-wide Interrogation of the Druggable Non-olfactory G Protein-coupled Receptor Superfamily*

G-protein-coupled receptors (GPCRs) are frequent and fruitful targets for drug discovery and development, as well as being off-targets for the side effects of a variety of medications. Much of the druggable non-olfactory human GPCR-ome remains under-interrogated, and we present here various approaches that we and others have used to shine light into these previously dark corners of the human genome.

G-protein-coupled receptors (GPCRs) are frequent and fruitful targets for drug discovery and development, as well as being off-targets for the side effects of a variety of medications. Much of the druggable non-olfactory human GPCR-ome remains under-interrogated, and we present here various approaches that we and others have used to shine light into these previously dark corners of the human genome.
G-protein-coupled receptors (GPCRs; 2 7-transmembrane domain receptors) historically have represented both the most abundant and the most popular gene superfamily for therapeutic drug discovery and development, with perhaps 30 -40% of approved drugs targeting the non-olfactory GPCRs (1,2). Typical estimates are that, at any given time, between 20 and 40% of candidate medications target nonolfactory GPCRs as their canonical or principal molecular targets (2)(3)(4). Indeed, in 2014, of the 41 new molecular entities approved by the Food and Drug Administration (FDA) (http://www.fda.gov/Drugs/DevelopmentApprovalProcess/ DrugInnovation/ucm20025676.htm), 9 had GPCRs as their canonical sites of action (Table 1). Of the GPCRs targeted by known drugs, the Family A aminergic GPCRs are by far the most popular (2)(3)(4) with aripiprazole, which has multiple biogenic amine GPCR targets and a complex mode of action (5), being the bestseller in 2014. As the olfactory GPCRs do not yet represent therapeutic targets, we will focus on the non-olfactory GPCRs here.
Not only do GPCRs represent the principal therapeutic site of action of many approved and candidate medications, but GPCRs also represent prominent "off-targets" for severe and potentially life-threatening side effects. Of these, drugs with 5-HT 2B serotonin receptor agonism have long been documented to induce severe, life-threatening valvular heart disease (6 -8). Indeed, based on the potent 5-HT 2B agonist activity of certain ergot derivatives used in treating Parkinson disease and migraine headaches (e.g. pergolide, cabergoline, and dihydroergotamine), we correctly predicted that these medications would also induce valvular heart disease (7,8). Two of these drugs (pergolide and cabergoline) were withdrawn from the international market following large-scale trials demonstrating their life-threating side effects (8,9). In follow-up studies, we surveyed 2200 FDA-approved and investigational medications, finding that 27 had potentially significant 5-HT 2B agonism, of which 6 are currently FDA-approved (guanfacine, quinidine, xylometazoline, oxymetazoline, fenoldopam, and ropinirole) (10). Interestingly, of the 2200 drugs screened, around 30% displayed significant 5-HT 2B antagonist activity (10), indicating that 5-HT 2B receptors represent a "promiscuous target" for approved and candidate medications. Our discovery that ergotamine and other ergots displayed functional selectivity for ␤-arrestin over G-protein signaling at 5-HT 2B receptors (10) led to the first structure-based explication of GPCR ␤-arrestin-biased signaling (11). The discovery that the 5-HT 2B receptor was responsible for the side effects of the appetite-suppressing medications fenfluramine and dexfenfluramine (6 -8) was thus a seminal finding of immense public health importance, which ensures that drugs under development will now be counterscreened against the 5-HT 2B receptor for significant agonist activity before being advanced to clinical trials.
Simultaneously with the discovery that the side effects of fenfluramine were due to the 5-HT 2B agonist activity of its main metabolite norfenfluramine (6 -8), it became clear that its therapeutic (anorectic) actions were due to norfenfluramine's agonist activity at the closely related 5-HT 2C receptor (12). This led to the prediction that 5-HT 2C -selective agonists devoid of 5-HT 2B agonist activity would represent safe and effective appetite suppressants (13) and the discovery of the 5-HT 2Cpreferring agonist lorcaserin, which was approved by the FDA as the first new obesity medication in nearly 20 years in 2012 (14 -16). Taken together, this vignette underscores how an understanding of both on-target and off-target actions of drugs at a single subfamily of GPCRs, in this case the 5-HT 2 serotonin receptor family, can be crucial for successful drug discovery efforts.

Chemical Informatics-based Approaches for Genome-wide GPCR-based Discovery
The discovery of small molecule drug-like compounds that interact with GPCRs in a number of ways (e.g. as orthosteric, allosteric, or biased ligands) is now relatively straightforward and will not be reviewed in any detail here as there are a number of excellent and recent review articles (17)(18)(19). As these are important concepts for GPCR drug discovery, however, they will be briefly defined. Thus, orthosteric ligands are those that occupy the site(s) of the native or natural ligand, whereas allosteric ligands occupy a site distinct from the orthosteric site (18,19). Additionally, it is now appreciated that GPCRs signal via ␤-arrestin and that this signaling is frequently independent of canonical G-protein modes of signaling (17). Indeed, drugs that preferentially signal via ␤-arrestin are considered to be ␤-arrestin-biased (17). For the remainder of the review, we will focus on genome-wide approaches for GPCR-based discovery, highlighting both in silico and physical screening approaches for the discovery of novel small drug-like small molecules acting at GPCRs.
In silico approaches for discovering GPCR modulators typically take advantage of large chemical databases that annotate the biological properties of small molecules. Table 2 lists a few of the more popular and widely used databases. Essentially, these databases have large lists of chemical compound names and, typically, their chemical descriptors along with the biological activity associated with these compounds. Most commonly, as in the ChEMBL database and PDSP K i database (KiDB), which rely mainly on published data, the activity is encoded as a K i or EC 50 value, whereas other databases (e.g. ChemBank and PubChem) provide the raw data as well as fitted data parameters. Utilizing the information from such databases, we and our collaborators have successfully predicted novel GPCR targets for known drugs (3,20,21) and have designed novel drugs targeting GPCRs entirely in silico (22). Importantly, in these exemplars of this overall approach, the GPCR-centric predictions were extensively validated both in vitro and in vivo in model organisms such as worms (23), zebrafish (24), mice (3,22), and most remarkably, in humans (21).
All of these resources rely upon accurately curated, precise data and, of the cited resources, ChEMBL and KiDB would appear to be the most useful as the main source of their data is from peer-reviewed publications. ChEMBL historically has drawn its data from medicinal chemistry publications, although the most recent version of ChEMBL also incorporates large amounts of data from PubChem. KiDB obtains its data mainly from non-medicinal chemistry publications (e.g. biochemistry, cell biology, pharmacology, neuroscience, and so on). Examining ChEMBL, which is the largest of these resources, we find that a large number of GPCR targets are under-annotated with respect to both their biological function and the chemical matter with which they may interact (Fig. 1, A and B). As can be seen, at least 50% of the non-olfactory GPCRs in the human genome have had few publications associated with them based on a search of PubMed conducted in mid-2013. Additionally, more than 50% of the non-olfactory GPCRs in the human genome had few annotated small molecules ( Table 2; GPCR Safari ChEMBL release 3.0). Indeed, of the 159 "orphan" GPCRs in the ChEMBL database, only 5 had annotated small molecules with documented bioactivity. Significantly, although ChEMBL is a curated database, it misidentifies the synthetic ligand 3-{4-[4-(2-cyanophenyl)-1-piperazinyl]butyl}-1H-indole-5-carboxamide as the natural ligand for GPR35 (https://www.ebi.ac.uk/ chembl/sarfari/gpcrsarfari/report/protein/266), even though kynurenic acid has been proposed as a naturally occurring ligand for GPR35 (25,26). This example of GPR35 being misannotated illustrates three important points: first, the need for careful expert curation; second, the fact that all of these databases contain a significant number of errors that could lead investigators astray; and third, the value of orthogonal (i.e. assays for which the readouts are independent) assays to validate "hits" and presumed active compounds.
In Fig. 2, we show that most of the non-olfactory human GPCR-ome is un-interrogated with respect to the chemical matter as annotated in ChEMBL. The practical impact of this is that, when using a database such as ChEMBL for predicting onand off-target actions of small molecules, most of the GPCRome is hidden from a cheminformatics perspective. GPCRs are not unique in that most of them are understudied, as a similar conclusion was reached for kinases a few years ago (27). Indeed, Isserlin et al. (28) have described what they have dubbed the "Harlow-Knapp (H-K) effect," which they define as: "the propensity of the biomedical and pharmaceutical research communities to focus their activities, as quantified by the number of publications and patents, on a small fraction of the proteome." Isserlin and colleagues (27,28) noted that this was true for the targets they studied (kinases, nuclear hormone receptors, and ion channels) irrespective of whether they confined their bibliographic analysis to the "pre-genomic era" (i.e. prior to the publication of the draft human genome in 2000) or later dates (i.e. 2009). We performed a different type of analysis and re-interrogated the publication records for the druggable, non-olfactory GPCRs in 2014, and compared this with all publications predating 2013. As shown in Fig. 3, there was a similar although not identical trend, with most of the understudied GPCRs still being understudied and the more popular GPCRs continuing this trend. For resources available to interrogate GPCRs from a chemical standpoint, such as PubChem and ChemBank, these databases will essentially supply raw screening data with (in many but not all instances) confirmatory concentration-response curves from which estimates of potency and efficacy are derived. For example, PubChem lists screens for a large number of GPCRs and, from these screens, results for a handful of orphan GPCRs have been published in peer-reviewed journals (29 -31). These published findings have led to the discovery that pamoic acid is a potent agonist for GPR35 via ␤-arrestin signaling (31), as well as the discovery of novel agonists and antagonists for GPR55 (29,30).
As should be clear from the foregoing, cheminformaticsbased approaches can be quite useful for predicting GPCR targets for both known drugs and other small, perhaps druglike, molecules. Because the bulk of the GPCR-ome is relatively uncharted territory, i.e. because very few drug-like small molecules have been identified for a large number of human GPCRs, such studies are necessarily and unavoidably underpowered.

Physical Approaches for Interrogating the GPCR-ome
In the past, we and others have used both radioligand binding and functional assays to elucidate the ligand-based pharmacology of non-orphan GPCRs. This approach, which we dubbed "receptorome screening," and which has been extensively described in prior reviews (17,(32)(33)(34), has led us to a number of important discoveries including: the identification of the -opioid receptor as the site of action of the widely abused hallucinogen salvinorin A (35); the discovery that the 5-HT 2B serotonin receptor is the valvulopathy receptor (6); identification of the remarkably complex pharmacology of antipsychotic drugs (36); large-scale validation of cheminformatics predictions (3,22); identification of GPCR as high affinity off-targets of kinase inhibitors (37)(38)(39); and large-scale validation of computationally docked and crystallography-confirmed binding poses (11, 40 -48) As radioligand-based approaches require radioligands with high specific activity and high affinity for their targets, they are not useful for the vast majority of GPCRs, for which such radioligands are unavailable. Additionally, the physical, informatics,  and infrastructure requirements required to routinely screen more than a few GPCRs simultaneously using radioligand binding assays are beyond the resources of most academic and industrial laboratories. Fortunately, the National Institute of Mental Health's Psychoactive Drug Screening Program (NIMH-PDSP), which is housed in the authors' laboratory, provides screening as a free service to not-for-profit investigators, thereby making this resource available to a large part of the scientific community. Indeed, in the past 5 years, more than 500 investigators world-wide took advantage of the NIMH-PDSP for GPCR profiling of novel and candidate drug-like small molecules. Functional screening methods are an alternative to radioligand binding-based approaches. Unfortunately, there are currently no published approaches suitable for interrogating the entire olfactory and non-olfactory GPCR-ome. Indeed, screening the entire druggable GPCR-ome is technically challenging due to the diverse G-protein-mediated signaling cascades used by GPCRs (e.g. G s , G i , G q , or G 12/13 ). In the past, forced coupling of G s , G i , and G 12/13 G-proteins to a G q -like Ca 2ϩ readout has been frequently used (49,50) to identify ligands for orphan and/or sparsely annotated GPCRs (17,51). Approaches that  rely on native coupling to known G-proteins have been successful in identifying novel and selective ligands for orphan GPCRs (52). Additionally, many GPCRs couple to G 12 and G 13 . Interestingly, the G 12/13 -dependent shedding of a membrane-bound reporter protein (53) has been reported as a potential "universal" approach for both orphan and non-orphan GPCRs.
Other approaches have relied on platforms that take advantage of G-protein-independent ␤-arrestin recruitment because nearly all GPCRs induce arrestin translocation (54). Many methods have emerged to quantify GPCR-␤-arrestin interactions, including high content screening (HCS) (55), bioluminescence resonance energy transfer (BRET) (56), and transcriptional activation following arrestin translocation (TANGO) (57). We have found the TANGO-based approach to be quite useful for chemical interrogation of GPCRs (11,40,43,58,59). Indeed, we have recently devised a genome-wide approach using a TANGO-based readout to screen nearly all of the druggable GPCR-ome in a facile, simultaneous, and parallel manner (47).

Conclusions and Recommendations
As we have shown, although GPCRs represent a useful and important target class for therapeutic drug discovery and biochemical study, most are under-interrogated. In part, this stems from the lack of robust and scalable ways to assess their activities. New technological platforms are becoming available that allow for unbiased interrogation of the druggable GPCR-ome (47), and when these are made freely available, they will likely begin to have a transformative effect on the study of GPCRs. Additionally, because of the "Harlow-Knapp effect," many GPCRs will likely remain understudied despite their potential importance from both a basic science as well as a translational perspective.
Author Contributions-B. L. R. and W. K. K. conceived and wrote the paper. B. L. R. and W. K. K. performed the bibliographic analysis. Both authors approved the results and the final version of the manuscript.