Rearrangement of the Extracellular Domain/Extracellular Loop 1 Interface Is Critical for Thyrotropin Receptor Activation*

The thyroid stimulating hormone receptor (TSHR) is a G protein-coupled receptor (GPCR) with a characteristic large extracellular domain (ECD). TSHR activation is initiated by binding of the hormone ligand TSH to the ECD. How the extracellular binding event triggers the conformational changes in the transmembrane domain (TMD) necessary for intracellular G protein activation is poorly understood. To gain insight in this process, the knowledge on the relative positioning of ECD and TMD and the conformation of the linker region at the interface of ECD and TMD are of particular importance. To generate a structural model for the TSHR we applied an integrated structural biology approach combining computational techniques with experimental data. Chemical cross-linking followed by mass spectrometry yielded 17 unique distance restraints within the ECD of the TSHR, its ligand TSH, and the hormone-receptor complex. These structural restraints generally confirm the expected binding mode of TSH to the ECD as well as the general fold of the domains and were used to guide homology modeling of the ECD. Functional characterization of TSHR mutants confirms the previously suggested close proximity of Ser-281 and Ile-486 within the TSHR. Rigidifying this contact permanently with a disulfide bridge disrupts ligand-induced receptor activation and indicates that rearrangement of the ECD/extracellular loop 1 (ECL1) interface is a critical step in receptor activation. The experimentally verified contact of Ser-281 (ECD) and Ile-486 (TMD) was subsequently utilized in docking homology models of the ECD and the TMD to create a full-length model of a glycoprotein hormone receptor.


The thyroid stimulating hormone receptor (TSHR) is a G protein-coupled receptor (GPCR) with a characteristic large extracellular domain (ECD). TSHR activation is initiated by binding of the hormone ligand TSH to the ECD. How the extracellular binding event triggers the conformational changes in the transmembrane domain (TMD) necessary for intracellular G protein
activation is poorly understood. To gain insight in this process, the knowledge on the relative positioning of ECD and TMD and the conformation of the linker region at the interface of ECD and TMD are of particular importance. To generate a structural model for the TSHR we applied an integrated structural biology approach combining computational techniques with experimental data. Chemical cross-linking followed by mass spectrometry yielded 17 unique distance restraints within the ECD of the TSHR, its ligand TSH, and the hormone-receptor complex. These structural restraints generally confirm the expected binding mode of TSH to the ECD as well as the general fold of the domains and were used to guide homology modeling of the ECD. Functional characterization of TSHR mutants confirms the previously suggested close proximity of Ser-281 and Ile-486 within the TSHR. Rigidifying this contact permanently with a disulfide bridge disrupts ligand-induced receptor activation and indicates that rearrangement of the ECD/extracellular loop 1 (ECL1) interface is a critical step in receptor activation. The experimentally verified contact of Ser-281 (ECD) and Ile-486 (TMD) was subsequently utilized in docking homology models of the ECD and the TMD to create a full-length model of a glycoprotein hormone receptor.
Glycoprotein hormones (GPHs) 3 normally regulate crucial processes in metabolism and reproduction by activating GPHRs. This is especially true for TSHR, which can cause several clinically relevant conditions like hypo-and hyperthyroidism when it malfunctions. Yet the mechanism of how extracellular ligand binding induces the structural changes required for intracellular G protein activation is unknown. We pursued an integrated structural biology approach using modeling guided by experimental data to generate experimentally supported full-length TSHR models. It is expected that some insights gleaned from a TSHR model can be generalized to other GPHRs. These models in turn create testable hypotheses on the mechanism of GPHR activation and can promote drug development to treat GPHR-associated diseases.
GPHs bind to the ECD of their respective receptors (Fig. 1A) and consequently initiate activation, which is presumably propagated by induction of conformational changes within the ECD's hinge region (HR) (1)(2)(3)(4). Interestingly, GPHRs still possess a binding site within the TMD not associated with physiological receptor activation but accessible to low molecular weight agonists and allosteric modulators (5)(6)(7). Another important aspect of GPHR function and physiology is posttranslational modification, including disulfide bond formation, glycosylation, tyrosine sulfation, and proteolytic cleavage with the latter only occurring during maturation of the TSHR (for review, see Kursawe et al. (8)). However, there is no evident physiological requirement for proteolytic excision of the ϳ50 amino acid C-peptide, with a deletion variant showing similar characteristics to the wild type (WT) receptor (9). In contrast, glycosylation and sulfation are obligatory, with the latter being an indispensable feature of specific hormone binding (8,10). The structure of the ECD of the follicle stimulating hormone receptor (FSHR, a member of the GPHR subfamily) in complex with FSH (11) showed that the ECD forms a continuous handshaped structure. In this the C-terminal HR does not form a separate structural entity as previously anticipated but, rather, comprises the last two ␤-sheets of the LRR-fold, an ␣-helix as well as the "thumb" region including the sulfation located at the interface to the hormone. Despite these invaluable insights on ligand binding and specificity, many details about GPHR activation are still elusive. These include the potential role of the HR residues with unresolved electron density, the significance of receptor oligomerization, and negative cooperativity in hormone binding (4). A major obstacle in understanding GPHR activation has been the lack of an atomic detail model, particularly one that defines the relative orientation of ECD and TMD, identifies interacting residues at the interface, and illustrates the structural changes upon ligand binding within the HR and the interface.
In pursuit of a full-length structural GPHR model we implement an integrated computational/experimental approach. Chemical cross-linkers (XL) of a defined maximal length react intra-or intermolecularly with two functional groups on the protein surface. After enzymatic digestion, the resulting fragments are identified by mass spectrometry (MS). Based on the spacer lengths, an approximate upper boundary for the distance is derived and employed as a restraint for the structural models (12). This approach is limited to the soluble ECD because of difficulties purifying a functional, full-length TSHR, even in the very low quantities needed for cross-linking experiments (11,13). Therefore, we additionally use double-mutant cycle analysis and disulfide cross-linking to assess the direct contact between amino acids at the ECD/TMD interface (14,15). Even though resulting structural restraints are sparse, they are sufficient to build structural models for the full-length TSHR with the Rosetta software suite (16). These models provide insights to TSHR activation. Specifically, we predict the relative orientation of ECD and TMD, potentially important contact points at the ECD/TMD interface, and the conformational changes necessary for receptor activation. The high sequence conservation of the investigated region within the GPHR subfamily as well as studies on chimeric receptors (17,18) suggest a shared activation mechanism with receptor-specific interactions. The reported approach can, therefore, be  . Schematic representation and identified cross-links of the TSHR⅐TSH complex. A, schematic representation of the TSHR⅐TSH complex including disulfides, cross-linked residues identified by mass spectrometry and significant residues of the TSHR, including residues with reported constitutively activating mutations (Ser-281, Ile-486, Ile-568), the sulfation site (Tyr-385), and boundaries of the model within the HR (Phe-381, Ser-304). The respective spacer-length of the cross-linking reagents is specified in the figure legend. B, boxplot of C␤ distance distribution between residues connected by chemical cross-linking within the homology models of the TSHR-ECD⅐TSH complex. The employed cross-link-specific cutoff distance (Table 1) is indicated by a dashed line in gray.
Cross-links 8 and 9 include one residue located in the part of the HR not included within the models. For these, the distance to the closest residue included in the models is reported, and the missing residues are considered in the cutoff distance. C, cross-links (green lines) between the hinge region of the TSHR (blue) and the hormone (red, ␣-chain; yellow, ␤-chain) suggest that the HR, including the part not resolved in the FSHR-ECD/FSH template and, therefore, not included in the homology models, is oriented toward the hormone and most likely also contributes to ligand binding.
expanded to the remaining GPHRs and provide new insights into similarities as well as receptor-specific features of GPHR activation.

Experimental Procedures
Purification of the Soluble TSHR-ECD-A soluble TSHR-ECD with a 10-histidine tag and a glycosylphosphatidylinositol (GPI) anchor (TSHR_ECD10HisGPI) was expressed and purified as previously described (13). Briefly, the gene was stably transfected into CHO Flp-In™ cells (Thermo Fisher Scientific, Waltham, MA) according to the manufacturer's instructions. Purification was performed by liquid chromatography at 4°C with a nickel-Sepharose high performance affinity column (HisTrap HP 5 ml; GE Healthcare). After column equilibration the sample was applied (flow rate: 0.5 ml/min), and the collected fractions were tested for presence and purity of the soluble ECD by SDS-PAGE followed by Coomassie staining or Western blotting with anti-TSH receptor antibody (A9, Abcam). Fractions containing the ECD in sufficient quantity and purity (Ͼ70%) were combined, concentrated, and bufferexchanged with PBS with a centrifugal concentrator (Corning Spin-X UF 20 ml, molecular weight cutoff 10).
Cell Culture, Transient Expression, and Characterization of Wild Type and Mutant Full-length TSHR-Mutations were introduced into the human TSHR gene and tagged with an N-terminal hemagglutinin tag in a pcDNA3.1(-)/hygromycin vector via site-directed mutagenesis as described previously (6). COS-7 cells were then transiently transfected with the WT and mutated vectors using the GeneJammer transfection reagent (Stratagene, Amsterdam, The Netherlands). Functionality of expressed TSHR variants was evaluated as described previously (19,20) by determining cell surface expression, specific binding of bovine TSH (bTSH, National Hormone and Pituitary Program of the NIDDK, National Institutes of Health), basal-and bTSH (30 milliunits)-induced cAMP accumulation, and linear regression analysis (LRA) of basal cAMP accumulation versus cell surface expression.
The G q/11 activation was determined in HEK GT cells by cotransfection of the vectors with a reporter vector harboring the firefly luciferase gene under the control of the nuclear factor of activated T-cells (NFAT) transcription factor (pNFAT-Luc, Agilent Technologies, Santa Clara, CA). 48 h after transfection cells were stimulated for 4 h with bTSH (30 milliunits) and lysed with Luciferase Cell Culture Lysis Reagent (Promega, Madison, WI). Luciferase activity was determined as described previously by Hampf and Gossen (21).
The cross-linked proteins were deglycosylated with 250 units of peptide N-glycosidase F according to the manufacturer's instructions. The samples were subsequently separated by gradient SDS-PAGE (4 -12%). Bands at the positions corresponding to the molecular weights of TSHR-ECD, TSH, and the complex were excised, and samples were reduced, alkylated, and digested in-gel using trypsin.
Surface Plasmon Resonance-Surface plasmon resonance was performed on a T100 (Biacore, Uppsala, UC, Sweden). Recombinant TSHR-ECD was amine-coupled on a CM3-Chip following standard procedures. The final protein loaded amounted to 210 relative units. Experiments were conducted for eight different ligand concentrations (1500, 500, 166.67, 55.56, 18.52, 6.17, 2.06, and 0 nM) at a flow rate of 30 l/min and 25°C. Contact time of the ligand was 300 s followed by 800-s dissociation time. The regeneration was performed using 2.5 M NaCl in HBS-EP (GE Healthcare) for 30 s followed by 200 s for stabilization. Data analysis was performed using Sigma Plot 12.0 (Systat Software Inc, Bangalore, Karnataka, India) and Biacore T100 evaluation Software 2.03.
Nano-HPLC/NanoESI-LTQ Orbitrap XL ETD MS-Samples were prepared in 0.1% formic acid, injected in a NanoAcquidity UPLC, trapped, and desalted for 10 min on a C 18 trapping column (nanoACQUITY symmetry trapping column, Waters) with a constant flow of 15 l/min and 2% acetonitrile. After 8 min the peptides were eluted and separated on a C 18 reverse phased column (ACQUITY UPLC Peptide BEH C18 nanoAC-QUITY, Waters) using a linear acetonitrile gradient (8 -45%) over 85 min or 140 min (Waters Corp., Milford, MA) at a flow rate of 300 nl/min. The HPLC system was coupled online to a mass spectrometer via a chip-based nanoESI source (TriVersa NanoMate, Advion, Ithaca, NY). The spray voltage was set to 1.6 -1.8 kV, and the capillary was heated to 250°C. MS/MSscans were triggered automatically after each full scan (m/z range of 400 -2000, resolution of 60,000, 1 microscan, and 5 ϫ 10 5 ions accumulated) for the 6 or 10 highest abundant precursor ions, exceeding an intensity of 10 3 and a charge state of Ն2. The employed lock mass for online recalibration was 445.1200 m/z. Furthermore, the instrument was set to exclude ions from a dynamic exclusion list (500 entries) with a maximal retention period of 60 s and a relative mass window of Ϯ3 Da for MS/MS scans. Fragmentation of selected precursor ions Ϯ4 Da was caused by collision-induced dissociation with ramped normalized collision energy of 37 Ϯ 15 (three steps). Activation Energy (Q) was set to 0.250 with an activation time of 30 ms. The automatic gain control target was set to 8000 ions, and the fragment analysis took place in the ion trap.
Nano-HPLC/NanoESI-Orbitrap Fusion Tribrid MS-Samples were prepared in 0.1% formic acid, injected in an UltiMate 300 HPLC, trapped, and desalted for 8 min on a C 18 column (Acclaim PepMap100) with a constant flow of 5 l/min and 2% acetonitrile. Afterward peptides were eluted and separated on a C 18 separation column (Acclaim PepMap RSLC column) using a linear acetonitrile gradient (8 -45%) over 80 or 130 min (Dionex Corp., Sunnyvale, CA) at a flow rate of 300 nl/min. The HPLC system was coupled online to a mass spectrometer via a chip-based nanoESI source (TriVersa NanoMate, Advion). The spray voltage was set to 1.7-1.8 kV, and the capillary was heated to 275°C. MS/MS-scans were triggered automatically after each full scan (m/z range of 350 -2000, a resolution of 60,000, 1 microscan, and 5 ϫ 10 5 ions accumulated) using a top speed decision tree (5-s cycle time) setting the highest priority for the highest charge state followed by the highest abundance. Precursor ion intensity was required to exceed 2 ϫ 10 3 , and the charge state was restricted to a range of 2-7 m/z. The employed lock mass was 445.1200 m/z. The instrument was set to exclude ions from a dynamic exclusion list with a maximal retention period of 15 s and a relative mass window of Ϯ20 ppm for MS/MS scans. Fragmentation of selected precursor ions Ϯ4 Da was caused by higher energy collision dissociation with stepped normalized collision energy of 35 Ϯ 10. The automatic gain control target was set to 5,000 ions. Fragment ions were detected in the Orbitrap at a resolution of 15,000.
Molecular Modeling of the Full-length TSHR-The homology model of the TSHR-ECD in complex with bovine TSH was generated using Rosetta 3 (16,24). Briefly, homology modeling was based on the structure of the FSHR-ECD in complex with FSH (Ref. 11; PDB ID 4ay9). In addition, sections of the LRR domain were replaced by the coordinates of the TSHR-LRR domain (Ref. 25; PDB ID 2xwt) after superimposing the residues at the junctions (cut after Leu-57 or Ser-234 of the TSHR-LRR domain). The protein sequences of the TSHR-ECD and TSH were subsequently aligned to the structural coordinates of the template structures. For each template a set of 2000 models (150 for the FSHR template) was built, reconstructing backbone coordinates in gapped regions of the alignment using the cyclic coordinate descent (CCD) protocol followed by a relaxation of the structures after side-chain coordinates were added from a rotamer library. The structures were clustered using Calibur (26). In addition C␤ distances for each model were determined with Rosetta's contactMap protocol (20).
Homology models of the TSHR-TMD were generated with the RosettaCM protocol as described by Song et al. (27). Homology modeling was performed for 20 templates of class A GPCRs considered to be in an inactive conformation (PDB codes 4n6h, 2rh1, 3uon, 4ej4, 4eiy, 1u19, 3rze, 4bvn, 4dkl, 2z73, 4u15, 4djh, 4ea3, 3v2y, 4mbs, 4ib4, 3pbl, 4ntj, and 3odu) and seven templates considered to be in an active conformation (4lde, 4mqs, 2ydv, 4j4q, 2y00, 3ayn, 4iar). A sequence and structure-based alignment of the templates was performed with the Molecular Operating Environment (MOE, 2012.10; Chemical Computing Group Inc., Montreal, QC, Canada) with manual adjustment of the alignment removing gaps within the core TM regions. For each template set 5000 models were generated, and the resulting models were clustered with Calibur both as separate sets and combined.
All homology models were evaluated based on energy and cluster size. The best scoring model from each of the 5 largest ECD model clusters was subsequently docked to the three best scoring models from each of the 10 largest TM domain clusters. Before the docking run the C-terminal loop segment of the ECD, which most likely adopts an unrepresentative conformation in the models due to the missing TMD, and the ligand bTSH were removed. In the initial placement of the two partners the ECD was placed arbitrarily in an upright position above the interface with the TMD. The initial perturbation included a random spin between 0 and 360°around, and a random tilt between 0 and 90°along the sliding axis (roughly parallel to the membrane normal). For this purpose the tilt option was implemented and incorporated into the Rosetta Software suite, allowing a random tilt within a predefined limit during the initial perturbation step of the docking protocol (28). During docking a cross interface disulfide between Cys-284 and Cys-408 was enforced. Furthermore, the low resolution step of the docking protocol was repeated until the C␤ distance between Cys-284 and Cys-408 was Ͻ15 Å. For each ECD/TMD combination 1000 models were built. The Ser-281/Ile-486 C␤ distance for each model was determined with Rosetta's contact-Map protocol and the interface energies with the InterfaceAna-lyzerMover (29). For remodeling of the linker region (Lys-401-Ile-411) two sets were selected; (i) all models of the best 100 by dG separated with a Ser-281-Ile-486 C␤ distance of Ͻ15 Å (57 models) and (ii) all models with dG separated ϽϪ6 and a Ser-281-Ile-486 C␤ distance of Ͻ10 Å (41 models). For each of these, 25 loop models were generated with a subsequent relaxation step. The resulting models were again clustered with Calibur. The best scoring models of the 10 largest clusters have been deposited at the model archive (Model Archive database; 10.5452/ma-aptif). Contact maps were generated for each cluster as well as for all models with a C␤ distance cutoff of 8 Å. For the best scoring structure of each cluster the position of the hormone and the position and environment of the sulfated tyrosine residue recapitulated those of the initial homology model of the extracellular domain. The junctions were remodeled (25 decoys) followed by relaxation of the entire structure. Visualization and image generation was done using the PyMOL Molecular Graphics System (Version 1.5.0.4 Schrödinger, LLC).

Strategy for TSHR Structure Prediction Based on Chemical Cross-linking and Mutation Data
Structural modeling of the full-length TSHR was based on structural templates resolved by x-ray crystallography of the GPHR-ECD (11,25) as well as the TMD of class A GPCRs (Fig.  2). Experimental data from chemical cross-linking of the soluble TSHR-ECD with bTSH was incorporated to guide and eval-

Full-length TSHR Models
uate the homology modeling of the TSHR-ECD⅐TSH complex. A number of class A GPCR experimental structures were incorporated into homology modeling of the TSHR-TMD by utilizing the multiple template approach of RosettaCM (27). The models of the ECD and the TMD were combined by docking with subsequent remodeling of the linker region. In this step structural flexibility of the interfaces was incorporated by combining various homology models of the ECD and TMD during docking. The putative contact of the ECD residue Ser-281 with the TMD was identified and verified by double mutant cycle analysis. This contact was used to guide and evaluate placement of the ECD in relation to the TMD during docking along with  (E1 and E2). Homology models were constructed using Rosetta 3 for the TSHR-ECD (A1 and A2) and the multitemplate approach of RosettaCM for the TSHR-TMD (B1). Chemical cross-linking of the soluble TSHR-ECD yielded 17 cross-links that were used to guide template selection and evaluate the models of the TSHR-ECD (E1). The model sets were further analyzed by clustering analysis using Calibur (A3 and B2). Models were selected based on energy and cluster size. The combination of 30 TSHR-TMD models with 5 TSHR-ECD models by docking yielded 150,000 docked models (C1). During docking a cross-interface disulfide between Cys-284 and Cys-408 was enforced. From the docked poses ϳ100 models were selected based on interface score and agreement with the experimentally verified contact of Ser-281 with Ile-486 (E2) for reconstruction of the linker region (Lys-401-Ile-411, C2). The model set of the full-length TSHR was further analyzed by contact maps (D1) and clustering (D2). Feasibility of the full-length models was verified by reintroduction of the ligand and remodeling of the thumb region (D2).

Full-length TSHR Models
the cross-interface disulfide (11,30). To gain information on frequently occurring ECD/TMD orientations and specific interface contacts, the final ensemble of models was analyzed by clustering and contact maps. Plausibility of the most frequent ECD/TMD orientations was verified by reintroduction of the ligand into the models.

Mass Spectrometry Analysis Confirms Glycosylation, Sulfation, and Proteolytic Cleavage of the Extracellular Domain
Glycosylation-Mass spectrometric analysis of the soluble TSHR-ECD after tryptic digestion identified fragments covering 80% of the protein sequence of the utilized construct. The analysis of glycosylation sites revealed complete glycosylation of three of the six putative sites within the ECD, at Asn-77/-113/-177 and a partial glycosylation of Asn-302 (31). Due to the absence of detected proteolytic peptides covering the remaining two sites (Asn-99/-198), glycosylation of these sites could not be determined by MS.
Sulfation-Sulfation of the TSHR-ECD was identified at position Tyr-385, as suggested by Costagliola et al. (10) as well as at position Tyr-387. Sulfation was typically identified at a single site or at both sites simultaneously; the peptide representing the non-sulfated form of TSHR was rarely observed. However, the mutagenesis data clearly show that the functional importance of tyrosine sulfation is exclusively attributed to Tyr-385, with no functional compensation by Tyr-387. However, whereas sulfation of Tyr-385 and Tyr-387 was determined in a truncated ECD, functional data were gathered from the full-length receptor. Given that secondary structure supposedly has a major influence on sulfation (32) and with the structural influence of the TMD on the HR (17), there might be a discrepancy between sulfation of the truncated and the fulllength TSHR, with sulfation of Tyr-387 occurring only in the truncated receptor.
Proteolytic Cleavage-Wadsworth et al. (33) suggested that residues Ala-317-Phe-366 are posttranslationally removed with no apparent effect on TSHR function. Analysis of proteolytic cleavage of the TSHR by MS confirmed C-terminal cleavage between position Phe-366 and Gly-367 by detection of a proteolytic peptide (Gly-367-Lys-371). Yet no peptide confirming the N-terminal cleavage site between Asn-316 and Ala-317 was detected. Recent studies suggest that excision occurs by successive removal of small fragments, resulting in ragged boundaries (34 -36).

Homology Models of the TSHR-ECD⅐TSH Complex Consistent with Chemical Cross-linking Data
The TSHR-ECD⅐TSH complex was studied by chemical cross-linking and MS using four different amino-reactive crosslinking reagents with differing spacer lengths. Before crosslinking the tight binding of bovine TSH to the TSHR-ECD was verified by surface plasmon resonance spectroscopy. Steadystate analysis indicates a two-site binding model as previously reported (13). Seventeen unique distance restraints could be determined within the TSHR-ECD⅐TSH complex ( Fig. 1 and Table 1). These included nine receptor-hormone cross-links, two cross-links between the subunits of the hormone, and three within the receptor and the ␣-chain of the hormone, respec-

TABLE 1 Cross-linked peptides of the TSHR-ECD⅐TSH complex identified by mass spectrometry and resulting estimated cutoff distances for structural modeling
The suffixes after residue numbers indicate the component of the complex to which the residue belongs (a, ␣-chain of TSH; b, ␤-chain of TSH; r, receptor). The covalently cross-linked residue of the peptide is highlighted in bold. If the cross-linking reaction occurred with the protein N-terminal amino group, a bracket is added. In cases where the data are ambiguous, the less likely cross-linking sites are underlined. Modified residues are annotated in the sequence (D*, deaminated asparagine; m, oxidized methionine; B, cysteine acetamide). u, unified atomic mass.

Full-length TSHR Models
tively. A comparative model of the TSHR-ECD⅐TSH complex was constructed using the experimentally determined structure of the FSHR-ECD⅐FSH complex (PDB 4ay9; 11) as template. The N-terminal residues for this model up to Leu-57 (FSHR) were taken from the TSHR-LRR domain structure (PDB 2xwt; Ref. 25). The majority of the XL-MS restraints are consistent with this comparative model (Fig. 1B). Specifically, of the 1800 models in the ensemble (top 90% by score), 99% fulfill 12 or more of the 17 cross-links. Two cross-links are violated in all models (Fig. 1, IDs 3 and 4), an effect that we attribute to a conformational change induced by the cross-linker (read below). If a protein exists in multiple conformations, it is sufficient if one conformation has the amino acids in close proximity to observe the cross-link. In turn, not all conformations need  which are assumed to be present in multiple conformations. Accordingly, we expect these to be violated in a higher fraction of the models. Cross-links within the hormone (IDs 6, 7, and 10) or the receptor (IDs 1, 11, and 15) confirm the general fold of the hormone and the ECD (Fig. 3A). Cross-links between the hormone and receptor close to loop 1 and 3 (␣-L1/3) of the hormone's ␣-subunit (IDs 8 and 13) as well as to ␣-L2 at the opposite side of the hormone (IDs 2, 12, and 16) confirm a similar binding mode of bTSH as reported for the FSHR-ECD⅐FSH experimental structure (11, 37) (Fig.  3B).  A B FIGURE 5. Structural variability at the ECD/TMD interface in homology models of the TSHR. Shown is the superposition of the best scoring homology models of the largest clusters for the TSHR-ECD (A) and the TSHR-TMD (B). The ECD models are structurally similar at the putative TMD interface located at the terminal ␣-helix excluding the connecting loop (depicted in orange), which was removed before docking. The models of the TMD, in contrast, show greater variations in the putative interface at the extracellular loops (light orange, ECL1; yellow, ECL2; white, ECL3).

Structural Plasticity in the Curvature of the LRR Domain
Interestingly, two cross-links (Fig. 1, IDs 2 and 17) exceeded the expected maximal C␤ distance of the crosslinking reagent based on an initial model of the TSHR-ECD⅐TSH complex from the structure of the TSHR-LRR domain (PDB 2xwt; Ref. 25) up to Ser-234 (Ser-226 FSHR) and the HR of the FSHR-ECD⅐FSH structure. Superimposition of the two employed templates revealed a reduced curvature of the TSHR-LRR domain at the transition region of the templates, which results in an increased distance between the hormone and the N-terminal section of the receptor (Fig. 3F). These models also display less favorable C␤ distances for three other cross-links (IDs 3, 12, and 14). A steeper curvature was also observed in the structure of the FSHR-LRR domain (PDB 1xwd; Ref. 37). Therefore, the differences in curvature are most likely sequence specific (38) and not due the inclusion of the HR.

Conformation of the TSHR-HR
Cross-links between the receptors HR and the TSH hormone (IDs 8, 9, and 13) confirm a significant interface between the HR with the hormone that could be important for signal transduction (Fig. 1C). It has previously been suggested that a part of the HR, including the region that is subjected to proteolytic cleavage within the TSHR, is intrinsically disordered (39). This hypothesis is supported by the FSHR-ECD crystal structure (PDB 4ay9; Ref. 11), where no density was observed for the respective region.

A TSHR-ECD⅐TSH-specific Interaction between TSHR Glu-34 and TSH ␤-Chain Lys-101
Visual inspection of the best scoring models also suggests a potential TSHR-specific interaction at the N-terminal end of the LRR domain due to spatial proximity of the side chains of Glu-34 of the TSHR with Lys-101 of the TSH ␤-chain observed

Full-length TSHR Models
in two models (Fig. 3E). Interestingly a TSHR mutation of Glu-34 (E34K) has been detected in patients with hypothyroidism (40). However, with no detailed binding data and only a slight impairment of G s signaling reported, the putative contribution of an Glu-34/Lys-101 interaction to binding affinity and specificity is most likely only marginal.

Spatial Proximity between the TSH ␣-Chain N Terminus and TSH ␤-Chain Lys-101
Next to this interaction with the receptor a cross-link between Lys-101 and the N terminus of the ␣-chain was detected (ID 5) implying a close proximity of both termini (Fig.  3C). Yet the residues are within the expected distance in only 2% of the models. Because the N-terminal amino acids of the ␣-chain are not ordered in any of the crystallographic structures of the human GPHs (11,37,(41)(42)(43)(44), this region is expected to be flexible. The bovine GPH ␣-chain features four additional amino acids, possibly increasing flexibility in the region (45).

Cross-links ID 3 and 4 Are Violated in All Comparative Models
DST yielded two further cross-links to the ␣-L2 (IDs 3 and 4) that are incompatible with all models (Fig. 3D). In contrast to cross-link 5, the connected residues are in structurally well defined regions. However, conformational changes in ␣-L2, including a disintegration of the helical fragment potentially induced by the coupling of DST, could be sufficient for the cross-link to be established. Alternatively, binding of the hormone to a second, low affinity binding site as suggested previ-ously (13,46,47) could also be associated with a closer proximity of the cross-linked residues. A third explanation for the controversial cross-links is the possibility that the cross-link is not established between the hormone and the ECD it is bound to but, rather, with the ECD of the adjacent ECD-hormone complex in the putative trimer structure (48). However, analysis of this scenario reveals that the C␤ distance to the HR within the ECD-hormone complex does not differ much from the distance to the HR in the adjacent complex. Yet, side-chain orientation and surface distance are more favorable for a cross-link to the HR of the adjacent ECD⅐hormone complex (Fig. 3G).

Identification and Verification of an ECD/ECL1 Contact between Ser-281 and Ile-486 by Double Mutant Cycle Analysis
It has been demonstrated that distant mutations result in synergistic receptor activation (49), and mutations in close proximity result in a more complex pattern dependent on the side-chain substitutions (20).  (Table 2). The absence of an additive effect suggests a shared leverage point of constitutive receptor activation and close spatial proximity. Targeted mutation of both residues to cysteines resulted in a receptor devoid of hormoneinduced activation of the G s and G q signaling pathway ( Fig. 4 and Table 2). Even though the ligand binding properties and cell surface expression of the double mutant are within the range of the single mutants, only the latter show ligand-induced receptor activation. Based on these observations the missing change in activity of the double mutant upon ligand binding is putatively caused by the presence of a disulfide bond between the introduced cysteines. The presence of a disulfide bond in this region critical to receptor activation most likely locks the receptor in a partially activated conformation and thus prevents signal propagation. For this bond to form, the two residues, therefore, have to be in close structural proximity in the receptor. Exchange of both residues to aspartate yields a receptor with retained ligand binding but no ligand-induced activation of G protein signaling. This observation is consistent with the notion that repulsive forces between the negatively charged aspartate side chains prevent signal propagation. Conformational changes at the ECD/ECL1 interface, including a relative repositioning of Ser-281 and Ile-486 are, therefore, a requirement for receptor activation (Table 2).

Thr-490 and Ile-568 Are Not in Direct Contact with Ser-281
With the confirmation of the Ser-281/ECL1 contact, we tested whether Ser-281 is proximal to Thr-490 in ECL1 and Ile-568 in ECL2 (49). However, even though combining the two cysteine mutations has a detrimental effect on cell surface expression and ligand-induced G q signaling, no loss of G s activation was observed (Table 2). Therefore, Thr-490 is most likely not in close proximity to Ser-281. Combination of the CAMs S281I (LRA 37) and I568T (LRA 30) is synergistic yielding a receptor with increased constitutive activity (LRA 78) and high levels of basal cAMP production at 92% of the activated Wt receptor level despite a cell surface expression of only 27% compared with the WT. We conclude that S281I and I568T are unlikely to be in direct contact. Notably, combination of the CAM S281I and I568T is associated with no apparent ligandinduced receptor activation despite retained binding affinity. This suggests that the S281I/I568T double mutant adopts the conformation of the fully activated receptor lacking ligand-induced activation. A similar phenotype has been previously associated with full receptor activation (49). In that study muta-tions in all three extracellular loops were necessary, whereas in our case substitutions at the ECD/ECL1 and ECL2/TM6 (55) interface were sufficient to enforce the activated conformation.

Docking of ECD and TMD to Generate a Full-length Receptor Model
Whereas homology models of the ECD showed little structural variations in the presumed interface to the TMD surrounding Ser-281, models of the TMD showed greater flexibility especially in ECL1 (Fig. 5), which is, with 8 -10 additional amino acids, significantly longer than in most class A GPCRs. Accordingly, the five best-scoring, representative ECD models were docked with the 30 best-scoring, representative models of the TMD. Best-scoring representative models were chosen by clustering to mimic a conformational selection process. Cluster Comparison of the score versus C␤ distance of Ser-281 and Ile-486 after clustering (A) shows that the best models based on score and cluster size display a Ser-281/Ile-486 C␤ distance Ͼ10 Å. Differences in contact maps (B) of cluster one and four (blue, contact in every model of cluster one and in none of cluster four; red, contact only in cluster four) for the Ser-281/ECL1 interface showing that the Ser-281/Ile-486 contact is only observed in cluster four (upper black rectangle) and the aromatic environment of Ser-281 including Tyr-481and Tyr-279 is only observed in cluster one (lower black rectangle). Superposition of the best scoring structure of cluster four (green) and the ␤2-adrenergic receptor (white, PDB code 2rh1) with the side chains of the WXFG motif depicted (C) shows that ECL1 of the homology model adopts a similar loop conformation in this region.
centers for the TMD included models derived both from TMD templates in an "active" conformation and from those in an "inactive" conformation. Analysis of the interaction energy of docked models compared with the Ser-281/Ile-486 C␤ distance reveals an energy funnel at a distance of 12.5 Å (Fig. 6A) with a very similar orientation of the ECD toward the TMD (Fig. 6C) in an upright position with Ser-281 facing toward ECL1 (Fig. 7). The best scoring models with a Ser-281/Ile-486 C␤ distance below 10 Å show greater diversity in the relative orientation of ECD and TMD (Fig. 6B). With a comparatively large Ser-281/ Ile-486 C␤ distance, the cluster at 17.5 Å (Fig. 6A) was not considered for further analysis. The addition of the hormone to the full receptor models does not result in an overlap of the hormone with the membrane in any of the structures and thus confirms the plausibility of the observed ECD orientations. To allow a free exploration of ECD/TMD orientations, the flexible linker region between ECD and TMD was constructed after completion of the docking simulation. The superior energy of the model is preserved when compared with models with a Ser-281/Ile-486 C␤ distance below 10 Å (Fig. 8A). Strikingly, in the final model Ile-486 is part of an extended transmembrane helix 3 with the side chain facing away from the interface with the ECD (Fig. 7). In contrast the models with a shorter Ser-281/ Ile-486 C␤ distance lack an extended TM3, enabling a loop conformation with the Ile-486 side chain facing toward the ECD. These models also show a small helical segment within ECL1 similar to the smoothened receptor (Ref. 56; PDB 4o9r) and a conformation of the C-terminal part similar to the WXFG motif present in most class A GPCR structures (WQTG in all three GPHRs) to which a pivotal role in ligand-mediated receptor activation is attributed (57) (Fig. 8B). Mutations of Trp-488 in the TSHR result in a drastically reduced cell surface expression, suggesting a similar importance (49). The comparison of contact maps of the largest and best scoring cluster to the largest cluster with an average Ser-281/Ile-486 distance below 10 Å shows that the first cluster is consistent with placing Tyr-279 and Tyr-481 in the environment of Ser-281 as has been suggested previously (50), whereas the latter displays the experimentally determined close proximity of Ser-281 and Ile-486 (Fig. 8C).

Multiple Conformations Involved in TSHR Activation
It is possible that the two observed ECL1 conformations represent different stages during GPCR activation. The extended, low energy TM3 conformation is similar to the activated state. The loop conformation observed in the fourth cluster would represent the basal state of the receptor. In this scenario Thr-490 is part of the small fragment that changes its conformation between the fold of the WXFG motif and an extended TM3 during activation (Figs. 7 and 8B). This is supported by the observation that substitution at this position to alanine, which has a higher helix probability than threonine (58), can facilitate the transition toward the activated conformation as observed in the CAM T490A. The high conservation of the region surrounding Ser-281 and ECL1 within GPHRs as well as the shared propensity for constitutive receptor activation by mutations of Ser-281 suggests an identical mechanism of activation and a shared ECD/ECL1 interface within GPHRs. The presented modeling approach can, therefore, be easily extended to the remaining two GPHRs. The final ensemble of models offers important insights into the likely mechanism of GPHR activation. By incorporating experimental data from chemical crosslinking coupled with MS fragment analysis and targeted receptor mutation, the quality and relevance of the final model set was significantly increased and enabled the generation of the first experimentally supported full-length models of a GPHR.