Solution Structure and Characterization of the LGR8 Receptor Binding Surface of Insulin-like Peptide 3*

Insulin-like peptide 3 (INSL3), a member of the relaxin peptide family, is produced in testicular Leydig cells and ovarian thecal cells. Gene knock-out experiments have identified a key biological role in initiating testes descent during fetal development. Additionally, INSL3 has an important function in mediating male and female germ cell function. These actions are elicited via its recently identified receptor, LGR8, a member of the leucine-rich repeat-containing G-protein-coupled receptor family. To identify the structural features that are responsible for the interaction of INSL3 with its receptor, its solution structure was determined by NMR spectroscopy together with in vitro assays of a series of B-chain alanine-substituted analogs. Synthetic human INSL3 was found to adopt a characteristic relaxin/insulin-like fold in solution but is a highly dynamic molecule. The four termini of this two-chain peptide are disordered, and additional conformational exchange is evident in the molecular core. Alanine-substituted analogs were used to identify the key residues of INSL3 that are responsible for the interaction with the ectodomain of LGR8. These include ArgB16 and ValB19, with HisB12 and ArgB20 playing a secondary role, as evident from the synergistic effect on the activity in double and triple mutants involving these residues. Together, these amino acids combine with the previously identified critical residue, TrpB27, to form the receptor binding surface. The current results provide clear direction for the design of novel specific agonists and antagonists of this receptor.

ered and isolated as a cDNA clone from a boar testis cDNA library (1). It is primarily produced by prenatal and postnatal mature Leydig cells of the testis and the thecal cells of the ovary. Its primary structure showed it to be a bona fide member of the insulin-insulin-like growth factor-relaxin superfamily, which is now known to have a total of ten members in the human. Like insulin, INSL3 is firstly synthesized as a pre-prohormone precursor containing a signal peptide linked to B-C-A domains. It undergoes processing by hitherto unidentified proprotein convertases, resulting in the production of mature INSL3 consisting of a A-B heterodimer that is covalently linked by two interchain disulfide bonds and one intrachain disulfide bond ( Fig. 1) (2). These disulfide bonds are essential for maintaining its expected characteristic insulin-like conformation and its unique biological activities (3). The receptor for INSL3, LGR8, is a member of the leucine-rich repeat-containing G-proteincoupled receptor family (4). It is closely related to the relaxin receptor, LGR7, and has been recently classified as relaxin family peptide (RXFP) receptor, RXFP2 (5). Importantly, relaxin peptides from some species can also bind to and activate LGR8, albeit with a lower affinity than INSL3 (5).
Male mice homozygous with targeted disruption of either the Insl3 or LGR8 gene exhibit bilateral cryptorchidism (6,7), a defect in testicular descent due to an abnormality in gubernacular development. In humans this is the most common congenital disorder during sexual differentiation, affecting ϳ3-5% of newborn male infants (8). It results in defective spermatogenesis, infertility, and an associated high risk for testicular malignancy. Female homozygous mice show impaired fertility associated with abnormal estrus cycle length (6). Overexpression of the INSL3 protein in female mice causes the ovaries to descend into the inguinal region due to an overdeveloped gubernaculum (9). Synthetic INSL3 will induce the growth of the gubernaculum in whole organ cultures (10). Hence the peptide is essential for the transabdominal phase of testicular descent where the gubernaculum proliferates and results in the migration of the testis to the inguinal region. More recent studies have uncovered prominent new roles for INSL3 in modulating both male and female germ cell function (11). In mammals it is well recognized that germ cell maturation is controlled by luteinizing hormone (LH). Oocyte maturation is induced by the pre-ovu-latory surge of LH, and, in the testis, the survival of male germ cells is also controlled by LH. However, because LH acts on the follicular cells in the female and the Leydig cells of the male, and not on the germ cells directly, local paracrine factors are likely involved in direct regulation of germ cell maturation. It has now been demonstrated that LH acts on the ovarian theca and testicular Leydig cells to produce INSL3 (11). This INSL3 can then bind to the LGR8 receptor, which is present on the germ cells to activate the inhibitory G-protein (G i ) leading to decreased cAMP production. Hence, treatment with INSL3 initiated meiotic progression (as measured by germinal vesicle breakdown) in arrested oocytes in vitro and reversed the decrease in testis weight (caused by increased germ cell apoptosis), which is induced by a gonadotrophin-releasing hormone antagonist in vivo (11). This significant finding demonstrates that INSL3 has a potential clinical application as a highly specific regulator of fertility. Furthermore, LGR8 antagonists have the potential to be highly specific contraceptives. Indeed, we recently demonstrated that LGR8 antagonists based on cyclic peptide mimetics of the INSL3 B-chain can mimic the effect of gonadotrophinreleasing hormone antagonist treatment in prepubertal rats, suggesting that direct antagonism of LGR8 function may be an effective contraceptive treatment (12).
The results of recent structure-function studies of INSL3 have shown that the primary receptor binding region resides within its B-chain. Analogs of INSL3 with modifications in the B-chain at positions 25 and 27 have shown that Trp 27 is essential for characteristic INSL3 activity (13). More recently, our demonstration, that B-chain-only INSL3 peptides can bind to the primary ligand binding site in the ectodomain of LGR8 and act as antagonists (12), has highlighted that the B-chain contains the essential residues for primary binding. It has also enabled a detailed structure-activity relationship study of analogs of the linear peptide to be undertaken. The results showed that the minimum length required for binding was residues 11-27 and confirmed the critical importance of the Trp 27 residue (12). On the basis of this information and to more precisely map the primary LGR8 receptor binding domain of human INSL3, we undertook its solution structure determination and correlated this with the LGR8 binding activation of INSL3 analogs with selected alanine substitutions in the B-chain.
Peptide Characterization-The purity of each synthetic peptide was assessed by analytical reversed-phase high-performance liquid chromatography and MALDI-TOF mass spectrometry using a Bruker Autoflex II instrument (Bremen, Germany) in the linear mode at 19.5 kV. Peptides were quantitated by amino acid analysis of a 24-h acid hydrolyzate using a GBC Ltd. analyzer (Melbourne, Australia).
Ligand Binding Assays-The method to study the binding of 125 I-INSL3 to HEK-293T cells stably transfected with human LGR8 has been described previously (15). 125 I-INSL3 was kindly provided by Prof. Pierre DeMeyts (Hegedorn Research Institute, Denmark). Data are expressed as mean Ϯ S.E. of percent specific binding of triplicate determinations made from at least three independent experiments. Data were analyzed using GraphPad Prism (GraphPad Inc., San Diego, CA), and a nonlinear regression one-site binding model was used to plot curves and calculate pK i values. Final pooled pK i data were analyzed using one-way analysis of variance coupled with Bonferroni's multiple comparison test for multiple group comparison.
Functional cAMP Assays-Receptor cAMP signaling was assessed using a cAMP reporter gene assay (16). 293T cells in 96-well plates were co-transfected with LGR8 and a pCRE-␤galactosidase reporter plasmid (16) (courtesy of Dr. R. Cone) to assess the cAMP signaling response to INSL3 and INSL3 analogs. Co-transfected cells were incubated with increasing concentrations of INSL3 analogs for 6 h, after which the medium was aspirated and the cells were frozen at Ϫ80°C overnight. The amount of cAMP-driven ␤-galactosidase expression in each well was determined by incubating the cells in 25 l of lysis buffer (10 mM sodium phosphate buffer, pH 8.0, 0.2 mM MgSO 4 , 0.01 mM MnCl 2 ) for 10 min, 100 l of assay buffer (100 mM sodium phosphate buffer, pH 8.0, 2 mM MgSO 4 , 0.1 mM MnCl 2 , 0.5% Triton X-100, 40 mM ␤-mercaptoethanol) for a further 10 min before 25 l of enzyme substrate solution (1 mg/ml chlorophenol red ␤-D galactopyranoside (Roche Applied Science) in 100 mM sodium phosphate buffer, pH 8.0, 2 mM MgSO 4 , 0.1 mM MnCl 2 ) was added to each well, and the plate was incubated for 30 min. The absorbance of each well was determined at 570 nm using a Ceres UV900C plate reader (Bio-Tek Instruments). All experiments were repeated at least three times with triplicate determinations within each assay. Results are plotted as mean Ϯ S.E. of percent normalized response compared with 1 nM INSL3. Data were analyzed using one-way analysis of variance coupled with Bonferroni's multiple comparison test for multiple group comparison.
NMR Spectroscopy-Pulsed-field gradient NMR diffusion experiments were performed with a two-dimensional sequence using stimulated echo longitudinal encode-decode (17) as described previously (18). Dioxane (0.2 mM) was used as the internal standard (19). Samples prepared for structure determination contained ϳ1 mM of peptide dissolved in either 90% H 2 O and 10% D 2 O (v/v) or 100% (v/v) D 2 O at pH 4.0. Spectra were recorded at 290, 298, and 303 K on a Bruker Avance 600-MHz spectrometer or on a Bruker DMX 750-MHz spectrometer. Two-dimensional experiments recorded included double quantum filtered correlation spectroscopy (DQF-COSY), TOCSY (using an MLEV-17 spin lock sequence with a mixing time of 80 ms), and NOESY with mixing times of 100, 150, or 200 ms. Slowly exchanging NH protons were detected by acquiring a series of one-dimensional and TOCSY spectra immediately after dissolution in D 2 O. As most amides disappeared within the first 1 h, resonances still visible after 1 h were considered to be protected from the solvent by hydrogen bonding. These included Leu A12 , Ser A13 , Cys A15 , Leu A20 , Leu A21 , Thr A22 , Leu A23 , Cys A24 , Glu B7 , Leu B9 , Arg B16 , Ala B17 , Leu B18 , Val B19 , Arg B20 , Val B21 , and Cys B24 . Spectra were processed using XWINNMR (Bruker).
Structure Calculations-Distance restraints were derived primarily from 150-ms NOESY spectra recorded at 298 K and 600 MHz in H 2 O or D 2 O. Cross-peaks were assigned and integrated in XEASY and converted to distance restraints using CYANA. Due to the generally broad lines of many amide protons, the 3 J NH-H␣ coupling constants were ambiguous. Thus, in most cases, broad angle restraints of Ϫ100 Ϯ 80°were included, and then only for residues where a positive angle could be excluded based on a strong sequential H␣ iϪ1 -HN i NOE compared with the intra residual H␣ i -HN i NOE. Sidechain 1 angles and stereospecific assignments were determined on the basis of observed NOE and 3 J H␣-H␤ coupling patterns. Hydrogen bonds were included into the structure calculations for all amide protons concluded to be slow exchanging once a suitable acceptor could be identified in the preliminary structures. Three-dimensional structures were calculated using simulated annealing and energy minimization protocols from ARIA (20) in the program CNS (21) as described previously (18). Briefly the protocol involves a high temperature phase comprising 4000 steps of 0.015 ps of torsion angle dynamics, a cooling phase with 4000 steps of 0.015 ps of torsion angle dynamics during which the temperature is lowered to 0 K, and finally an energy minimization phase comprising 5000 steps of Powell minimization. The subsequent refinement in explicit water involves heating to 500 K via steps of 100 K, each comprising 50 steps of 0.005 ps of Cartesian dynamics followed by 2500 steps of 0.005 ps of Cartesian dynamics at 500 K, before a cooling phase where the temperature is lowered in steps of 100 K, each comprising 2500 steps of 0.005 ps of Cartesian dynamics. Finally, the structures were minimized with 2000 steps of Powell minimization. The coordinates representing the solution structure of INSL3 have been submitted to the Protein Data Bank and given the accession number 2H8B.

Synthesis of Human INSL3 and Analogs-Human INSL3 has
previously been successfully prepared in our laboratories by random combination of each of the A-and B-chains at high pH (22). Although overall yields were good, equivalent yields but with an easier scale-up were obtained by a regioselective disulfide bond formation approach. The nine B-chain Ala-substituted INSL3 analogs were also made by the same method and in good yield. Each peptide was comprehensively chemically characterized including by MALDI-TOF mass spectrometry (see supplementary material) and its high purity confirmed.
NMR Diffusion Measurements-Translational diffusion measurements using pulsed-field gradient experiments with dioxane (hydrodynamic radius 2.12 Å) as an internal standard were used to assess the aggregation state of INSL3 in solution. On the basis of the equation derived by Wilkins et al. (19) (R h ϭ (4.75 Ϯ 1.11)N 0.29Ϯ0.02 , where N is the number of residues in the protein and R h is the hydrodynamic radius in Angstroms) the expected hydrodynamic radii for a monomer (57 amino acids) and a dimer (114 amino acids) of INSL3 are 15.3 and 18.8 Å, respectively. The experimental hydrodynamic radius of INSL3 was found to be 16.7 Å at both 0.1 and 1 mM, confirming that there is no concentration-dependent aggregation. Although this value is a little higher than expected for a monomer, given the identical diffusion coefficients at the two concentrations this likely reflects a less compact structure rather than any aggregation.
NMR Assignments and Structure Determination of INSL3-For the structure determination of INSL3, extensive two-dimensional NMR data were recorded at 600 and 750 MHz. Although the signals were well dispersed indicative of a folded structure a large number of residues had significantly broadened resonances. Despite this broadening it was possible to achieve resonance assignments using two-dimensional sequential assignment strategies. For the amide protons of residues Ala A7 , Arg A8 , Cys A11 , Ser A13 , Gly A14 , and Phe B14 , the broadening of the amide signals was so severe that they could not be detected. However, the side-chain resonances from these residues were assigned based on their patterns in the aliphatic region of the TOCSY spectrum, and the assignments were confirmed by sequential as well as medium and long range NOEs. A number of other residues had broadened signals, including Cys A10 , Cys A15 , Thr A16 , Glu B7 , Lys B8 , Leu B9 , Cys B10 , Gly B11 , and His B12 , confirming that INSL3 undergoes considerable conformational exchange in solution. Furthermore, minor conformations appeared to be present for several residues, as evident from "brothering" of signals. This is in agreement with what has previously been seen for H3 relaxin in solution, confirming that peptides of the relaxin family are quite flexible. To assess whether the structure or dynamic processes in INSL3 were affected by solution conditions, spectra were recorded in the pH range 2.7-7.4 (supplementary data). Although in general resonance broadening was increased at higher pH as a result of faster solvent exchange, no significant changes in H␣ chemical shifts or changes in the prevalence of minor conformations were observed as a function of pH. INSL3 contains five Pro residues of which one is the N-terminal residue of the B-chain.
The other four are all in the trans conformation, as evident from strong sequential H␣ iϪ1 -H␦ i NOEs, respectively.
Structural restraints in the form of distance restraints based on NOE cross-peak intensities, backbone and side-chain dihedral angle restraints derived from coupling constants and NOE patterns, and hydrogen bonds derived from amide-exchange data were used to calculate the structure of INSL3. Structures were calculated by simulated annealing followed by refinement and energy minimization in a water shell. Ambiguous crosspeaks were subsequently assigned based on preliminary structures. Although most amides exchanged within minutes a number were still visible after 1 h. Because these were all found to be consistent with the hydrogen bond patterns of the elements of secondary structure in preliminary structures, they were included as restraints in the calculations. Fig. 2 shows a superimposition of the family of 20 structures representing the solution structure of INSL3. It is clear that INSL3 adopts a typical relaxin/insulin fold with a well defined core but with several distorted regions. The structural statistics (summarized in Table 1) show that the structures are in good agreement with the experimental data and have favorable covalent geometry. The A-chain comprises two parallel ␣-helical regions (residues A5-A12 and A17-A24) separated by an extended segment (residues A14 -A16). The latter forms a small ␤-sheet by hydrogen bonding to the B-chain segment B6 -B8. The B-chain contains a longer helical segment, which spans residues B12-B22 and is oriented across the face of the two A-chain helices. The fold is cross-braced by the A10 -A15, A11-B10, and A24 -B22 disulfide bonds, which hold the pep- tide chains together around a hydrophobic core comprising the side chains of Pro A6 , Cys A10 , Cys A15 , Leu A20 , Leu A23 , Cys A24 , Leu B9 , Cys B10 , Phe B14 , Ile B15 , Ala B17 , Leu B18 , and Cys B22 . As evident from the overlay of the family of INSL3 structures, both N termini (A1-A4 and B1-B5) are disordered and appear to be flexible in solution, consistent with a lack of medium and long range NOEs as well as chemical shifts close to random coil values. The orientations of the C-terminal tails of both chains, which extend away from the helical segments, are also poorly defined. Interestingly, although in the B-chain C-terminal tail there are few NOEs associated with the peptide backbone and only small deviations from random coil chemical shifts, a number of NOEs can be seen between the Trp B27 side chain and Leu B18 , Val B19 , and the Cys A24 -Cys B22 disulfide bond, suggesting that the tail can wrap around and interact with this part of the molecule. We recently reported such an interaction for H3 relaxin, but in that case the tail appeared to have a well defined conformation. In INSL3 these interactions appear to be transient, with the Trp and the rest of the tail being mobile in solution. Further evidence for this proposal is the presence of a minor conformation observed for Tyr A26 whose aromatic side chain displays similar NOEs to Leu B18 and Val B19 . This interaction would not be possible if the Trp B27 side chain permanently interacted with and thus prevented the interaction of Leu B18 and Val B19 with other residues.

Description of the Three-dimensional Structure of Human INSL3-
Effects of B-chain Alanine Substitutions on LGR8 Binding and Activation-Alanine substituted INSL3 analogs were tested for their ability to compete with 125 I-INSL3 binding to HEK-293T cells stably transfected with LGR8. Additionally, selected INSL3 analogs were tested for their ability to stimulate cAMP accumulation in LGR8-transfected cells. INSL3 bound to LGR8 with high affinity (pK i ϭ 9.34 Ϯ 0.02) and activated cAMP accumulation with high efficacy (pEC 50 ϭ 10.24 Ϯ 0.12) ( Table 2) in a similar manner to that previously described (12,15). INSL3 analogs with single Ala substitutions at His B12 , His B13 , or Arg B20 did not change LGR8 binding affinity. In contrast, analogs with single Ala substitutions at either Arg B16 (pK i ϭ 8.49 Ϯ 0.09; p Ͻ 0.001) or Val B19 (pK i ϭ 8.36 Ϯ 0.11; p Ͻ 0.001) did exhibit decreased binding affinity relative to INSL3 (Fig. 3 and Table 2). Interestingly, analogs with pairs of Ala substitutions (His B12 plus Arg B16 , pK i ϭ 7.81 Ϯ 0.04; Arg B16 plus Arg B20 , pK i ϭ 7.82 Ϯ 0.14) resulted in a decreased affinity compared with the single Arg B16 Ala-substituted analog ( p Ͻ 0.001). An analog with three Ala substitutions (His B12 , Arg B16 , and Arg B20 ) resulted in a further decrease in affinity (pK i ϭ 7.33 Ϯ 0.07) ( p Ͻ 0.001) compared with the double Ala-substituted analog ( Fig. 3B and Table 2). Hence, although substitution of either His B12 or Arg B20 alone for Ala does not affect LGR8 binding, it appears that they somehow cooperate with Arg B16 for optimal binding of INSL3 to LGR8. We also examined a B1-26 truncated INSL3 analog in our binding assays and found that it exhibited a dramatically reduced binding affinity (pK i ϭ 6.86 Ϯ 0.06) compared with INSL3. This result is consistent with previous data, which demonstrated that this truncation (leading to a loss of Trp B27 ) results in a decrease, but not complete loss, of binding affinity (13). Importantly the mutation of all the proposed binding residues to create Ala B12/16/19/20/27 INSL3 resulted in a peptide with no binding affinity for LGR8. Binding affinities were reflected in efficacies in stimulating cAMP accumulation from LGR8-expressing cells (Table 2). Hence, the single Arg B16 3 Ala pointsubstituted analog demonstrated lower activity than INSL3 (pEC 50 ϭ 8.93 Ϯ 0.15; p Ͻ 0.001 versus INSL3) and the triple (His B12 , Arg B16 , and Arg B20 ) Ala-substituted analog resulted in a further decrease in activity (pEC 50 ϭ 8.06 Ϯ 0.17; p Ͻ 0.001 versus INSL3; p Ͻ 0.05 versus Arg B16 3 Ala) (Fig. 3C).

DISCUSSION
Given its importance in the control of testis descent and modulation of germ cell function, the precise molecular mech-   H3 and H2 Relaxin-Fig. 4 shows a comparison of the structures of INSL3, H3 relaxin (18), and H2 relaxin (23) from which it is clear that the peptides have a very similar core, with the key differences being mainly around the termini. In both H2 and H3 relaxin the A-chains are highly structured. In contrast, in INSL3 residues A1-4 are disordered, with the first structurally ordered residue being Asn A5 , as evident from a number of medium range helical-type NOEs, in particular to Arg A8 . In both H2 and H3 relaxins the A-chains finish with a C-terminal cysteine residue that, due to the covalent link to its disulfide bond partner, locks the chain into its helical conformation. Although the INSL3 A-chain has a similar helical conformation, it contains two additional C-terminal residues, Pro A25 and Tyr A26 , which are structurally disordered in the main conformer of INSL3. However, there are also peaks in the spectra originating from a minor conformation of Tyr A26 . This conformation displays NOE contacts to a minor conformation of Leu B18 , confirming that in some conformations Tyr A26 interacts with the rest of the molecule. This interaction is also consistent with the methyl signals of the minor conformation of Leu B18 being shifted upfield by ring-current effects from Tyr A26 . It is possible that this conformation involves a cis peptide bond preceding Pro A25 , but this could not be confirmed due to resonance overlap.
The B-chain N terminus appears to be flexible and lacks a preferred orientation in both INSL3 and H3 relaxin. Of more interest is the conformation of the longer C-terminal tail. In the crystal structure of H2 relaxin the C-terminal helix extends one turn further than in H3 relaxin and INSL3. However the tail following the helix, including the conserved Trp, is flexible and lacks significant electron density in the crystal structure. In contrast, in H3 relaxin we observed a large number of NOEs between the Trp side chain and resonances in the molecular core, including B18, B19, and the A24 -B22 disulfide bond (18). These NOEs were enough to define a conformation in which the tail turns around and the Trp side chain packs against the core and forms favorable hydrophobic interactions. Interestingly similar NOEs are observed for INSL3, although in this case the tail does not seem to have a well defined position, raising the possibility that the tails in all relaxins are flexible to differing degrees. Given the involvement of residues in the tail and the B-chain helix in receptor binding and biological activity, this flexibility may be important for the selectivity of the peptides for their respective G-protein-coupled receptors.
Aside from the flexible terminal regions, the presence of broad resonances for a number of other residues in INSL3 reflects internal dynamical processes in the molecular core. We recently reported similar broadening for certain resonances in H3 relaxin (18). Interestingly, the same residues are broadened in both peptides, although to a more severe degree in INSL3, with several amide protons being undetectable. The broadening appears to be mainly centered around the Cys A10 -Cys A15 disulfide bond, making it likely that dynamic flipping of the disulfide bond conformation is the primary cause of the broadening. Evidence for dynamic processes in the form of resonance broadening and minor conformations such as is described here is not unusual in solution-state NMR and reflects the dynamic nature of protein structures.
Structure-Activity Relationships of INSL3-Recent studies have revealed that the INSL3 B-chain residues 11-27 are needed for binding to the LGR8, with Trp B27 , which we now know from the current study lies in the B-chain tail beyond the helix, playing a crucial role (13). However, the key residues in the binding of H2 relaxin to LGR7 are correlated to a RXXXRXXI motif within the B-chain helix (24). Thus it was likely that the binding of INSL3 to LGR8 also relied on additional residues in its B-chain helix. The residues at equivalent positions in INSL3 are His B12 , Arg B16 , and Val B19 , and thus these were considered primary candidates for further determinants of INSL3 bioactivity. In the solution structure presented here, the side chains of these residues are indeed presented on one side of the B-chain helix. Fig. 5 shows a comparison of the active sites of INLS3, H2 relaxin, and H3 relaxin and illustrates the projections of the different side chains. In addition to the residues equivalent to the relaxin tri-peptide binding motif, other residues in this region of INSL3 were found to have a significant degree of surface exposure, including His B13 and Arg B20 , and these therefore might also be in a position to influence receptor binding. Based on this observation our alanine substitution strategy also involved these residues. Strikingly, of the single Ala substituted analogs, only Arg B16 3 Ala and Val B19 3 Ala showed significantly reduced affinity for LGR8.   H3 (B), and H2 (C) relaxin. The positioning of the side chains of the residues known to be crucial for activity in INSL3 and H2 relaxin are shown in stick representation. It is important to note that, although the projections of these side chains can be predicted from the structures, their exact conformations when bound to the receptor are not known. In the NMR structures these residues are free in solution and disordered while in the crystal structure of H2 relaxin both Arg B12 (R12) and Arg B16 (R16) are involved in salt bridge interactions across the crystallographic dimer interface and thus likely adopt a conformation different from the active monomer.
However, in multi-Ala-substituted analogs, the absence of either His B12 or Arg B20 had a strong synergistic effect in combination with Arg B16 . This suggests that His B12 and Arg B20 may form additional interactions with the receptor or at the very least assist in the initial recognition step that is most likely governed by long-range electrostatic interactions between the basic residues of the hormone and acidic residues on the receptor.
For completeness, an analog of INSL3, lacking the C-terminal tail, including Trp B27 , that has decreased activity (13) was synthesized and analyzed. This (des-B27-31)-INSL3 analog demonstrated a similarly reduced affinity for LGR8 as the His B12 -, Arg B16 -, and Arg B20 -substituted INSL3 peptide. Importantly, the combined mutation of Trp B27 together with His B12 , Arg B16 , Val B19 , and Arg B20 to Ala resulted in a peptide with no binding affinity for LGR8, highlighting the fact that all of these residues are required for high affinity binding. All of the Ala-substituted analogs were analyzed by circular dichroism spectroscopy and found to possess either insignificant or very small secondary structural changes relative to the native human INSL3. 5 NMR analysis of a representative set of Ala mutants, including those with one, two, three, and five substitutions, showed that they uniformly had well dispersed spectra (supplementary data) consistent with well folded conformations. Thus the substitutions did not introduce perturbations to the overall structure that might have altered the activity through secondary conformational effects.
Interestingly the positioning of Trp B27 in the current structure remains uncertain. Most of the B23-B31 tail of INSL3 has chemical shifts differing very little from random coil. Furthermore, apart from the Trp B27 NOEs to Leu B18 and Val B19 , no non-sequential NOEs are seen in this region, strongly suggesting a dynamic tail without a preferred orientation. It is therefore likely that the tail does not adopt its active conformation until it binds to the receptor. Fig. 6 shows the INSL3 receptor binding surface both with the tail in the conformation suggested by the NOEs to the INSL3 core and with the tail removed. In the latter case a continuous primary binding interface is formed in which His B12 , Arg B16 , Val B19 , and Arg B20 are all fully exposed. As a substantial part of this surface is covered by the tail, as shown in Fig. 5A, it is likely that the tail moves away from this position upon binding and interacts with the receptor at a position that may be either distant from or within close proximity to the B-chain surface, depending on whether the tail extends out or coils up. In a recent report, INSL3 analogs with chemical restraints introduced between the A-chain C terminus and position B26 were synthesized to keep Trp B27 in a fixed position relative to the terminal region (25). Analogs with linkers that were too short had significantly reduced activity, whereas those analogs where the linker maintained the distance between the ␣-carbons of A26 and B26 to 10 -11 Å were highly active (25). This strongly supports the concept of a more extended conformation in the B-chain tail in which the Trp B27 interacts with LGR8 in a region separate to that of the B-chain helix binding site. It is interesting to note that, even though INSL3 has similar residues to those of the H2 relaxin receptor binding tripeptide motif (His B12 , Arg B16 , and Val B19 ), INSL3 has a very poor affin-FIGURE 6. The receptor binding surface of human INSL3. A, surface of the lowest energy structure of INSL3 with the residues determining binding activity indicated. As seen here parts of the important residues His B12 (pink), Arg B16 (red), Val B19 (red), and Arg B20 (pink) are covered by the C-terminal tail, including the crucial Trp B27 (green). B, view of the same surface of the B-chain helix with the C-terminal B23-B31 residues removed. This picture represents a possible binding surface if in the active conformation Trp B27 moves away from the molecule and interacts with the receptor at a point separate from the B-chain helix binding site.
ity for the relaxin receptor LGR7. A possible reason for this is the presence of an Arg (in INSL3) instead of an Ala (in H2 relaxin) at position B20. Although we have shown that the Arg can assist in coordinating the interaction with LGR8, its positive charge and considerable bulkiness relative to Ala may be unfavorable for interacting with the H2 relaxin receptor LGR7.
The flexibility of the N-terminal A-chain helix of INSL3 is interesting in the light of a recent result showing that, although a truncated analog lacking the first six amino acids retains full activity, a second analog in which residues A1-A8 have been removed is a potent antagonist of INSL3 at LGR8 (26). These results indicate that, although the regions important for the primary binding to the receptor are located in the B-chain, a secondary interaction involving the A-chain is needed to activate the receptor signal. Here we have shown that residues A1-A4 are unstructured in solution, consistent with their lack of functional roles, whereas residues A5-A11 adopt the helical conformation that is typical for the relaxin/insulin-like fold. The dynamic character of this helix is crucial for the function of insulin (27) and may also be needed for INSL3 activity.
Based on the results presented here a mechanism for LGR8 activation by INSL3 may be proposed. This model involves an initial, primarily electrostatic interaction involving residues His B12 , Arg B16 , and Arg B20 , followed by a repositioning of the B-chain tail and the hormone "locking in" to the correct position by anchoring the crucial Trp B27 residue. A second recognition event would follow in which parts of the A-chain interact with a second site on LGR8 to induce a conformational change allowing the generation of signaling events.
In summary, we have used NMR techniques to solve the structure of human INSL3 in solution, and through analysis of a series of synthetic Ala-substituted analogs, used the structure to help propose a molecular mechanism for the binding of INSL3 to its receptor, LGR8. A thorough understanding of the features controlling the biological activities of relaxin family peptides will be of importance for the future design of antagonists/agonists for use as pharmacological probes as well as for pharmaceutical applications.