Contribution of TyrB26 to the Function and Stability of Insulin

Crystallographic studies of insulin bound to receptor domains have defined the primary hormone-receptor interface. We investigated the role of TyrB26, a conserved aromatic residue at this interface. To probe the evolutionary basis for such conservation, we constructed 18 variants at B26. Surprisingly, non-aromatic polar or charged side chains (such as Glu, Ser, or ornithine (Orn)) conferred high activity, whereas the weakest-binding analogs contained Val, Ile, and Leu substitutions. Modeling of variant complexes suggested that the B26 side chains pack within a shallow depression at the solvent-exposed periphery of the interface. This interface would disfavor large aliphatic side chains. The analogs with highest activity exhibited reduced thermodynamic stability and heightened susceptibility to fibrillation. Perturbed self-assembly was also demonstrated in studies of the charged variants (Orn and Glu); indeed, the GluB26 analog exhibited aberrant aggregation in either the presence or absence of zinc ions. Thus, although TyrB26 is part of insulin's receptor-binding surface, our results suggest that its conservation has been enjoined by the aromatic ring's contributions to native stability and self-assembly. We envisage that such classical structural relationships reflect the implicit threat of toxic misfolding (rather than hormonal function at the receptor level) as a general evolutionary determinant of extant protein sequences.

(FnIII-1, FnIII-2, FnIII-3). An insert domain within FnIII-2, including the insulin-binding element (␣CT) within the C-terminal region of the ␣ subunit, was incompletely traced in the electron-density map. The overall conformation of the (␣␤ ⌬ ) 2 ectodomain resembles an inverted-V in which the presumed high-affinity site of insulin binding lay within the crux of the dimer (12). An improved crystallographic model of the ectodomain has recently been described (12).
A recent structural advance exploited domain-minimized models of the ␣ subunit containing the primary insulin-binding elements L1 and ␣CT. Notably, a co-crystal structure has been determined at 3.5 Å resolution of a ternary complex between insulin, an L1-CR fragment, and a synthetic ␣CT peptide spanning residues 704 -719 of IR-A (4). In this structure (designated the micro-receptor (IR) complex) the C-terminal segment of the insulin B chain is detached from the hormone's ␣-helical core; such detachment is incompatible with classical structures of insulin (Fig. 1C) but enables insertion of this segment (including Tyr B26 ; red and asterisk in Fig. 2A) between the con-served surfaces of L1 and the ␣CT peptide (Fig. 2, B and C). This mode of binding, long anticipated based on studies of anomalous insulin analogs (13,14) and residue-specific photo-crosslinking (15), has defined the binding surfaces in the IR for insulin's conserved triplet of aromatic residues, Phe B24 , Phe B25 , and Tyr B26 ( Fig. 2A). Surprisingly, whereas the side chain of Phe B24 inserts within a classical nonpolar pocket (anchoring the displaced B-chain ␤-strand; Ref. 16), the B25 and B26 side chains appear less closely packed (Fig. 2, B-D). The latter side chains, despite their broad conservation (17) and efficient IR photo-cross-linking as photo-activatable derivatives (15,18,19), appear to contact the IR surface only loosely.
Our interest in Tyr B26 was motivated by its broad conservation among vertebrate insulins (17) and insulin-like growth factors (IGF-I and IGF-II) (20,21). Classical structure-activity relationships at B26 are complex, presumably because of a subtle interplay between direct effects of modifications at the hormone-receptor interface and indirect effects of conformational changes at this interface. Whereas substitution of Tyr B26 by Phe forms zinc-free dimers via anti-parallel association of B-chain ␣-helices and C-terminal ␤-strands (brown); two zinc ions then mediate assembly of three dimers to form a classical hexamer (T 6 ). The A chain is shown in yellow (ribbon representation), and the B chain is in beige (B1-B19) or brown (B20-B30). The conserved aromatic residues of Phe B24 and Phe B25 are shown as black sticks, whereas Tyr B26 is red. B, inverted V-shaped assembly of IR ectodomain homodimer. One monomer is in ribbon representation (labeled), the second is in surface representation. Domains are labeled as follows: L1, first Leu-rich repeat domain; CR, Cys-rich domain; L2, second Leu-rich repeat domain. ␣CT, ␣-chain C-terminal segment. C, model of WT insulin in its receptor-free conformation overlaid onto the structure of the insulin-bound IR (4). L1 and part of CR are shown in powder blue; ␣CT is shown in purple. Residues Phe B24 , Phe B25 , and Tyr B26 are as in panel A. The B chain of IR-bound insulin is shown in black (B6-B19); the brown tube indicates classical location within the overlay of residues B20-B30 of insulin in its receptor-free conformation, highlighting steric clash of B26-B30 with ␣CT. Coordinates were obtained from PDB entries 4INS, 4ZXB, and 3W11.
impairs IR affinity by 2-fold (22), a finding consistent with a specific (although modest) contribution by the para-OH group, studies of an Ala substitution have given rise to conflicting results (23)(24)(25). Although apparently dispensable in a truncated yet active insulin analog lacking residues B26-B30 (despentapeptide[B26-B30]-insulin-amide; Refs. 26 and 27), mutations or non-standard modifications at B26 in full-length or truncated insulin analogs may impair or enhance activity (28,29). Thus, no coherent pharmacophore has been obtained.
To clarify the contribution of Tyr B26 to receptor binding, we undertook a systematic mutational survey based on the semisynthetic preparation of 18 insulin analogs. In accordance with the peripheral B26-related interface in the IR complex (4), this survey revealed that high IR affinity can be conferred by nonaromatic polar or charged side chains (either acidic or basic).
Although well tolerated at the receptor interface, these substitutions were observed to impair the hormone's thermodynamic stability, self-assembly, and susceptibility to fibrillation. Our results thus suggest that, among the 20 natural amino acids, the wild type Tyr B26 best meets the simultaneous challenges of biological activity, protein stability, self-assembly, and protection from toxic misfolding (30). Such co-optimization highlights the multidimensional biophysical and biological landscape of protein evolution.

Experimental Procedures
Preparation of Insulin Analogs-Analogs were made by trypsin-catalyzed semi-synthesis using an insulin fragment, des-octapeptide[B23-B30]-insulin and modified octapeptides as described (23). The des-octapeptide[B23-B30]-insulin was gener-FIGURE 2. Insulin sequence and IR complex. A, sequence of WT insulin and sites of modification. A and B chains are shown in white and gray, respectively. Conserved aromatic residues Phe B24 and Phe B25 are highlighted as black circles. The present study focused on substitutions of Tyr B26 (red circle); additional substitutions were made at position B29 (Orn; encircled X) to facilitate semi-synthesis. B, stick representation of residues B20-B27 (carbon atoms (green), nitrogen atoms (blue), and oxygen atoms (red) packed between ␣CT and the L1-␤ 2 strand. B-chain residues B8-B19 are shown as a black ribbon, and the A chain is shown as a yellow ribbon; residues A1-A3 are concealed behind the surface of ␣CT. Key contact surfaces of ␣CT with B24-B26 are highlighted in magenta and of L1 with B24-B26 are highlighted in cyan; L1 and ␣CT surfaces not in interaction with B24-B26 are shown in lighter shades. Insertion of the B20-B27 segment between L1 and ␣CT is associated with a small rotation of the B20-B23 ␤-turn and changes in main-chain dihedral angles flanking Phe B24 (4). C, orthogonal view to B, showing interaction of the side chain of Phe B24 with the nonpolar surface of the L1-␤ 2 sheet. Tyr B26 is hidden below the surface of ␣CT. Engagement of conserved residues A1-A3 against the nonpolar surface of ␣CT is shown at the top. D, environment of Tyr B26 within Site 1 complex (stereo). Neighboring side chains in L1 and ␣CT are as labeled. Coordinates were obtained from PDB entry 4OGA. ated via cleavage of human insulin with trypsin and purified by reverse-phase high performance liquid chromatography (HPLC); octapeptides were synthesized by solid-phase synthesis (31). The formation of a peptide bond between Arg B22 and a synthetic octapeptide was mediated by trypsin (in a mixed solvent system containing 1,4-butanediol and dimethylacetamide) as previously described (32). Insulin analogs were purified by preparative reverse-phase C4 HPLC (Higgins Analytical Inc., Proto 300 C4 10 M, 250 ϫ 20 mm), and their purity was assessed by an analytical reverse-phase C4 HPLC (Higgins Analytical Inc., Proto 300 C4 5 M, 250 ϫ 4.6 mm). Molecular masses of purified analogs were verified using an Applied Biosystems 4700 proteomics analyzer (matrix-assisted laser-desorption/ionization time-of-flight mass spectrometry; MALDI-TOF MS).
Receptor-binding Assays-Affinities for IR-B were measured by a competitive-displacement scintillation proximity assay. This assay employed detergent-solubilized holo receptor with C-terminal streptavidin-binding protein tags purified by sequential wheat germ agglutinin and Strep-Tactin-affinity chromatography from detergent lysates of polyclonal stably transfected 293PEAK cell lines expressing each receptor. A dilution series of human insulin (a generous gift from Novo-Nordisk A/S, Bagsvaerd, Denmark) or analog (11 dilutions, 5-fold each with a maximum initial concentration of 2 M) in 100 l of binding buffer (100 mM HEPES (pH 7.8), 100 mM NaCl, 10 mM MgSO 4 , 0.025% (v/v) Tween 20, and 0.5% (w/v) bovine serum albumin) was made in a 96-well plate (Costar). The assay was initiated by the addition to the wells of a premixed solution containing (i) wheat germ agglutinin scintillation proximity assay (SPA) beads (PerkinElmer Life Sciences), (ii) solubilized receptor, and (iii) 125 I-Tyr A14 -insulin in binding buffer. The final concentration of 125 I-labeled ligand was 7.5 pM, and the amount of receptor added was adjusted so that the extent of labeled ligand binding in the absence of competitor was Ͻ15% of the total added counts in order to avoid liganddepletion artifacts. Plates were incubated with gentle shaking for 24 h at room temperature, centrifuged, and counted for 5 min/well in a 12-detector Trilux scintillation counter (PerkinElmer Life Sciences/Wallac). To obtain analog dissociation constants, competitive binding data were analyzed by non-linear regression by the method of Wang (33), a model that provides an analytical solution for the binding of two ligands to a single receptor.
Receptor-binding Screening Protocol-The ability of insulin analogs to displace bound 125 I-Tyr A14 -insulin from antibodyimmobilized wheat germ agglutinin-purified receptor was tested at an analog concentration of 0.75 nM. This concentration corresponded to displacement of 95% of receptorbound 125 I-Tyr A14 -insulin by the control analog, Orn B29 -insulin. The fraction of 125 I-Tyr A14 -insulin displaced by a given analog permitted assignment to the following three categories: Ͻ60% (low affinity), 60 -80% (intermediate affinity), or Ͼ80% (high affinity).
Insulin Self-assembly-Oligomeric states of the insulin analogs were monitored by size-exclusion chromatography (SEC) using HPLC. The exclusion void volume (V o ) was established with apoferritin (443 kDa). Insulin analogs were made 0.6 mM in a buffer consisting of 25 mM Tris-HCl (pH 7.4), 0.65 mg/ml phenol, 1.6 mg/ml meta-cresol, 16 mg/ml glycerol, and ZnCl 2 at a ratio of 2 zinc ions per insulin hexamer. For zinc-free conditions, analogs were also made 0.6 mM in phosphate-buffered saline (pH 7.4). Protein samples (volume 10 l) were applied through a Waters 717 autosampler onto a Zenix-C SEC-150 column (Sepax Technologies, Southborough, MA) with a nominal fractionation range of 0.5-150 kDa. Proteins were fractionated at a flow rate of 1 ml/min using a Waters Binary HPLC system. Protein elution was monitored at 215 and 280 nm using a dual-lambda Waters 2487 absorbance detector. The mobile phase consisted of 10 mM Tris-HCl (pH 7.4) and 140 mM NaCl with or without 0.3 mM ZnCl 2 and 50 mM cyclohexanol; the latter provided a non-aromatic replacement for the phenolic compounds often employed as R 6 -hexamer stabilizing-and anti-microbial agents in pharmaceutical formulations (34 -36). Data acquisition and processing utilized Waters HPLC Empower software. The column was calibrated for apparent molecular mass determination by fractionating standard proteins individually on the column.
TR Transition and R 6 Co 2ϩ Kinetic Assay-Visible absorption spectroscopy was used to probe the formation and disassembly of phenol-stabilized R 6 Co 2ϩ -substituted insulin hexamers. Insulin analogs were made 0.6 mM in a buffer containing 50 mM Tris-HCl (pH 7.4), 50 mM phenol, 0.2 mM CoCl 2 , and 2 mM NaSCN. Sample pH was readjusted in each case to 7.4, and samples were incubated overnight at room temperature before the studies to ensure that a conformational equilibrium was reached. Spectra (400 -750 nm) were obtained to monitor tetrahedral Co 2ϩ coordination with its signature peak absorption band at 574 nm (37). To assess variation in the amplitude and max of the d-d transition band, the wild-type (WT) spectrum was obtained in four separate preparations containing three replicates each from solutions that were independently prepared; respective S.D. were Ϯ3.7% and 0.5 nm. To determine the rate of Co 2ϩ release from the hexamers, metal ion sequestration was initiated at 25°C by the addition of an aliquot of EDTA (50 mM at pH 7.4) to a final concentration of 2 mM; this yielded a molar ratio of EDTA per formal insulin trimer of 6.7. Attenuation of the 574-nm absorption band was monitored on a timescale of seconds to hours. Post-dissociation absorption spectra (400 -750 nm) were observed to confirm complete attenuation of the 574-nm absorption band (38). Kinetics data were fit to mono-exponential decay functions to determine dissociation rate and half-life of Co 2ϩ -coordinated R 6 hexamers. Independent replicates of the EDTA assay of WT insulin (using freshly prepared stock solutions in each case) yielded a S.D. of 15% (Ϯ69 s) in the lifetime.
Circular Dichroism-Far-ultraviolet (UV) CD spectra were obtained using an AVIV spectropolarimeter equipped with an automated syringe-driven titration unit. Wild type insulin or insulin analogs were made 50 M in 10 mM potassium phosphate (pH 7.4) and 50 mM KCl. Spectra were obtained from 190 -250 nm as described (39). Thermodynamic stabilities were probed by guanidine hydrochloride-induced denaturation monitored by CD at helix-sensitive wavelength 222 nm. Data were fit by nonlinear least squares to a two-state model (40), where x is the concentration of guanidine hydrochloride, and A,B represents respective estimates of the baseline ellipticities of the protein in its native and unfolded states as extrapolated to a guanidine concentration of 0 M. Baseline values were approximated via pre-and post-transition lines represented by equa- Such simultaneous fitting avoids artifacts of linear plots of ⌬G versus concentration of denaturant (41).
Assessment of Fibril Formation-Insulin or insulin analogs were made 60 M in phosphate-buffered saline (pH 7.4) containing 0.1% sodium azide and gently rocked at 37°C in glass vials in the presence of a liquid/air interface. Aliquots were taken at regular intervals and frozen for later analysis of thioflavin T fluorescence. The assay was terminated on visual appearance of cloudiness (42). Statistical significance of differences in lag times among analogs was evaluated using Student's t test.

Receptor-binding Studies Defined Three Classes of Analogs-
18 insulin analogs containing substitutions at B26 were prepared at small scale (Table 1). To eliminate the B29 tryptic site (and so facilitate semi-synthesis; Ref. 43), the analogs each contained Orn B29 (in place of Lys B29 ). Similarly, to provide a basic side chain at B26, an analog was prepared containing both substitutions Orn B26 and Orn B29 . Cys B26 -Orn B29 -insulin was not prepared to avoid possible disulfide interchange and/or formation of covalent dimers.
A coarse receptor-binding assay (using IR-B) was first undertaken that enabled subgroups of the insulin analogs to be distinguished based on displacement of pre-bound 125 I-labeled insulin at a uniform analog concentration of 0.75 nM (Fig. 3A). At this concentration WT insulin displaced 95% of the prebound tracer ( 125 I-Tyr A14 -insulin; see "Experimental Proce-dures"). The results defined three classes ( Fig. 3B): (i) high affinity (tracer displacement Ͼ80%; i.e. similar or greater than that observed on binding of WT insulin); (ii) intermediate affinity (tracer displacement 60 -80%), and (iii) low affinity (tracer displacement Ͻ60%). The low affinity class contained two aliphatic residues (Ile and Leu). The intermediate affinity class comprised a diverse set of residues, including Phe, Met, Pro, Thr, and Val; the remaining analogs (representing 10 of the 18 analogs tested) were placed in the high-affinity group.
Definitive IR-B-binding assays were then undertaken of selected high, intermediate, and low affinity analogs (Table 1) using a scintillation proximity assay performed with purified detergent-solubilized receptor isoform expressed in the same cell line. Affinities greater than WT insulin were conferred only by Ser B26 and Glu B26 (K d ϳ 0.02 nM), whereas the affinities conferred by Tyr B26 , Ala B26 , and Orn B26 were indistinguishable (K d ϳ 0.04 nM). The high affinities of Ala B26 and Glu B26 insulin analogs have previously been reported (3,25). Also in accordance with past studies (22), substitution of Tyr B26 by Phe reduced affinity (between 2-and 3-fold). Whereas a similar reduction was conferred by Val B26 , substitution of Tyr B26 by Ile or Leu led to more severe impairments (by ϳ10-fold and 30-fold, respectively).  3B). d H represents the high affinity group in the initial coarse screening (Fig. 3B). e Not determined as the Cys B26 analog was not prepared. f Not determined as substitution of Tyr B26 by Lys or Arg would have complicated semi-synthesis (see "Experimental Procedures"). g Orn B26 provided a model of a basic side chain. Two sets of observations were particularly striking: (a) the high affinity of analogs containing charged side chains (of either sign) or a short polar side chain (Ser) at B26 and (b) the functional incompatibility of aliphatic substitutions larger than Ala at B26. Although this pattern would in general be unexpected on systematic mutagenesis of a conserved aromatic residue in a globular protein (and indeed stands in contrast to results of substitutions at insulin positions Tyr B16 and Phe B24 ; Refs. 13,16,22), the B26-contacting surface within the IR complex (4) comprises a shallow solvent-exposed depression at the ␣CT/L1 junction. Although its structural analysis was limited by low resolution (3.5 Å), this surface contains multiple potential sites for hydrogen bonding and favorable electrostatic interactions, potentially extended by a network of bound water molecules. We speculate that these features of the B26-related surface underlie the enhanced affinities of Ser B26 and Glu B26 insulin analogs and general exclusion of aliphatic substitutions.

SEC Studies Provided Evidence of Decreased or Aberrant
Self-assembly-Competence of the insulin analogs for zinc-free or zinc-stabilized self-assembly was assessed using a calibrated SEC method (Fig. 4). In the absence of zinc ions, WT insulin was predominantly dimeric as expected at a protein concentration of 0.6 mM under these conditions; the parent Orn B29 -insulin analog exhibited similar elution behavior with a slight decrease in extent of dimerization (Fig. 4, A-C). A control for a monomeric insulin analog was provided by insulin lispro (KPinsulin; Pro B28 3 Lys and Lys B29 3 Pro) (38,44). The elution time of the Orn B26 analog is similar to that of KP-insulin. The Glu B26 analog exhibits two modes of self-association: ϳ40% as dimeric (elution time similar to that of the parent analog) and 60% as a higher molecular mass aggregate (asterisk in Fig. 4A). Although the aberrant elution peak is broad, the mean apparent mass is ϳ80 kDa, corresponding to 12-14 monomeric units. Perturbed overall dimerization of Glu B26 -insulin is in accord- ance with a previous study by equilibrium ultracentrifugation (3).
SEC studies were extended to conditions of WT R 6 assembly through the addition of zinc ions and phenol (Fig. 4, D-F). To facilitate UV detection of the eluted proteins, cyclohexanol (which is transparent at 215 and 280 nm) was used instead of phenol in the running buffer (see "Experimental Procedures"). Under these conditions WT insulin and the parent Orn B29insulin each exhibited an apparent mass slightly larger than a hexamer (43 and 37 kDa, respectively), whereas KP-insulin exhibited an apparent mass of one hexamer (formal molecular mass 35 kDa inclusive of two zinc ions; Fig. 4F). The Orn B26 analog was predominantly hexameric with a slight reduction in apparent mass (33 kDa), whereas the Glu B26 analog predominantly exhibited a broad distribution of lower molecular mass entities (asterisk in Fig. 4D). A small percentage of the Glu B26 analog (ca 5%) eluted as a small bump (41 kDa) on the leading edge of the diffuse profile.
Visible Absorption Co 2ϩ Spectroscopy Provided Evidence of Altered R 6 Assembly with Accelerated Disassembly-The structure and stability of the phenol-stabilized R 6 insulin hexamer may readily be probed by visible absorption spectroscopy on substitution of the zinc ions by Co 2ϩ (37). This spectrum exhibits a prominent absorption band due to the tetrahedral coordinate of Co 2ϩ in each R 3 trimer of the R 6 hexamer (or in the R 3 f trimer of a T 3 R 3 f hexamer); no such band exists in the spectrum of an octahedral Co 2ϩ complex as in the T 3 trimer of an insulin hexamer (or in an EDTA complex) (38). The magnitude of the d-d band in KP-insulin (gray line in Fig. 5A) was slightly attenuated relative to the WT hexamer (black line), presumed to represent a small shift in the conformational equilibrium from R 6 to T 3 R f 3 (37,38,45). The spectrum of the parent Orn B29insulin (red line in Fig. 5A) was similar to that of WT insulin, whereas the peak signal of the Orn B26 analog (violet) was attenuated by 7 (Ϯ1)%. The spectrum of the Glu B26 analog (tur-quoise) was attenuated by 39 (Ϯ2)%, suggesting that the substitution partially impairs the TR transition.
The rate of disassembly of the insulin analog hexamers may likewise be probed through EDTA sequestration of Co 2ϩ . This assay's underlying principle exploits transient release of the metal ion when insulin hexamers dissociate and re-assemble within its conformational equilibrium. Because the affinity of EDTA for Co 2ϩ is Ͼ10 8 greater than that of insulin, such transient release results in essentially irreversible sequestration of the metal ion in a colorless complex (38). Whereas the lifetime of the parent hexamer was similar to that of WT insulin (red and black lines in Fig. 5B), the lifetime of the Glu B26 hexamer (turquoise) was reduced by 2.6 (Ϯ0.2)-fold. Kinetic studies of the Orn B26 analog were limited by its progressive precipitation on the addition of EDTA. Data collected until the appearance of a visible precipitate displayed an even shorter lifetime (reduced by 15 (Ϯ3)-fold), but interpretation of this value is unclear given the competing aggregation of unknown structure.
CD Studies of High Affinity Analogs Provided Evidence of Native-like Structure with Decreased Dimerization-Although well tolerated in relation to receptor binding, substitution of Tyr B26 by Orn, Glu, or Ser was in each case associated with an altered far-UV CD spectrum at a protein concentration of 50 M (Fig. 6A). At this concentration WT insulin and Orn B29insulin are partially dimeric (46) as assessed by size exclusion chromatography (data not shown). The variant spectra exhibited attenuated ellipticity at 222 nm and deepening of the minimum near 208 nm, spectroscopic features associated with partial loss or dynamic destabilization of ␣-helices (47). In FIGURE 5. Visible absorption spectra of cobalt-stabilized hexamers and kinetics of metal ion release. A, Co 2ϩ d-d bands of Orn B29 -insulin (red), Glu B26 -Orn B29 -insulin (turquoise), and Orn B26 -Orn B29 -insulin (violet) near 550 nm provide a signature of the R (or R f ) hexameric state. Amplitudes of both Orn B26 and Glu B26 variants were attenuated in relation to Orn B29 -insulin. Control spectra were provided by WT insulin (black) and KP-insulin (gray). Whereas attenuation of amplitude of the 550-nm band of the Orn B26 variant may be explained by the decreased hexamer formation of the analog, the marked differences in the Glu B26 spectrum may be the result of nonspecific aggregates of the analog forming in aqueous solution, as suggested by gel filtration experiments. Au, absorption units. B, sequestration of divalent cobalt ions from insulin analogs by EDTA; Orn B29 -insulin (red), Glu B26 -Orn B29 (turquoise) insulin, and Orn B26 -Orn B29 (violet) analogs are shown in relation to those of WT insulin (black) and KP-insulin (gray). Orn B26 -Orn B29 -insulin formed aggregates that precipitated from solution at intermediate stages of hexamer dissociation (marked by black asterisk): the initial region of the curve was fitted to a monoexponential equation. FIGURE 6. Studies of structure, stability, and fibrillation. A, far-UV CD spectra of Orn B29 -insulin (red), Glu B26 -Orn B29 -insulin (cyan), Orn B26 -Orn B29 -insulin (magenta), and Ser B26 -Orn B29 insulin (dark blue) relative to WT insulin (open circles; black) at neutral pH 7.4 and 25°C. Ellipticity was normalized per residue. B, corresponding guanidine-unfolding transitions as monitored at 222 nm. Thermodynamic stabilities were derived using a two-state model (see Table 2). Color code is as in A. C, histogram of ⌬G u values in kcal mol Ϫ1 . Marked changes in stability were evident depending on the identity of the substitution. D, dot plot of lag time (days) to fibril formation of insulin analogs. Onset of fibrillation was defined by a 2-fold enhancement of thioflavin T fluorescence; see the section entitled "Functional Substitutions Led To Accelerated Fibrillation" for statistical analysis and p values. accordance with the above SEC findings, we ascribe these CD changes to decreased dimerization; analogous CD changes were previously found to accompany dilution of WT insulin in the range 5-100 M (48), presumably due to enhanced flexibility of the monomer (49 -51).
Functional Substitutions Impair Thermodynamic Stability-Substitutions Orn B26 , Glu B26 , and Ser B26 impaired global stability. Estimates of respective free energies of unfolding ⌬G u ) were obtained at 25°C based on CD studies of fractional unfolding on chemical denaturation ( Fig. 6B and histogram in Fig. 6C). A trend was observed wherein lower concentrations of denaturant (guanidine hydrochloride) were required for 50% unfolding of analogs containing the above substitutions ( Fig. 6B and Table  2, column heading C mid ). Application of a two-state model (native and unfolded; Ref. 40) yielded ⌬G u values that were in each case at least 0.5 kcal mol Ϫ1 lower than that of Orn B29insulin (baseline stability 3.2 (Ϯ0.1) kcal mol Ϫ1 ). Respective decrements in stability (⌬⌬G u ) for the Orn B26 , Glu B26 , and Ser B26 analogs were 0.7 (Ϯ0.2), 0.9 (Ϯ0.2), and 0.9 (Ϯ0.2) kcal mol Ϫ1 . Such perturbations were thus similar to that reported in studies of an insulin analog containing substitution of Phe B24 by Ala (⌬⌬G u 0.8 (Ϯ0.2) kcal mol Ϫ1 ; Ref. 52).
The m-values obtained in the fitting (which correlate with extent of solvation of nonpolar surfaces on protein denaturation; Ref. 40) were significantly attenuated relative to Orn B29insulin or WT insulin ( Table 2, column heading m). Such attenuation suggests that in their respective native states, the analogs exhibited less efficient desolvation of nonpolar surfaces. This trend may be due to a direct perturbation of the (A2, A3)-related interchain crevice (2) by Orn B26 , Glu B26 , and Ser B26 ; transmitted structural perturbations cannot be excluded.
Functional Substitutions Led To Accelerated Fibrillation-The reduced stabilities, impaired dimerization, and perturbed m-values of the Orn B26 , Glu B26 , and Ser B26 analogs motivated assessment of lag times before onset of fibrillation relative to Orn B29 -insulin ( Fig. 6D and Table 2). These assays were performed in the absence of zinc ions. Although Orn B29 -insulin exhibited a broad range of lag times Ͼ5 days (with a mean of 13 days in n ϭ 14 trials), the analogs consistently exhibited lag times Ͻ4 days (n ϭ 3). Despite the small sample size, p values were Ͻ0.01 for each analog. These trends are, therefore, unrelated to the charge of the B26 side chain and net charge of the protein (53). We speculate that perturbation of the (A2, A3)-related interchain crevice favors local unfolding (42) and nonnative conformational excursions, in turn favoring formation of an amyloidogenic nucleus (Fig. 7C) (54).

Discussion
Vertebrate insulin sequences exhibit broad conservation (17), including several invariant residues recently shown to pack at the primary hormone-receptor interface (4). Examples are provided by Val A3 , Val B12 , and Phe B24 , non-polar side chains of distinctive shape and size whose respective binding sites in the IR are also invariant (7). Such framework contacts are also conserved among IGFs and the cognate type 1 IGF receptor (20,21). Diverse amino acid substitutions at A3, B12, B24, or B25 impair receptor binding, including mutations as subtle as Val A3 3 Leu, Val B12 3 Leu, Phe B24 3 Tyr, or Phe B24 3 Leu (13, 55-57). Co-evolution of the hormone-receptor interface has presumably led to the strict conservation of such contact sites. Although current co-crystal structures of model insulin-IR complexes were obtained at low resolution (3.5 Å; Ref. 4), we anticipate that future improvements in resolution may enable quantitative rationalization of such classical structure-activity relationships.
Systematic Mutagenesis of the Insulin Surface-Mutational surveys of key residues in insulin through systematic synthesis of all possible substitutions have provided broad insight into structure-activity relationships (58,59). Such surveys may complement crystallographic studies of "micro-receptor" complexes (4), especially as to date such fragments lack the FnIII domains in the IR ␣-subunit and so provide an incomplete description of the hormone-receptor interface. A mutational survey of Glu B13 , for example, has suggested that this side chain contacts the IR (59) despite its lack of contact to ␣CT or L1 (4); this inference is in accordance with a proposed Site 2 contact within a FnIII domain (6,60). Mutational surveys may also provide insight into conserved contacts visualized within the IR complex. We recently utilized this approach to survey mutations at position B24 (16). The results provided a probe of a deep nonpolar pocket within the hormone-IR complex (4). At this site only the native side chain of Phe B24 among natural amino acids confers high activity. The non-standard aliphatic residue cyclohexanylalanine (Cha B24 ) was found to be compatible with high activity, demonstrating that aromaticity per se is not required within the B24-related pocket (16).
Structure-Activity Relationships at B26 -Because of the low resolution of the present IR co-crystal structure (4) and enigmatic structure-activity relationships in prior studies (23-29), we undertook a systematic mutagenesis of the B26 position with all standard amino acids, with the exceptions of Cys (to avoid a disulfide exchange) and Lys and Arg (to avoid tryptic cleavage of the octapeptide in the course of semi-synthesis; see "Experimental Procedures"). The latter two residues were represented by the basic amino acid Orn, an analog resistant to tryptic cleavage. The analogs also contained Orn B29 in place of the native Lys for the same reason. Semi-synthesis thus enabled rapid and efficient introduction of diverse substitutions at B26 without encountering barriers to recombinant protein expression (24).  (3) a Thermodynamic parameters were inferred from CD-detected guanidine denaturation data by application of a two-state model. b Fibrillation lag times pertain to zinc-free wild type insulin (in a monomer-dimer equilibrium) and analogs (monomeric); each protein was made 60 M in phosphate-buffered saline (pH 7.4). A 2-fold increase over baseline in thioflavin T fluorescence provided a criterion for onset of fibrillation. c All individual samples in this set exhibited the same lag time of 2 days. As the method employed in this study could not distinguish fibril lag times with a resolution given in hours, some variance was added to the data by adding Ϯ0.1 to individual data points to obtain an estimate of the S.D.
The present B26 survey has demonstrated, in striking contrast to our prior B24 survey (16), that chemically diverse side chains are functional, whereas side chains associated with impaired IR binding were hydrophobic (Ile B26 , Leu B26 , and Pro B26 ). Trp B26 and Phe B26 were compatible with near WT affinities, suggesting that their "weakly polar" electronic characteristics or planarity (each feature a consequence of aromaticity (61)) are favorable at the B26-related IR surface. The displaced conformation of residues B24-B27 in the IR complex leaves Tyr B26 partially exposed at a solvated edge of the L1 domain (Fig. 2B). The reduced activity of Pro B26 -Orn B29 -insulin may reflect not only its non-polarity but also its constrained and distinctive conformation.
All of the tested polar or charged amino acids conferred substantial activity (Asp, Glu, His, Asn, Orn, Gln, Ser, and the native Tyr; with Thr in the intermediate class). Of these, the highest affinities were conferred by Glu, Ser, Orn, and the native Tyr. Indeed Glu B26 and Ser B26 analogs of Orn B29 -insulin exhibited IR affinities 2-fold greater than those of WT insulin or Orn B29 -insulin itself; the affinity of Orn B26 -Orn B29 -insulin was indistinguishable from WT. Yet, Glu, Ser, and Lys (by analogy to Orn) are rarely found in nature as alternatives for Tyr B26 . To investigate this seeming evolutionary paradox, we measured free energies of unfolding (⌬G u ) by chemical denaturation (40). The thermodynamic stabilities of these active variants were each markedly reduced relative to WT insulin or its parent control (⌬⌬G u 0.9 (Ϯ0.2), 0.9 (Ϯ0.2), and 0.7(Ϯ0.2) kcal mol Ϫ1 , respectively.) In SEC studies the Glu B26 and Orn B26 analogs exhibited impaired zinc-free dimerization and zinc-stabilized hexamer formation in accordance with the classical packing of Tyr B26 at an aromatic-rich dimer interface (2). Whereas decreased dimerization of Glu B26 -insulin has previously been reported (3), the SEC data also demonstrated aberrant higher order zinc-independent aggregation together with instability of the R 6 zinc hexamer. Furthermore, this substitution and Orn B26 perturbed the kinetic stability of the R 6 Co 2ϩ -substituted insulin hexamer. Finally, we observed that the high-affinity analogs each exhibited (under zinc-free conditions) markedly reduced lag times before fibril formation. Susceptibility to such aggregation-coupled misfolding would be expected to be further magnified by impaired native self-assembly.
The present set of analogs contained the platform substitution Lys B29 3 Orn, introduced to simplify the protocol of trypsin-mediated semi-synthesis). This non-standard residue (related to Lys by removal of a single methylene moiety from a linear side chain; i.e. [2,5]-diaminopentanoic acid versus [2,6]diaminohexanoic acid) does not perturb the activity or stability of insulin. Its use as a template was unlikely to have influenced the pattern of activities among B26 analogs. Indeed, to the extent that our survey of substitutions recapitulated prior stud- FIGURE 7. Evolutionary constraints and insulin fibrillation. A, Venn diagram showing intersection of multiple constraints: function, foldability, misfolding, and assembly. We envisage that Tyr B26 is conserved due to its explicit roles in folding and assembly and implicit role in avoiding misfolding. B, surface representation of a T-state monomer (PDB entry 4INS) with residues B23-30 (stick model) within a groove between the A and B chains. The aromatic side chains of Phe B24 and Phe B25 (both dark gray) and Tyr B26 (red) are shown. C, general scheme of insulin fibrillation via a partially unfolded monomer intermediate. The native state is protected by classic self-assembly. Disassembly leads to an equilibrium between native and partially folded monomers. The receptor-bound conformation of insulin (top) may also participate in this equilibrium. This partial fold may unfold completely (bottom) as an off-pathway event or aggregate to form a nucleus en route to a proto-filament (right).
ies of particular analogs in a native context (such as Ala B26insulin and Glu B26 -insulin), similar results were obtained with respect to activity (23) and self-assembly (3). Such template independence is not surprising in this case as residue B29 is peripheral to the hormone's receptor-binding surface (62,63), does not contribute to classical self-association surfaces (2,64), and is not conserved among vertebrate insulin sequences (17). Furthermore, in several otherwise high-resolution crystal structures of insulin, the side chain of Lys B29 exhibits high thermal B factors (2) and/or lacks continuous electron density (in particular toward its distal end), suggesting dynamic disorder (as exemplified by Protein Data Bank entries 1BEN, 1EV3, 1TRZ; also see PDB entries 1TYL, 1TYM, 1G7A, 1EV3, 1ZNJ, 1EV6, 1BPH, and 1DPH for further examples) (65-67); such flexibility is in accordance with motional narrowing of 1 H NMR resonances observed in a variety of NMR studies (49,68). In pharmacologic applications these considerations have favored B29 as a site of substitution or chemical modification (e.g. Pro B29 in Humalog, Glu B29 in Apidra, and acylation of Lys B29 in Levemir and Tresiba; Refs. 69 -72). Surprisingly, the platform substitution Lys B29 3 Orn (introduced only to simplify the protocol of trypsin-mediated semi-synthesis) was itself associated with enhanced resistance to fibrillation at neutral pH, a finding that suggests that the mechanism of amyloid formation can be influenced by subtle changes despite its seeming universality among polypeptide sequences (30). It is possible that long-range electrostatic interactions by the ␦-amino group of Orn B29 (made more distant and so weakened as compared with the ⑀-amino group of Lys B29 ) may impose kinetic barriers to partial transient unfolding as envisioned in a putative amyloidogenic intermediate (73). Interpretation of relative fibrillation lag times is limited by the lack of structural information pertaining to such non-native intermediates.
Interest in position B26 has recently been renewed by evidence that substitution of Tyr B26 by Asn (and possibly other amino acids) can alter the ratio of affinities to isoform IR-A versus IR-B in cellular assays, possibly as a consequence of a B26-related conformational change (74). Although the degree of selectivity was small, such studies motivated elegant use of non-standard protein engineering to create a novel class of constrained analogs (75). The latter studies employed cell lines that respectively expressed either one isoform or the other but in different lineage-specific contexts. Because such membranes may differ in composition or proteome, it would be of future interest to investigate whether such isoform selectivity might be observed in studies of purified receptor isoforms (as distinct from intact cells) and, if so, might extend to receptor fragments (such as the IR model).
It would be of future interest to investigate the solution structures of selected B26 analogs by NMR spectroscopy. Although such studies would ordinarily require use of a monomeric insulin template (such as insulin lispro; Ref. 76) to circumvent insulin self-association, a confounding issue at the high protein concentrations required for NMR, our SEC studies of Orn B26 -Orn B29 -insulin and Glu B26 -Orn B29 -insulin suggest that these analogs may be amenable to high-resolution study without further modification. Of particular interest would be the extent to which these B26 substitutions may lead to nonlocal changes in the dynamics of the hormone's ␣-helical domain (such as in its pattern of conformational broadening; Ref. 77) due to perturbation of the Ile A2 /Val A3 -associated interchain crevice. In light of the displaced conformation of the B23-B27 segment in the IR complex (4), it would be of further interest to probe whether such B26 substitutions weaken the attachment of this segment to the insulin core. A recent study of the insulin monomer by molecular-dynamics simulations has predicted that long-range interactions by Tyr B26 in WT insulin regulate transient detachment and re-attachment of this segment through a series of transient conformational substates (78). Use of heteronuclear NMR methods to uncover such "excited-state" conformations represents a promising frontier of protein science (79).
Toxic Misfolding as an Evolutionary Constraint-Although in vivo formation of insulin fibrils is rare in humans, a hystricomorph rodent in South America (Octodon degus) develops senile diabetes mellitus mostly in association with deposition of an amyloid strictly composed of insulin in its islets of Langerhans (80,81). The insulin sequence in this rodent contains the unusual substitution His B10 3 Asn, which removes the central Zn 2ϩ coordination site in the classical insulin hexamer. In addition, the variant insulin of O. degus contains substitution Tyr B26 3 Arg, which would further be expected to impair or block hexamer assembly. Because the self-assembly of Zn 2ϩinsulin hexamers is thought to protect the insulin from partial unfolding and non-native aggregation within the storage granules of pancreatic ␤-cells (1), the combined adverse effects of Asn B10 and Arg B26 on self-assembly may provide a molecular mechanism for the formation of an insulin-specific amyloid in this species. Such an experiment of nature highlights in the breach the implicit constraint that we propose underlies the broad conservation of Tyr at B26: avoidance of toxic misfolding. We thus envisage that a combination of inefficient or unstable disulfide pairing, perturbed hexamer assembly, and heightened susceptibility to fibrillation might be associated with a risk of toxic protein deposition as an amyloidogenic disease and so impose independent evolutionary constraints ( Fig.  7A) (68). The displacement of the C-terminal segment of the B chain from the hormone's ␣-helical core (Fig. 7B) would otherwise favor a mechanism of fibrillation based on distortion of the monomeric structure with aberrant exposure of nonpolar surfaces (Fig. 7C) (53).
Although residue B26 is broadly conserved as Tyr among vertebrate insulins and as Phe among IGFs (17), such conservation is not strict. In addition to the amyloidogenic O. degus sequence, for example, several hystricomorph insulins contain Arg or Ser at B26 (17). Furthermore, the genome of Poecilia formosa (the Amazon molly fish) encodes a proinsulin/IGF-like protein wherein the B24-B26 segment contains a non-aromatic residue at B26 (Phe B24 -Tyr B26 -Asn B26 ; SwissProt entry A0A096M678_POEFO). Such anomalies are in accordance with the dispensability of Tyr B26 in truncated insulin analogs with native activity, as exemplified by des-pentapeptide[B26-B30]-insulin-amide (26). It is possible that the life spans of such animals are shorter than the time scale of toxic protein deposition, and so their reproductive success is unaffected by the variant insulin.
Concluding Remarks-Our results suggest that conservation of Tyr B26 is unlikely to be enjoined by the topography of the IR. The reduced stability of alternative high-affinity analogs and their enhanced susceptibility to fibrillation instead suggest that its conservation reflects co-optimization of several factors, including not only activity but also stability and avoidance of toxic misfolding (30). The latter's evolutionary importance is highlighted by the monogenic proinsulin syndrome (82). Indeed, the broad conservation of Tyr B26 may reflect the multiple roles played by specific side chains in the course of a complex "conformational life cycle" from nascent folding to receptor binding. It would be of future interest to extend our survey from insulin to proinsulin in relation to foldability in the endoplasmic reticulum and corresponding folding efficiency in vitro.
Author Contributions-V. P. prepared the set of insulin analogs, performed the fibrillation-and guanidine-titration studies, and coordinated the receptor-binding studies. N. B. P. undertook SEC-HPLC studies and contributed to these sections of the manuscript. N. R. and V. P. performed spectroscopic studies of cobalt insulin hexamers. M. C. L. contributed to in silico analysis of the B26-related binding surface of the insulin-microreceptor complex. J. W. oversaw the receptor-binding experiments. M. A. W. coordinated the study and preparation of the manuscript.