Solution structure of an ultra-stable single-chain insulin analog connects protein dynamics to a novel mechanism of receptor binding

Domain-minimized insulin receptors (IRs) have enabled crystallographic analysis of insulin-bound “micro-receptors.” In such structures, the C-terminal segment of the insulin B chain inserts between conserved IR domains, unmasking an invariant receptor-binding surface that spans both insulin A and B chains. This “open” conformation not only rationalizes the inactivity of single-chain insulin (SCI) analogs (in which the A and B chains are directly linked), but also suggests that connecting (C) domains of sufficient length will bind the IR. Here, we report the high-resolution solution structure and dynamics of such an active SCI. The hormone's closed-to-open transition is foreshadowed by segmental flexibility in the native state as probed by heteronuclear NMR spectroscopy and multiple conformer simulations of crystallographic protomers as described in the companion article. We propose a model of the SCI's IR-bound state based on molecular-dynamics simulations of a micro-receptor complex. In this model, a loop defined by the SCI's B and C domains encircles the C-terminal segment of the IR α-subunit. This binding mode predicts a conformational transition between an ultra-stable closed state (in the free hormone) and an active open state (on receptor binding). Optimization of this switch within an ultra-stable SCI promises to circumvent insulin's complex global cold chain. The analog's biphasic activity, which serendipitously resembles current premixed formulations of soluble insulin and microcrystalline suspension, may be of particular utility in the developing world.

Insulin is a small globular protein that regulates metabolic homeostasis in vertebrates. A classical model for studies of protein structure and long a mainstay of therapy for diabetes mellitus (DM), 4 insulin was the first recombinant protein in clinical use and a pioneering target of protein engineering (1). Whereas design of first-generation analogs sought to optimize pharmacokinetic (PK) properties (i.e. rapid-or long-acting therapeutic formulations), recent efforts have focused on protein stability (2)(3)(4). Ultra-stable insulin analogs promise to circumvent a costly "cold chain" underlying its global distribution (2). The salience of this issue has been sharpened by an emerging pandemic of DM in the developing world (5).
Insulin contains two chains: 5 an A chain (21 residues) and B chain (30 residues) (Fig. 1A) (6). The mature hormone nonetheless belongs to a metazoan superfamily of single-chain proteins (7) and itself derives from a single-chain precursor (proinsulin (8)). In the accompanying article (9), we described an ultrastable single-chain insulin analog (labeled SCI-a; Fig. 1C) with appropriate biological activity (including duration of signaling) in a rat model (9). The crystal structure of SCI-a, determined as a non-canonical zinc-free hexamer, revealed native-like component dimers with apparent disorder in the engineered C domains and adjoining B domain residues. Because of this structure's limited resolution (2.8 Å) and incomplete electron density, we sought a high-resolution solution structure and its dynamic assessment. This study thus focused on a monomeric variant of SCI-a, viz. SCI-b, that is amenable to heteronuclear 3D/4D-NMR spectroscopy (10,11). SCI-b differed from its parent at residues B28 and B29, recapitulating features of rapid-acting insulin analogs (Fig. 1B) (12,13). Intravenous (i.v.) bolus injection of SCI-b led, for unknown reasons, to protracted insulin action (9), similar to that of ultra-stable two-chain insulin analog cross-linked by an additional disulfide bridge (3). The monophasic pharmacodynamics (PD) profile of SCI-a by contrast resembled that of wild-type (WT) insulin (9). Insulin binds as a monomer to a receptor tyrosine kinase (TK), the insulin receptor (IR). The product of a single gene, the IR precursor is processed in the trans-Golgi network to form a disulfide-linked (␣␤) 2 homodimer (14). The extracellular ␣ subunits bind insulin, and the transmembrane ␤ subunits contain the intracellular TK domains (15). Whereas the structure of the holoreceptor has been visualized only at low resolution (Ͼ20 Å (16)), its domain dissection has enabled stepwise crystallographic analysis. (i) Structures of the intracellular TK domains and their mode of interaction have been determined at a resolution of 1.9 Å (17,18). (ii) The structure of the N-terminal three domains of the ␣ subunit (Leu-rich domain 1, Cysrich domain, and Leu-rich domain 2; L1-CR-L2) has been determined at 2.3 Å (19). (iii) The structure of the dimeric ectodomain ((␣␤ ⌬ ) 2 , containing the entire ␣ subunit and the extracellular portion of the ␤ subunit, ␤ ⌬ ) has been determined at 3.3 Å (20), wherein its conformation resembles an inverted V. Although co-crystals of an insulin-ectodomain complex have not been obtained, a variety of evidence indicates that insulin binds within the crux of the dimer (15,21,22).
Construction of domain-minimized models of the IR ␣ subunit has enabled crystallization of "micro-receptor" (IR) complexes (15,21). Such complexes contain, at a minimum, the primary insulin-binding elements L1 and ␣CT stabilized by an Fab antibody fragment. One such co-crystal structure, determined at 3.5 Å resolution, depicted a ternary complex between insulin, an L1-CR fragment, and a synthetic ␣CT peptide spanning residues 704 -719 of IR isoform A (15). In this structure the C-terminal segment of the insulin B chain pivots from the hormone's ␣-helical core to enable its insertion between the conserved surfaces of L1 and the ␣CT peptide. This mode of binding, anticipated based on studies of anomalous insulin analogs (23,24) and residue-specific photo-crosslinking (25), defined the binding surfaces in the IR for insulin's conserved triplet of aromatic residues: Phe B24 , Phe B25 , and Tyr B26 . The side chain of Tyr B26 , although important for insulin self-assembly (6), packs at a solvated edge of the IR (15) and can be substituted by Ala, Ser, or Glu without loss of affinity (26).
The solution structure of SCI-b presented here contains a well-ordered ␣-helical domain whose overall structure is similar to corresponding portions of the crystallographic protomers of SCI-a. The latter resemble T-state protomers as observed in T 6 crystal structures of WT insulin (9). The B chain ␤-strand (residues B24 -B28) exhibits local order in the absence of a dimer-related ␤-sheet, but its precise positioning relative to the A and B domain ␣-helices differs from crystallographic dimers (9). The C domain is less well-ordered than the globular domain, but its ensemble of conformations is not random; non-polar and electrostatic contacts with residues in the N-terminal ␣-helix of the A domain are on average maintained. Order parameters derived from 13 C chemical shifts and 1 H-15 N relaxation studies are in accordance with these trends.
NMR-derived structures of proteins are ordinarily based on the assumption that all restraints (as derived from nuclear Overhauser effects (NOEs), J-coupling constants, and patterns of chemical shifts (27)) are enforced simultaneously (28). This assumption may not be valid in the presence of local or segmental mobility (29). Accordingly, we have extended such modeling through use of time-averaged molecular-dynamics (MD) simulations in which all or part of the protein satisfies the NMRderived restraints only as time-averaged in an MD trajectory (30,31). This dynamic view was extended by simulated annealing (SA) and multiconformer simulation within the SCI-a crystal lattice to obtain an ensemble of possible states. A third and more distantly related ensemble was obtained through MD simulations of SCI-b as docked within a IR complex (15,21). Comparison of these ensembles highlights the role of segmental mobility in the mechanism of receptor binding.
Together, our results imply a functional transition between an ultra-stable closed state in the free hormone and an active open state on receptor binding. A model of an SCI-IR complex is proposed in which the critical IR ␣CT element "threads" within a loop opened by displacement of the B domain C-terminal segment and bounded by a disordered C domain. Beyond these mechanistic implications, the biphasic PD features of SCI-b in vivo closely matched the dual-action profile of premixed soluble and microcrystalline insulin formulations (32): regimens with potential benefits among minority communities in the West (33) and of broad current use in Africa (34) and Asia (35). The biphasic action profile of the present SCI together with its extraordinary resistance to thermal degradation (9) may enhance access to insulin among underprivileged patients in the developing world (2,5).

Results
SCI-a and SCI-b are described in the accompanying article (9). Designed based on prior studies of two-chain insulin analogs (12,13), chemically cross-linked analogs (36), and an SCI prototype (4), these proteins contain sequence elements intended to co-optimize specific biochemical and biophysical properties (Fig. 1C). Whereas SCI-a (amenable to protective self-assembly) was envisaged as a therapeutic candidate, SCI-b

Heat-stable insulin dynamics
provides a monomeric model amenable to high-resolution NMR study. Paired substitutions Asp B28 and Pro B29 combine features of two rapid-acting analogs in current clinical use (insulins lispro and aspart) (Fig. 1B) (12,13). Dimerization and in turn higher-order assembly are impaired by (i) the absence of proline at B28, whose pyrrolidine ring engages Gly B23 Ј across the dimer interface (37), and (ii) the negative charge of Asp B28 itself (1). The tractability of the one-dimensional 1 H NMR spectrum of SCI-b at high protein concentration (9) motivated its present uniform 15 N/ 13 C-labeling in yeast Pichia pastoris; heteronuclear resonances assignments were obtained by standard methods (10,27).  (6)) from two perspectives showing A (green) and B (blue) chains and disulfide bridges (yellow sticks): A20 -B19, A7-B7, and A6 -A11 (black boxed labels). Side chains of Thr A8 and Tyr A14 are highlighted as red sticks. The C-terminal B chain segment is labeled. B, protein sequences of WT insulin, insulin aspart, and insulin lispro. C, protein sequences of novel single chains SCI-a and SCI-b as well as progenitor SCI-c (4). SCIs contain a six-residue C domain; peptide bonds connecting the A and B domains to the C domain are shown as red lines. Substitutions relative to WT insulin in B and C are shown in red. Gold lines in sequences indicate disulfide bonds. Black arrows at bottom of C highlight residues B10, B28, B29, and A14.

Solution structure resembles crystallographic protomers
The solution structure of SCI-b was calculated on the basis of 1370 NOE distance restraints (average of 24 per residue; Table  S1) and 75 dihedral angle restraints. An ensemble, based on simultaneous enforcement of these restraints, contains a native-like insulin fold (A and B domains; green and blue in Fig.  2, A and B) connected by the 6-residue C domain (red in Fig. 2,  A and B). Statistics are given in Table S1. The SCI-b core is similar to that of WT insulin (gray in Fig. 2C) and our previous SCI prototype (gray ensemble in Fig. 2D). In the SCI-b monomer residues, B23-B28 appear less well-ordered and displaced relative to the corresponding segments at a classical dimer interface in SCI-a (9), wherein B24 -B28 and its dimer-related partner form an anti-parallel ␤-sheet (6). Following helical alignment of the SCI-b ensemble to all protomers of SCI-a (Fig.  3A), the mean root-mean-square deviations (r.m.s.d.) of respective residues B24 -B26 are 1.2(Ϯ0.3), 1.3(Ϯ0.3), and 2.0(Ϯ0.5) Å, respectively. SCI-b also differs from crystallographic protomers of SCI-a in the presumed orientation of the B27-C6 segment, although well-defined electron density in the latter structure was not observed (9).
The C domain is neither well-ordered nor disordered. Its ensemble of trajectories spans a crevice between A and B domains. Segmental imprecision reflected a lower density of NOE restraints; evidence of underlying dynamic disorder is presented below. Apparently, stabilizing interactions were introduced by the arginine guanidinium groups as follows (Fig.  2, E-G). (i) The N-terminal portion of the A1-A8 ␣-helix extends to include residues C4 -C6 (sequence PRR) as demonstrated by the proximity of Pro C4 and Val A3 ( Fig. 2E; corresponding 4D NOESY cross-peaks in Fig. S1). (ii) A salt bridge is formed between Arg C6 and Glu A4 (Fig. 2F). (iii) Favorable -cation interactions (38) are inferred between Tyr A19 -Arg C5 and possibly Phe B25 -Arg C5 (Fig. 2G). The latter is observed in some but not all members of the ensemble due to imprecision at residue B25. All residues within the insulin moiety mutated relative to WT (B28, B29, A8 and A14; Fig. S2) are solvent-exposed. Whereas the A8 and A14 side chain posi- and published SCI-c NMR ensemble (gray; PDB code 2JZQ) relative to SCI-b (red, green, blue) (D). Superpositions were aligned according to the main-chain atoms of the A1-A8, A13-A19, and B9 -B19 segments. A representative model in E-G was selected to demonstrate close proximity of Pro C4 -H ␥2 and Val A3 -H ␥2 extending the A1-A8 helix (E), the salt bridge between Arg C6 guanidinium nitrogen and Glu A4 carboxylate (F), and putative -cation interactions between Arg C5 and Tyr A19 or Phe B25 (G). Color code: A domain (green); B domain (blue); C domain linker (red); disulfide bridges (yellow sticks); oxygen (red); nitrogen (cyan); carbon (black); and hydrogen (white). Distance measurements are indicated by red dashed lines; -cation distances were measured from centers of the aromatic rings (small black spheres) to the terminal nitrogen atoms of Arg C5 .

Heat-stable insulin dynamics
tions are well-defined within the ensemble, Asp B28 and Pro B29 side chains are imprecise, residing proximal to the flexible BC junction.

Segmental imprecision in the NMR-derived ensemble mirrors pattern of order parameters
To probe protein dynamics, 15 N relaxation studies were undertaken to measure residue-specific R 1 and R 2 relaxation rates and heteronuclear NOE intensities (Fig. 4A). These data, probing motions on the picosecond to nanosecond time scale, enabled calculation of generalized order parameters (S 2 ) (Fig.  4B, top). Such parameters distinguish between rigid (S 2 Ն 0.65) and flexible (S 2 Ͻ 0.65) elements (39). The latter correlated with sites of fast 1 H-2 H amide-proton exchange (red squares) as described below. These residues were fit best by an extended model-free analysis where S 2 fast ϭ S 2 Ϫ S 2 slow (see "Experimental procedures").
Regions of flexibility or order, as distinguished by the above 1 H-15 N relaxation studies, correlated with patterns of 13 C NMR chemical shifts. Shown in Fig. 4B are secondary-structure probabilities (SSP; bottom) and predicted order parameters (S 2 Ј; middle). Ordered regions (S 2 and S 2 Ј Ͼ 0.65 and SSP 0) correspond to the three canonical ␣-helices. SSP scores also suggest ␤-strand features within the B23-B30 segment. The latter segment exhibited increasing flexibility (0.45 Ͻ S 2 Ͻ 0.65) from B23 to B30 with maximal flexibility within the C domain (residues 31-36 or C1-C6). Values of S 2 and S 2 Ј were consistently Ͻ0.6 in the C domain. Although relaxation parameters could not be obtained at Pro C4 (which lacks an amide proton), its chemical shift-derived S 2 Ј suggested less disorder than at flanking sites. These dynamic probes may be taken into account to generate alternative ensembles of protein structures based on time-averaged MD simulations (30, 31) (see "Discussion").

SCI-b exhibits enhanced thermodynamic stability
As expected, based on its native-like solution structure, far-UV circular dichroism (CD) spectra of SCI-b and insulin lispro implied similar helix contents at 4°C (Fig. 5, A and B). Thermal unfolding was monitored by mean residue ellipticity at 222 nm in the range 4 -88°C (Fig. 5C). Whereas insulin lispro exhibited a marked and irreversible loss of helix content with increasing temperature, SCI-b displayed only a small and limited attenuation of ellipticity on heating. On cooling from 88 to 4°C, the far-UV CD spectrum of SCI-b was indistinguishable from the original spectrum, although lispro did not refold. Dynamic light scattering (DLS) revealed that lispro forms large aggregates following 10 min of incubation at 88°C, whereas these conditions do not affect the size distribution of SCI-b (Fig. S3). Thermodynamic stability was likewise assessed by CD-monitored guanidine denaturation with application of a two-state model ( Fig. 5D and Table 1). The SCI's free energy of unfolding (⌬G CD 4.7(Ϯ0.1) kcal/mol) was larger than that of insulin lispro (3.1(Ϯ0.1) kcal/mol). Respective titration midpoints (C mid ) were 5.7(Ϯ0.1) M (SCI-b) and 4.7(Ϯ0.1) M (insulin lispro). In such modeling the m-value was higher for SCI-b (0.82(Ϯ0.01) kcal/(mol⅐M)) than for insulin lispro (0.67(Ϯ0.01) kcal/(mol⅐M)), suggesting that hydrophobic surfaces are more effectively desolvated in SCI-b. These estimates depend on the assumption of two-state unfolding and on fitting of post-transition baselines (40). and disulfide bridges in yellow) with all six monomers of SCI-a (black, crystal refinement from accompanying article (9)); the structures were aligned as in Fig. 2. Two views are given at relative angle 90°to illustrate differences among B24 -B30 segments. For clarity, SCI-b structures are at low opacity. B, average pairwise backbone r.m.s.d. comparing all monomers of SCI-a to 18 SCI-b structures. Absent r.m.s.d. values reflect residues with indeterminate electron densities. C-E, stereo models illustrating side-chain packing of residues B15 and B24 (C), B12 and B26 (D), and A2, A16, and A19 (E) within the mean SCI-b structure overlaid with the side-chain electron densities of SCI-a monomer D; structures were aligned as in Fig. 2 (SCI-a structure is hidden for clarity). Disulfide bridges are shown in yellow, and atoms otherwise colored by domain: A domain, green; B domain, blue; C domain, red; and atom colors: oxygen, light red; and nitrogen, light blue.

Heat-stable insulin dynamics
To obtain estimates of thermodynamic stability without reference to two-state modeling, we investigated amide-proton exchange kinetics in D 2 O at pD 7.0 and 25°C. These studies employed protein samples uniformly enriched in 15 N. The first 15 N-HSQC spectrum ( Fig. 5E; obtained ϳ10 min following dissolution in D 2 O) exhibited only four 1 H-15 N cross-peaks: B15, B18, A16, and A19. Their progressive 1 H-2 H exchange in subsequent HSQCs yielded monoexponential kinetics (Fig.  5F). Comparison of observed (k obs ) and intrinsic exchange rates (k int ; calculated using SPHERE (41)) yielded protection factors (PF ϭ k int /k obs ) and in turn provided an estimate of thermodynamic stability (⌬G pH 7.4 ): 4.63(Ϯ0.08) kcal/mol (averaged over sites B15, B18, and A16; Table 2). 6 This estimate, in striking accord with the results of guanidine denaturation studies, assumes global-exchange kinetics, i.e. sites where 1 H-2 H exchange requires complete unfolding (42,43).

Subglobal conformational fluctuations are similar to those in a two-chain analog
Analysis of presumed global 1 H-2 H exchange at pD 2.4 ( Fig.  6A; residues B18, A16, and A19) yielded estimates of ⌬G pH 2.8 (Table S2) similar to those at neutral pH. Amide-proton exchange is markedly slower under these conditions as intrinsic exchange rates are slower than at neutral pD (41). Despite such global stabilization, the residue-specific pattern of subglobal and local PFs (Fig. 6B) was in accordance with a previous analysis of insulin lispro at pD 3.0 (43). Such similarities provide evidence that the C domain does not significantly damp subglobal motions on this prolonged time scale (Ͼ1 min).

Modeling of receptor-bound SCI-b predicted a novel open conformation
Models of the complex (Fig. 7) between SCI-b and IR resulted in ␣CT being threaded through the loop formed by the C domain and B24 -B30 segment. The proximity of ␣CT to the C terminus of the A domain precluded all attempts to form an un-threaded model. Removing ␣CT residue Phe-714 from the alignment with the template allowed ␣CT to disengage all contact with SCI-b, resulting in models with ␣CT no longer threaded through the loop; these models also no longer had Phe-714 occupying its usual site bounded by Ile A2 , Val B12 , Leu B15 , and IR residues Leu-37 and Phe-64.
Interactions between SCI-b and IR over the 300-ns simulation ( Fig. 7A) mirrored those observed in the co-crystal structure of the WT insulin complex, with notable key conserved interactions; in particular, Phe B24 occupied the space within the hydrophobic pocket generated by B domain residues Val B12 , Leu B15 , Tyr B16 , and Cys B19 and IR residues Asn-15, Leu-37, Phe-39, and Phe-714. Phe B25 projects away from L1-␤2 and inserts between ␣CT residues Val-715 and Pro-718 in the same way as observed in the WT complex. Numerous transient intramolecular electrostatic interactions were observed: between Tyr B26 and Asp B28 , between C domain residues Glu C1 and Glu C2 with Arg C5 and Arg C6 , and between Glu A4 and Arg C5 . The C domain of SCI-b contacted the flexible loop on the CR domain of the IR between residues 264 and 277, predicting numerous nonspecific interactions. Superposition of bound SCI-b from this simulation to free SCI-b obtained by NMR (   15 N spin-lattice (R 1 , black) and spin-spin (R 2 , blue) relaxation rates and heteronuclear NOEs (red). Data were obtained using a 13 C, 15 N-SCI-b sample. Vertical error bars in each plot result from input spectral noise in each T 1 or T 2 experiment and subsequent error analysis by the Relax NMR software package (90). B, top, generalized (S 2 , black squares) and fast (S 2 fast , red squares) order parameters calculated from experimental 1 H-15 N T 1 , T 2 , and heteronuclear NOE data with the DYNAMICS software package (91). Predicted generalized order parameter (S 2 Ј; middle plot) and secondary structure prediction scores (bottom plot) were calculated by the TALOSϩ software program (60) based on chemical shifts. In all plots, blue-, red-, and green-shaded areas, respectively, highlight the B9 -B19 helix, the C1-C6 segment, A1-A8 helix, and the A13-A19 helix. For SSP probabilities (bottom), positive values are associated with ␣-helix, whereas negative values suggest ␤-strand; an SSP value of zero implies disorder.

Heat-stable insulin dynamics
overlay (Fig. 7C) and from a superposition of bound structures to all monomers from the SCI-a crystal structure ( Fig. 7D; structure from Ref. 9) highlight that the B23-C6 segment primarily contributes to the closed 3 open conformational transition.

SCI-b resembles a premixed soluble-microcrystalline insulin formulation
The biological activity of SCI-b was tested on SQ injection in Lewis rats rendered diabetic by streptozotocin (Fig. 8, A and B) (44). Because its intrinsic PD profile (as defined on i.v. bolus injection; see our accompanying article (9)) exhibits fast and delayed phases, our SQ studies were conducted in relation to a premixed clinical formulation containing soluble and microcrystalline components (Humalog Mix 75/25; Lilly) (32). Soluble insulin lispro provided a rapid-acting control (red squares in Fig. 8). Whereas standard premixed formulations were developed to provide effective post-prandial bolus (rapid) and basal (delayed) insulin activity via differential absorption rates in the SQ depot (black and aquamarine in Fig. 8), a monocomponent solution of SCI-b provided a similar action profile (green triangles in Fig. 8). The potential implications of this finding for global health are discussed below.

Discussion
Protein dynamics is integral to a molecular understanding of structure and function. Segmental flexibility, for example, may underlie conformational change and in turn facilitate proteinprotein recognition (45). Furthermore, such dynamic features may accelerate evolution of novel functions (46). In this and our accompanying article (9), we have sought to apply such biophysical principles to the engineering of ultra-stable protein therapeutics. We first discuss our biophysical findings and then their translational implications.

SCIs provide an ultra-stable platform for protein engineering
This study focused on an ancestral folding motif: the insulinrelated superfamily (7). The high-resolution NMR structure of an active SCI, designed as an engineered monomer in solution, provided a foundation for analysis of protein dynamics. Our findings were extended by use of SA/MD simulations to gain insight into the dynamics of the parent protein in a crystal lattice and in a model receptor complex.
In the accompanying article in this issue (9), we presented the design rationale for SCI-a and SCI-b (Fig. 1C). The crystal structure of SCI-a was determined at moderate resolution (2.8 Å) as a zinc-free hexamer. Because an entire hexamer lies within the lattice's asymmetric unit, each protomer was crystallographically independent. Despite the limited resolution of the diffraction data, the six insulin moieties (i.e. the A-and B domains) exhibited canonical structures with native dimerization. Root mean square deviation (r.m.s.d.) values among the six crystallographic protomers were consistent with the structural variability of WT insulin in different crystalline envi-  Table 1). E, 1 H-15 N-HSQC of 15 N-SCI-b ϳ1500 s after placement of dried protein sample in 100% D 2 O potassium phosphate buffer. F, exponential decay profiles generated by plotting HSQC cross-peak intensities as a function of time. Only B15, B18, A16, and A19 resonances were present in first and subsequent HSQCs due to rapid baseline base-catalyzed 1 H-2 H exchange at pH 7.4. Solid lines show singleexponential fits; parameters are given in Table 2.  for A19 is ascribed to its proximity to cystine A20 -B19, which may confound calculation of k int , possibly due to residual structure in the unfolded state under these conditions.

Heat-stable insulin dynamics
ronments (47). Incomplete electron density was observed (to a varying extent among the six protomers) in the respective C domains and preceding residues of the B domain (B27-B30). Precise analysis of side-chain conformations and interactions in the hydrophobic core was limited by the crystallographic resolution. The crystal structure of SCI-a exhibited a novel overall feature: in the absence of axial zinc ions, the trimer interfaces were displaced relative to classical zinc hexamers (9). We speculated that this novel zinc-free SCI hexamer pertains in ␤-cells to selfassembly of proinsulin in the endoplasmic reticulum and/or Golgi apparatus (in which zinc ion concentrations are typically less than 1 pM (48)). Such native assembly might mitigate aggregation-coupled misfolding en route to prohormone processing and storage in glucose-regulated secretory granules. Subsequent formation of zinc-insulin hexamers and their microcrystallization within these granules are made possible by a ␤-cellspecific zinc transporter ZnT8 (49).

Interrelation of stability, dynamics, and function
Interest in SCIs was stimulated by classical studies in which bifunctional chemical reagents were employed to tether the C terminus of the B chain to the N terminus of the A chain (50,51). Such non-standard linkers most often connected the ⑀amino group of Lys B29 to the ␣-amino group of Gly A1 , thereby mimicking a connecting peptide. Because D-amino acid substitutions at A1 are well-tolerated (52), an alternative approach employed a D-Lys A1 ⑀-amino group, extending the effective linker length (51). The relative receptor-binding affinities of the resulting B29-A1 tethered insulin analogs reflected the number of atoms within the cross-link: the longer the tether, the stronger the insulin analog-IR binding interaction.
The relationship between linker length and IR affinity suggested that conformational "play" between Lys B29 and Gly A1 is required for high-affinity hormone binding. Biophysical studies of N ␣A1 , N ⑀B29 -ethylene glycol-bis-succinoyl-insulin, whose tether contains 22 connecting atoms inclusive of the B29 side chain, demonstrated increased thermodynamic stability (⌬⌬G u 1.9 kcal/mol at 23°C) and decreased protein flexibility, as indicated by an ϳ10 3 -fold retardation in overall rate of amideproton exchange in D 2 O (36). To our knowledge, the latter study provided the first evidence for a biophysical linkage between inter-chain dynamics and protein stability in the insulin molecule. The present amide 1 H-2 H exchange studies of SCI-b similarly revealed increased global thermodynamic stability through significantly delayed global and subglobal exchange in D 2 O at pD 2.4 (pH 2.8) relative to analogous studies of two-chain WT insulin (53).

SCI-b exhibits a combination of structural order and dynamic disorder
The structure of SCI-b was determined as an engineered monomer in solution by heteronuclear multidimensional NMR methods (10,11). Whereas SCI-a contained Pro B28 -Glu B29 , a feature of clinical analog insulin glulisine compatible with a native mode of dimerization (54), paired substitutions Asp B28 and Pro B29 in SCI-b impaired dimerization; this design combined features of insulin aspart and lispro (12,13). The 1 H NMR

Heat-stable insulin dynamics
spectrum of SCI-b was found to exhibit sharp resonances even at a protein concentration of 1 mM (9). The pattern of secondary 1 H NMR chemical shifts was consistent with stable folding of a globular domain. Ring-current shifts associated with formation of classical dimer and trimer interfaces (55) were not observed.
The mean NMR structure of SCI-b is similar to but not identical to crystallographic protomers of SCI-a (Fig. 3A) or WT insulin (6,56). Although similarities in core side-chain packing were observed between the average SCI-b solution structure and the crystallographic electron density of SCI-a (Fig. 3, C-E), minor differences occur in segment B24 -B28 (backbone r.m.s.d. in Fig. 3B). The latter is a "naked" ␤-strand in the monomer (57) but stabilized as an anti-parallel ␤-sheet at the dimer interface of the hexamer (6,56). In accordance with past NMR studies of two-chain analogs (24,57,58), the aromatic side chains of Phe B24 and Tyr B26 pack against the ␣-helical core even in the absence of its partner strand.
Also congruent between solution and crystal structures was the partial disorder of the C domain. Four types of NMR observations provide evidence for its reduced structural organization. First, the C domain exhibited more limited 1 H, 13 C, and 15 N chemical-shift dispersion than do the A and B domains. Such attenuated dispersion predicted lower order parameters relative to those of the insulin moiety (with the exception of frayed N-terminal residues B1-B3) (59). Second, C domain resonances exhibited motional narrowing (57,58) as evidenced by greater carbonyl carbon peak intensities (relative to rigid core residues) in 3D HNCO spectra (Fig. S4). Third, the density of inter-residue NOEs in the C domain (i.e. contacts either (a) between residues in the C domain or (b) from the C domain to the insulin moiety) was lower than the density of NOEs among the ␣-helical segments; reduced NOE density led to greater imprecision in a canonical distance-restrained SA ensemble. Finally, explicit interrogation of main-chain order parameters Disulfides are yellow and labeled by black boxes. N and C termini are as labeled. C, per-residue average C ␣ -C ␣ distances (͗R C␣-C␣ ͘) between all possible pairwise C ␣ comparisons across models in an 18-structure free SCI-b selectively timeaveraged ensemble (from A, right) and 18 structures from the bound SCI-b simulation, and D, among all models in an 18 structure SCI-a monomer "D" ensemble (from B, right) and the bound SCI-b ensemble. Error bars in C and D represent standard deviations; error in D is systematically larger than in C due to averaging over the six independent SCI-a crystallographic protomers. The B9 -B19, A1-A8, and A13-A19 helices and C1-C6 segment are shown as shaded boxes.

Heat-stable insulin dynamics
(based on 1 H-15 N T 1 , T 2 , and hetNOE measurements at 700 MHz) verified that nanosecond-scale fluctuations are more marked in the C domain than in the A or B domains. Notably, regions of structural order (S 2 Ն 0.65) correlated well with the three helices, whereas segmental flexibilities in the B1-B4 and B27-C6 segments were identified by S 2 Ͻ 0.65 (Fig. 4).
Dynamical information may also be inferred from patterns of 13 C and 1 H chemical shifts (60). Comparison of SCI-b and a reference two-chain analog (two-chain insulin analog containing Asp B10 , Lys B28 , and Pro B29 -insulin (61)) ( Fig. S5) suggested that the B25-B30 segment is better ordered in SCI-b. Predicted S 2 Ј parameters (where the prime indicates chemical shift-derived values) provides evidence that conformational fluctuations in C-terminal segment of the B domain are damped by the presence of the C domain, despite the latter's partial disorder (Fig. S5, top). Furthermore, the stabilizing Thr A8 3 His substitution in SCI-b was associated with enhanced secondary-structure propensity (SSP) scores within the A1-A8 helix (Fig. S5,  bottom). These conclusions were strengthened by analysis of the methyl proton secondary shifts of Ile, Leu, Val, and Ala residues (probes of aromatic ring currents related to tertiary structure Table S3) (55).
Complementary evidence for C domain disorder was provided by qualitative features of the electron-density maps of the variant Pro B28 -Glu B29 SCI-a hexamer (9). In this crystal form the asymmetric unit was the hexamer itself, and so three inde-pendent views of the SCI-a dimer were obtained, in turn providing six views of the C domain. Each dimer exhibited a canonical B domain interface with discontinuous electron density between B27 and A1. In most monomers, density in this region was not interpretable; the most complete (monomer D) included limited density for residues B28 -B29 and C6. Although static disorder in the lattice might in principle have contributed to such attenuated or discontinuous electron density, the consistency of these findings and their coherence with the NMR data strongly suggest that these incomplete features of the crystal structure represent dynamic disorder.

Ensemble perspective as a thought experiment
Despite evidence of segmental flexibility, both the NMR (Fig.  2) and X-ray-based (see accompanying article (9)) structural techniques are limited in their ability to visualize internal protein dynamics. To provide a depiction of SCI-b that a "Maxwell's Demon" might observe on the nanoscale, we recalculated the SCI-b NMR ensemble (Fig. 9A, left) using the Torda-Scheek-van Gunsteren time-averaged (TA) distance restraint protocol (30,31) on either all restraints (Fig. 9A, middle) or in a modified protocol, wherein time-averaging is applied only to regions known to be dynamic (Fig. 9A, right). Whereas time averaging for all restraints increases imprecision globally, selective time-averaging essentially reproduced the ensemble calcu- . The glucose transporter (brown) regulates glucose entry, whereupon its metabolism generates ATP as an intracellular ligand for the ligandgated K ϩ channel (blue). Binding of ATP closes K ϩ channels and thereby depolarizes the cell membrane, which in turn opens voltage-gated Ca 2ϩ channels (green). Entry of Ca 2ϩ triggers first-phase exocytosis of secretory vesicles followed by mobilization of storage granules (orange with red borders). D, plasma insulin concentration curve after subcutaneous injection of stated clinical analogs. The isolated soluble insulin and microcrystalline are indicated as green and blue curves, respectively. Premixed soluble and microcrystalline insulin is shown in red. D is adapted from http://watcut. uwaterloo.ca/webnotes/Metabolism/Diabetes.html.

Heat-stable insulin dynamics
lated using conventional static distance restraints, thereby raising confidence in the latter's physical accuracy. Notably, segmental flexibility is readily apparent via increased r.m.s.d. in the B28 -C5 segment (Fig. 9, C and D). As a corollary to these alternative ensembles of the solution structure, an ensemble model of hexameric SCI-a was generated. First, restraints derived from the carbon-carbon distances in the SCI-a crystallographic refinement (see accompanying article (9)) were utilized in a conventional SA protocol to generate an ensemble of 60 structures (Fig. 9B, left). Each of these structures was then subjected to multiconformer simulation wherein the B-factors of all atoms were fixed (B ϭ 2 Å 2 ) either without (Fig. 9B, middle) or with (Fig. 9B, right) constraining motion of the B5-B27 and A1-A21 segments, which had well-defined electron den-sity in the original refinement. The protocol for these multiconformer simulations, distinct from standard ensemble X-ray refinement, 7 is described in Fig. S6A. Ensemble-averaged R-factor calculations (R Ens , see Equations 2 and 3) yielded R Ens ϭ 0.37 ("B ϭ 2 Å 2 crystal simulation") and R Ens ϭ 0.56 ("B ϭ 2 Å 2 core constrained"); the former was slightly higher than R Free ϭ 0.31 for the single-structure SCI-a refinement (9). Notably, the R Ens of the most imprecise ensemble (Fig. 9B, middle) was lower than that of the core-constrained ensemble (Fig. 9B, right). This was a consequence of more efficient sampling of atom positions . Each structure was run through a TA distance-restrained MD simulation wherein MD time-averaging was activated for all restraints (All Residues TA; middle). A separate simulation was then performed that enforced time-averaged restraints only for residues residing within flexible regions (predicted S 2 Ͻ 0.65; right). B, each of the 35 hexameric structures generated using carbon-carbon distance restraints derived from the single-structure SCI-a crystallographic refinement (Hexamer Rebuild SA; left) were subjected to multiconformer simulation (see Fig. S6A for schematic description of the simulation procedure) with all thermal B-factors set to 2 Å 2 and either no residues constrained (B ϭ 2 Å 2 crystal simulation; middle) or with the positions of B5-B26 and A1-A21 constrained (B ϭ 2 Å 2 with core constrained; right). The main-chain (C) and heavy-atom side-chain (D) r.m.s.d. per residue for SCI-b (black) or SCI-a (gray) were calculated from ensembles in A (right) and B (right), respectively. The r.m.s.d. for the SCI-a hexamer ensemble are averaged over all monomers. The B9 -B19, A1-A8, and A13-A19 helices and C1-C6 segment are shown as shaded boxes. Gly residues at positions B8, B20, B23, C3, and A1 were excluded from heavy-atom side-chain r.m.s.d. calculations.

Heat-stable insulin dynamics
relative to the electron density by the ensemble as a whole (Fig.  S6B). Although this would in principle also result in a lower R Ens than R Free of the single-structure refinement, the ensemble models of SCI-a contain C domain residues with no corresponding electron density, resulting in R Ens Ͼ R Free . We emphasize that over-parametrization in our protocol precludes use of R Ens as a comparative refinement parameter; its value instead corroborates the physical plausibility of the ensemble.
The above ensemble exhibits a striking overall similarity to the NMR-derived models in Fig. 9A and provides a nanoscale perspective of the hexamer as it might exist in the crystal lattice. Here, the B1-B3 and B28 -C6 segments are flexible (r.m.s.d. in Fig. 9, C and D), but are constrained to lie within specific channels defined by neighboring hexamers within the lattice. The per residue r.m.s.d. (Fig. S7) of other alternative ensembles (from Fig. 9) highlight the same trends of structural order and flexibility in the solution and in this ensemble-based visualization of the crystal structure.
The ensemble perspectives of SCI-a and SCI-b highlight the segmental flexibility that foreshadows our MD-based model of SCI binding to the IR (Fig. 7). We envisage these as models of what a Maxwell's Demon, operating on the nanoscale, might observe. Such a Demon might play an active role in conformational selection, akin to Maxwell's mechanism by which environmental information is ostensibly extracted to change local order while still obeying the 2nd law of thermodynamics (62,63). As the SCI samples conformational space, the Demon would be charged with selecting an open conformation to enable receptor binding, a process that likely occurs on the micro-to millisecond time scale inaccessible to our multiconformer and TA-NOE simulations. Nevertheless, the above ensemble perspective provides insight into the process of conformational selection by highlighting flexible sites (i.e. B24 -C6) that must undergo closed 3 open conformational fluctuation before IR binding. We cannot exclude a mechanism by which an initial encounter complex by the closed state facilitates such opening as envisaged in classical notions of induced fit; here the Demon would be the receptor itself, coupled to the regulatory intelligence of the cell.

Model of an SCI-IR complex
In accordance with an extensive prior biochemical literature (23,24,64), recent crystallographic studies of insulin bound to a domain-minimized insulin micro-receptor (IR, composed of the L1 and ␣CT domains of IR) demonstrated that the B24 -B27 segment of insulin is displaced from its classical packing against the A1-A8 and B9 -B19 ␣-helices (15,21). Effected by changes in Ramachandran dihedral angles flanking Phe B24 , this "unhinging" (by ϳ60°relative to classical crystal structures of the free hormone dimer or hexamer (6)) enables conserved non-polar surface (spanned by the side chains of Ile A2 , Val A3 , Phe B24 , and Phe B25 ) to engage the receptor (15,21). An extended B24 -B27 segment lies in a groove between receptor elements L1 and ␣CT (15,21). Residues B28 -B30, not wellconserved among vertebrate insulins and dispensable for activity, were presumed to be disordered in the IR complex. Accordingly, these three residues would contribute to the C domain of biologically active SCIs to provide a flexible tether between Thr B27 and Gly A1 . Thus, the present SCI would effectively contain a 9-residue connecting peptide between the displaced B24 -B27 segment and the A1-A8 ␣-helix. Our MD simulations of such an SCI-IR complex (Fig. 7) have demonstrated the plausibility of this connection. The model is notable for breakage of C domain interactions with the insulin moiety, including packing of its central Pro against Val A3 ; distances between Val A3 C␥ 2 and Pro C4 C␥ are 6.1(Ϯ1.5) Å in the free NMR ensemble and 16.0(Ϯ1.9) Å in the predicted SCI-IR complex. Close packing of these side chains in the predicted complex is precluded in the IR complex by ␣CT. The bound SCI conformation thus differs from its free structure (as a monomer in solution and as a zinc-free hexamer in a crystal lattice) with respect to both the position of the B24 -B27 ␤-strand and displacement of the C domain.
Ultimately, segmental flexibilities in SCI-b and SCI-a foreshadow the closed to open conformational transition necessary to bind and activate the IR. MD modeling of SCI-b and IR (Fig.  7A) illustrates that the C domain, although relatively ordered in the solution-state monomer, opens up to wrap around the ␣CT segment of the IR-A receptor. In these threaded models, the interactions between insulin A and B domain residues with L1 and ␣CT were native-like, maintaining all interactions observed in the crystal structure of native insulin in complex with IR (15,21). Significant flexibility within the C domain was observed, with glutamic acids (Glu C1 and Glu C2 ) and arginine residues (Arg C5 and Arg C6 ) forming transient intramolecular electrostatic interactions as well as intermolecular interactions with the N-terminal ␣CT helical residues, ostensibly stabilizing the bound conformation. Models in which ␣CT did not thread the C domain resulted in SCI disengagement from the IR during MD simulations, suggesting an inherent instability with these models.
The open state of SCI-b thus emulates that of two-chain insulin through the liberation of the B24 -B30 segment from the protein core. This binding mode justifies SCI-b's retained activity in vivo while also rationalizing the biological inactivity of SCIs with shorter C domain sequences that would sterically hinder ␣CT-SCI "threading." The striking difference between the open and closed states is highlighted by an overlay (Fig. 7B) of 18 best-fit SCI-b NMR structures (from Fig. 2) with 18 models of the receptor-bound SCI-b from the MD simulation. The average per-residue C␣-C␣ distances across a helix-aligned ensemble composed of 18 structures each of the bound SCI-b and free SCI-b quantifies the magnitude of the closed 3 open transition (Fig. 7C), which is localized primarily from B24 to C6. A similar trend is seen when comparing the bound SCI-b ensemble to one monomer of the core-constrained SCI-a multiconformer simulation ensemble (Fig. 7D). The dynamic properties of the C domain thus appear to facilitate the conformational switch from the closed to open (receptor-bound) state and are thus crucial for SCI function.
The MD simulations of SCI-b in complex with IR recapitulate the flexibility observed experimentally in the SCI monomer and provide evidence of the requirement for these residues to remain intrinsically dynamic both when unbound and in complex with the receptor. The requirement for ␣CT to thread

Heat-stable insulin dynamics
through the open C domain would be difficult to characterize experimentally; future attempts to determine the crystal structures of these SCIs in complex with IR may not resolve the positions of the dynamic C domain residues. The ability for MD to probe the time-dependent interactions made by these flexible domains will be integral for the future design of novel insulin analogs.

Relationship of SCI-b to IGFs
Solution structures of insulin-like growth factors (IGF-I and IGF-II) have previously been described (65,66). IGF-I and IGF-II are homologous to SCI-b but with respective C domain lengths of 12 and 8 residues. Although similar in overall structure, there is one striking difference; whereas free SCI-b exhibits a well-defined native state, IGFs are well-organized only as ligand complexes or on specific binding to carrier proteins (IGF-binding proteins) (67). Indeed, in the absence of a bound detergent molecule or a bound phage-display peptide (68), NMR spectra of IGFs exhibit partial aggregation and conformational broadening on the millisecond time scale (65,66). Presumably, ligand-free IGF proteins are in part molten (66).
As with SCI-b, main-chain dynamic studies of an extended variant of IGF-I (long-[Arg 3 ]IGF-1) provided evidence of C domain flexibility with attenuation of corresponding hetNOE values (65). As a consequence, NMR-based modeling of IGF-I and IGF-II is remarkable for the imprecision of their respective C domains (Fig. S8).
Also unlike insulin and proinsulin, free IGF-I undergoes slow disulfide rearrangement to form an equilibrium between isoenergetic native and non-native ("swapped") pairing schemes (69). By contrast, insulin has a unique ground state (70). Studies of insulin variants suggest that its native disulfide pairing is maintained, at least in part, by favorable contacts between His B5 and A chain residues (including Cys A6 , Cys A7 , and Cys A11 ); these long-range interactions have no counterpart in IGFs wherein His B5 is substituted by Thr (70,71).

Translational implications as the fruit of serendipity
In our accompanying article (9), we observed that subtle sequence variation among SCIs was associated with changes in their PD profiles as evaluated in rats on i.v. bolus injection. These findings were serendipitous and not predicted on the basis of structural modeling. SCI-a, for example, exhibited a PD profile akin to that of insulin lispro (or WT insulin), whereas SCI-b exhibited similar initial activity but a subsequent tail of activity; this tail was even more pronounced in i.v. studies of SCI-c. Because the plasma half-lives of SCI-a and SCI-c are similar (each ϳ17 min (72)), 8 we presume that these PD differences arise at the level of the responding target (73). Evidently, subtle differences in the structure or dynamics of the hormone-receptor complex can be associated with biological differences in the duration or time course of attenuation of the insulin signal.
The biphasic PD profile of SCI-b on i.v. injection (9) motivated the present SQ studies. Biphasic release of insulin by pancreatic ␤-cells is a general physiological feature of metabolic homeostasis ( Fig. 8C and Fig. S9A) (74). The two phases of insulin release correspond to immediate and delayed continuous glucose-responsive exocytosis of insulin-containing secretory vesicles. The latter requires the ATP-dependent mobilization of a storage pool of vesicles (orange in Fig. S9A). Such biphasic secretion, recapitulated in isolated perfused islets (Fig. S9B) (75), provides a rapid bolus of insulin in response to a meal followed by prolonged basal release. Whereas both phases of insulin release are lost in type 1 DM (T1D; due to autoimmune ␤-cell destruction), early stages of type 2 DM are characterized by preferential blunting of first-phase release (76). In the natural history of T2D, eventual ␤-cell exhaustion also leads to progressive attenuation of secondphase insulin secretion (76).
The goal of insulin replacement therapy in T1D is to recapitulate the biphasic physiological pattern of post-prandial pancreatic insulin secretion and thereby prevent either hyperglycemic or hypoglycemic excursions. This goal has stimulated the development of rapid and basal insulin analogs (77) and sophisticated insulin delivery devices, including an "artificial pancreas" in which the output of a continuous glucose monitor controls (via a predictive algorithm) a pump as a closed-loop system (77). Rigorous glycemic control in T1D has been shown to delay or prevent microvascular complications and may also reduce the risk of macrovascular disease (78). The associated technologies primarily pertain to patients in affluent societies.
To an increasing degree over the next two decades, the majority of new cases of T2D are predicted to be in the developing world (5). This reflects an intersection between genetic factors and cultural changes, including urbanization, Western diet, and level of physical exertion (5). For unknown reasons, the incidence of T1D (classically a syndrome of northern latitudes and so more prevalent among Caucasians) is also increasing toward the equator (5). The predominant form of insulin prescribed in such populations is a premixed formulation for twice-a-day injection. Such products contain (a) a short-acting soluble component (as zinc insulin analog hexamers) and (b) a long-acting microcrystalline insulin suspension; the latter consists of zinc-stabilized protamine-insulin analog complexes (neutral protamine Hagedorn; NPH (77)). An example is provided by Humalog Mix75/25 (Lilly), in which 75% of the hormone (insulin lispro) is contained within the NPH microcrystals and 25% is in the soluble phase (32); a similar product in broad clinical use contains an insulin aspart mixture of 70% NPH and 30% soluble analog (Novolog Mix 70/30 from Novo-Nordisk).
The essential idea underlying design of premixed formulations exploits the combined PK properties of the soluble and microcrystalline insulin components within the SQ depot (Fig.  8D). The PD properties of the insulin analog are identical once in the bloodstream. Biphasic absorption of the rapid and delayed components following breakfast and dinner provides reasonable glycemic control in patients for whom these are the major meals (i.e. lunches are small). Disadvantages derive from the cost and complexity of manufacture and from the susceptibility of insulin component in NPH microcrystals to thermal degradation.
We have observed that the biphasic PD properties of SCI-b fortuitously recapitulated the biphasic PK-based profile of standard premixed formulation but in a solution that contains a single phase. The striking resistance of SCI-b to thermal degradation (predicting an extended shelf life) could be of clinical advantage in regions of the developing world lacking access to refrigeration (2,5).

Concluding remarks
This study has described the high-resolution solution structure and dynamics of an active monomeric SCI. Ensembles generated by alternative SA protocols and multiconformer simulations extended our findings, providing alternative depictions of protein dynamics. Segmental flexibility in the foreshortened C domain of the SCI foreshadows a closed 3 open transition on binding to the IR: an SCI-␣CT threading model was supported by MD simulation. Whereas this model rationalizes a wealth of prior biochemical data pertaining to smaller, inactive SCIs, optimization of this structural switch in a therapeutic formulation promises to extend the unrefrigerated shelf-life of insulin pharmaceuticals and enhance their global access in the face of an emerging DM pandemic (5).
It would be of future interest to test our MD-based model of an SCI-IR complex through crystallographic studies. If observed in such model complexes and relevant to holoreceptor binding, the proposed threading mechanism predicts an abrupt transition between inactive SCIs (whose C domains contain three or fewer residues) and active SCIs (four or more residues). A systematic series of SCIs might therefore provide a molecular ruler by which to probe the coupling between conformational change and SCI-␣CT threading. The same SCI series might exhibit an analogous yet uncorrelated transition from fibrillation resistance to susceptibility. Because cross-␤ assembly entails a more marked conformational change than is predicted on receptor binding, we imagine that longer C domains would be needed in a fibril to span the gap between B30 and A1. Comparison between these independent rulers thus promises to define the sweet spot for ultra-stable insulin analog design.

Construction of SCI precursor expression vectors
A set of synthetic plasmids (pPICZ␣) was designed and constructed for expression in yeast P. pastoris to enable direct secretion of SCI precursors (SCI-a, SCI-b, and SCI-c; Fig. 1C) into the media (79,80). These plasmids were constructed as described in the accompanying article (9).

Biosynthesis of SCI precursor
SCI precursor biosynthesis followed a method adapted for insulin production in P. pastoris yeast (80) as described in the accompanying article (9). This study also adapted a protocol (81) for isotopic labeling of the proteins. Cells from the most expression-optimized colony were inoculated in 10 ml of sterile YPG ϩ Zx1 (1% yeast extract, 2% peptone, 2% glycerol, 100 g/ml Zeocin) in a 50-ml Falcon tube and grown overnight at 30°C and a stirring speed 225 rpm (C25KC Incubator/Shaker; New Brunswick Scientific, Edison, NJ). The 10-ml culture was then added to 100 ml of YPG ϩ Zx1 in a sterile 500-ml Erlenmeyer flask and grown another 24 h at 30°C and 225 rpm. Cells were then pelleted and resuspended into 1 liter of sterile YPS (1% yeast extract, 2% peptone, 0.1 M sorbitol) in 2-liter flasks and incubated 24 h at 30°C and 225 rpm. Cells centrifuged in sterile 400-ml bottles were then resuspended in media containing 0.17% yeast nitrogen base without amino acids with 0.5%

Purification of yeast product
For purification of 13 C, 15 N-SCI-b precursor, hydrophobic interaction chromatography (butyl-Sepharose 4 Fast Flow resin; GE Healthcare) was first used to capture protein from the filtered supernatant. All reverse-phase HPLC (rp-HPLC) purifications were performed using aqueous 0.1% trifluoroacetic acid (TFA) as Buffer A and 0.1% TFA in acetonitrile as the organic modifier (Buffer B). Preparative purifications utilized a Waters 2545 quaternary pumping system with FlexInject (Waters). The precursor was isolated by preparative C4 rp-HPLC (C4 Proto 300 Å, 10-m, 250 ϫ 20 mm; Higgins Analytical Inc., Mountain View, CA). For purification of all other precursors (unlabeled and 15 N-only labeled), a minimized rp-HPLC-only protocol was used. To the filtered cell media, acetonitrile and TFA were added to 20 and 0.1%, respectively, followed by preparative C4 rp-HPLC (as above). In either case, eluted SCI was collected and lyophilized.

Heat-stable insulin dynamics Optical spectroscopy
Far ultraviolet (255-200 nm) CD spectra were obtained at high signal-to-noise for insulin analogs using a CD spectropolarimeter (Aviv Biomedical Inc., Lakewood, NJ) equipped with temperature control and an automated titration unit. CD spectra (255-200 nm) of buffer were first obtained using degassed potassium/phosphate buffer (10 mM KH 2 PO 4 /K 2 HPO 4 (pH 7.4) buffer with 50 mM KCl) at 4, 25, and 37°C using a 30-s averaging time. As these spectra did not vary with temperature, the 25°C KPi dataset was subtracted from all protein spectra. Samples were prepared at a concentration of 20 -70 M protein in KPi buffer, brought to pH 7.4 with KOH, and placed in a parafilm-sealed 1-mm path-length quartz cuvette. Full CD spectra (255-200 nm) were acquired at 4, 25, and 37°C with 0.5 nm wavelength resolution and a 15-s photocount averaging time. All CD samples were quantitated via reference-subtracted UV-visible spectra acquired in KPi buffer with a UV-visible spectrometer (Aviv Biomedical Inc.) and a 3-mm quartz cuvette. Protein concentrations in potassium/phosphate buffer were calculated using absorbance at ϭ 280 nm and estimated extinction coefficients predicted by the on-line ExPASy ProtParam tool.

CD studies of protein unfolding
To assess temperature dependence and reversibility of protein folding, a thermal scan (monitored at helix sensitive wavelengths 222(Ϯ1) nm) was performed on all CD protein samples (see above) from 4 to 88°C followed by a reverse gradient from 88 to 4°C (all in 4°C steps). A full CD spectrum (255-200 nm) was then acquired at 4°C to verify reversible folding (or lack thereof). Thermodynamic stabilities in KPi buffer at 25°C were determined by monitoring guanidine-induced unfolding at the helix-sensitive wavelength 222 nm as described (83).

Heat-induced aggregation probed by DLS
Aqueous aliquots of insulin lispro or SCI-b necessary to obtain 0.6 mM protein solutions in 30-l final volumes were pipetted into separate Eppendorf tubes and lyophilized. The powder was then dissolved in "Tris diluent" (25 mM Tris-HCl (pH 7.4) containing 16 mg/ml glycerin, 1.6 mg/ml meta-cresol, 0.65 mg/ml phenol, and ZnCl 2 at a ratio of three Zn 2ϩ ions per insulin hexamer). All buffer components (i.e. deionized water, 4ϫ Tris-HCl diluent (pH 7.4), and 10 mM ZnCl 2 ) were degassed via nitrogen bubbling for 15 min and then syringe-filtered (0.02-m Whatman Anton 10 filters; Sigma), followed by centrifugation (5 min at 16,100 ϫ g). Gentle mixing of the filtered components with lyophilized protein was performed by repeated pipetting.
Unheated samples were probed by DLS as described (9) following 30 min of incubation at room temperature. Heated samples were placed on a heat block set to 88°C for 10 min to replicate the high temperatures encountered during the CDmonitored thermal scan (see above) before DLS measurement.

NMR spectroscopy
All spectra were acquired at a protein concentration of 0.6 mM at 25°C with a Bruker AVANCE 700 MHz spectrometer equipped with a triple-resonance cryoprobe (Bruker Biospin Corp, Billerica, MA) as described (83). All chemical shifts were calibrated in parts/million (ppm) relative to 4,4-dimethyl-4silapentane-1-sulfonic acid as an internal standard. All raw NMR spectra (except four-dimensional data) were processed using Bruker's TopSpin software before analysis using third party software packages, as described below.
Custom routines written in R-script and Bash unix shell performed automated NOE distance restraint generation and structural refinement according to established protocols (28). Distance restraints were obtained from 4D NOESY data; dihedral-angle restraints for all non-dynamic residues were predicted by TALOSϩ (60, 86) using chemical shifts. An SA protocol (28) performed by XPLOR-NIH (87) generated 98 structures; refinement involved iterative identification of bestfit structures followed by relaxation of distance restraints (or removal, if restraint became Ն6 Å) and ensemble recalculation until no restraint violations Ն0.4 Å were found. The final ensemble (generated using 1370 NOE and 75 dihedral angle restraints) was consistent with observed 4D NOESY data. All structural figures were generated in PyMOL (Schrödinger, LLC, New York).

H-2 H amide-proton exchange
To avoid potential artifacts in the SCI-b CD-monitored guanidine titration, 1  The observed rate of amide exchange, k obs , for each resonance was determined using the Sparky "rh" extension. The intrinsic rate of exchange for each residue, k int , was calculated using the SCI-b sequence, pH 2.8 or 7.4, and 100% D 2 O content as input parameters in the on-line SPHERE program (41). Calculating k obs /k int gives the PF, which can be inserted into Equation 1, to obtain a model-free estimate of thermodynamic stability, ⌬G AmEx . Here, R is the ideal gas law constant, and T is the absolute temperature. ⌬G AmEx values obtained from pH 2.8 amide exchange studies are defined as ⌬G pH 2.8 , whereas ⌬G pH 7.4 indicates an amide exchange result obtained at pH 7.4. The 1 H-15 N cross-peak of Leu B15 was omitted from pH 2.8 analysis due to overlap with that of Ile A10 (Fig. 6A). Very slow exchange at pH 2.8 led to significant error in ⌬G pH 2.8 from these resonances.

H-15 N NMR relaxation studies
2D spin-lattice (T 1 ), spin-spin (T 2 ), and hetNOE 15 N-HSQC spectra were acquired at 700 MHz as described (88,89). Het-NOE spectra, which probe sub-nanosecond motions, used a relaxation delay of 6 s. T 1 and T 2 relaxation experiments yield information about internal protein dynamics; these experiments used a relaxation delay of 2 s. Spectra were processed in TopSpin and assigned in Sparky. Relaxation rates (R 1 and R 2 ) and NOE values were calculated by Relax NMR (90); errors in these values were derived from spectral noise. DYNAMICS generated by-residue generalized order parameters, S 2 , as described (91), using a rotational correlation time of 4.8 ns (isotropic diffusion model) and fixed chemical shift anisotropy of Ϫ240 ppm. This analysis detected sites of fast chemical exchange by fitting data for some residues to an extended model-free model that also presents the fast dynamics order parameter S 2 fast ϭ S 2 Ϫ S 2 slow . TALOSϩ was used to predict both secondary structure propensity (SSP) scores and generalized order parameters (S 2 Ј, which do not account for chemical exchange) from chemical shifts as described (60).

Alternative ensemble models of SCI-a and SCI-b
Using XPLOR-NIH, the refined 98 structure ensemble of SCI-b generated by distance-restrained SA was subjected to Verlet equilibration at 300 K for 10 ps, followed by another 25-ps simulation where NOE distance restraints were applied using the Torda-Sheek-van Gunsteren TA NOE protocol (30,31) and a decay constant of 0.625 ps. Time averaging was applied to either all residues or, in a separate simulation, to only those with predicted order parameter lower than 0.65 (i.e. dynamic regions).
Because of incomplete electron density in the original SCI-a crystal refinement (see accompanying article in this issue (9)), a model of the "complete" structure of the hexamer was artificially generated. First, a custom MATLAB script calculated all possible carbon-carbon distances in the SCI-a structure and generated a restraint for any distance within the range 35-40 Å. These ϳ200 restraints were used in the XPLOR-NIH SA pro-tocoltobringastartingstructureofsixindependentSCI-amonomers (generated in PyMOL using an SCI-b structure as a template) together, resulting in a coarse hexameric structure. To this structure, restraints derived from all carbon-carbon distances within 1.8 -30 Å (ϳ670,000 restraints) in the SCI-a crystal refinement were applied to generate an ensemble of 60 hexameric structures (labeled Hexamer Rebuild SA in Fig. 9B). A multiconformer simulation (all B-factors fixed to B ϭ 2 Å 2 ) utilizing reflection data from SCI-a was performed on each individual structure from Hexamer Rebuild SA. Here, B-factors of 2 Å 2 were chosen so that the mean square displacement (u, obtained from the equation B ϭ 8 2 u 2 ) of carbon atoms would be just under 0.17 Å or ϳ10% the van der Waals radius of carbon (1.7A). A separate simulation was performed wherein any residue that was well-resolved in the SCI-a crystal refinement was spatially fixed, whereas all others retained B-factors of B ϭ 2 Å 2 , allowing only dynamic regions to move, paralleling the TA NOE ensembles generated for SCI-b. The protocol for our multiconformer simulation, modifications of slow cooling crystallographic SA (2000 3 300 K; 25 K reduction per 5-fs time step), is given in schematic form in Fig. S6. R-factors for each structure in the multiconformer ensemble, R i , were calculated using the standard Equation 2, where F Obs are observed structure factors, a function of Miller indices h ជ . F calc and k are calculated structure factors and the weighting coefficient as obtained during the B ϭ 2 Å 2 simulations. From here, the ensemble averaged R-factor, R Ens , is calculated using Equation 3, where n ϭ 60, the number of structures in the ensemble. Note that for clarity only 35 structures of each hexameric ensemble are shown in Fig. 9.

Homology modeling of SCI-b bound to insulin micro-receptor
Models of SCI-b in complex with the IR isoform A (IR) L1-CR (IR), including the IR ␣CT domain, were generated using the MODELLER (92) program utilizing the crystal structure of native insulin in complex with the micro-receptor and ␣CT as a template. To account for glycosylation a single N-linked N-acetyl-D-glucosamine carbohydrate at each of the IR residues Asn-16, Asn-25, Asn-111, Asn-215, and Asn-255 was included. From 25 models created, the model with the lowest MODELLER objective function was used in subsequent MD simulations.
MD simulations were performed using the GROMACS (version 5.1.2) (93) suite of programs and the CHARMM36 (94,95) Heat-stable insulin dynamics force field. Briefly, the simulation consisted of an initial steepest decent minimization, a short 50-ps positionally restrained MD holding the protein fixed, and finally unrestrained MD for 300 ns. The system was solvated using the TIP3P water model in a cubic box extending 10 Å beyond all atoms and utilized periodic boundary conditions. The system was made electrically neutral to a final ionic strength of 0.1 M with the addition of sodium and chloride ions; ionizable residues were assumed to be in their charged state at a pH of 7. The temperature of the system was maintained by coupling the protein and solvent independently to a velocity rescaling (96) thermostat at 300 K with a time constant of 0.1 ps. Pressure was controlled to 1 bar using a Berendsen barostat (97) with a coupling of 0.5 ps. A cutoff of 12 Å was used to account for non-bonded interactions and the particle-mesh Ewald method (98) to account for long-range electrostatics applying a grid width of 1.2 Å and a sixth-order spline interpolation. Neighbor searching applied a Verlet grid cutoff scheme with a neighbor-list update frequency of 40 steps and a time step of 2 fs. All bond lengths were constrained with the P-LINCS algorithm. Coordinates of the model were archived every 15 ns after an initial 45 ns equilibration.

Biological testing in diabetic rats
Humalog Mix75/25 (Lilly) provided a biphasic standard on SQ injection. Insulin lispro or SCI-b was dissolved at concentrations for which each 100-l volume contained the doses (in nanomoles) specified in Fig. 8. These proteins were dissolved in Lilly diluent (a product containing 3.8 mg/ml sodium phosphate (pH 7.4), 16 mg/ml glycerin, 1.6 mg/ml meta-cresol, 0.65 mg/ml phenol). The formulations each contained ZnCl 2 at a ratio of three Zn 2ϩ /1 insulin hexamer. Insulin purity was verified by C4 analytical rp-HPLC as above. Protein-free Lilly diluent was used as a negative control (buffer only). Male Lewis rats (mean body mass ϳ300 g) were rendered diabetic by streptozotocin (44) and studied as described in our accompanying article (9). Because of day-to-day variation (between rats and in the same rat on different days), 4 -8 rats were injected per analog formulation on each test day, and the experiments were repeated on one or more non-consecutive days to obtain a larger sample size.