A Conserved Histidine in Insulin Is Required for the Foldability of Human Proinsulin

The insulins of eutherian mammals contain histidines at positions B5 and B10. The role of HisB10 is well defined: although not required in the mature hormone for receptor binding, in the islet β cell this side chain functions in targeting proinsulin to glucose-regulated secretory granules and provides axial zincbinding sites in storage hexamers. In contrast, the role of HisB5 is less well understood. Here, we demonstrate that its substitution with Ala markedly impairs insulin chain combination in vitro and blocks the folding and secretion of human proinsulin in a transfected mammalian cell line. The structure and stability of an AlaB5-insulin analog were investigated in an engineered monomer (DKP-insulin). Despite its impaired foldability, the structure of the AlaB5 analog retains a native-like T-state conformation. At the site of substitution, interchain nuclear Overhauser effects are observed between the methyl resonance of AlaB5 and side chains in the A chain; these nuclear Overhauser effects resemble those characteristic of HisB5 in native insulin. Substantial receptor binding activity is retained (80 ± 10% relative to the parent monomer). Although the thermodynamic stability of the AlaB5 analog is decreased (ΔΔGu = 1.7 ± 0.1 kcal/mol), consistent with loss of HisB5-related interchain packing and hydrogen bonds, control studies suggest that this decrement cannot account for its impaired foldability. We propose that nascent long-range interactions by HisB5 facilitate alignment of CysA7 and CysB7 in protein-folding intermediates; its conservation thus reflects mechanisms of oxidative folding rather than structure-function relationships in the native state.

Insulin is a small globular protein containing two chains, A (21 residues) and B (30 residues). The mature hormone is the post-translational product of a single-chain precursor, proinsulin (1), in which a connecting domain extends from the C-terminal residue of the B domain (Thr B30 ) to the N terminus of the A chain (Gly A1 ) (Fig. 1A). In the pancreatic ␤ cell, proinsulin folds in the endoplasmic reticulum (ER) 4 to form three specific disulfide bridges: A6 -A11, A7-B7, and A20 -B19 (Fig. 1A, orange bars). Although the three-dimensional structure of proinsulin has not been determined, a variety of evidence indicates that it consists of a folded insulin moiety (Fig. 1B, red and blue ribbons) and a disordered connecting region (dashed black line). Upon transit through the Golgi apparatus and entry into immature secretory granules (2), the C-peptide (Fig. 1A, black open circles) is excised by a specific set of prohormone convertases (3). The mature hormone is stored as Zn 2ϩ -stabilized hexamers within specialized secretory granules (4). Insulin hexamers dissociate upon secretion into the portal circulation, enabling the circulating hormone to function as a Zn 2ϩ -free monomer. The stability and receptor binding activity of insulin require maintenance of its three disulfide bridges (5)(6)(7)(8)(9)(10).
The complex pathway of insulin biosynthesis, trafficking, assembly, and secretion may impose additional evolutionary constraints unrelated to the structure and function of the insulin monomer. Such hidden constraints may be revealed by analysis of conserved residues at sites tolerant of substitutions in vitro. An elegant example is provided by His B10 , which mediates axial zinc binding within the insulin hexamer (27). Whereas substitution of His B10 by Asp enhances the stability and activity of the insulin monomer (28,29), the corresponding mutation in the human insulin gene leads to a syndrome of hyperproinsulinemia and diabetes mellitus (30). In the ␤ cell, Asp B10 -proinsulin exhibits impaired trafficking, leading to inefficient excision of the connecting peptide and failure to be efficiently stored within glucose-regulated secretory granules. His B10 may thus contribute to subcellular targeting in the ER and Golgi apparatus. A recent study suggests that the mutation may also impair the fidelity of disulfide pairing in the ER (31).
In this study, we investigated the function of His B5 , the other histidine residue in the B chain. In the crystallographic T-state (the predominant conformation of the insulin monomer in solution) (5,32,33), the B5 imidazole ring packs near the A7-B7 disulfide bridge at the edge of a solvated crevice between the A and B chains (11). Comparison of multiple independent T-state structures indicates consistent engagement of the B5 imidazole ring within a pocket lined by residues A6 -A11 (Fig. 1C). Although the details of such packing differ among crystal structures, His B5 appears to brace the A7-A11 disulfide bridge on the protein surface. The B5-related crevice is remote from the classical receptor-binding surface, and indeed, mutagenesis (34,35) and deletion analysis (11,  ). B, structural model of the insulin-like moiety and disordered connecting peptide (dashed black line). The A and B domains are shown in red and blue, respectively. Cystines (orange) are labeled in yellow boxes. C, structure of the B5-related crevice in multiple high resolution crystal structures. His B5 and surrounding residues A6 -A11, B4, B6, and B7 are shown in 15 independent T-state protomers. Structural variability reflects differences among crystal structures; two families of His B5 side chain orientations are observed. Residues in the A chain are shown in red; residues in the B chain are shown in blue; and disulfide bridges are shown in orange. Structural coordinates were obtained from Protein Data Bank codes 1APH, 1BPH, 1CPH, 1DPH, 1G7A, 1LPH, 1MSO, 1TRZ, 1TYL, 1TYM, 1ZNI, 2INS, and 4INS. Structures were aligned with respect to the main chain atoms of residues A6 -A11 and B4 -B7. 36 -38) suggest that His B5 contributes only modestly to biological activity. His B5 is nonetheless broadly conserved among eutherian mammals. 5 To elucidate hidden biological functions of His B5 that may constrain its divergence, we have undertaken an analysis of an Ala B5 -insulin variant in vitro in relation to effects of this and other B5 substitutions on the foldability of proinsulin in mammalian cell culture.
Our results are presented in two parts. We first describe the chemical synthesis and characterization of a monomeric Ala B5 analog. Because the complex self-association properties of insulin ordinarily confound biophysical studies, a well characterized monomeric template is provided by DKP-insulin (39 -41). The three amino acid substitutions in the B chain of DKPinsulin (His B10 3 Asp (D), Pro B28 3 Lys (K), and Lys B29 3 Pro (P)) disallow formation of dimers and higher order oligomers without perturbing its biological activity. Ala B5 -DKP-insulin retains substantial receptor binding activity, and its structure closely resembles that of native insulin (5). Surprisingly, the pattern of long-range nuclear Overhauser effects (NOEs) indicates that the conformation of the B5-related crevice and neighboring disulfide bridges (cystines A7-B7 and A6 -A11) is not significantly altered by the absence of the imidazole side chain. Because these initial studies were unrevealing in relation to evolutionary constraints, we next investigated foldability as probed by the efficiency of disulfide pairing in chain combination (25,26).
Substitution of His B5 with Ala markedly impairs insulin chain combination in vitro and the folding of proinsulin in the ER of a mammalian cell. Because the protocol of chain combination (42) is generally robust (permitting the synthesis of hundreds of insulin analogs during the past 40 years), its exquisite sensitivity to a B5 substitution provides a striking contrast to the near-native structure and activity of the Ala B5 analog. The biological importance of this synthetic block is demonstrated by transient transfection analysis of proinsulin biosynthesis within the ER of living cells (43). In this assay, a variety of amino acid substitutions and deletions in proinsulin (in each case, expected not to perturb bioactivity) have been found to perturb folding (44). The present study strongly suggests that His B5 functions in the ER to direct disulfide pairing in a kinetic pathway of protein folding. We propose that nascent packing of His B5 against the A domain in a protein-folding intermediate orients Cys A7 and Cys B7 for disulfide bond formation and that such conserved interactions are dispensable once native folding has been achieved. We thus envisage that the His B5 3 Ala substitution provides an example of a pathway mutation (45): its dramatic and unexpected effects on proinsulin biosynthesis illuminate the stringent structural requirements of native oxidative protein folding in the ER.

MATERIALS AND METHODS
Synthesis of Insulin Analogs-Human insulin was kindly provided by Lilly. The tetra-S-sulfonate derivative of the human A chain was obtained by oxidative sulfitolysis (20). B chain analogs were prepared by solid-phase chemical synthesis (20). DKP-insulin and Ala B5 -DKP-insulin were prepared by chain combination (20,46). Predicted molecular masses were confirmed by matrix-assisted laser desorption ionization time-offlight mass spectrometry.
Proinsulin Expression and Mutagenesis-A human proinsulin cDNA was subcloned into the pcDNA3 vector (43). Variant proinsulin cDNAs encoding B5 mutations were created by three PCRs. Reaction 1 used forward and reverse primers to amplify a cDNA stretch encoding the signal peptide and the downstream site of the introduced mutation. Reaction 2 used another set of primers to amplify a cDNA encoding the mutation point as well as the remaining C-terminal region of proinsulin. The products from reactions 1 and 2 were designed to share a 20-bp overlap in the region of the introduced mutation. Finally, reaction 3 used the primers at the 5Ј-and 3Ј-ends of the proinsulin coding sequence along with both gel-purified products from the first two PCRs as template, thereby generating the full-length mutant proinsulin cDNAs. These were gel-purified and ligated into the pGEM-T vector (Promega Corp.). Mutations were confirmed by direct DNA sequencing, and the mutants were subcloned into pcDNA3 using KpnI and EcoRI restriction sites (43).
Mammalian Cell Culture and Transfection-293T cells were cultured in high glucose Dulbecco's modified Eagle's medium plus 10% fetal bovine serum and 0.1% penicillin/streptomycin at 37°C with 5% CO 2 . Cells were transfected with plasmid DNA 1 day after seeding using Lipofectamine (Invitrogen) in Opti-MEM for 5 h at 37°C and then recovered overnight in medium containing 10% fetal bovine serum, with a change to normal growth medium at 24 h and assays of protein expression at 48 h.
Metabolic Labeling-Cells were preincubated in methionine/cysteine-deficient medium for 30 min, metabolically labeled in the same medium containing 35 S-labeled methionine and cysteine for 1 h, washed once with a large volume of complete chase medium, and chased further in complete medium for 1 h. At the end of the chase, the medium was collected, and the cells were lysed in 100 mM NaCl, 1% Triton X-100, 0.2% sodium deoxycholate, 0.1% SDS, 10 mM EDTA, and 25 mM Tris (pH 7.4).
Immunoprecipitation and SDS-PAGE-Cell lysates and chase media that had been treated with a proteinase inhibitor mixture (Roche Applied Science) were preincubated with Zysorbin (Zymed Laboratories Inc.) for 60 min at 4°C and precleared by centrifugation. Supernatants were immunoprecipitated with guinea pig anti-insulin antiserum (LINCO Diagnostics) plus fresh Zysorbin. Immunocomplexes were sedimented at 12,000 ϫ g for 1 min, and pellets were washed twice with cell lysis buffer and once with high salt buffer (0.5 M NaCl, 1% Triton X-100, 10 mM EDTA, and 25 mM Tris (pH 7.4)). Samples were then analyzed by nonreducing Tris/Tricine/urea/SDS-PAGE (43); samples were allowed to enter the gel at 50 V for 1 h and then run at 100 V overnight without cooling. Insulin gels were fixed initially in 20% trichloroacetic acid without alcohol and then in 12.5% trichloroacetic acid plus 50% methanol, incubated briefly with water, and finally either phosphorimaged or 5 Hystricomorph mammals exhibit divergent insulin sequences, including at residue B5. Such insulins exhibit low affinity and do not form zinc hexamers (70). Substitution of His B10 with Asn prevents zinc-mediated assembly. Fibrillation of the monomeric insulin in the degu rat is associated with islet amyloidosis (71,72).
incubated with 1 M sodium salicylate for 20 min and exposed to Kodak XAR film at Ϫ70°C.
Receptor Binding Assays-Relative activity is defined as the ratio of analog to human wild-type insulin required to displace 50% of specifically bound 125 I-labeled human insulin. A human placental membrane preparation containing the insulin receptor was employed as described (29). In all assays, the percentage of tracer bound in the absence of competing ligand was Ͻ15% to avoid ligand depletion artifacts.
Circular Dichroism-Far-UV CD spectra were obtained as described (8). Spectra, acquired with an Aviv spectropolarimeter, were normalized by mean residue ellipticity. Estimates of secondary structure were obtained by deconvolution (47). Samples were dissolved in 10 mM phosphate and 100 mM KCl (pH 7.4) at a protein concentration of ϳ25 M. For equilibrium denaturation studies, samples were diluted to 5 M; guanidine HCl was employed as denaturant (26). Data were obtained at 4°C and fitted by nonlinear least squares to a two-state model (48).
NMR Spectroscopy-Spectra were obtained at 700 MHz at a protein concentration of 600 M using a cryogenic 1 H NMR probe. The DKP-insulin template contains substitutions His B10 3 Asp, Pro B28 3 Lys, and Lys B29 3 Pro (39 -41). Three solution conditions were employed as described (5,22,26): in aqueous solution at pH 7.0 and 25°C and at pH 8.0 and 32°C and in 20% (v/v) deuteroacetic acid at pH 1.9 and 25°C. The latter co-solvent facilitates analysis of exchangeable amide resonances, otherwise incomplete at neutral pH because of base-catalyzed solvent exchange and conformational broadening (5). (In this cosolvent, insulin retains a native-like monomeric fold (49).) Resonance assignment was based on homonuclear two-dimensional NOE spectroscopy (mixing times of 80 and 200 ms), total correlation spectroscopy (mixing time of 55 ms), and doublequantum filtered correlated spectroscopy. Helix-related hydrogen bonds were inferred from the pattern of protected amide resonances as observed in D 2 O solution containing 20% deuteroacetic acid (50).
Histidine pK a Measurement-The pK a of His B5 in DKP-insulin was estimated based on the pH dependence of the chemical shift of its H ⑀ imidazole resonance in D 2 O. The following pD values were employed: 2.5, 3.0, 3.5, 4.0, 6.3, 6.8, 7.2, 7.5, 8.0, 8.8, 9.3, and 9.8 (direct meter reading).
Molecular Modeling-Structures were calculated by distance geometry/simulated annealing using the program DG-II (51); restrained molecular dynamics calculations were performed using X-PLOR (52).

RESULTS
We first describe the synthesis, activity, and solution structure of an Ala B5 analog of an engineered insulin monomer (DKP-insulin) (39,40). Such engineering (53) is required to avoid dimerization and higher order protein assembly. We then investigate the effects of B5 substitutions on the expression, folding, and secretion of proinsulin in a transfected mammalian cell line.
Ala B5 Impairs Insulin Chain Combination-The substitution His B5 3 Asp was introduced into the DKP-insulin B chain; 40 mg of wild-type A chain and 20 mg of variant B chain (each as an S-sulfonate derivative) were used in chain combination. Whereas this protocol ordinarily yields 4.0 -4.4 mg of HPLCpurified DKP-insulin, the yield of Ala B5 -DKP-insulin was 0.2-0.3 mg; similar reductions were observed in three combination reactions. Although the low yields may have been accentuated by less efficient recovery from the reverse-phase C 8 column because of the smaller quantity of protein loaded, the Ala B5 substitution also reduced in the prior insulin fraction (following CM-cellulose chromatography of the crude reaction mixture) by at least 10-fold. The limited efficiency of chain combination in this case stands in contrast to the general robustness of this reaction (26).
Ala B5 -insulin Analog Is Active and Well Folded-Despite the poor yield of chain combination, repetition of the protocol enabled us to purify sufficient quantities of Ala B5 -DKP-insulin for analysis of activity and structure. The analog exhibited a receptor binding affinity of 129% relative to native human insulin; the same value was obtained in two replicates (Table 1, first column). Because the relative affinity of DKP-insulin is 161 Ϯ 19 under these conditions (enhanced binding due to the Asp B10 substitution) (28), the specific effects of the B5 substitution on affinity relative to the parent monomer are 80 Ϯ 10%. Retention a Yield of chain combination is defined following purification of analogs by CM-cellulose chromatography. b ⌬G u (kcal/mol) indicates the apparent change in free energy upon denaturation in guanidine HCl as extrapolated to zero denaturant concentration by a two-state model (48). c ⌬⌬G u (kcal/mol) indicates the difference in ⌬G u values relative to human insulin. Uncertainties in two-state fitting parameters do not include possible systematic error due to non-two-state behavior. d C mid is defined as the concentration of guanidine HCl at which 50% of the protein is unfolded. e The m value (kcal/mol/M) is the slope obtained by plotting the unfolding free energy (⌬G u ) versus the molar concentration of denaturant; this slope may be proportional to the protein surface area exposed upon unfolding. f Activity is defined as the affinity for the human placental insulin receptor relative to human insulin (100%); under these conditions, the K d for native insulin is 0.48 Ϯ 0.06 nM. g NA, not applicable. h ND, not determined. Past studies indicate that the activity of Asp B10 -insulin is similar to that of DKP-insulin (39,40). i DKP-insulin contains three substitutions in the B chain to prevent self-assembly: the classical dimer interface is destabilized by Pro B28 3 Lys and Lys B29 3 Pro, whereas the hexamer interface is destabilized by His B10 3 Asp (39, 40). j Lack of sigmoidicity in transitions makes uncertain the fitting of the pre-transition base line (8). of near-native receptor binding affinity is in accord with the native in vivo potency of Ala B5 -insulin in a rat hypoglycemic assay but higher than that originally reported by Sieber and co-workers (34) in in vitro assays of insulin-stimulated glucose oxidation (21%) and lipolysis (31%) in isolated adipocytes. These lower values (obtained in 1979) may reflect incomplete purification of the analog by countercurrent distribution prior to the advent of reverse-phase HPLC methods. The far-UV CD spectrum of Ala B5 -DKP-insulin ( Fig. 2A, dashed line) resembles that of DKP-insulin (solid line) with similar but slightly attenuated ␣-helix-specific features at 222, 208, and 196 nm.
The 1 H NMR spectra of DKP-insulin and Ala B5 -DKP-insulin likewise exhibit similar patterns of resonance line widths and chemical shift dispersion (Fig. 2C). The spectrum of the analog is tractable by homonuclear two-dimensional NMR methods (Fig. 3) (54), permitting resonance assignment (supplemental Figs. S1 and S2). The chemical shifts are essentially identical to those observed in DKP-insulin; changes in magnitude of Ͼ0.1 ppm are observed only at neighboring sites in the B5-related pocket. The novel Ala B5 spin system is well resolved in total correlation and NOE specta (Fig. 3, B and D). Analysis of secondary structure based on diagnostic strings of d NN , d ␣N , d i,iϩ3 , and d i,iϩ4 NOEs is essentially identical to that of DKP-insulin (supplemental Fig. S3). The pattern of protected amide protons in 20% deuteroacetic acid and 80% D 2 O is likewise the same as that observed in DKP-insulin (supplemental Fig.  S4) (5).
Interresidue NOEs in DKP-insulin and Ala B5 -DKP-insulin are similar as summarized by diagonal plots (Fig. 4, A and B). Long-range NOEs are each case consistent with structures of T-state crystallographic protomers. Interchain long-range NOEs from the ␤-CH 3 resonance of Ala B5 (Fig. 4B, highlighted in red) are in accord with a rigid body model derived from the wild-type structure. These include NOEs between Ala B5 and Cys A7 H ␣ (Fig.  3D) and Ile A10 (supplemental Figs. S1 and S2). The orientation of the N-terminal segment of the B chain (residues B1-B8) is further defined by native-like long-range NOEs involving Phe B1 , Leu B6 , Cys B7 , and Leu A13 .
The solution structure of Ala B5 -DKP-insulin was calculated by distance geometry and restrained molecular dynamics according to 674 NOE-, J coupling-, and hydrogen bond-related restraints (supplemental Table S2). The ensemble is similar to the T-like conformation of DKPinsulin (Fig. 4C). The ensemble of A chains (Fig. 4C, red) and B chains (blue) closely matches the mean structure of DKP-insulin (Fig. 4C, ribbons). The root mean square deviations between the two structures is similar to the pairwise root mean square deviation within either ensemble alone. The respective orientations of the A and B chains (Fig. 4C) and the environment of the B5 side chain within an interchain crevice (Fig.  4D) are in each case similar. The imprecision of the distance geometry/simulated annealing ensemble in this region (although in part because of informational uncertainty in the restraints) resembles the range of structural variation observed among wild-type crystal structures (Fig. 1C). Distances between the ␤-methyl carbon of Ala B5 to neighboring residues are similar to corresponding distances involving the ␤-carbon of His B5 in the solution structure of DKP-insulin (supplemental Table S1). Closer distances are observed between the ␤-carbon of Ala B5 and Ile A10 than between the ␤-carbon of His B5 and Ile A10 , presumably because of the smaller size of Ala compared with His. This proximity is enjoined in the distance geometry/restrained molecular dynamics calculation by an NOE between the methyl resonances of Ala B5 and Ile A10 . A similar trend toward smaller B5-related interresidue distances in the analog is observed in the relationship between Ala B5 / Thr A8 and Ala B5 /cystines A7-B7; by contrast, the distance from Ala B5 to the ␣-carbon of Cys A6 is increased but consistent with the distribution of distances among crystal structures. Such local conformational adjustments are as expected in response to a loss-ofvolume substitution. Maintenance of an overall native-like structure and receptor-binding surface is in accord with the substantial receptor binding activity of the analog but in contrast with the impaired efficiency of chain combination.
Ala B5 Impairs Protein Stability-Ala B5 -DKP-insulin exhibits decreased thermodynamic stability as inferred from guanidine denaturation studies (Fig. 3B). Fitting of these CD-detected denaturation curves by a two-state model (Table 1) yields ⌬⌬G u values of 1.7 Ϯ 0.1 kcal/mol. This decrement is consistent with loss of His B5 -specific interchain packing interactions and hydrogen bonds in an otherwise native-like structure. (In wild-type crystal structures, the variable orientation of the B5 imidazole ring is associated with distinct hydrogen-bonding schemes involving either H-N ␦ or H-N ⑀ (but not both) and common to the carbonyl oxygens of either residue A7 or A9.) The existence of such hydrogen bonds in the solution structure of DKP-insulin and their contribution to its stability are in accord with a shift in the pK a of His B5 by 1 pH unit (from ϳ6.4 to 7.4) as determined by pH titration of the 1 H NMR spectrum.
B5 Substitutions Impair Proinsulin Expression and Secretion-Transient transfection of mammalian cells with a plasmid expressing human proinsulin provides a useful model for studying proinsulin folding within the ER (Fig. 5A). Following transfection of 293T cells, we thus examined expression, disulfide isomer formation, and secretion of newly synthesized proinsulin in cells radiolabeled with 35 S-labeled amino acids for 1 h and chased for an additional hour. This model enables the assessment of the effects of substitutions on disulfide pairing in the ER and subsequent trafficking but does not recapitulate ␤ cell-specific prohormone processing or formation of microcrystalline insulin storage depots in glucose-regulated secretory vesicles. Although transfected 293T cells exhibit significant protein overexpression that may compromise the folding efficiency of proinsulin relative to that in ␤ cells, the absence of endogenous proinsulin in these cells makes the detection of transfected proinsulin or variants straightforward.
Labeled proinsulin or variant proinsulins were immunoprecipitated with anti-insulin polyclonal antibody and subjected to nonreducing Tris/ Tricine/urea/SDS-PAGE, allowing examination of distinct proinsulin disulfide isomers that form within the ER (Fig. 5C). In accord with prior studies (43), transfection of the wild-type proinsulin construct gave rise to robust expression, primarily of a fast migrating species containing native disulfide bonds (Fig. 5C, lanes 3 and  4) relative to an empty vector control (lanes 1 and 2). The most rapidly migrating species (Fig. 5C, arrow) was efficiently secreted from transfected cells (lane 3; C) to the medium (lane 4; M), which typically achieved Ͼ95% efficiency by 4 h of chase. In addition, there were less rapidly migrating disulfide isomers generally representing a minor fraction of proinsulin; these exhibited a lower percentage of secretion (Fig. 5C, bracket). Substitution of His B5 with Ala resulted in decreased detection of the variant proinsulin with a marked decrease in the fraction of Ala B5 -proinsulin molecules achieving the native disulfide-bonded form (Fig. 5C,  lanes 5 and 6). Secretion of the variant proinsulin was undetectable (Fig. 5C, lane 6, asterisk). The impaired foldability of Ala B5 -proinsulin is in accord with the inefficiency of disulfide FIGURE 3. Two-dimensional NMR identification of the Ala B5 spin system and B5-related NOEs. A and B, total correlation spectra of DKP-insulin and Ala B5 -DKP-insulin, respectively, in the region containing A 3 X spin systems of Ala side chains. C and D, portions of NOE spectra indicating B5-related NOEs. C, contacts of DKPinsulin between the His B5 imidazole ring and the A chain; D, analogous contacts involving the ␤-CH 3 resonance of Ala B5 . Spectra were observed in D 2 O at 700 MHz, 32°C, and pD 7.6 (direct meter reading); the mixing time was 55 ms (total correlation spectra) or 200 ms (NOE spectra) in each case. Additional spectra are provided in supplemental Figs. S1-S4.
pairing encountered above in the chemical synthesis of Ala B5 -DKP-insulin.
To assess the generality of this result, we examined alternative B5 substitutions: Phe, Asp, Asn, and Gln. The rationales were as follows. Phe was tested as an alternative aromatic ring that, unlike His, is without a hydrogen bond acceptor or donor; Asn and Gln were chosen as polar nitrogenous side chains that, like His, are capable of donating and receiving hydrogen bonds; and Asp was included for comparison with Asn. Three replicates of these experiments are shown in Fig. 6. In each case, the B5 substitution markedly decreased recovery of newly synthesized proinsulin and, unlike wild-type proinsulin, blocked release from cells (C) into the medium (M). The stringency of the requirement for His at position B5 for folding and secretion is surprising given the marked structural variability observed in detail among wild-type crystal structures (Fig. 1C). It is possible that the B5 substitutions accelerate protein degradation rather than inhibit initial expression of the variant proinsulins because of impaired formation of one or more disulfide bridges, resulting in disulfide mispairing. Indeed, blocked formation of the B7-A7 disulfide bridge in proinsulin is the causative feature underlying the molecular pathogenesis of diabetes mellitus in the Akita mouse (55).
Stability and in Vivo Folding Are Not Correlated-Biophysical studies of two-disulfide insulin and proinsulin analogs lacking cystines A6 -A11 indicate that such species form partial folds of low stability (⌬G u ϭ 1.9 Ϯ 0.3 kcal/ mol, implying that ⌬⌬G u is ϳ3 kcal/mol) ( Table 1) (5, 7). In transfected Chinese hamster ovary or 293T cells, removal of the A6 -A11 disulfide bridge by mutation of these Cys residues to Ser does not block expression or secretion; furthermore, this pairwise substitution produces little detectable change in band mobility upon nonreducing Tris/ Tricine/urea/SDS-PAGE (31) despite the presumed loss of structural organization and low stability. This control experiment is illustrated in Fig. 5C (lanes 7 and 8). Even more surprising, substitution of His B10 with Asp (predicted to augment stability through optimization of segmental helical stability) leads to a paradoxical increase in the production of nonnative disulfide isomers in the ER (44). 6 Together, these studies strongly suggest that the efficiency of folding and secretion in vivo is not correlated with the extent of structure and stability in vitro. Impaired foldability of Ala B5 -proinsulin in 293T cells is thus unlikely to be a consequence of the effect of the substitution on the thermodynamic stability of the protein once folded.

DISCUSSION
This study has focused on the contribution of His B5 to the folding, structure, and stability of human insulin. Conserved within eutherian mammals, the imidazole ring packs within a solvated crevice at the edge of the protein surface (11). Direct contacts in this crevice, including interchain hydrogen bonds and a network of bound water molecules, constrain the orientation of the N-terminal segment of the B chain against the A chain. These interactions define in part the environment of the solvent-exposed A7-B7 disulfide bridge but do not contribute significantly to receptor binding (24,35). In an effort to understand what biological constraints might underlie the conservation of His B5 , we chose to investigate an Ala B5 analog. Because the analog exhibits near-native structure and activity, we undertook the comparative analysis of foldability in vitro and in cell culture.
Efficiency of Insulin Chain Combination-Chemical synthesis of Ala B5 -DKP-insulin was limited by inefficient disulfide pairing in the course of chain combination. Although a practical impediment to structural studies, such inefficiency is remarkable in light of the wide utility of this procedure in the synthesis of many unrelated analogs since its development by Katsoyannis in 1966 (42). The robustness of this reaction is highlighted by the efficient syntheses of two-disulfide analogs lacking either cystines A6 -A11 or A7-B7 (5,7,8). These analogs are less well organized than native insulin, and their partial folds exhibit negligible thermodynamic stabilities. Similar efficiencies are observed in the synthesis of partially folded analogs in which the N-terminal A chain ␣-helix is destabilized by multiple glycine substitutions (26). Conversely, synthetic yields are not significantly increased by substitutions that enhance stability through optimization of helical N-or C-capping residues (26,29). Lack of correlation between yield and stability provides evidence that insulin chain combination is under kinetic (rather than thermodynamic) control as originally proposed by Katsoyannis and coworkers (20).
Although synthetic failures are seldom reported, the following negative observations may relate to the reaction mechanism and hence to the associated folding pathway. (i) Substitution of invari-ant Gly B8 with L-amino acids markedly impairs chain combination, whereas D-substitutions enhance its efficiency (41). The conformation of Gly B8 is integral to the direction of the B7-B10 ␤-turn. The positive B8 angle, located in a region of the Ramachandran plane ordinarily "forbidden" to L-amino acids, would be maintained by D-substitutions but destabilized by L-substitutions. Perturbation of this turn in a reaction intermediate might misalign Cys B7 and so impair formation of cystines A7-B7. L-amino acid substitutions at residue B8 of a single-chain insulin precursor (mini-proinsulin) likewise impair its expression in yeast (56,57).
(ii) Interchange of Leu B11 and Val B12 to create the analog [Val B11 ,Leu B11 ]insulin results in very low yields (20). Because substitution of Val B12 with Leu alone does not impair yield (21), this low yield is presumably due to the B11 substitution. In the wildtype structure, Leu B11 packs in the core, where it appears to stabilize the conformation of Leu B6 and cystines A6 -A11. It is possible that Val B11 is associated with a packing defect that impairs alignment of Cys A6 and Cys A11 in a protein-folding intermediate. These examples suggest that insulin chain combination is guided by subtle nascent interactions in the folding chains.
Application to Folding of Proinsulin-Impaired chain combination of Ala B5 -DKP-insulin correlates with a block of the folding and secretion of Ala B5 -proinsulin and related analogs in a transfected rodent secretory cell line. Although such a correlation would seem natural (58 -60), recent studies of proinsulin variants have shown that the biological requirements of foldability extend beyond chemical requirements. Substitution of His B10 with Asp, for example, impairs the fidelity of disulfide pairing in the ER despite the stabilizing effect of this substitution in vitro (44). Foreshortening of the connecting peptide in mini-proinsulin analogs, which enhances the efficiency of oxidative refolding in vitro (61), can likewise lead to scrambling of disulfide bridges in the ER (31). We imagine that such constructions perturb the enzymatic machinery of oxidative protein folding and/or the associated ER chaperones. Substitution of His B5 with Ala (or the diverse set of other side chains tested in Fig. 6) is likely to introduce a kinetic block of native disulfide pairing, and hence, degradation supervenes.
Because chain combination and in vivo foldability reflect kinetic processes, small decrements in thermodynamic stability cannot in FIGURE 5. Pathway of insulin biosynthesis and transient transfection assay for expression and secretion. A, nascent proinsulin folds as a monomer in the rough ER (rER), wherein the zinc ion concentration is low. In the Golgi apparatus, the zinc-stabilized proinsulin hexamer assembles and is processed by cleavage of the connecting peptide to yield mature insulin. Zinc insulin crystals are observed in secretory granules. B, insulin hexamers dissociate in bloodstream to yield active monomers. C, SDS-PAGE assay of proinsulin expression in the ER (lanes 1, 3, 5, and 7) and secretion into the medium (lanes 2, 4, 6, and 8). The transfection assay was carried out in 293T cells: human wild-type proinsulin and variants Ala B5 and [Ser A6 , Ser A11 ]. The empty vector control is shown in lanes 1 and 2. Cells were pulse-labeled with 35 Slabeled amino acids for 1 h and chased for 1 h. The cells were lysed, and both lysates (C) and chase media (M) were immunoprecipitated with anti-insulin antiserum. The most rapidly migrating species contains the native disulfidepairing scheme; less rapidly migrating species represent disulfide isomers. themselves account for relative yields. The absence of thermodynamic control is demonstrated through control studies of an unstable two-disulfide analog in which cystines A6 -A11 are pairwise substituted with Ser. Despite its partial unfolding and loss of a sigmoidal (cooperative) unfolding transition (7), chain combination and ER foldability are robust. We nevertheless suggest that the loss of His B5 -related interchain hydrogen bonds (apparent in the structure of Ala B5 -DKP-insulin) and the associated decrease in the stability of the folded state (⌬⌬G u ϭ Ϫ1.7 kcal/mol, inferred from guanidine denaturation studies) are relevant to impaired foldability. In particular, we imagine that these interactions are critical to the structure and metastability of a proteinfolding intermediate and serve to guide the orientation of Cys A7 and Cys B7 for productive pairing. This intermediate is likely to contain cystines A20 -B19, the first disulfide bridge to be identified in in vitro refolding studies of mini-proinsulin (62) and insulin-related growth factor I (63,64). Unlike pairwise substitution of cystines A6 -A11, mutations at position A20 or B19 block productive folding and secretion in transfected cells (43).
In a reversible two-state folding reaction (a simpler process than multistep oxidative folding), the distinction between effects of mutations on the native state versus the transition state is well understood (Fig. 7). Mutations that impair interactions involved in both states (Fig. 7A) increase the barrier height from the unfolded state (activation free energy (⌬G*)) and reduce native stability (⌬G u ). By contrast, mutations at sites uninvolved with nascent interactions in the transition state (Fig. 7B) impair stability without effect on the barrier height from the unfolded state. These principles are likely to apply to the discrete steps in an oxidative folding pathway. We therefore imagine that His B5 both participates in nascent interchain interactions preceding formation of cystines A7-B7 and stabilizes the subsequent disulfide-bridged species. Substitution of Ile A2 with Gly, destabilizing but without effect on chain combination yield (26), might provide an example of the other class. It would be of future interest to investigate in the present 293T cell transfection assay whether the expression and secretion of proinsulin would be perturbed by this non-conservative substitution.
Complementary Determinants of Foldability and Function-Although the structure of an insulin monomer in solution resembles the crystallographic T-state, in which His B5 packs within an interchain crevice, zinc insulin hexamers exhibit a ligand-dependent equilibrium among T 6 , T 3 R 3 f , and R 6 conformations (11). Whereas in the T-state the B chain contains an extended N-terminal arm (residues B1-B6) and type IIЈ ␤-turn (residues B7-B10) followed by the central ␣-helix (residues B9 -B19), in the alternative R-state, the B chain contains an extended ␣-helix (residues B1-B19) (65,66). 7 This structural transition thus alters the environment of His B5 . In a putative R-state monomer, His B5 and Leu B6 would extend from a novel helical surface; Gly B8 would exhibit a negative angle and thus escape from the forbidden right side of the Ramachandran plane. The relevance of an R-like structure to receptor binding has long been the subject of speculation (11,41,49).
Respective D-or L-substitutions at position B8 shift the equilibrium among T 6 , T 3 R 3 f , and R 6 hexamers in one direction or the other (41). D-Amino acids at position B8 favor the T-state but block receptor binding; L-substitutions destabilize the T-state but can be highly active. Remarkably, among such chiral analogs, efficiencies of chain combination are anti-correlated with receptor binding activities. Although D-substitutions cannot be tested in vivo, we imagine that the folding-competent conformation of proinsulin resembles the classical T-state, including nascent interactions between His B5 and the A chain. Whereas this conformation is retained in the free hormone, insulin may undergo a change in conformation upon receptor binding: the bound state may exhibit R-like features (11,41,66). 8 Because the T-to-R transition is also associated with changes in the conformation and solvent accessibility of the A7-B7 disulfide bridge, it is plausible that these T-or R-like conformations, if populated in protein-folding intermediates, could favor or retard the kinetic mechanism of disulfide pairing. We thus envisage that His B5 and D-amino acid substitutions at 7 The T-to-R transition is also characterized by a change in the handedness of cystines A7-B7. The sulfur atoms of the latter are exposed in the T-state but buried in a non-polar crevice in the R-state. We speculate that coupling between the B8 angle and handedness of cystines A7-B7 may account for the low yield of chain combination in the synthesis of L-Ser B8 -DKP-insulin. 8 The receptor-bound conformation of insulin may differ from either T and R-like structures in the C-terminal portion of the B chain (residues B20 -B30). In classical structures of either type, the C-terminal B chain ␤-strand covers Ile A2 and Val A3 . Analogs containing allo-Ile A2 (in which the chirality of the ␤-carbon is inverted) (19) or Leu A3 exhibit native structure but low activity (22,73,74). Destabilization of the B chain ␤-strand by substitution of Phe B24 with D-Phe enhances activity (40, 75).  (N and NЈ), increasing the barrier height from the unfolded state (U). B, mutations at sites uninvolved in the transition state may nonetheless destabilize the native state; no effect was observed on barrier height from the unfolded state. Guanidine denaturation experiments probe ⌬G u and ⌬⌬G u (see Table 1). Although multistep oxidative folding is more complex, analogous principles are proposed to underlie intermediate events.
position B8 contribute to the foldability of insulin by an analogous mechanism: stabilization of a local T-like structure near cystines A7 and B7. Conclusion-His B5 is conserved among eutherian mammals and birds but is not invariant among hystricomorph mammals. Whereas guinea pig insulin retains His B5 , for example, the divergent insulins of the coypu and chinchilla rat contain Arg. Similarly, whereas fish insulins frequently retain His B5 (including the primitive hagfish), Arg is observed in ratfish, rabbitfish, and elephantfish. Arg B5 is also observed among reptiles. Accordingly, it would be of future interest to investigate the structure and foldability of Arg B5 -insulin analogs. In contrast to insulin itself, residue B5 is not conserved as His or Arg among insulin-related growth factors (67,68) or more distant members of the insulin-related superfamily (69), including invertebrates (9).
The marked effects of B5 substitutions on foldability stand in contrast to the high biological activity of Ala B5 -DKP-insulin once its folding has been achieved. We imagine that interactions between His B5 and the A chain are disrupted upon binding of insulin to the insulin receptor. Such an induced fit may be related to an R-like conformational change (41,66). The small changes in the free energy of binding observed upon deletion of residues B1-B5 (⌬⌬G b ϳ 1 kcal/mol; Ref. 11) or mutation of residue B5 to Ala (⌬⌬G b Ͻ 0.5 kcal/mol) may reflect modulation of this conformational equilibrium or minor adjustments at the edge of the hormone-receptor interface. Conservation of His B5 among eutherian mammals is enjoined not by requirements of receptor binding but instead by the pathway of oxidative protein folding in the ER.