Assisted protein folding.

The pioneering work of the late Christian Anfinsen and his colleagues (1) on the reoxidation of bovine pancreatic ribonuclease (RNase) to a native, biologically active enzyme in vitro after reduction of disulfide bridges and disruption of tertiary structure demonstrated that regeneration of native conformation of a purified protein can occur spontaneously in a test tube without the addition of any other co-factors or helper enzymes. This led to the still valid conclusion that “no special genetic information, beyond that contained in the amino acid sequence, is required for the proper folding of the molecule and for the formation of ‘correct’ disulfide bonds” (2). Of course, the story of protein folding goes back much further (reviewed in Ref. 3). A number of milestones can be noted. In 1911, Chick and Martin found that proteins could be denatured in vitro, and they distinguished that process from aggregation of the protein. In 1929, Wu postulated that protein denaturation was an unfolding process and that native protein structures involved regular, repeated patterns of folding into a three-dimensional network. Anson and Mirsky in 1931 and Anson (1945) showed that hemoglobin folding is reversible and that hemoglobin could be renatured in vitro to a form that had a native-like absorption spectrum, oxygen binding, and tryptic digestion pattern. Studies in the 1950s by Eisenberg and Schwert and by Schellman demonstrated that denaturation and renaturation are thermodynamic processes, involving a change in free energy and large changes in conformation between the denatured and native states. Even the early investigators realized that the protein folding processes that occurred in test tubes, although they could reconstitute native structure, were too slow to work inside cells. For example, even under optimized conditions of protein dilution, pH, and temperature, renaturation of RNase takes about 20 min (2), and RNase is a relatively simple monomeric protein. Renaturation of some multidomain proteins may take several hours in vitro, yet it is clear that all possible conformations could not be sampled on the way to native structure. Levinthal (4) summed this up succinctly in the “Levinthal paradox” that can be stated as follows: if a given amino acid can assume approximately 10 different conformations, the total number of possible conformations in a polypeptide chain of 100 residues would be 10. The time that this could take would be well beyond the life span of an organism if not of the universe, depending on how many conformations could be sampled before a protein reaches native state. Thus, it was realized early on that cells must have special ways to make the process more efficient. Experiments to examine the role of the intracellular environment in protein folding involved the renaturation of proteins such as RNase (2), bovine pancreatic trypsin inhibitor (BPTI) (5), or influenza hemagglutinin (6) in isolated microsomal fractions. The results indicated that protein folding can be facilitated by proteins contained in the endoplasmic reticulum (ER) of eukaryotic cells. In the case of disulfide bond-containing proteins such as BPTI (5) or the human chorionic gonadotropin (hCG)-b subunit (7), the key ER protein involved appears to be protein disulfide isomerase (see below). It was soon realized that many polypeptides can reform native structure easily by themselves in vitro (usually small single domain proteins) while others (more complex, multidomain, or oligomeric proteins) fold and assemble efficiently only in the presence of additional proteins that are not constituents of the final native protein itself. These additional proteins have been called “molecular chaperones.” The term molecular chaperone was first used by Laskey et al. (8) to describe the role of nucleoplasmin in the assembly of DNA and histones into nucleosomes. The name seemed appropriate because nucleoplasmin promotes histone-histone interactions to form the correct oligomeric form while preventing aggregation. It does so without itself forming part of the nucleosome and without specifying nucleosome structure. Hence nucleoplasmin assumes the role of a chaperone. The term molecular chaperone has been applied by Ellis and Hemmingsen (9) to the expanding families of proteins of bacterial and eukaryotic compartments involved in protein folding, assembly, and translocation. The term has stuck, and it is now used to define a wide variety of factors that facilitate generation of native protein and nucleic acid structures.


Historical Perspectives
The pioneering work of the late Christian Anfinsen and his colleagues (1) on the reoxidation of bovine pancreatic ribonuclease (RNase) to a native, biologically active enzyme in vitro after reduction of disulfide bridges and disruption of tertiary structure demonstrated that regeneration of native conformation of a purified protein can occur spontaneously in a test tube without the addition of any other co-factors or helper enzymes. This led to the still valid conclusion that "no special genetic information, beyond that contained in the amino acid sequence, is required for the proper folding of the molecule and for the formation of 'correct' disulfide bonds" (2).
Of course, the story of protein folding goes back much further (reviewed in Ref. 3). A number of milestones can be noted. In 1911, Chick and Martin found that proteins could be denatured in vitro, and they distinguished that process from aggregation of the protein. In 1929, Wu postulated that protein denaturation was an unfolding process and that native protein structures involved regular, repeated patterns of folding into a three-dimensional network. Anson and Mirsky in 1931 and Anson (1945) showed that hemoglobin folding is reversible and that hemoglobin could be renatured in vitro to a form that had a native-like absorption spectrum, oxygen binding, and tryptic digestion pattern. Studies in the 1950s by Eisenberg and Schwert and by Schellman demonstrated that denaturation and renaturation are thermodynamic processes, involving a change in free energy and large changes in conformation between the denatured and native states.
Even the early investigators realized that the protein folding processes that occurred in test tubes, although they could reconstitute native structure, were too slow to work inside cells. For example, even under optimized conditions of protein dilution, pH, and temperature, renaturation of RNase takes about 20 min (2), and RNase is a relatively simple monomeric protein. Renaturation of some multidomain proteins may take several hours in vitro, yet it is clear that all possible conformations could not be sampled on the way to native structure. Levinthal (4) summed this up succinctly in the "Levinthal paradox" that can be stated as follows: if a given amino acid can assume approximately 10 different conformations, the total number of possible conformations in a polypeptide chain of 100 residues would be 10 100 . The time that this could take would be well beyond the life span of an organism if not of the universe, depending on how many conformations could be sampled before a protein reaches native state. Thus, it was realized early on that cells must have special ways to make the process more efficient.
Experiments to examine the role of the intracellular environment in protein folding involved the renaturation of proteins such as RNase (2), bovine pancreatic trypsin inhibitor (BPTI) 1 (5), or influenza hemagglutinin (6) in isolated microsomal fractions. The results indicated that protein folding can be facilitated by proteins contained in the endoplasmic reticulum (ER) of eukaryotic cells. In the case of disulfide bond-containing proteins such as BPTI (5) or the human chorionic gonadotropin (hCG)-␤ subunit (7), the key ER protein involved appears to be protein disulfide isomerase (see below).
It was soon realized that many polypeptides can reform native structure easily by themselves in vitro (usually small single domain proteins) while others (more complex, multidomain, or oligomeric proteins) fold and assemble efficiently only in the presence of additional proteins that are not constituents of the final native protein itself. These additional proteins have been called "molecular chaperones." The term molecular chaperone was first used by Laskey et al. (8) to describe the role of nucleoplasmin in the assembly of DNA and histones into nucleosomes. The name seemed appropriate because nucleoplasmin promotes histone-histone interactions to form the correct oligomeric form while preventing aggregation. It does so without itself forming part of the nucleosome and without specifying nucleosome structure. Hence nucleoplasmin assumes the role of a chaperone.
The term molecular chaperone has been applied by Ellis and Hemmingsen (9) to the expanding families of proteins of bacterial and eukaryotic compartments involved in protein folding, assembly, and translocation. The term has stuck, and it is now used to define a wide variety of factors that facilitate generation of native protein and nucleic acid structures.

Protein Folding in Vitro Versus in Vivo
There are some similarities as well as differences between intracellular protein folding and protein folding in test tubes. For instance, for the tailspike protein of Salmonella typhimurium phage P22 (10, 11) and hCG-␤ subunit (7) intermediates in the folding pathway of the proteins appear to be the same in vivo and in vitro, but the rate and efficiency with which proteins achieve final native state in vivo is higher than that in vitro. It must also be kept in mind that, both in vivo and in vitro, correct folding is in competition with misfolding and aggregation. This depends on the protein concentration used for in vitro folding reactions, and in general, very dilute protein concentrations (0.01-0.02 mg/ml) (2,12) are needed to prevent aggregation. This has presented a huge problem to the biotechnology industry in attempts to produce useful amounts of recombinant proteins. The efficiency of folding in vitro can frequently be facilitated by appropriate adjustment of the redox potential (13)(14)(15) or the addition of factors such as protein disulfide isomerase (PDI) for eukaryotic disulfide-bonded proteins (15)(16)(17) or DnaK/DnaJ chaperones for bacterial proteins (reviewed in Refs. 18 and 19). In contrast to what happens in vitro, cells minimize or circumvent the off-pathway events by utilizing molecular chaperones that facilitate the folding process by preventing aggregation and other unfavorable interactions.
There is growing interest in what regulates the folding of mammalian proteins in vivo because of the number of human diseases now known to be related to protein folding defects (reviewed in Refs. 20 and 21). This includes cystic fibrosis, ␣ 1 -antitrypsin deficiency, Alzheimer's disease, Creutzfeld-Jacob disease, neurodegenerative diseases such as Huntington's chorea, and cancer.
Many of the eukaryotic proteins whose folding and assembly have been studied in vivo are membrane or secreted proteins. They follow a similar route to the cell surface. (i) Synthesis is carried out in the rough ER. (ii) Nascent proteins are translocated into the cisternal space of the ER where the signal peptide is cleaved; initial co-translational folding involving secondary structure and some native tertiary structure occurs; addition of high mannose N-linked oligosaccharides and initial processing of N-linked oligosaccharide chains (for glycoproteins) takes place; formation of disulfide bonds occurs, and for multimeric proteins, oligomerization or subunit assembly is attained along with achievement of native structure. (iii) The proteins destined for the cell surface or secretion are translocated to the Golgi apparatus, further processed, and then either translocated to the cell surface or packaged into secretory vesicles for secretion.

Role of Disulfide Bond Formation in Protein Folding and
Assembly It has been clear for a long time that the in vitro folding of proteins targeted for secretion is facilitated by folding in the presence of microsomal extracts (2). It is now known that microsomes contain many chaperones that foster protein folding (30) as well as the systems to create a favorable redox potential for the formation of disulfide bonds (13,31). For many secreted proteins, disulfide bonds are important for stabilization of tertiary structure and for their assembly into multimeric structures (32). PDI is a key factor facilitating disulfide bond-dependent folding. For example, the rate of stimulation of BPTI folding by microsomal extracts is enhanced to the extent expected by the PDI activity of these extracts (5). Moreover, PDI stimulates the rate of folding of kinetically trapped BPTI molecules in vitro by several thousand-fold but has little effect on disulfide bond formation of BPTI molecules possessing disulfide bonds that form efficiently in the absence of PDI (17).
The pioneering work of Creighton and his colleagues (reviewed in Ref. 34) employed the formation of intramolecular disulfide bonds as a biochemical probe to study the folding pathway of BPTI. This 58-amino acid protein has three intramolecular disulfide bonds, and when reduced and denatured, its folding pathway can be followed in vitro by the reformation of the disulfide bonds. Using alkylation with iodoacetic acid to trap free thiols, Creighton et al. (34) defined a folding pathway involving intermediates that varied in their amount and type of disulfide bonds. They also observed the existence of a significant, albeit mostly transient, population of intermediates containing disulfide bonds not present in the native protein, the formation of which is the result of disulfide bond rearrangements.
There is also evidence that disulfide bond rearrangement occurs during protein folding in intact cells. For example, in the intracellular folding pathway of the hCG-␤ subunit determined by pulsechase kinetics (35) and by site-directed mutagenesis of cysteines involved in disulfide bonds (36,37), two of the six disulfide bonds formed during the kinetic folding pathway of hCG-␤ are different from those seen in the crystal structure of the native protein (38,39).
Another protein whose in vivo folding has been studied is influenza hemagglutinin (HA) (28). The folding of HA in the ER has also been followed by the formation of intrachain disulfide bonds. Folding of HA starts cotranslationally with some disulfide bonds beginning to form soon after both cysteines that are involved in a disulfide pair enter the ER lumen. However, most disulfide bond formation in HA occurs after nascent polypeptide bond termination. This has also been observed during the in vivo folding of hCG-␤ (35).

Role of Glycosylation in Protein Folding and Assembly
Since many membrane and secretory proteins are glycoproteins, it is important to consider the role of carbohydrates in protein folding, assembly, and secretion. N-Linked oligosaccharides of the high mannose composition are added cotranslationally to the Asn-X-Ser(Thr) consensus sequence of proteins in the ER. One function of N-linked glycans is to facilitate protein folding and conformational maturation. When N-linked chains are eliminated by sitedirected mutagenesis of Asn residues in glycosylation consensus sequences or by treatment of cells with agents that block addition of N-linked glycans or their processing, many ER-synthesized glycoproteins misfold, aggregate, and get degraded within the ER (reviewed in Ref. 40). Some glycoproteins seem to fold and be translocated efficiently without their N-linked glycans. The only rule that seems to emerge here is that larger, more complex glycoproteins have more trouble folding if their N-linked glycans are missing.
The role of N-linked oligosaccharide chains in intracellular folding of the hCG-␤ subunit has been determined by examining the kinetics of folding in Chinese hamster ovary cells transfected with wild-type or mutant hCG-␤ genes lacking one or both of the asparagine glycosylation sites (41). Folding of hCG-␤ lacking both Nlinked glycans was inefficient and correlated with the slow formation of the last three disulfide bonds (i.e. disulfides 23-72, 93-100, and 26 -110) to form in the hCG-␤ folding pathway. Unglycosylated hCG-␤ was slowly secreted from Chinese hamster ovary cells, and ␤ subunit folding intermediates retained in cells for more than 5 h were degraded into a smaller hCG-␤ fragment. However, coexpression of hCG-␣, which is required for formation of the biologically active ␣␤ heterodimer, enhanced folding and formation of disulfide bonds 23-72, 93-100, and 26 -110 of hCG-␤ lacking N-linked glycans, suggesting that the presence of its heterodimeric companion subunit fosters ␤ subunit folding and assembly, perhaps because the ␣ subunit can act like a chaperone for ␤ subunit folding. In addition, the molecular chaperones BiP, ERp72, and ERp94, were found in a stable complex with unglycosylated, unfolded hCG-␤ and may be involved in the folding of this ␤ form (41). These data indicate that N-linked oligosaccharides assist hCG-␤ subunit folding by facilitating disulfide bond formation, perhaps by increasing the stability and solubility of the native structure that fosters disulfide formation.

Molecular Chaperones
The role of molecular chaperones in protein folding, assembly, and intracellular translocation has been the subject of a number of recent reviews (19,(42)(43)(44)(45). ER chaperones play a key role in protein folding and quality control. Cytosol chaperones play a key role in folding, transport, and biological activity of a number of proteins targeted for transport to specific organelles such as the nucleus and mitochondria. Examples of some eukaryotic molecular chaperones are shown in Table I. Important members of the ER family of chaperones include BiP, originally characterized as an immunoglobulin binding protein (hence the name (46) 2-3 min). Along with the sequential formation of the 9 -90 and 23-72 disulfide bonds, p␤1-late undergoes a major conformational shift into p␤2-free (t1 ⁄2 ϭ 4 -5 min). When the 93-100 disulfide bridge forms, p␤2-free is converted into an assembly-competent intermediate (t1 ⁄2 ϭ 8 -10 min) that, following association with the ␣ subunit, is recognized as early p␤2-combined. After heterodimer assembly occurs, the 26 -110 bond forms a "seat belt" around the ␣ subunit. The crystallographic data (38,39) indicate that the disulfide bonds 38 -90 and 9 -57 are present in the mature hCG ␣␤ heterodimer, suggesting that there is a disulfide rearrangement in the ␤ subunit folding pathway (reprinted with permission from Bedows et al. (36)).
calnexin (49), and calreticulin (50). Additional chaperones continue to be discovered and characterized. The enzyme-like cofactors protein disulfide isomerase (51) and peptidyl prolyl cis/trans-isomerase (52), which catalyzes the isomerization of trans-to cis-proline, are usually considered as ER chaperones as well.
Chaperones appear to act sequentially in protein folding pathways by binding to folding intermediates that are in various stages of folding and then passing them on to the next chaperone or chaperone complex in the cascade, eventually releasing a competent native protein (53)(54)(55). Binding usually involves interaction of chaperones with hydrophobic residues on the surface of unfolded proteins, and release often involves ATP hydrolysis. Binding of chaperones does not involve specific amino acid consensus sequences in the substrate protein but rather is determined by the arrangement of hydrophobic residues. Binding of BiP by folding intermediates, as an example, is favored by 7-8-residue amino acid sequences with aliphatic and aromatic amino acid residues in alternating positions (56). These are the sort of sequences that would normally be on the inside of native proteins, providing a way for chaperones to discriminate between folded and unfolded proteins.

What Happens to Misfolded Proteins?
It is generally thought that misfolded proteins remain in the ER and are sequestered and degraded there without being secreted. We now know that there are some exceptions to this, e.g. certain mis(un)folded forms of the hCG-␤ subunit. Some mutant forms of hCG-␤ that do not fold properly are degraded intracellularly while other mutant forms that remain incompletely folded are secreted (37). Nevertheless, there are a number of examples where protein misfolding leads to protein accumulation in the ER and degradation. Some molecular chaperones appear to be involved in targeting irreversibly misfolded proteins for degradation in the ER. BiP is one of these. For example, immunoglobulin light chains that are slowly folding and retained in the ER of cultured mouse cells are quantitatively bound to BiP as partially disulfide-bonded forms and then degraded, whereas light chains that are more rapidly folded and secreted only transiently interact with BiP (57).

Mechanisms of Chaperone Action
The observation that chaperones are needed to assist protein folding in living cells does not negate the findings of Anfinsen and others that proteins can fold spontaneously in solution based only on information contained in their primary amino acid sequence. Indeed, the data comparing the in vitro versus in vivo folding pathways for proteins that have been studied in this regard, for example the S. typhimurium phage P22 tailspike protein (10,11) and the hCG-␤ subunit (7), indicate that proteins go through the same folding steps in vitro and intracellularly. What then do chaperones do and why do cells need them?
The best evidence for the mechanisms by which chaperones assist protein folding comes from Escherichia coli in which the chaperones DnaJ/DnaK and the chaperonins 2 GroEL/GroES act in concert to facilitate folding of proteins. Recent biochemical evidence (reviewed in Refs. 19 and 45) and crystallographic data (58,59) provide a fairly clear and fascinating story on this subject, although there is still some controversy on some key points. GroEL is made up of 14 identical 60-kDa subunits that form two heptameric ring structures with a pocket in the middle that can accommodate proteins up to about 60 kDa. GroES is a single heptameric ring structure of 10-kDa subunits that can form a cap over the GroEL structure and is involved in holding a folding intermediate within the GroEL pocket. Folding occurs in the GroEL pocket and involves binding and hydrolysis of ATP and release and rebinding of incompletely folded protein within the GroEL pocket until most of the protein is folded to native state (19).
The data indicate that members of Hsp70 and Hsp40 families of chaperones in E. coli, namely DnaK and DnaJ, respectively, are involved in the initial binding of nascent polypeptides as they proceed off the ribosome. These chaperones assist in the early steps of protein folding and can facilitate, in cooperation with the nucle- otide exchange factor GrpE, complete folding to a native state of some proteins. Other proteins require the additional actions of the GroEL/GroES system to complete folding (Fig. 2) (18). Analogous systems exist in eukaryotic cells and most likely act through mechanisms similar to the E. coli folding systems. The cytosolic Hsc70 family and the Hsp70 ER analogue BiP appear to act like the DnaK/DnaJ chaperones in E. coli. The TCP-1 family of eukaryotic cystosolic chaperonins, of which the chaperonin TRiC is the analogue of GroEL, appears to function like the GroEL/ES system. It should be noted that not all proteins need to proceed through the above noted Hsc70/Hsp70/Hsp40 and GroEL/GroES/ TRiC systems to fold to native structure. Genetic data using temperature-sensitive mutants to shut off production of GroEL, for example, indicate that a maximum of about 30% of E. coli proteins need GroEL to fold correctly (45). Furthermore, four cellular folding compartments (the ER, mitochondrial intermembrane space, and the chloroplast lumen of eukaryotic cells as well as the periplasmic space of bacteria) do not contain GroEL-type chaperonins. However, each of these compartments contains a complement of chaperones that assist protein folding (Table I).
One key question that remains is whether proteins fold to native structure while they are bound to chaperones or whether they have to be released into solution to complete folding. Hartl and his colleagues (53) support the former hypothesis. They have data to show that when proteins are synthesized by a more in vivo-like translational system (reticulocyte lysate), sequential binding of folding intermediates by Hsc70/Hsp40 and then by TRiC occurs and that native state is achieved while a protein substrate remains bound to TRiC. Lorimer and his colleagues (60), on the other hand, believe that a polypeptide dissociates completely from GroEL during the folding process and may rebind several times before it is folded correctly to native state (so called "iterative annealing") but that final folding events occur in solution. Whichever model turns out to be correct, and it may be substrate-dependent, it is clear that proteins folding at the high concentration and in the highly compact state of the intracellular environment require chaperones in order to assist their folding and prevent their spontaneous aggregation.