Cotranslational Protein Folding*

The problem of how the linear amino acid sequence of a polypeptide folds to assume its unique tertiary structure is one of the most basic and challenging conundrums of contemporary science. Many of the principles and characteristics of protein folding have been learned by studying refolding of denatured polypeptides. However, the problem of protein folding cannot be completely understood without reference to the biological context of protein folding, especially for large, multidomain, and multisubunit proteins. One of the basic differences between biosynthetic protein folding and protein renaturation is cotranslational folding, folding that occurs during synthesis. The elegant idea that the process of protein folding is concomitant with synthesis was articulated, and experimental testing was begun in the early 1960s (1, 2). Today there is substantial experimental support for the cotranslational folding hypothesis. Both cotranslational and cotranslocational folding, at least when the latter is coupled to translation, share the basic feature of vectorial appearance of the nascent polypeptide from the ribosome or the membrane and the potential initiation of the folding process by the emerging polypeptide. It is true that the same conformations are achieved by polypeptides folded in cells as a consequence of biosynthetic processes and as a result of refolding of the full-length polypeptide from the denatured state. However, identity of the final protein structures does not necessarily mean identity of the pathways leading to their formation (3). It is the kinetics of the folding process that establishes the folding pathway(s) and potential partitioning among different final forms and, ultimately, their relative yields. In fact, the biological function that is shared by all proteins is the ability to fold properly, and this function must be executed efficiently by all proteins prior to any other function. This seems to be the essence of the vectorial folding process. Several general patterns and principles of cotranslational folding are summarized in Figs. 1 and 2.

mic reticulum. The disulfide bond between Cys-35 and Cys-100 of the N-terminal domain starts to form when the nascent chains achieve 15.5 kDa length (8). Formation of this bond is almost quantitative when the nascent polypeptide has achieved a length of 18 kDa; formation of the disulfide thus requires ϳ3 s.
It has been shown with a conformation-dependent antibody that Escherichia coli tryptophan synthase ␤ chains begin to fold during translation, even before appearance of the entire N-terminal domain (9,10). No lag was detected between synthesis of the nascent chains and appearance of immunoreactivity (11). Monoclonal antibody recognizing the structured monomer of bacteriophage P22 tailspike protein reacts with nascent chains (12). Ribosome-bound firefly luciferase and bovine rhodanese form protease-resistant N-terminal domains (13,14). Folding of ribosome-bound rhodanese and of ricin has been observed through the use of fluorescent probes (15,16).
Binding of Cofactors and Ligands-Binding of cofactors and ligands often stabilizes protein structure and can affect folding pathways. For the chloroplast reaction center protein D1, binding of several cofactors has been found to occur during synthesis and translocation into the thylakoid membrane (17). Cotranslational binding of chlorophyll is required to synthesize the full-length protein and prevent degradation of the nascent chains. Glycosylation of influenza hemagglutinin occurs in the lumen of the endoplasmic reticulum (18). Upon blockage of oligosaccharide addition, folding of the protein is perturbed, leading to the formation of aggregates. Binding of heme to rabbit ␣-globin begins when the emerging polypeptide achieves a length of 86 residues (19). Attachment of ligands and cofactors in all the above cases can occur immediately upon or very soon after appearance of the binding sites along the polypeptide chain, thereby stabilizing the tertiary structure of the nascent polypeptide.
Later Stages of the Folding Process and Formation of Oligomeric Structures-Rat serum albumin is a secretory protein with 17 disulfide bonds in the native structure which are spread throughout the polypeptide chain. In the nascent polypeptides, about onehalf of the cysteinyl residues exist in disulfide bonds, indicating completion of a substantial part of the overall folding process (20). Hemagglutinin-neuraminidase of Newcastle disease virus begins to assume defined structure during the process of synthesis (21). Nascent influenza hemagglutinin also forms disulfide bonds cotranslationally, including the critically important 52-277 bond (18). Two recent studies have demonstrated formation of enzymatically active forms of rhodanese and firefly luciferase still bound to the ribosomes when these polypeptides are expressed with extended C-terminal segments so that each enzyme was in the bulk solution (22,23). Polyribosomes from Chironomus salivary gland cells produce giant secretory proteins having compact domain-like structures (24).
Formation of oligomeric structures involving nascent polypeptides has been reported for several proteins. Formation of the ␤-galactosidase oligomer from nascent polypeptides was suggested in the pioneering studies of cotranslational folding. Ribosomebound ␤-galactosidase chains can complement functionally defective subunits and produce ribosome-bound enzymatically active forms upon coexpression in heterozygous strains of E. coli or by mixing subunits in vitro (1). Formation of enzymatically active ␤-galactosidase on ribosomes also was observed following enzyme induction in vivo (2). The modular organization of the monomer and independent folding of each domain provides an explanation for how this large tetrameric complex could be formed with one monomer not yet completely synthesized (25). The authors suggested the possibility of formation of a dimeric complex between nascent polypeptides attached to neighboring ribosomes and then, by a similar mechanism, formation of the tetramer (2).
Cotranslational trimerization of the reovirus cell attachment protein via the N-terminal domain has been observed, possibly reflecting trimerization of nascent chains synthesized from adjacent ribosomes in the same polyribosomal complex (26). The human protein hexabrachion, a hexamer composed of 320-kDa subunits, achieves its folded form upon secretion so efficiently that no intermediate forms involving full-length subunits could be detected in vivo (27). Nascent polypeptides of several eucaryotic cytoskeletal proteins have been shown to assemble into the corresponding polymeric cytoskeletal structures (28).
Formation of the initial complex between immunoglobulin heavy and light chains involves disulfide bond formation between fully synthesized light chains and nascent heavy chains (29). The Cys residue from the heavy chain that is involved in the disulfide bond is located between two domains, each of which contains a single intradomain disulfide bridge. It appears that formation of this disulfide bridge requires prior folding of two adjacent domains.
An intriguing case of cotranslational assembly involves formation of type I procollagen trimer. Association of full-length chains is initiated by interactions between C-terminal propeptides (30). The triple helix propagates from the C-terminal propeptide to the Nterminal end. In vivo, a substantial portion of nascent collagen is full-length (31) as a consequence of a pause in translation prior to termination. These fully elongated chains potentially can associate through their propeptides to provide helix growth from the C terminus to the N terminus similar to that for the full-length procollagen. Based on the enhanced stability of the collagen trimer to proteolysis, it has been suggested that the initial stages of trimer formation can occur with the nascent collagen chains (32).

Molecular Chaperones and Folding Catalysts in
Cotranslational Folding Chaperones-Molecular chaperones are ubiquitous components of cells. In the presence of co-chaperones and ATP, substrate polypeptides form transient complexes with chaperones, cycling between free and chaperone-bound forms (33), leading to competition between folding polypeptides for binding to the chaperones (34). The binding affinity of substrate polypeptides to chaperones in the absence of ATP is substantially higher (by orders of magnitude), and dissociation of the complexes is extremely slow (33,35).
HSP 70 -Members of the HSP 70 chaperone family include DnaK and DnaJ in procaryotic cells and eucaryotic HSP 70 and its co-chaperone HSP 40. When polyribosomal complexes consisting of the entire spectrum of nascent polypeptides were analyzed, it was shown that HSP 70 chaperones were associated with the nascent polypeptides (36 -38). Some chaperones associated with the ribosomes may be involved in the translation process itself (39).
Two members of the HSP 70 family in yeast cells, Ssb1p and Ssb2p, have been found to interact with nascent polypeptides on translating ribosomes (37). The defective phenotype of these mu-tant strains, which have lower levels of polyribosomes, can be suppressed by increased expression of the HBS1 gene, which encodes a polypeptide resembling in sequence the eucaryotic translation elongation factor EF-1␣ and translation termination factor (37).
Interaction of nascent rhodanese with DnaK and DnaJ in the E. coli expression system has been studied in some detail. It appears that DnaJ interacts with the nascent polypeptide first, followed by DnaK, and finally, association of GrpE leads to dissociation of the complex (40). Interestingly, the nascent rhodanese polypeptide itself appears to block translation termination and release of the polypeptide from the ribosome, presumably by interference of the N-terminal segment of the polypeptide with binding of the translation termination factor RF2 (41). Binding of this N-terminal segment to the ribosome is only disrupted by DnaJ in conjunction with DnaK (40,42). Accumulation of the ribosome-bound fulllength polypeptide on the ribosome due to impeded termination is not prevented by the endogenous chaperones; release requires incubation with chaperones at high concentration (41).
HSP 60 -Studies of the potential involvement of HSP 60 in cotranslational folding have yielded controversial results because of the difficulty in distinguishing between interactions of the chaperone with nascent polypeptides from interactions with polypeptides immediately after release into the bulk solution. Exposure of the C-terminal segment of 20 -30 amino acid residues, which are sheltered within the ribosome during synthesis, can change the folding properties of the released polypeptide as well as its inter- FIG. 2. Schematic representation of a section through a protein folding landscape in which the basic funnel concept described by others for refolding polypeptides (67, 68) has been adapted to include the processes of cotranslational folding. The energy surface on the left depicts the hypothetical case of protein biosynthesis in the absence of folding (blue arrows). The vertical axis represents conformational energy of the polypeptide, whereas the circumference of the funnel represents the conformational space available to the polypeptide. As the polypeptide emerges from the ribosome, the available conformations will increase (the funnel becomes wider) as the length of the polypeptide increases, and as the polypeptide emerges into the aqueous environment but does not fold, it will move up the surface of the funnel to higher energies. The blue surface represents processes involving covalent bond formation and hydrolysis; the overall process of biosynthesis and folding constitutes movement from left to right. The green surface represents noncovalent interactions associated with protein folding. When the full-length but still unfolded polypeptide is released from the ribosome, it will be free to fold to the native state through the pathways defined by the folding funnel on the right (green arrows). The more realistic model of cotranslational folding is viewed as a tunneling process whereby the nascent polypeptide folds through a series of intermediates as it emerges from the ribosome, thereby retaining a lower energy than would be the case for synthesis without folding. The nascent polypeptide at each stage of biosynthesis will be able to access multiple conformations thus defining a folding funnel similar to that of the full-length polypeptide on the right; we have simplified the figure by showing the most highly populated species at each step of synthesis in the form of a tunnel. When sufficient polypeptide has emerged to begin to assume some structure, we envision the biosynthesis/folding process as leaving the blue funnel and "tunneling" to the folding (green) funnel. The intermediates I 1 , I 2 , and I 3 are as defined in the legend to Fig. 1. The position on the biosynthesis funnel at which the tunnel begins reflects the length of nascent polypeptide required to stabilize a subset of conformational states; the different tunnels were included to indicate that some polypeptides may require longer N-terminal sequences before any structures become stabilized. The full-length nascent polypeptide, M* folds to native monomer, M n , following release by packing of the C-terminal segment of polypeptide and final isomerization steps. Note that the cotranslational folding pathway maintains a lower barrier than would occur with synthesis in the absence of folding and therefore would be expected to occur faster. It also appears that cotranslational folding would allow the polypeptide to avoid kinetic traps that may be encountered during refolding of full-length polypeptides. actions with chaperones. GroE is required to produce rhodanese in an enzymatically active form during synthesis (41). Similar results have been obtained for TRiC, a mammalian cytoplasmic member of the HSP 60 family, for synthesis of firefly luciferase and of actin (13,43). Clearly, these experiments, while demonstrating a requirement of HSP 60 for productive folding of these polypeptides, do not inform us of whether the interaction occurs during synthesis or after release of the polypeptide from the ribosome. The C-terminal sequences of firefly luciferase and actin are critical for binding of the polypeptides to TRiC (13,43). There is a huge excess of endogenous chaperones, 2.6 M GroE (44) and 1 M TRiC (13), over nascent polypeptides produced in cell-free expression systems (1-10 nM). Consequently, even minimal nonspecific contamination of the ribosomal fraction by these large particles would be sufficient to accommodate a substantial proportion of the nascent polypeptides upon release from the ribosomes. Careful examination of polyribosomes from E. coli (38) as well as those carrying nascent rhodanese chains in vitro (14) revealed no GroEL interaction with the nascent polypeptides. On the contrary, only after release from ribosomes have polypeptides been found in transient association with GroEL. In studies of eucaryotic mitochondrial proteins, their interactions with members of the HSP 60 family have been observed following completion of synthesis and/or translocation of the polypeptides (45). The suggestion that GroEL and TRiC, unlike other HSP 60 chaperones, are involved in interactions with nascent chains, requires further substantiation.
An important question has been raised concerning the fraction of cellular proteins assisted by chaperones in folding. It has been estimated that no more than 5% of the polypeptides in E. coli fold with the assistance of GroE (44). In this regard, it should be remembered that rhodanese and firefly luciferase are translocated proteins that fold in an environment different from that of the cytoplasm and generally do not refold spontaneously without the aid of chaperones. It would be premature to extrapolate the requirements of chaperones found for folding of these proteins to all proteins. Some other chaperones have been implicated in binding nascent chains of particular groups of proteins. SecB can bind nascent polypeptides of E. coli secretory proteins, apparently preventing premature folding in the cytoplasm (46). Calnexin, a chaperone in the endoplasmic reticulum membrane, binds transiently to some glycoproteins and is required for their proper folding and assembly. It has been found to bind hemagglutinin nascent chains via oligosaccharides attached to the polypeptide (18). Another chaperone from the same cellular compartment, HSP 47, associates with nascent procollagen (47). Inactivation of HSP 47 leads to reduced synthesis of collagen and delay in its folding.
Folding Catalysts-Protein disulfide isomerase (PDI) 1 has been shown to affect folding of disulfide-containing proteins, both in vivo and in vitro. PDI can be cross-linked to nascent polypeptides in vivo (48,49). Moreover, it has been demonstrated that PDI is essential for efficient cotranslational formation of disulfide bonds in a coupled translation/translocation system (50). Eucaryotic peptidylprolyl isomerase (PPI) residing in the endoplasmic reticulum forms transient complexes with translocating polypeptides (49). E. coli trigger factor, found in association with nascent polypeptides, possesses PPI activity (51,52), but there is no evidence yet regarding catalysis of prolyl bond isomerization of nascent polypeptides.
Ribosomes-Renaturation of some proteins is improved by the presence of ribosomes (53,54). The effect of ribosomes on protein refolding has been attributed to the large ribosomal subunit, specifically to its RNA, the 23 S and 28 S RNA of procaryotic and eucaryotic ribosomes, respectively (53,54). Domain V, which is involved in the peptidyltransferase center on the large ribosomal subunit, has been implicated in the effect of the RNA on protein renaturation. These observations raise the question whether ribosomes can play an active role in biosynthetic protein folding.

Kinetics and Pathway of Cotranslational Folding
An upper limit of the rate of cotranslational folding is imposed by the rate of polypeptide synthesis. For many proteins, as mentioned above, the C-terminal segment of 20 -30 amino acid residues, which is sheltered by the ribosome prior to the release of the full-length polypeptide into the bulk solution, is essential for formation of the native, biologically active structure. Consequently, folding cannot be completed before release of the nascent polypeptide from the ribosome. The kinetics of folding of the polypeptide and ultimate appearance of the native form will be a function of the rates of polypeptide synthesis, folding of the full-length monomer, and for oligomeric proteins, subunit assembly. Manipulation of the conditions of protein expression can change the rate-limiting step of the folding/association process and, consequently, change the kinetics of the overall process. For oligomeric proteins, the concentration of newly synthesized monomers is critically important because the association reaction is a higher order, concentration-dependent process. Cotranslational folding of the bacterial luciferase ␤ subunit is rate-limiting in the formation of the native ␣␤ heterodimer when prefolded ␣ subunit is available at a sufficiently high concentration (55). Coexpression of both subunits leads to much slower formation of the native enzyme, apparently because association becomes the rate-limiting step (56).
For many proteins for which folding events have been observed with nascent chains, cotranslational processes may contribute to the fast rate of biosynthetic folding. The rapid rates of biosynthetic folding cannot be achieved upon renaturation of denatured fulllength polypeptides in the presence of chaperones and folding catalysts. The point of concern in the investigation of folding of ribosome-bound chains has been the possibility that rather slow folding events might occur during the time required for analysis of the ribosomal complexes. However, in several cases, late folding events which occur cotranslationally in vivo, or in vitro prior to analysis, have been observed. Biosynthetic folding seems to be much faster and more efficient than renaturation for several proteins (Refs. 8,18,20,25,29,30, and references therein). Firefly luciferase (57) and hydroid obelin 2 fold much more efficiently during synthesis than during renaturation under the same conditions. Firefly luciferase also folds efficiently upon translocation into proteoliposomes depleted of chaperones (58). These observations imply a crucial role for vectorial folding of nascent chains.
The time course of biosynthetic folding relative to renaturation has been compared directly for bacterial luciferase (55), a cytoplasmic protein that contains no disulfide bonds. Isomerization of prolyl residues is not rate-limiting in its folding, at least in the refolding of full-length subunits. Association of ␣ with ␤ determines the overall rate of enzyme formation. The ␤ subunit released from the ribosome associates with the ␣ subunit much faster than does ␤ i , which predominates in refolding experiments, suggesting that the structure of the ␤ subunit when it is released from the ribosome is different from ␤ i . Whereas in refolding experiments all molecules begin to refold at the same time upon dilution into native conditions, in the expression system, there is a steady-state rate of appearance of newly synthesized polypeptides, which necessitates careful analysis of the data (6,55). It was concluded that the ␤ subunit produced by biosynthetic folding is a folding intermediate which is beyond a rate-limiting step encountered during refolding of the subunit (55).
The significance of vectorial folding to the kinetics of polypeptide synthesis has been tested using permuted proteins. It has been established that some permuted polypeptides can fold and acquire native structure. However, as discussed above, the basic parameters that can be determined by cotranslational folding are kinetics and efficiency of the folding process. Indeed, analysis of the kinetics of refolding of permuted versions of ribonuclease T1 and ␣-spectrin SH3 domain revealed that the folding of these permuted sequences is significantly slower and the yield is lower than that of the wild-type (59,60). For proteins with the N and C termini distantly positioned on the surface, only slight rearrangements of N-and C-terminal secondary structural elements without disruption of the folded core of the protein results in bringing the termini into close proximity, suggesting that the final step in folding may involve binding of the termini to the surface of the folded core (61).
Statistical analysis of more than 200 protein structures has revealed the tendency that, within the length of polypeptide typical for a domain, residues tend to interact with the N-terminal portion of the polypeptide and that the N-terminal region is, on average, more compact than the C-terminal region (62). This observation is consistent with vectorial folding of nascent polypeptides beginning from the N terminus and proceeding to the C terminus.

Conclusions and Perspectives
In this review, we have tried to demonstrate that cotranslational folding is an essential component of biosynthetic folding of many proteins in cells. This stage of folding may be crucial for the overall kinetics and yield of folding. Cotranslational folding appears to be especially important for large multidomain and multisubunit proteins. Indeed, all proteins discussed in the review fall into this category. The ultimate goal of studies of cotranslational protein folding is to learn the details of the pathways involved. The kinetics of the folding process, the partitioning of polypeptides among alternative forms, and the yield of correctly folded protein are consequences of kinetic partitioning between alternative pathways. The basic differences between cotranslational folding and the refolding of the full-length polypeptide are: 1) vectorial appearance of the nascent polypeptide and subsequent vectorial folding which decreases the potential for nonproductive interactions and allows folding by consecutive parts; 2) isomerizations within the partially folded N-terminal segment of a polypeptide which occur concomitantly with the synthesis of the C-terminal segment of the polypeptide; 3) restricted diffusion and attachment of the nascent chain to the large ribosomal particle, which reduces the aggregation potential of the nascent polypeptides; 4) formation of disulfide bonds and proper prolyl isomer conformation, which may be catalyzed more efficiently prior to formation of structural intermediates in which the Cys and Pro residues are not accessible for the PDI and PPI.
Recently it has been suggested that eucaryotic proteins fold cotranslationally whereas procaryotic proteins fold posttranslationally and that some components of bacterial cells prevent folding of nascent polypeptide chains (63). Demonstration of cotranslational folding of procaryotic proteins in bacterial cells and extracts (1, 2, 9, 10 -12, 55) and of folding of ribosome-bound eucaryotic proteins in bacterial systems (14 -16, 22) clearly contradicts this proposal.
The involvement of chaperones in cotranslational folding of specific proteins may be the basis of the multiple effects apparently exerted by chaperones on the folding process. These include decreasing aggregation by binding aggregation-prone intermediates, potential unfolding of nonproductive folding intermediates, and assisting the polypeptide in overcoming energy barriers in the folding reaction (33,35). It remains to be ascertained which nascent polypeptides are targets for chaperones and folding catalysts and to learn the details of their action.
Both experimental and theoretical studies of protein refolding suggest that there is evolutionary pressure for proteins to fold fast (5,64,65). Folding of larger proteins generally involves smaller independent folding units (64,65). We believe that the evolutionary pressure for fast folding operates in the context of biosynthetic folding, including vectorial synthesis and concomitant folding of the nascent polypeptide chain, obviously not on refolding of the full-length polypeptide.