Reconciling the controversy regarding the functional importance of bullet- and football-shaped GroE complexes

The chaperonin GroEL and its co-chaperonin GroES form both GroEL–GroES bullet-shaped and GroEL–GroES2 football-shaped complexes. The residence time of protein substrates in the cavities of these complexes is about 10 and 1 s, respectively. There has been much controversy regarding which of these complexes is the main functional form. Here, we show using computational analysis that GroEL protein substrates have a bimodal distribution of folding times, which matches these residence times, thereby suggesting that both bullet-shaped and football-shaped complexes are functional. More generally, co-existing complexes with different stoichiometries are not mutually exclusive with respect to having a functional role and can complement each other.

The Escherichia coli GroE system assists protein folding in vivo and in vitro by an ATP-dependent mechanism (1). It comprises GroEL, a complex of two back-to-back stacked rings, each made up of seven identical subunits, and its cofactor GroES, which is a single homoheptameric ring. Binding of GroES to GroEL forms a cage in which substrate proteins can fold in isolation from bulk solution (1). Early EM and biochemical studies showed that GroEL and GroES are able to form both GroEL-GroES 2 symmetric football-shaped (2)(3)(4)(5) and GroEL-GroES asymmetric bullet-shaped (6,7) complexes. The crystal structure of the bullet complex was solved more than 2 decades ago (8), and, more recently, crystal structures of the football complex have also become available (9,10). The existence of both types of complexes is, thus, not in any dispute. Moreover, both types of complexes were found to co-exist under certain conditions (11). The main controversy has concerned the identity of the predominant functional form of the GroE machine. Some studies show that, in the presence of substrate protein, the relative populations of the asymmetric and symmetric complexes shift in favor of the latter species (12, 13). By contrast, others have argued that the asymmetric complex is the functional form and that the symmetric species accumulates only in the presence of unfoldable protein substrates (14). Given that GroEL has become a paradigm for molecular machines, in gen-eral, and chaperonins, in particular, resolving the controversy regarding the identity of its main functional form is important and has been the subject of reviews (e.g. see Refs. 11 and 15). It is also of broad interest because other protein complexes with varying subunit stoichiometries are known to co-exist in vitro and in vivo (16). An example is the 20S core particle of the proteasome, which forms both 1:1 and 1:2 complexes with the 19S lid complex (17).
The debate regarding the functional roles of the bullet-and football-like complexes is of importance for understanding GroE's mechanism of action, in part, because it bears on the residence time of substrate proteins encapsulated in its cavity. In the case of the asymmetric complex, the residence time is estimated to be 10 -15 s (18,19), whereas, in the case of the symmetric complex, it is only about 1 s (13). Short residence times are consistent with the iterative annealing model (20) according to which the role of encapsulation is to subject the substrate protein to forced unfolding upon ATP and GroESpromoted conformational changes. According to the iterative annealing model, GroEL assists folding by unfolding misfolded proteins, thereby providing them with further opportunity to fold correctly inside or outside the cavity. The short residence times in the symmetric complex would, therefore, result in more iterations and higher folding yields. By contrast, long residence times are more consistent with a model according to which encapsulation favors folding owing to confinement and properties of the cavity walls and the cavity-confined water (1). Given that GroEL has evolved to facilitate the folding of a range of protein substrates (21,22) with different properties (23), we consider here the possibility that symmetric and asymmetric complexes specialize in assisted folding of different protein substrate subsets and that both are, therefore, functional forms.

Results and discussion
Assuming that slow folders might benefit from longer encapsulation times, we hypothesized that GroEL substrates have a bimodal distribution of folding rates (i.e. times) in accord with the different residence times in the two GroE complexes. We first compared the distribution of folding rates of the 57 obligatory GroEL substrates (21,22) with that of all of the cytosolic proteins in E. coli. Given that many of the relevant three-dimensional structures are not available, the folding rate (k F ) of each protein was calculated from its length and predicted secondary structure (24) as described before (25). It may be seen that the distribution of folding rates of the obligatory GroEL substrates is indeed found to be bimodal (Fig. 1A), whereas that of all of the ϳ1600 cytosolic proteins is unimodal (Fig. 1B). The cro ACCELERATED COMMUNICATION modality of these distributions was confirmed using the Bayesian information criterion (26). Strikingly, the ratio of the folding times (t ϭ ln2/k F ) corresponding to the two maxima of the bimodal distribution matches very closely to the ratio of experimentally determined residence times in the bullet and football forms (log(k max1 /k max2 ) ϭ log(t 2 /t 1 ) ϭ Ϫ1.95 ϩ 0.82 ϭ Ϫ1.13 (i.e. t 2 /t 1 ϭ 0.07), whereas the ratio of estimated residence times (13,18,19) is between 1 ⁄ 10 (i.e. 0.1) and 1 ⁄ 15 (i.e. 0.07)).
Next, we analyzed the distributions of folding rates of homologs of the 57 obligatory GroEL substrates in mollicutes (27), a class of bacteria that includes both organisms with a chaperonin system and (the only known) organisms without a chaperonin system. The distribution of folding rates of the 155 homologs of GroEL substrates in mollicutes with a chaperonin system was also found to be bimodal ( Fig. 2A). By contrast, the distribution of folding rates of the 387 homologs of GroEL substrates in mollicutes without a chaperonin system was found to be multimodal (Fig. 2B). These results support our proposal that bimodality in the distribution of folding rates is a property of GroEL substrates that is linked to the mechanism of their GroE dependence.
In summary, there is compelling biochemical evidence that GroE footballs are functional. It is not clear, however, why inter-ring negative cooperativity (28), which is responsible for the presence of the bullet form, has evolved if the only functional form is the football. Our data suggest that both forms are functional with respect to substrates with different folding rates. It is also possible, however, that both forms are functional under different conditions. It has been shown, for example, that formation of footballs is favored under high ATP concentrations, but the range of ATP concentrations in E. coli cells varies widely and was reported to be 1.55 Ϯ 1.22 mM (29). Consequently, the asymmetric species may also be the main functional form at low ATP concentrations. In general, co-existing complexes can complement each other and need not be mutually exclusive in having a functional role.

Materials and methods
The sequences of E. coli (K12 strain, organism ID 83333) proteins were extracted from Uniprot. The number of cytosolic proteins, which were defined as such if they did not include the keyword "membrane" in their Uniport accession, was found to where n in the number of data points, RSS is the residual sum of squares between the model and the data, and k is the number of parameters in the model. The two models were compared using ⌬BIC i,j ϭ BIC i Ϫ BIC j , where i and j represent the number of modes in the model. Here, we found ⌬BIC 1,2 ϭ 16, indicating BIC 2 Ͻ BIC 1 and, therefore, that the bimodal distribution describes the data significantly better (lower BIC score is better). B, distribution of the folding rates of all E. coli cytosolic proteins is unimodal, with a maximum value of log(k F ) ϭ Ϫ0.72. Here, ⌬BIC 1,2 ϭ Ϫ22, thereby indicating that the unimodal distribution describes the data significantly better. Figure 2. Distributions of folding rates of homologs of the GroEL substrates in mollicutes with and without a chaperonin system. A, distribution of folding rates of 155 homologs of GroEL substrates in mollicutes with a chaperonin system. For this class, ⌬BIC 1,2 ϭ 15, indicating that the bimodal distribution describes the data better than a unimodal distribution, as found for the GroEL substrates in E. coli. B, distribution of folding rates of 387 homologs of GroEL substrates in mollicutes without a chaperonin system. For this class, ⌬BIC 1,2 ϭ 6 and ⌬BIC 2,3 ϭ 28, indicating that a trimodal distribution describes the data significantly better than unimodal or bimodal distributions.
ACCELERATED COMMUNICATION: Bullet and football GroE complexes be 1579. The list of GroEL homologs from mollicutes was taken from Ref. 27. PSIPRED (24) was used to predict the secondary structure of all of the proteins in this study. Folding rates were calculated using the equation, log 10 (k F ) ϭ 10.7 Ϫ 16.6(l eff 0.1 Ϫ 1), where l eff is the effective protein length (25). The experimentally determined (30) folding rate of the GroEL substrate PepQ, for example, is 0.039 min Ϫ1 (log 10 (k F ) ϭ Ϫ3.23), in good agreement with the calculated rate of 0.026 min Ϫ1 (log 10 (k F ) ϭ Ϫ3.36).