Pseudo-T-even Bacteriophage RB49 Encodes CocO, a Cochaperonin for GroEL, Which Can Substitute for Escherichia coli's GroES and Bacteriophage T4's Gp31*

Bacteriophage T4-encoded Gp31 is a functional ortholog of the Escherichia coli GroES cochaperonin protein. Both of these proteins form transient, productive complexes with the GroEL chaperonin, required for protein folding and other related functions in the cell. However, Gp31 is specifically required, in conjunction with GroEL, for the correct folding of Gp23, the major capsid protein of T4. To better understand the interaction between GroEL and its cochaperonin cognates, we determined whether the so-called “pseudo-T-even bacteriophages” are dependent on host GroEL function and whether they also encode their own cochaperonin. Here, we report the isolation of an allele-specific mutation of bacteriophage RB49, called ε22, which permits growth on the E. coli groEL44 mutant but not on the isogenic wild type host. RB49 ε22 was used in marker rescue experiments to identify the corresponding wild type gene, which we have named cocO(cochaperonin cognate). CocO has extremely limited identity to GroES but is 34% identical and 55% similar at the protein sequence level to T4 Gp31, sharing all of the structural features of Gp31 that distinguish it from GroES. CocO can substitute for Gp31 in T4 growth and also suppresses the temperature-sensitive phenotype of the E. coli groES42 mutant. CocO's predicted mobile loop is one residue longer than that of Gp31, with the ε22 mutation resulting in a Q36R substitution in this extra residue. Both the CocO wild type and ε22 proteins have been purified and shownin vitro to assist GroEL in the refolding of denatured citrate synthase.

The Escherichia coli GroEL and GroES proteins form a complex, known as the GroE chaperone machine, that is crucial to the functions related to protein folding (reviewed in Refs. [1][2][3][4]. The GroEL and GroES proteins are often referred to as chaperonin and cochaperonin, respectively (5). Both GroES and GroEL have been shown to be essential for E. coli viability under all conditions tested (6). Extensive structural analysis has revealed that both proteins are organized into rings with a 7-fold rotational axis (7,8). GroEL is arranged in two head-tohead stacked rings of seven subunits in whose central cavities various substrate proteins can bind, primarily through hydrophobic interactions (Ref. 9; reviewed in Refs. 2 and 4). GroES is a heptameric ring of 10.5-kDa subunits from which mobile loops extend and interact with GroEL (10,11).
A typical GroEL substrate is first captured in either of the GroEL central cavities, and the subsequent binding of ATP and GroES to the same ring (referred to as the cis ring) results in the formation of a dome-like structure over the substrate, which is now released into the GroEL cis cavity. The subsequent binding of ATP and substrate (and potentially GroES as well) in the trans GroEL ring results in the release of GroES from the cis ring (4,12). If the substrate in the cis ring is properly folded, or in a form that no longer binds GroEL, it is released into the medium, and this GroEL ring will become the new trans ring in a new GroEL folding cycle. Houry et al. (13) have shown, by using coimmunoprecipitation experiments, that ϳ10% of E. coli's polypeptides bind transiently to GroEL. Some of these proteins are encoded by essential genes, in agreement with the genetic observation that the groES and groEL genes are both essential for E. coli growth (6).
The bacteriophage T4, which is dependent on the host GroEL function for folding of its major capsid protein, Gp23, encodes its own GroES ortholog, termed Gp31. There is very little sequence similarity between GroES and Gp31 (14,15), although there is a limited conservation in certain regions, most notably the mobile loop that becomes immobilized upon binding to GroEL (10,16,17). The functional importance of the mobile loop was first shown by sequencing mutations in groES that blocked bacteriophage growth ( requires both GroEL and GroES host functions for growth (10,18)). Similarly, T4 gene 31 mutations, which allow T4 to propagate on certain E. coli groEL mutants, also affect the Gp31 mobile loop (14,19,20).
Comparative studies of GroES and Gp31 have shown that while Gp31 can substitute for GroES, the reverse is not true; i.e. Gp31 can suppress the temperature-sensitive phenotype of the E. coli groES42 mutant and can participate in the in vitro folding of various substrates, but GroES is unable to substitute for Gp31 in T4 bacteriophage assembly, even when overproduced (21). This may not be totally surprising, since it is likely that Gp31 evolved as a specialized ortholog of GroES that specifically folds the Gp23 substrate. Gp23 is 56 kDa, near the maximum limit allowed by the "Anfinsen cage" formed under the dome of the GroEL-GroES complex (reviewed in Refs. 3 and 4). Structural studies suggest that the GroEL-Gp31 complex may result in the formation of a larger cavity under the dome of the Gp31 heptamer due to a longer mobile loop, the lack of the roof loop present in GroES, and lack of a conserved aromatic residue, which extends into the cavity of GroES (22,23).
Recently, a large number of bacteriophages with T4-like morphology have been characterized according to their sequence homology with the classical T-even bacteriophages, T2, T4, and T6 (24). The vast majority of these bacteriophages are very closely related to T-evens at the DNA sequence level. However, a few of them have DNA sequences that differ significantly, and these have been termed "pseudo-T-even bacteriophages" (25). We investigated the genomes of the pseudo-Teven bacteriophages to determine whether they possess a gene encoding a homolog of either GroES or Gp31. Here, we report our findings on the bacteriophage RB49 gene 31 homolog, which we have named cocO (cochaperonin cognate).

EXPERIMENTAL PROCEDURES
Strains-The various strains, bacteriophages, and plasmids used in this study are listed in Table I.
Media-Bacteria were grown in Luria-Bertani broth (LB; 10 g of tryptone, 5 g of NaCl, 5 g of yeast extract per liter, pH 7) or on LB-agar (10 g of agar/liter of LB). Top agar (6 g of agar/liter of LB) was used for seeding lawns with cells and/or bacteriophage. Ampicillin and tetracycline were added to the media at final concentrations of 100 or 15 g/ml, respectively, when necessary. TSG buffer (0.01 M Tris-HCl, pH 7.4 -7.5, 0.15 M NaCl, 0.03% gelatin) was used for making serial dilutions of bacteriophage and for storing concentrated bacteriophage stocks.
DNA Library Construction-One liter of an RB49 lysate (2 ϫ 10 9 pfu/ml) was concentrated by polyethylene glycol precipitation (34), followed by CsCl banding on a step gradient. The concentrated bacteriophage was dialyzed for 30 min in 500 ml 3 M NaCl, 0.1 M Tris, pH 7.4, followed by two 30-min dialyses in 500 ml of 0.3 M NaCl, 0.1 M Tris, pH 7.4. The dialyzed bacteriophage was treated with proteinase K, followed by phenol/chloroform extraction (34). The DNA was partially digested using Sau3A (New England Biolabs), and fragments of 1.5-5 kb were purified from an agarose gel. The fragments were ligated to the vector pMPMA4 (a derivative of pMPMA4⍀ (32) in which digestion by XbaI was used to remove the ⍀ fragment encoding spectinomycin resistance), which had been digested with BglII. The ligation reaction was trans-formed into competent DA1368 (MC1061 endA::tet R ) 1 bacteria, and transformants were selected on LB-agar-ampicillin plates at 37°C.
Marker Rescue-Cells growing exponentially at a density of A 595 ϭ 0.1 were infected with the appropriate bacteriophage at a multiplicity of infection of 0.02 bacteriophage/bacterium. The cultures were aerated at 37°C until lysed, ϳ2 h. Chloroform was added to ensure lysis and the killing of all remaining cells. To check for wild type recombinants, 0.1 ml of the lysate was mixed with 0.3 ml of a saturated culture of groEL140 cells and 3 ml of molten top agar and then poured onto an LB-agar plate. The plates were incubated at 37°C overnight to allow plaque formation.
Suppression of the groES42 Temperature-sensitive Phenotype-E. coli groES42 cells were transformed with either the pcocO ϩ 34-2, p31 ϩ , or pMPMA4 parental vector plasmids. (p31 ϩ is a plasmid that complements the T4 31amNG71 mutant, which has an amber (nonsense) mutation in gene 31. The plasmid was isolated from a T4 genomic library cloned into the vector pBAD18 (33).) Transformants were selected for ampicillin resistance at 30°C. After overnight growth, two colonies from each transformation were resuspended in 100 l of LB broth each. Three 5-fold dilutions were made of each culture. Ten l of each dilution and the original suspension were spotted on LB-agarampicillin plates. The plates were incubated at 30 or 43°C, and bacterial colony-forming ability was monitored following overnight incubation.
Cloning of the Minimal cocO Gene for Overexpression-The cocO 5ЈRI/NdeI and cocO 3Ј XbaI primers were used to polymerase chain reaction-amplify the minimal cocO gene from wild type RB49 bacteriophage. The product was cloned into the EcoRI and XbaI sites of the pMPM201 vector under the regulation of the P BAD promoter as previously described (32). The resulting plasmid was named pFScocO ϩ . The cocO⑀22 allele was cloned in the same manner, using RB49 cocO⑀22 DNA as template, resulting in plasmid pFScocO⑀22. Purification of the CocO Protein-Six 1.5-liter cultures inoculated with DA1368 bacteria containing plasmid pFScocO ϩ were grown in LB plus 100 g/ml ampicillin at 37°C. Synthesis of CocO was induced at A 600 ϳ0.6 with 0.02% arabinose for 4 h. Cells were harvested by centrifugation and washed once in buffer containing 50 mM Tris, pH 7.5, 1 mM EDTA, 5 mM ␤-mercaptoethanol, and 15% (v/v) glycerol. The pellet was stored at Ϫ20°C and gently thawed before lysis. Cells were resuspended in buffer A (10 mM Tris, pH 7.7, 100 mM NaCl, 1 mM EDTA, 1 mM ␤-mercaptoethanol) and lysed by two passes through a French press at 1,000 p.s.i. The following steps are adapted from the purification procedure used for Gp31 (21). After removal of the debris by low speed centrifugation (10 min at 10,000 ϫ g), followed by ultracentrifugation for 60 min at 145,000 ϫ g, the supernatant was fractionated by a streptomycin sulfate (5% of 25% (w/v)) precipitation. The CocO-containing supernatant was further fractionated by a 35% (w/v) ammonium sulfate precipitation. Again, CocO remained in the supernatant. CocO was dialyzed into buffer B (20 mM Tris, pH 7.7, 1 mM EDTA, 1 mM ␤-mercaptoethanol) and applied to a Q-Sepharose column, which binds acidic proteins. Because of its neutral pI, CocO eluted in the flowthrough. After a 70% (w/v) ammonium sulfate precipitation, CocO was resuspended and dialyzed into buffer C (20 mM sodium phosphate, pH 6.8, 5 mM ␤-mercaptoethanol, 1 mM EDTA) and loaded onto an hydroxyapatite column to which a 20 mM to 0.5 M sodium phosphate, pH 6.8, gradient was applied. CocO eluted at ϳ0.2 M sodium phosphate. After exchanging into buffer D (100 mM Tris, pH 7.5, 1 mM EDTA, 1 mM ␤-mercaptoethanol), final impurities were removed by gel filtration (Hiload 16/60 Superdex G200 on a Waters HPLC). CocO eluted at 16 min, suggesting an approximate native molecular mass of 80,000 Da.
Purification of the mutant CocO⑀22 protein was executed following its overproduction from plasmid pFScocO⑀22, using the exact steps described for the wild type protein.
Refolding of Citrate Synthase-The procedure for chaperonin-dependent renaturation of pig heart citrate synthase has been described previously (35). Briefly, citrate synthase at 33 M was denatured in 6 M guanidine hydrochloride, 3 mM dithiothreitol, and 2 mM EDTA for 30 min at 25°C. Denatured citrate synthase was diluted to 0.2 M into a renaturation mix containing 20 mM potassium phosphate, pH 7.4, 10 mM MgCl 2 , 2 mM ATP, 1 mM oxaloacetic acid and various combinations of the chaperonins at 4.2 M (concentration given for monomers) as indicated under "Results." The refolding reactions were performed at 27°C in a total volume of 400 l, and citrate synthase activity was measured after 60 min. The enzyme activity was assayed by measuring the decrease in absorption at 232 nm due to the cleavage of acetyl CoA and the utilization of oxaloacetic acid.

RESULTS
Dependence of RB49 on groEL-Both polymerase chain reaction amplification and DNA hybridization experiments with the pseudo-T-even bacteriophage RB49 suggested that this bacteriophage has no gene 31 homolog with significant DNA sequence similarity to the T4 gene (data not shown). Thus, we determined if RB49 growth is dependent on the host GroEL and GroES functions. We found that, like bacteriophage T4, bacteriophage RB49 is totally dependent on the E. coli GroEL function for successful completion of its life cycle but is not dependent on GroES function. Table II shows that wild type RB49, which plates with an efficiency of 1.0 on the wild type host, does not form plaques on either the E. coli groEL44 or groEL515 mutant hosts but does propagate on the groEL140 and groES42 hosts. Since bacteriophage T4 plaques normally on groEL515, it appears that RB49 is more sensitive to the in vivo biological effects of this mutation.
Isolation of RB49 Mutations in the Putative Gene 31 Homolog-In analogy with the isolation of the bacteriophage T4 31⑀1 mutant (23,31), spontaneous mutants of RB49 can be isolated at a frequency of ϳ5 ϫ 10 Ϫ8 that grow on groEL44 mutant bacteria. One such isolate, RB49 ⑀22, forms wild type-size plaques on the E. coli groEL44 mutant host at 37° (Table II). It was found that RB49 ⑀22 cannot form plaques on either wild type, groEL140, or groES42 hosts at 37°C; however, it does form plaques on these same hosts at 42°C. For comparison, we include the T4 mutant T4 31⑀1 in Table II, which was originally isolated as a spontaneous plaque former on groEL44 (31). Previous analysis has shown that the T4 31⑀1 mutation results in an L35I substitution in the middle of the mobile loop of Gp31 (14,19,23).
We obtained further evidence for a Gp31 homolog in RB49 by showing that the T4 gene 31 product complements the ⑀22 mutation (Table III), allowing bacteriophage RB49 ⑀22 growth on the restricting E. coli wild type host at 37°C. In contrast, a plasmid overexpressing both the E. coli groEL and groES wild type genes does not complement the ⑀22 mutation (data not shown). Earlier results had shown that overproduction of groES also does not complement the T4 31⑀1 defect (21).
Cloning and Genetic Properties of cocO-To identify the RB49 gene where the ⑀22 mutation resides, we constructed an RB49 DNA library in the multicopy plasmid pMPMA4. Although the construction of the RB49 DNA library was initiated with ϳ1.5-5.0-kb-long DNA fragments, the resulting library was enriched in small DNA insert fragments, probably because of the toxicity of many of the RB49 DNA sequences in E. coli.
We transformed the entire library into the E. coli groEL44 mutant, which is permissive for RB49 ⑀22 and used this library in a marker rescue scheme. Since it was unclear whether the wild type clone corresponding to ⑀22 would be underrepresented in our RB49 DNA library, we started with an initial pool of ϳ10 5 independent groEL44 transformants obtained with plasmid DNA purified from the original amplified library (itself consisting of ϳ10 6 independent clones). This pool was divided into 45 groups each containing ϳ2,000 transformants and allowed to amplify. RB49 ⑀22 was grown on these 45 groups of transformants, and 0.1 ml of each lysate was plated on the nonpermissive groEL140 mutant strain at 37°C (Table II) to detect the presence of RB49 wild type recombinants or revertants. Whereas most of the transformant pools gave fewer than 20 plaques per plate, five of them gave significantly higher numbers, ranging from 50 to 200. The marker rescue scheme was repeated (five independent lysates each) for two of these pools, confirming that the elevated number of plaque formers (putative wild type recombinants) was not due to statistical fluctuation. One of these pools was chosen for further screening and, by repeating the marker rescue experiment with progressively smaller pools of transformants (250, 10, and finally individual transformants), we progressively obtained higher frequencies of RB49 wild type recombinants. Thus, we were able Each of the indicated bacterial strains was seeded in molten top agar on LB-agar plates. Stocks of the four indicated bacteriophage strains were serially diluted 10-fold, and 5 l of each dilution was spot-tested on each lawn. The plates were incubated overnight at 37 or 42°C, and the plaques were counted. The approximate efficiency of plating of each bacteriophage strain on a particular bacterial host is shown by either a plus sign, indicating an efficiency of plating of approximately 1.0, or a minus sign, indicating an efficiency of plating of less than 10 Ϫ6 .
to eventually isolate a single transformant carrying a plasmid (pcocO ϩ 34-2) encoding the RB49 wild type allele that corresponds to the ⑀22 mutation and which we have named cocO (cochaperonin cognate).
Besides the ability to recombine with the ⑀22 mutation, the following experiments strongly suggested that the pcocO ϩ 34-2 clone carries a bona fide Gp31 homolog. First, the presence of pcocO ϩ 34-2 enables both the T4 31amNG71, which carries an amber (nonsense) mutation in gene 31, and RB49 cocO⑀22 mutant bacteriophages to propagate on their otherwise restrictive wild type host, CG3014 sup ϩ , which lacks a nonsense suppressor. Second, we could show that plasmid pcocO ϩ 34-2 suppresses the temperature-sensitive phenotype of the E. coli groES42 mutant, as was previously shown with the cloned T4 gene 31 ( Fig. 1; Ref. 21).
Isolation of an Amber Mutation in the cocO Gene-Following the successful cloning of the cocO ϩ gene, it was transformed into CG3014 sup ϩ bacteria. We tested the resulting CG3014 (pcocO ϩ 34-2) strain for permissiveness toward a collection of RB49 amber mutants that were isolated on the basis of propagating on CR63 supD bacteria but not on the CG3014 sup ϩ host. We found that of all the RB49 amber mutants tested, only RB49 amE45 was capable of growth on the CG3014 (pcocO ϩ 34-2) bacteria. Thus, it was renamed RB49 co-cOamE45. Not only did the pcocO ϩ 34-2 plasmid enable normal growth of RB49 cocOamE45 (Table III), but 1% of the progeny were wild type RB49 recombinants. These results, taken together with those described above, genetically demonstrate that the pcocO ϩ 34-2 clone indeed carries the wild type gene that corresponds to the RB49 ⑀22 and amE45 alleles.
The Sequence and Predicted Structure of the CocO Protein-The predicted sequence of the RB49 cocO gene product is shown in Fig. 2. It encodes a 107-amino acid residue protein with a predicted molecular mass of 11,732 Da and a theoretical pI of 6.84. It is 34% identical and 55% similar at the amino acid sequence level to T4 Gp31 (36). The ⑀22 mutation is a transition mutation, resulting in the change of a glutamine codon (CAA) to arginine (CGA) at amino acid position 36 (Q36R). Neither Gp31 nor GroES have this particular residue in their mobile loop. In analogy with Gp31, the predicted mobile loop of CocO is longer than that of GroES and contains the highly conserved glycine and the three consecutive hydrophobic residues found at positions 32-34 in CocO (Fig. 2). Similar to Gp31, CocO encodes neither the roof loop (composed of residues 48 -55 in GroES) nor the tyrosine at position 71 of GroES, both highly conserved features among the other GroES cochaperonin homologs (22,23). Crystallography of T4 Gp31 has led to the suggestion that these modifications may result in a larger Anfinsen cage in the GroEL-Gp31 complex (22). Perhaps this allows a better accommodation of the 56-kDa Gp23 capsid protein, which is at the upper limit of the size permitted by the GroEL-GroES complex.
Alignment of the GroES, Gp31, and CocO amino acid sequences reveals a stretch of 12 amino acid residues (residues 79 -90 in CocO) conserved between the two bacteriophageencoded cochaperonins but absent in GroES and the rest of its cochaperonin homologs. In the three-dimensional structure of Gp31, this region constitutes an extra loop extending down from the external face of the Gp31 dome (22).
Sequencing of the RB49 cocOamE45 mutation showed that it is localized in codon 100, being a CAG to TAG transition mutation. Thus, the corresponding CocOamE45 protein should be eight amino acids shorter than wild type CocO when expressed in the CG3014 sup ϩ (nonsuppressing) bacterial host. The lethal phenotype of this mutation clearly indicates that the last eight amino acids of CocO are important for its correct functioning and/or assembly.
Purification of CocO Protein-The CocO wild type protein was overproduced and purified from bacteria carrying the pF-ScocO ϩ plasmid (see "Experimental Procedures" for details). Similarly, the CocO⑀22 mutant protein was overproduced and purified from bacteria carrying the pFScocO⑀22 plasmid. Both proteins were at least 90% pure, as judged by staining of the purified protein preparations following separation by SDSpolyacrylamide gel electrophoresis (Fig. 3A). The elution profiles of both proteins during HPLC gel filtration suggest an  1. Suppression of the groES42 temperature-sensitive growth phenotype by the RB49 cocO gene. Strain DA1415 (groES42) was transformed with either plasmid pMPMA4 (control), p31 ϩ (expressing wild type Gp31), or pcocO ϩ 34-2 (expressing wild type CocO). See "Experimental Procedures" for details.

FIG. 2.
Comparison between Gp31, CocO, and GroES predicted amino acid sequences. The alignment between Gp31 and GroES is taken from the article by Hunt et al. (22). Alignment between CocO and Gp31 was performed using BLAST (36). approximate molecular mass of 80,000 Da, consistent with their putative heptameric forms (data not shown).

Functional Analysis of the CocO Wild Type and CocO⑀22
Proteins-To test whether the purified CocO wild type and CocO⑀22 proteins are active and capable of interacting with their GroEL partner, we carried out in vitro folding assays with denatured citrate synthase. Previously, it was established that the correct in vitro folding of pig heart citrate synthase requires both the GroEL and GroES proteins (37,38). As shown in Fig.  3B, the CocO cochaperonin is indeed capable of assisting wild type GroEL in the folding of denatured citrate synthase. Even more convincing are the observations that (a) the wild type CocO protein cannot assist the GroEL44 mutant protein in citrate synthase folding, thus reproducing the in vivo inability of bacteriophage RB49 to propagate on the groEL44 mutant host (Table II), and (b) the CocO⑀22 mutant protein restores the ability of the GroEL44 mutant protein to fold citrate synthase, again in agreement with the in vivo result that the bacteriophage RB49 cocO⑀22 mutant grows perfectly well on the groEL44 mutant host (Table II). These experiments have been performed using various chaperonin/cochaperonin ratios, with essentially the same results (data not shown). In contrast to the in vivo bacteriophage plating phenotype (Table II), the CocO⑀22 mutant protein was capable of assisting the wild type GroEL protein in citrate synthase folding. This result is not entirely unexpected, since bacteriophage RB49 cocO⑀22 can form plaques on CG3014 wild type bacteria at high temperatures, thus demonstrating significant interaction between GroEL wild type and CocO⑀22 proteins, at least at high temperature (Table II). Most likely, citrate synthase, although a relatively large substrate (monomer is 51,629 Da), is not as problematic to fold as the RB49 major capsid protein.

DISCUSSION
Although the x-ray crystallographic structures of the GroEL chaperonin and its GroES cochaperonin have been solved, both as a complex and as individual proteins, many details of their interaction with each other and their substrates are not understood (2,7,8,11). The primary contact that GroES makes with GroEL has been localized to a 16-amino acid residue segment of GroES. This so-called mobile loop, although relatively unstructured in free GroES, adopts a ␤-hairpin structure upon binding to GroEL (10). In the crystal structure, the only direct contact that GroES makes with GroEL is with a conserved, hydrophobic tripeptide located in the middle of its mobile loop (10,11). The contributions of other elements to both the initial binding of GroES to GroEL and to the subsequent stabilization of the GroEL-GroES complex are unknown. Another detail that remains to be clarified is whether GroES binding to the trans ring of GroEL can take place while GroES is still bound to the cis ring ("footballs") and whether such binding may be required for the efficient release of some substrates from the cis ring of GroEL.
Genetic and biochemical studies have identified the bacteriophage T4-encoded Gp31 protein (product of gene 31) as essential for the correct folding of Gp23 (product of gene 23), the major bacteriophage capsid protein (39 -42). In the absence of a functional gene 31, the Gp23 capsid protein aggregates in lumps on the E. coli inner membrane (39). The realization that some groEL mutations (but none of the groES mutations) block bacteriophage T4 morphogenesis by interfering with the action of Gp31 led to the proposal that Gp31 may serve as a more specialized GroES-like homolog, capable of assisting the correct folding of Gp23 by GroEL (14,43) despite the fact that the GroES and Gp31 proteins share little sequence identity (ϳ14%; Refs. 14 -16). Gp31 is indeed a cochaperonin for GroEL; it substitutes for GroES in protein folding, binds GroEL, and modulates its ATPase activity as GroES does and even suppresses the temperature-sensitive phenotype of groES mutations (21). But only Gp31, and not GroES, can assist in correctly folding Gp23 in vivo (21,44).
What then is the feature(s) of Gp31 that distinguishes it from GroES? Hunt et al. (22) emphasized four structural features that distinguish Gp31 from GroES and all of the other GroESlike cochaperonins. First, since Gp31's mobile loop is substantially longer (22 versus 16 amino acids) compared with that of GroES (10,19), this may result in the formation of a "taller" dome structure over GroEL. Such a larger space in the GroEL-Gp31 cavity could better accommodate Gp23, a relatively large substrate at 56 kDa. A large mobile loop is also a feature of the CocO cochaperonin (Fig. 2), which is actually one amino acid longer than that of Gp31. The ⑀22 mutation (Q36R) that results in an efficient interaction with the GroEL44 mutant chaperonin actually alters this extra residue in CocO. Previous biochemical analyses have shown that Gp31 and GroEL44 do not efficiently form a complex in vitro (20). Thus, CocO and GroEL44 could suffer from a similar defect, which is overcome by the Q36R substitution of CocO⑀22. The inability of the GroEL44-CocO wild type pair to correctly interact in vivo could be reproduced in vitro where they are unable to refold citrate synthase. Furthermore, the GroEL44-CocO⑀22 pair of mutant proteins that functions normally in vivo for bacteriophage RB49 morphogenesis also correctly folds citrate synthase in vitro. In this respect, it is interesting that the only mutation of bacteriophage T4 gene 31 that is capable of restoring substantial interaction with GroEL44 is the T4 31⑀1 mutation, an L35I substitution in Gp31's mobile loop (23). This substitution results in a more hydrophobic amino acid (isoleucine 35 for leucine) in the first position of the hydrophobic tripeptide of Gp31. In contrast, the Q36R substitution in CocO's mobile loop enables it to interact even better with GroEL44, as judged by the fact that T4 31⑀1 (L35I) grows on groEL44 hosts at 37°C but not at 42°C, while RB49 cocO⑀22 (Q36R) grows on groEL44 bacteria at both 37 and 42°C (Table II). The Q36R substitution is located two amino acid residues C-terminal to the conserved hydrophobic tripeptide and may bypass the GroEL44 block by a different mechanism than the L35I substitution of Gp31.
Hunt et al. (22) also noted the presence of a universally conserved aromatic amino acid residue at position 71 in GroES and its homologs. This residue juts into the central cavity of GroES and limits the volume available within the GroEL-GroES complex for substrate folding. This aromatic amino acid is absent from the equivalent position in both Gp31 and the CocO cochaperonin (Fig. 2). Another feature that distinguishes both Gp31 and CocO from GroES and the other cochaperonins is the lack of a "roof loop" structure at the top of the GroES dome (Fig. 2). Again, this difference could allow Gp31 and CocO to fold a larger substrate, such as Gp23. Finally, both Gp31 and CocO possess an extra loop at their C-terminal region that is absent in GroES and all other GroES-like cochaperonins. In the Gp31 crystal structure, this extra loop domain is located outside of the dome. Although it is not clear what role it plays in Gp31 and CocO cochaperonin function, deletion of this loop domain abolishes Gp31 specificity for Gp23 but not for its E. coli protein substrates (45).
Despite the various structural differences between the bacteriophage-encoded Gp31(CocO) cochaperonin class and the GroES-type cochaperonins, the molecular mechanism(s) that underlies the unique capacity of the Gp31(CocO) cochaperonin to fold the Gp23 capsid protein remains enigmatic. The T-even bacteriophages assemble an icosahedral head structure composed of 960 Gp23 subunits (46). If one assumes a burst size of 200 bacteriophage particles per infected bacterium, then ϳ200,000 Gp23 subunits must mature during the short time (10 -20 min at 37°C) available for virion morphogenesis. In addition to being relatively large, the Gp23 capsid protein also has a tendency to aggregate, as judged by the formation of lumps in the absence of Gp31 (39). Thus, Gp23 may necessitate many cycles of GroEL binding and release for its proper maturation. It could be that the Gp31(CocO)-GroEL folding cycle is substantially shorter than that of GroES-GroEL, thus favoring the fast and efficient maturation of Gp23. Another possibility is that the Gp31(CocO) cochaperonin can bind to the trans ring of GroEL more efficiently than GroES and, in doing so, accelerate the release of Gp23 from the cis ring. Finally, the size and biochemical properties of the Gp31(CocO)-GroEL cavity could be specifically customized for Gp23 folding. Of course, none of these possibilities are mutually exclusive, and they could all contribute to the correct and rapid maturation of Gp23. Whatever this difference between Gp31 (CocO) and GroES may be, it is clear that the Gp31(CocO) cochaperonin class is capable of folding all of E. coli's essential GroEL substrates, since either Gp31 or CocO can totally substitute for GroES in bacterial growth. 2 Thus, Gp31 (CocO) must maintain all features necessary for the correct folding of many polypeptide substrates. Clearly, more experiments are necessary, such as domain swap experiments, to pinpoint the Gp31 (CocO) features that enable preferential Gp23 folding.
One aspect of this work deserves further comment. Here, we have studied the structure and function of the Gp31 protein of an evolutionarily distant T4-type bacteriophage and compared its properties with those of the T4 homolog and the E. coli ortholog. There is extremely little sequence homology between bacteriophage T4 Gp31 and E. coli GroES, although their cochaperonin functions are very similar. More sequence homology is apparent between GroES and RB49 CocO sequences, particularly in the N-terminal portion of these proteins. As Gp31 homologs are analyzed from additional distant T4-type bacteriophages, we anticipate that their evolutionary relationship to the GroES protein will become increasingly obvious. Such an expanded phylogenetic analysis of Gp31 could also provide valuable information about the motifs and domains that are critical for function and thus conserved.