GroEL binds artificial proteins with random sequences.

Chaperonin GroEL from Escherichia coli binds to the non-native states of many unrelated proteins, and GroEL-recognizable structural features have been argued. As model substrate proteins of GroEL, we used seven artificial proteins (138 approximately 141 residues), each of which has a unique but randomly chosen amino acid sequence and no propensity to fold into a certain structure. Two of them were water-soluble, and the rest were soluble in 3 m urea. The soluble ones interacted with GroEL in a manner similar to that of a natural substrate; they stimulated the ATPase cycle of GroEL and GroEL/GroES and inhibited GroEL-assisted folding of other protein. All seven artificial proteins were able to bind to GroEL. The results suggest that the secondary structure as well as the specific sequence motif of the substrate proteins are not necessary to be recognized by GroEL.

A wide range of proteins, unrelated in their primary sequences and native tertiary structures, can bind to GroEL (25)(26)(27). Because GroEL cannot interact with proteins in the native state, there should be specific local structural features that are recognized by GroEL and are commonly found in many proteins in the non-native, but not native, state. However, it remains yet unclear what kinds of structural features of these proteins are recognized by GroEL. Studies using a variety of proteins as a model of substrate protein have been carried out, but they have led to several different proposals on the GroELrecognizable structural features, ranging from a fully unfolded to native-like conformation (28 -36). Some proteins (short peptides) containing ␣-helices were reported to bind to GroEL (37)(38)(39)(40), although it was shown that an all or mainly ␤-sheet protein (Fab fragment, pepsin, and green fluorescent protein) could bind to GroEL (16,41,42). On the other hand, binding of GroEL to unfolded proteins without an ordered secondary structure, apocytochrome c (33) and ␣-casein (21), was demonstrated. In the crystal structure of the isolated apical domain of truncated GroEL with a 17-residue NH 2 -terminal tag, the region identified previously as a polypeptide binding site of GroEL indeed held a peptide stretch of the tag (residues Ϫ1 to Ϫ7) region with a nearly fully extended configuration probably as a mimic of the natural substrate protein (17).
Generally, natural proteins have an inherent propensity to form secondary (and tertiary) structures. This may cause ambiguity in the search for a GroEL-recognizable structure because it is not easy to exclude the possibility that the folding state of the substrate protein changes upon binding to GroEL. Even such a possibility exists that a portion of the bound protein which is not involved directly in the binding is allowed to fold some secondary structure. Recently Yomo and his colleagues (43) reported that the plasmid library generates numerous artificial proteins, each of which has a unique and randomly chosen amino acid sequence. These proteins are free from evolutional constraints and, except for very rare probability, have no propensity to fold into a certain structure (44). Taking advantage of this, we used seven different artificial proteins with 138ϳ141 amino acids as model substrate proteins of GroEL and found that all of them are able to bind to GroEL. Thus the secondary structure as well as specific sequence motif of the substrate proteins are not necessary for recognition by GroEL.
Mutation-pEL237E carrying a gene of the GroEL mutant (L237E), which is unable to bind polypeptide (18), was generated by site-directed mutagenesis using an oligonucleotide GCAACAGCTTCTTCAACCGG-TAGCATTTCGCGG. Single-stranded DNA of pET-EL derived from pET21c (Novagen) was obtained by infecting E. coli CJ236 cells with helper phage M13K07 (Amersham Pharmacia Biotech).
Protein Expression and Purification-The purification of GroEL from E. coli JM109 cells bearing the expression plasmid pKY206 (45) and GroES was described in Ref. 26. The GroEL fractions of a butyl-Toyopearl column (Tosoh) were concentrated with ultrafiltration (Amicon) with a 100-kDa cutoff filter (YM100) and were diluted with a buffer containing 20% methanol. This concentration-dilution procedure was repeated three times. GroEL fractions were dialyzed overnight in 25 mM Tris-HCl, pH 7.5, and stored at Ϫ20°C after the addition of glycerol (final concentration, 25%). GroES fractions were dialyzed overnight in 20 mM Na-P i , pH 7.5, and stored at Ϫ80°C. The mutant GroEL (L237E) was purified from E. coli BL21(DE3) cells bearing the pEL237E by the same procedures except that the ammonium sulfate concentration in the lysis buffer and the column buffer was 15% saturation. Proteins * This work was supported by research fellowships from the Japan Society for the Promotion of Science for Young Scientists. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
with random sequences, RP3-42 and RP3-45, were expressed as soluble proteins in E. coli and purified as follows. E. coli KP3998 expressing the protein was harvested, washed with 30 mM Tris-HCl, pH 7.5, 30 mM NaCl and suspended in 25 mM Tris-HCl, pH 7.5, 1 mM EDTA, containing proteinase inhibitor mixture (Complete TM , Roche Molecular Biochemicals). The suspension was incubated at 65°C for 25 min, centrifuged, and the proteins contained in the supernatant were precipitated with a 33% saturation of ammonium sulfate. The precipitate was dissolved in buffer A (25 mM Tris-HCl, pH 7.5, 1 mM EDTA, 1 mM dithiothreitol) and dialyzed to the same buffer. This lysate was applied to a QAE-Toyopearl 550C column (Tosho) equilibrated with buffer A and eluted with a 0 -0.2 M NaCl gradient. Fractions containing random sequence proteins were pooled and applied to a butyl-Toyopearl 650 column equilibrated with buffer A. The column was washed with buffer A and then with water. Proteins were eluted with a linear gradient of ethanol. Fractions containing random sequence proteins were dialyzed to 25 mM Tris-HCl, pH 7.5, and stored at Ϫ80°C. Random sequence proteins expressed as insoluble proteins were partially purified (Ͼ80%) as inclusion bodies by washing with 25 mM Tris-HCl, pH 7.5, 1 mM EDTA, and 1% Triton X-100. These inclusion bodies were solubilized in 3 M urea and stored at Ϫ20°C.
Protein Concentration-Concentrations of RP3-42, RP3-45, rLA, and ␣-casein were determined spectrophotometrically. Extinction coefficients at 280 nm used for RP3-42, RP3-45, rLA, and ␣-casein were as in Refs. 44,44,46, and 47, respectively. Concentrations of wild-type and mutant GroEL and GroES were determined with the Bradford assay (Bio-Rad) using known concentrations of wild-type GroEL and GroES determined by quantitative amino acid analysis as standards. Concentrations of other proteins were determined with the Bradford assay using bovine serum albumin standard (Pierce).
Optical The reaction was linear as a function of time for 10 min. Thus the ATPase activity was obtained from this term. The solution was treated with a malachite green reagent, and the absorbance at 630 nm was measured.
Detection of Binding of Random Sequence Proteins to GroEL with Western Blotting-1 l of random sequence proteins (1 mg/ml) with or without 3 M urea was diluted to 400 l of buffer B containing 40 g of GroEL. Then the solution was concentrated with an ultrafiltration centrifuge device (Ultarfree-MC, 100-kDa cutoff filter, Millipore) to approximately 20 l. This concentrated solution was electrophoresed on 6% polyacrylamide gel without SDS and blotted onto a polyvinylidene difluoride membrane. Random sequence proteins were detected using alkaline phosphatase-conjugated anti T7-tag antibody. Blots were developed with nitro blue tertazolium and 5-bromo-4-chloro-3-indolyl phosphate (49).

FIG. 1. Amino acid sequences of
RPs. The consensus sequence of NH 2 -terminal region (11 residues) is a T7 epitope tag sequence and linker sequence (6 residues). The amino acid sequence at the COOH-terminal region of multiple cloning sites (22 residues) was one of three common sequences arising from three different reading frames.

RESULTS AND DISCUSSION
Random Sequence Proteins-An expression plasmid library encoding artificial genes for proteins with 138ϳ141 amino acid residues was made as described previously (43). The nucleotide sequences were nearly random, although randomness was restricted by three factors: the appearance of stop codons was avoided, a T7 epitope tag sequence (11 residues) and linker sequence (6 residues) were introduced at the NH 2 -terminal region for detection of the proteins, and multiple cloning restriction sites were put into the COOH-terminal region for easy genetic handling. The amino acid sequence at the region of multiple cloning sites (22 residues) of each expressed protein was one of three common sequences arising from three different reading frames. The remaining ϳ90-amino acid sequence contained 20 kinds of amino acids and was nearly random. We transformed E. coli by the plasmid library, chose seven colonies expressing proteins, and deduced the amino acid sequences from nucleotide sequences (Fig. 1). Five transformants produced proteins as inclusion bodies that we solubilized in 3 M urea, designated as RP3-04, -06, -34, -54, and -64 (insoluble RPs). 2 The remaining two, RP3-42 and RP3-45, were expressed as soluble proteins and purified without exposure to urea (soluble RPs). Structures of soluble RPs have been studied previously, and it was suggested that they are rather compact like the molten globule but did not contain a marked secondary structure (44). Indeed, CD spectra of soluble RPs (Fig. 2) were very similar to the spectrum of ␣-casein, a model unfolded protein with no secondary structure. Reference proteins, native holo-LA, and rLA in a lesser degree, showed an ␣-helix signal, a trough around 222 nm. RP3-42 and -45 did not have a trough at 222 nm or at 214 nm, a ␤-sheet signal. Another indication of disordered structures of soluble RPs was a deep trough at 190 nm (50). Exposure of the hydrophobic surface to the bulk water phase was assessed by binding of ANS. The relative fluorescence of ANS at 466 nm at the same protein concentration (18 M) was 100% (apo-LA, a typical molten globule), 57% (RP3-45), 36% (holo-LA), and 18% (RP3-42). In 6 M guanidine HCl, all of the proteins showed no fluorescence (Ͻ5%). Taking into account the results from small angle x-ray scattering (44), it was indicated that although soluble RPs lack the secondary structure, they have a compact shape with some hydrophobic surface (RP3-45ϾRP3-42).
Soluble RPs Stimulated GroEL ATPase Cycle-As reported (34,(51)(52)(53), when substrate proteins such as ␣-casein and rLA were present, the catalytic cycle of GroEL and GroEL/GroES was accelerated, and the rate of ATP hydrolysis was enhanced (Fig. 3). Similarly, RP3-42 and RP3-45 activated the ATPase activity of GroEL and GroEL/GroES. One of the known features of GroEL ATPase activity is the suppression by GroES, and this feature was retained for the natural and soluble RPs. GroEL/GroES-dependent folding of rhodanese from the guanidine HCl-denatured state was inhibited when RP3-42 was present. As the amount of RP3-42 increased, the yield of recovered rhodanese activity decreased, reaching 30% of the original activity in the presence of a 27 molar excess of RP3-42 over rhodanese (data not shown). Similar inhibition was observed for rLA, a well characterized model substrate protein of GroEL. Thus, the soluble RPs compete for the same binding site on 2 In the original paper (43), RP3-04 and RP3-54 were classified as soluble. However, in this report, these were classified as insoluble because of purification from inclusion bodies. GroEL with natural substrate protein. Based on these results we can conclude that GroEL recognizes random sequence proteins as natural substrate proteins.
Binding of Soluble RPs Probed by Fluorescence-Because GroEL has no tryptophan residue, the change of the intrinsic fluorescence emission spectrum of the substrate protein induced by binding to GroEL reflects solely the change of the local environment around tryptophan residues of the substrate proteins. The binding of RP3-42 and -45 to GroEL caused an increase of fluorescent intensities of about 1.6-fold and blue shifts of the emission maximum from 350.5 to 341.5 nm (RP3-42) and from 347 to 342 nm (RP3-45) (Fig. 4A). Such an increase of fluorescence intensity and blue shift of the maximum emission wavelength were also observed when ␣-casein bound to GroEL (not shown). These changes are attributable to the increased hydrophobicity around tryptophan residues which results from interaction with hydrophobic residues of polypeptide binding sites of GroEL. We titrated the increase of fluorescence of RPs with varying amounts of GroEL (Fig. 4B). The dissociation constants (K d ) were estimated by fitting the data to a simple titration equation, and the values of 17 nM (RP3-42) and 145 nM (RP3-45) were obtained. Interestingly, the binding affinity is not parallel to the extent of hydrophobic surface probed by ANS. The real reason for this apparent discrepancy is not known, but the acidic nature of RP3-45 (pI ϭ 4.9) relative to RP3-42 (pI ϭ 5.5) could attenuate the binding to GroEL, which is also acidic (pI ϭ 4.5). The binding stoichiometry was calculated to be 1.75 and 2.05 mol/mol GroEL, respectively. Binding stoichiometry over the value 2 has been reported in Ref. 54. When the mixture of GroEL and dansyl-labeled RP3-42 was subjected to gel filtration high performance liquid chromatography, fluorescence was eluted at the position of GroEL, and the prior incubation of the mixture with ATP resulted in a loss of 65% of the fluorescence (data not shown). This indicates that ATP released GroEL-bound RP3-42 from GroEL like a natural substrate protein.
Direct Detection of Binding of Random Sequence Proteins to GroEL-The mixture of soluble RPs and GroEL was analyzed with native polyacrylamide gel electrophoresis (Fig. 5A). Both RPs were detected at the position of GroEL. The possibility that GroEL recognized the T7 tag and/or multicloning sequences of RPs, rather than randomized sequence, was excluded because we confirmed that native proteins (green fluorescent protein) with a T7 tag at the NH 2 terminus and multicloning sequence at the COOH terminus did not bind to GroEL (data not shown). A GroEL mutant (L237E), in which Leu-237 was replaced by a glutamic acid residue, was shown previously to be unable to bind a natural substrate protein (18). This mutant GroEL also failed to bind soluble RPs (Fig. 5A). This ensured that soluble RPs bind to the normal polypeptide binding site of GroEL. Whether GroEL could bind insoluble RPs was examined by dilution of an RP solution in 3 M urea into a GroEL solution followed by native polyacrylamide gel electrophoresis (Fig. 5B). All of the RPs tested bound to GroEL, again failing to bind to mutant GroEL(L237E).
Taken together, the results in this report show that secondary structure is not prerequisite for the substrate protein to bind GroEL. The same conclusion can be drawn from the studies using ␣-casein as a substrate protein and crystal structure of minichaperone in which a peptide with a roughly extended conformation occupied the polypeptide binding site of GroEL (17). It should be added, however, that this does not exclude the possibility that some peptides that interact weakly with GroEL adopt a conformation of ␣-helix when binding to GroEL (37)(38)(39)(40). It is our current view that the polypeptide binding site of GroEL may be flexible enough to accept an ␣-helix form of peptides that expose hydrophobic residues but that an ex-tended form of peptides with hydrophobic residues should be a more favorite substrate.