The Clostridium cellulolyticum Dockerin Displays a Dual Binding Mode for Its Cohesin Partner*

The plant cell wall degrading apparatus of anaerobic bacteria includes a large multienzyme complex termed the “cellulosome.” The complex assembles through the interaction of enzyme-derived dockerin modules with the multiple cohesin modules of the noncatalytic scaffolding protein. Here we report the crystal structure of the Clostridium cellulolyticum cohesin-dockerin complex in two distinct orientations. The data show that the dockerin displays structural symmetry reflected by the presence of two essentially identical cohesin binding surfaces. In one binding mode, visualized through the A16S/L17T dockerin mutant, the C-terminal helix makes extensive interactions with its cohesin partner. In the other binding mode observed through the A47S/F48T dockerin variant, the dockerin is reoriented by 180° and interacts with the cohesin primarily through the N-terminal helix. Apolar interactions dominate cohesin-dockerin recognition that is centered around a hydrophobic pocket on the surface of the cohesin, formed by Leu-87 and Leu-89, which is occupied, in the two binding modes, by the dockerin residues Phe-19 and Leu-50, respectively. Despite the structural similarity between the C. cellulolyticum and Clostridium thermocellum cohesins and dockerins, there is no cross-specificity between the protein partners from the two organisms. The crystal structure of the C. cellulolyticum complex shows that organism-specific recognition between the protomers is dictated by apolar interactions primarily between only two residues, Leu-17 in the dockerin and the cohesin amino acid Ala-129. The biological significance of the plasticity in dockerin-cohesin recognition, observed here in C. cellulolyticum and reported previously in C. thermocellum, is discussed.

The microbial degradation of the plant cell wall is an important biological process that is central to the cycling of carbon between microorganisms, plants, and herbivores. Furthermore, the enzymes that catalyze the hydrolyses of plant structural polysaccharides are deployed in several biotechnological processes (for review see Ref. 1), although there is currently considerable interest in the application of these biocatalysts in the conversion of lignocellulose, an abundant renewable source of organic carbon, into biofuels such as ethanol and butanol (2,3). The plant cell wall is an insoluble highly recalcitrant macromolecule consisting mainly of interlocking polysaccharides. Saprophytic microorganisms that utilize the plant cell wall as a major nutrient synthesize enzyme consortia in which the biocatalysts act in synergy to degrade the composite substrate. A common feature of the plant cell wall apparatus synthesized by both eukaryotic and prokaryotic anaerobic microorganisms is that the component enzymes assemble into large complexes, which are referred to as cellulosomes (for review see Ref. 4). The cellulosome is assembled by the binding of the catalytic subunits, comprising glycoside hydrolases, esterases, and lyases, to a noncatalytic protein scaffold (hereafter referred to as Cip) (5). The integration of the plant cell wall hydrolases into the cellulosome has been proposed to potentiate the synergistic interactions between the enzymes and contributes to substrate targeting through the cellulose binding capacity of most Cip molecules (6,7). The Cip molecules of Clostridium cellulolyticum and Clostridium thermocellum (designated CipC and CipA, respectively) can bind eight and nine enzymes, respectively (8), although the cellulosomes of other anaerobic bacteria deploy multiple Cip and adapter molecules in assembling as many as 96 catalytic subunits into a single complex (9). In Clostridia the Cip contains multiple type I cohesin modules that bind tightly to the type I dockerins present on the catalytic subunits and thus assemble these enzymes into the cellulosome (10). In general, within a single organism, cohesin modules of the Cip display a very high level of sequence identity, and the type I dockerins appear to display little if any discrimination between their receptors in the cellulosome scaffold (11,12). Similarly, the type I dockerin modules also display extensive sequence identity consistent with the lack of specificity for the type I cohesins (4,13).
The crystal structures of type I C. thermocellum cohesindockerin complexes have provided insight into the mechanism of cellulosome assembly (14,15). Within the dockerin there is a tandem duplication of a 22-residue sequence that contributes an ␣-helix and an EF-hand calcium-binding motif and displays remarkable structural conservation. Indeed the structure of the first duplicated segment, which contains the N-terminal helix, can be superimposed precisely over the structure containing the second segment containing the C-terminal helix (helix-3). This symmetry, coupled with several mutagenesis studies (16,17), indicates that the C. thermocellum type I dockerin contains two equivalent ligand-binding sites, which have been maintained during evolution. This view is entirely consistent with the crystal structure of the complexes. Thus in one complex helix-3 dominates cohesin recognition with Ser-45 and Thr-46 playing a central role in the polar interactions between the two protein partners (14). In the second crystal structure the dockerin is rotated 180°relative to the cohesin, and helix-1, rather than helix-3, plays a central role in complex formation (15). Thus, the equivalent residues to Ser-45 and Thr-46 in the N-terminal helix, Ser-11 and Thr-12, dominate the hydrogenbonding interactions between the dockerin and its cohesin partner in this second binding mode.
Although the sequences of C. cellulolyticum type I cohesins and dockerins are very similar to the corresponding C. thermocellum modules, there is no cross-specificity between the proteins derived from the two organisms (8,13,16). Significantly the Ser-Thr dyad in the C. thermocellum dockerins is replaced with hydrophobic residues in the corresponding C. cellulolyticum protein modules. Mutagenesis studies have shown that replacing the Ser-Thr dyad with hydrophobic residues in the C. thermocellum dockerin, and similarly substituting the Ala-Leu and Ala-Phe motifs with hydroxyl amino acids in the C. cellulolyticum dockerin, extends ligand specificity (16). Thus the mutant C. cellulolyticum and C. thermocellum dockerins gain the capacity to recognize the C. thermocellum and C. cellulolyticum cohesins, respectively, while retaining a diminished affinity for their original protein target (16).
Currently it is unclear whether the dual binding mode displayed by the C. thermocellum type I dockerins is a generic feature of cellulosome assembly, whereas the mechanistic basis for the organism-based specificity displayed by cohesin-dockerin partners remains essentially unknown. Here we report the crystal structure of the C. cellulolyticum dockerin-cohesin complex. The data show that the dockerin displays dyad symmetry and is able to interact with its cognate cohesin through a dual binding mode. Cohesin-dockerin recognition is dominated by hydrophobic interactions, which are centered around a hydrophobic pocket on the surface of the cohesin, formed by Leu-87 and Leu-89, which is occupied in the two binding modes by the dockerin residues Phe-19 and Leu-50, respectively. Intriguingly, organism specificity is dictated primarily by apolar interactions between only two residues, Leu-17 in the dockerin and the cohesin amino acid Ala-129.

EXPERIMENTAL PROCEDURES
Cloning and Expression-DNA encoding the dockerin module of the GH5 C. cellulolyticum cellulase CcCel5A (residues 410 -473; note that the proline at position 473 in the published sequence is a glutamate) from C. cellulolyticum was amplified by PCR from pJFAc (18), and the resulting product was ligated into NdeI/XhoI-digested pET22b (Novagen), to generate pHF1. The dockerin gene under the control of the pET22b T7 promoter and terminator was amplified by PCR from pHF1 using the forward primer T7f (5Ј-CACGATGCGTCCGGCG-TAGAGGAT-3Ј) and the reverse primer T7br (5Ј-GGGGGG-AGATCTATCCGGATATAGTTCCTCCTTTCA-3Ј) that incorporated 5Ј and 3Ј BglII sites. To express C. cellulolyticum dockerin and cohesin genes in the same plasmid, the resulting PCR product was digested with BglII and subcloned into the BglII site of plasmid pET-coh1B, which encodes the first cohesin module (residues 277-439) of C. cellulolyticum scaffoldin CipC (19). The resulting recombinant plasmid, termed pHF2, contained both genes organized in tandem and was sequenced to ensure that no mutations had occurred during PCR. In the wild type cohesin-dockerin complex, derived from the co-expression of the cohesin and dockerin genes from pHF2, only the cohesin contained a C-terminal His 6 tag. The construction of the plasmid encoding the mutated dockerins of Cel5A and the first cohesin from CipC was performed by the overlap-extension PCR method. The regions encoding the N-terminal and C-terminal parts of the dockerin were amplified by PCR from pHF2 using mutagenic primer pairs. The resulting overlapping fragments were mixed, and a combined fragment was synthesized using the external primers. The fragment was subsequently cloned into BglII linearized pHF2, thereby generating pHF3. Positive clones were verified by DNA sequencing. To study cohesin-dockerin binding, the dockerin was also expressed independent of its cohesin partner but fused to thioredoxin, encoded by pET32a, to ensure higher levels of protein expression. DNA encoding the dockerin of the cellulase Cel5A (residues 410 -475) was amplified by PCR from pHF2 and cloned into EcoRI-and XhoI-restricted pET32a to generate pHF5.
Protein Expression and Purification-The pET22b derivatives encoding the cohesin-dockerin complexes and discrete cohesins and the pET32a plasmids encoding C. cellulolyticum dockerins were transformed into Escherichia coli strains BL21 (DE3) and Origami, respectively. To express the clostridial proteins, recombinant E. coli strains harboring the appropriate recombinant plasmids were cultured in LB containing 100 g/ml ampicillin at 37°C to mid-exponential phase (A 550 0.6). Isopropyl ␤-D-thiogalactopyranoside was then added to a final concentration of 1 mM, and the cultures were incubated for a further 16 h at 19°C. The cohesin-dockerin A16S-L17T and cohesin-dockerin A47S-F48T complexes were purified by metal-ion affinity chromatography, buffer exchanged into 20 mM Tris/HCl, pH 8.0, containing 2 mM CaCl 2 , and then further purified by anion exchange chromatography using a Source 30Q column and a gradient elution of 0 -1 M NaCl (Amersham Biosciences) to separate the complexes from unbound cohesin. Fractions containing the protein complexes were buffer exchanged and then concentrated in 2 mM CaCl 2 to a final concentration of 20 and 10 g/liter for the A16S-L17T and A47S-F48T complexes, respectively. Discrete His-tagged cohesins and dockerins were purified by metal-ion affinity chromatography as described previously (14).
Isothermal Titration Calorimetry-Isothermal titration calorimetry (ITC) 5 was deployed to measure the affinity of native and mutant forms of the C. cellulolyticum cohesin for its dockerin partner essentially as described previously (14). Briefly the wild type and mutant forms of the dockerin (7-119 M), fused to thioredoxin, were stirred at 300 rpm in the reaction cell, which was injected with 25 ϫ 10 or 48 ϫ 5-l aliquots of a 70 -1715 M solution of cohesin at 300-s intervals. The buffer consists of 50 mM NaHepes, pH 7.5, containing 2 mM CaCl 2 , and titrations were carried out at 308 K unless otherwise stated. Integrated heat effects, after correction for heats of dilution, were analyzed by nonlinear regressing using a single site model (Microcal ORIGIN version 7.0, Microcal Software, Northampton, MA). The fitted data yield the association constant (K A ) and the change in enthalpy associated with binding (⌬H). Other thermodynamic parameters were calculated using the standard equation ϪRTlnK A ϭ ⌬G ϭ ⌬H Ϫ T⌬S. The c values (product of the molar concentration of binding sites ϫ K A ) were Ͼ2.6.
Crystallization of the C. cellulolyticum Cohesin-Dockerin Complexes and Structure Resolution-Crystals of the cohesindockerin A16S-L17T complex grew over a period of 10 -12 days, in 0.2 M potassium sulfate and 20% w/v polyethylene glycol 3350, and were cryoprotected with 20% (v/v) of glycerol. Crystals of the cohesin-dockerin A47S-F48T complex grew over a period of 4 -5 days, in 0.2 M lithium sulfate and 25% w/v polyethylene glycol monomethyl ether 2000, and were cryoprotected with 20% (v/v) of glycerol. The crystals were harvested in rayon fiber loops and frozen in liquid nitrogen.
Data were collected, for both constructs, using single crystals at the European Synchrotron Radiation Facility on station ID14-2 at 100 K using an ADSC Q4 charged-coupled device detector and at a wavelength of 0.9330 Å. All diffraction data were indexed and integrated in MOSFLM (CCP4) or DENZO/ SCALEPACK (20). All other computing was carried out using the CCP4 suite unless otherwise stated. Crystals of the A16S-L17T complex belong to the space group P2 1 2 1 2 1 with cell dimensions a ϭ 39.3 Å, b ϭ 60.5 Å, and c ϭ 100.7 Å and one complex in the asymmetric unit. In contrast, crystals of the cohesin-dockerin A47S-F48T complex belong to the space group P3 2 21 with approximate cell dimensions of a ϭ b ϭ 76.42 Å and c ϭ 111.09 Å and two independent complexes in the asymmetric unit. Both cohesin-dockerin complex structures were solved by molecular replacement using PHASER (21) with the search model being the structure of the previously reported apo-form of the type I C. cellulolyticum cohesin module (PDB accession code 1g1k (22)). Initial building of the dockerin subunits into the electron density was performed using ARP/wARP (23), and the remaining residues were built by hand using COOT (24). Refinement was carried out with REFMAC (25) with 5% of the data set aside for cross-validation purposes. A summary of the refinement statistics is shown in Table 1. Coordinates and observed structure factor amplitudes have been deposited at the Protein Data Bank (PDB codes 2vn5 and 2vn6). Figures were drawn in MOLSCRIPT (26) and BOBSCRIPT (27).

Thermodynamics and Stoichiometry of Cohesin-Dockerin
Recognition-The binding of a CipC-derived C. cellulolyticum type I cohesin to its dockerin partner was assessed by ITC. The data show that at 308 K the K A is 6.50 ϫ 10 8 M Ϫ1 with a ⌬H of Ϫ19.3 kcal mol Ϫ1 and a T⌬S of Ϫ6.84 kcal mol Ϫ1 (Table 2). There was a negative direct relationship between temperature and both ⌬H and ⌬S, although ⌬G was not sensitive to changes in the experimental temperature ( Fig. 1). At 284 and 299 K, respectively, ⌬H and ⌬S were 0. The heat capacity (⌬Cp) of cohesin-dockerin binding was Ϫ822 cal Ϫ1 mol Ϫ1 K Ϫ1 . The ITC data also showed that the stoichiometry of ligand binding was ϳ1, the significance of which is discussed below.
Protein Expression and Crystallization Strategy-To determine the crystal structure of the C. cellulolyticum cohesindockerin complex, the two proteins were co-expressed in E. coli. Initial attempts to crystallize the purified C. cellulolyticum dockerin-cohesin complex were unsuccessful. As with past work on C. thermocellum, we postulated that the failure to crystallize the complex likely reflected the dynamic interaction of the two potential ligand-binding sites in the dockerin with the cohesin. It has been suggested that the C. cellulolyticum dockerin has two ligand-binding sites in which residues dAla-16 and dLeu-17 at site 1 and dAla-47 and dPhe-48 in site 2 (C. cellulolyticum dockerin residues henceforth are prefaced with d and cohesin residues with c) play a key role in cohesin recognition (8,13,16). To encourage a single binding mode between the protein partners for the equivalent C. cellulolyticum complexes, two variants of the dockerin were constructed in which the functions of site 1 and site 2 were disrupted through the introduction of the mutations A16S/L17T and A47S/F48T, respectively. Diffracting crystals of the cohesin in complex with either mutant of the dockerin mutants were then obtained. Structure of the Type I Cohesin-Dockerin C. cellulolyticum Complex-The structure of the cohesin-dockerin A16S/L17T (Coh-DocA16S/L17T) and the cohesin-dockerin A47S/F48T (Coh-DocA47S/F48T) complexes were solved to 1.49-and 1.9-Å resolution, respectively, by molecular replacement using the crystal structure of the apo-form of the C. cellulolyticum type 1 cohesin (PDB code 1g1k (22)) as the search model. The two complexes in the asymmetric unit overlay with an r.m.s.d. of 0.18 Å (C-␣ atoms) for the cohesin residues and 1.0 Å for the dockerin amino acids.
The individual components of the two protein complexes are extremely similar to each other ( Fig. 2) with an r.m.s.d. of 0.5 Å for the C-␣ atoms of the cohesins and 1.0 Å for the C-␣ atoms of the dockerins (treated independently). The structure of the cohesin either unliganded or in complex with either dockerin variant was essentially identical (r.m.s.d. ϳ0.8 Å). Thus, similar to the type I C. thermocellum cohesin, the corresponding C. cellulolyticum protein does not undergo significant conformational changes upon binding to its dockerin ligands.
Structure of the C. cellulolyticum Type I Cohesin in Complex with Its Cognate Dockerin-The type I C. cellulolyticum cohesin in complex with its cognate dockerin has an elliptical structure comprising a 4-residue ␣-helix and 10 ␤-strands, which forms 2 ␤-sheets aligned in an elongated ␤-sandwich and displays a classical jelly roll fold (Fig. 2). The two sheets include ␤-strands 10, 1, 2, and 7 on one face (sheet A) and ␤-strands 5, 6, 3, and 9 on the other face (sheet B). The two sheets are connected by ␤-strand 8 that makes hydrogen bonds with both ␤-strand 9 on sheet A and ␤-strand 3 on sheet B. The cohesin also displays striking similarity to type I C. thermocellum cohesins (r.m.s.d. 0.8 Å).
Structure of the C. cellulolyticum Type I Dockerin-The dockerin, in both complexes, has an identical structure that consists of two parallel helixes, comprising residues dAla-16 to dMet-27 and dAla-46 to dLeu-58, respectively, whereas the extended loop connecting these structural elements contains a 3-residue 3 10 helix (Fig. 2). The overall structure of the C. cellulolyticum dockerin is very similar to the C. thermocellum Xyn10B type 1 dockerin (r.m.s.d. 0.76 Å). The C. cellulolyticum dockerin, in both complexes, contains two Ca 2ϩ ions coordinated by several amino acid residues in canonical EF-hand loop motifs. The coordination of the two calciums is identical to the metal ions  observed in the type I dockerin of C. thermocellum Xyn10B in complex with its cognate cohesin (14). The N-and C-terminal helices of the C. cellulolyticum type I dockerin display significant sequence and structural conservation, with an r.m.s.d. for the internally repeated segments of 1.9 Å for the main chain atoms. This structural conservation includes the two EF-hand calcium-binding motifs and, significantly, results in a near-perfect 2-fold dyad symmetry within the dockerin module. Thus, if helix-1 is rotated 180°, it can be superimposed over helix-2 and vice versa. The functional significance of this structural symmetry is described in detail below.
C. cellulolyticum Type I Coh-DocA16S/L17T and Coh-DocA47S/F48T Interfaces-The dockerin in both complexes interacts mainly with one face (sheet B) of the cohesin barrel. Superimposition of Coh-DocA16S/L17T with Coh-DocA47S/ F48T reveals that the structure of the cohesin (r.m.s.d. of ϳ0.5 Å for 141 C-␣ atoms) and dockerin (r.m.s.d. of ϳ1.0Å for 57 C-␣ atoms) modules are very similar in the two crystal structures. Indeed, the dockerin in Coh-DocA16S/L17T presents the symmetry observed in Coh-DocA47S/F48T, with helices 1 and 2 rotated 180°and overlapping almost perfectly (Fig. 2). It is recognized that the analysis of electron density maps of proteins that display dyad symmetry can be difficult; however, the following differences in residues that are in equivalent positions in the two heterodimers enabled the orientation of the dockerin mutants A16S/L17T and A47S/F48T, respectively, to be determined: dMet-60/dHis-31, dIle-52/dGly-21, dLeu-50/dPhe-19, dAsn-43/Gly-12, dVal-39/dTyr-8, and dLeu-17/dPhe-48. The crystal structure of the complexes is entirely consistent with the view that helix-1 and helix-2 of the dockerin play equivalent roles in the recognition of its protein partner in the two heterodimers. Thus, in Coh-DocA16S/L17T helix-2 is the major region of the dockerin that interacts with its cohesin partner, whereas helix-1 dominates protomer recognition in Coh-DocA47S/F48T.
Site-directed mutagenesis data (Table 2) show that alanine substitution of the three hydrophobic cohesin residues cLeu-87, cLeu-89, and cMet-135 results in a substantial decrease in affinity (9 -130-fold), whereas the cI93A mutation reduces the K A 3-fold. Although the cN47A and cY49A mutations also cause an ϳ5-fold decrease in the K A , the amino acid substitutions cT45A, cS76A, cS85A, cN91A, and cK137G had little influence on the affinity of the cohesin for its C. cellulolyticum dockerin partner. Thus, the only significant polar interactions between the protomers in Coh-DocA16S/L17T are the hydrogen bonds between the side chain carbonyl of cAsn-47 and the main chain amide of dAla-47 and between the phenolic hydroxyl of cTyr-49 with the main chain carbonyl of dIle-26. It is therefore apparent that hydrophobic interactions play a dominant role in the assembly of this complex.
The most significant hydrophobic interaction at the protomer interface is between the apolar pocket, formed by cLeu-87 and cLeu-89, which is occupied by the aromatic side chain of dLeu-50 in Coh-DocA16S/L17T and dPhe-19 in Coh-DocA47S/F48T. Indeed, the affinity of the cL87A/cL89A cohe-sin mutant for its cognate dockerin is ϳ10 4 -fold less than the wild type protein. The reciprocal dockerin mutations, dF19A/dL50A, also caused a significant reduction in K A (ϳ3000-fold), although this loss in affinity was less dramatic than in the corresponding cohesin mutant. This is consistent with the observation that cLeu-87 and cLeu-89, in addition to interacting with dPhe-19/dLeu-50, also make van der Waals contacts with dAla-20/ dAla-51 and dAla16/dAla-47. The modest decrease in affinity displayed by the I93A mutant is rather surprising as C-␦1 and C-␥1 make hydrophobic contacts with dMet-27/dLeu-58, whereas C-␥2 interacts with dMet-60. The contribution of cMet-135 to dockerin recognition is through weak hydrophobic interactions with the aliphatic side chains of dLys-24/dLys-55, and strong van der Waals contacts with the C-␤ of dAla20/dAla51. The methionine may also contribute, indirectly, to dockerin recognition by making strong hydrophobic interactions with cLeu-89 and cIle-93 and thus play a role in optimizing the position of these residues for dockerin recognition. The hydrophobic interaction that appears to play a central role in the specificity of the C. cellulolyticum cohesin-dockerin interaction is between dLeu-17 and a shallow hydrophobic pocket of which cAla-129 is the major contributor (see below).
In addition to the direct polar interactions described above, the protomers in both cohesin-dockerin heterodimers appear to make numerous solvent-mediated hydrogen bonds. Although the importance of these bridging water molecules in the protein complex is unclear, the mutation of the cohesin residues cSer-76, cSer-85, cAsn-91, and cLys-137, whose side chains make water-mediated hydrogen bonds with the dockerin, had little influence on affinity ( Table 2). These data suggest that indirect polar interactions do not play a key role in cohesin dockerin recognition. However, it is possible that water-mediated hydrogen bonds between main chain polar groups contribute to the formation of the heterodimer. Indeed, in the C. thermocellum type I cohesin-dockerin complex, an indirect hydrogen bond between ctAsp-39 (C. thermocellum cohesin and dockerin residues are preceded with ct and dt, respectively) and the main chain polar groups of four dockerin residues contributes 4 kcal mol Ϫ1 of binding energy (8).
Comparison of the C. cellulolyticum and C. thermocellum Type I Cohesin-Dockerin Complexes-Overlaying the equivalent type I C. cellulolyticum and C. thermocellum cohesindockerin complexes reveals significant structural conservation in the organization of the complexes. Thus, comparison of C. cellulolyticum Coh-DocA16S/L17T with the native C. thermo-  cellum Coh-Doc complex yields an overall r.m.s.d of 1.36 Å for 180 eq C-␣ atoms (of both cohesin and dockerin), whereas comparison of the C. cellulolyticum Coh-DocA47A/F48T heterodimer with C. thermocellum Coh-DocS45A/T46A yields an r.m.s.d of 1.2 Å for 183 equivalent C-␣ atoms. The location of the interface between the two protomers within the heterodimer is identical (Fig. 4), although there is little cross-specificity between the two complexes (8,13,16). Inspection of the cohesin-dockerin interactions in the C. thermocellum and C. cellulolyticum complexes provides insight into the molecular basis for the distinct specificities displayed by these protein pairs. Thus, the hydrophobic residues of the C. cellulolyticum cohesin that interact with the dockerin are clustered toward the C-terminal region of ␤-strand 6, whereas in the C. thermocellum protein the ligand binding apolar residues extend across the full length of ␤-strand 3 and also include amino acids in ␤-strands 5 and 6 (14,15). There is, however, some conservation in these key aliphatic residues. Thus, the following pairs of C. cellulolyticum and C. thermocellum cohesin residues occupy equivalent positions in their respective complexes: cLeu-87/ ctLeu-83, cLeu89/ctAla-85, and cTyr-49/ctVal-41. Several C. cellulolyticum and C. thermocellum hydrophobic dockerin residues that interact with the cognate cohesin are also highly conserved and include the following pairs: dMet-27/dtLeu-56, dIle-26/dtLeu-55, dLeu-50/dtLeu14, dPhe-19/dtLeu-48, dLeu-58/dtLeu-22, and d-Met-60/dtAla-24. It is therefore not surprising that many of the hydrophobic interactions at the protomer interface are conserved in the two protein complexes. For example, the key hydrophobic pocket in the C. cellulolyticum cohesin, formed by cLeu-87 and cLeu-89, would be occupied by the C. thermocellum dockerin amino acid dtLeu-14.
In contrast to the hydrophobic interactions, there are significant differences in the polar interactions between the protomers in the C. cellulolyticum and C. thermocellum heterodimers. The only similar polar cohesin residues that interact with the cognate dockerins are cAsn-47/ctAsp-39 and cAsp-87/ ctAsn-91, although mutagenesis studies have shown that the D39N mutation in a type I C. thermocellum cohesin reduces affinity for the C. thermocellum dockerin 1000-fold indicating significant differences in the role of similar residues in the two complexes (8). It has been suggested that the dramatic effect of the D39N reflects the capacity of both O-␦1 and O-␦2 to act as hydrogen bond acceptors in polar interactions with the dockerin. The lack of crossspecies recognition between the protein partners may also reflect steric clashes, particularly when the C-terminal helix of the dockerin dominates cohesin binding; in this orientation dPhe-48 of the C. cellulolyticum dockerin will clash with the C. thermocellum cohesin residue ctAsn-127, whereas dtArg-53 of the C. thermocellum dockerin will clash with the C. cellulolyticum cohesin residue cIle-93.
The structure of the C. cellulolyticum complex provides insight into the mechanism by which the T12L mutation in the C. thermocellum dockerin leads to recognition of the C. cellulolyticum cohesin and how the A16S/L17T/A47S/F48T C. cellulolyticum dockerin mutant binds to the C. thermocellum cohesin (16). Thus, the side chain of the leucine introduced into the C. thermocellum dockerin makes several hydrophobic interactions with the C. cellulolyticum cohesin, most notably with the C-␤ of cAla-129 but also with the C-␥ of cThr-45 (equivalent to ctGly-123 and ctAsn-37, respectively, in the C. thermocellum cohesin), while also making weak hydrophobic contacts with the aliphatic region of the cLys-137 side chain and the ␣-carbon of Gly-131. In contrast, the introduction of the corresponding Ser to Ala and Thr to Leu substitutions into the C-terminal binding site of the C. thermocellum dockerin did not confer specificity for the C. cellulolyticum cohesin. This may reflect the energetic penalty imposed by the predicted steric clashes discussed above. The capacity of the A16S/L17T/ A47S/F48T C. cellulolyticum dockerin mutant to bind the C. thermocellum cohesin likely reflects the capacity of the introduced polar residues to make hydrogen bonds with ctAsn-37, ctAsp-39, and ctGlu-131. An intriguing feature of these "change in specificity" mutations (or substituting these amino acids with alanine) ( Table 2) is that the engineered dockerins retain significant affinity for their original cohesin partners. This likely reflects the retention of the extensive hydrophobic interactions between the engineered dockerins and their native protein partners, exemplified by the functional importance of

DISCUSSION
Previous studies showed that the type I C. thermocellum dockerin domains contain two near identical type I cohesinbinding sites (14 -16). The observation that the corresponding C. cellulolyticum system displays a similar binding mode, even though there is little cross-specificity between clostridial heterodimers, indicates that there is a strong evolutionary driver for the retention of two identical cohesin-binding sites in the dockerin domains. The reasons for the retention of two, apparently equivalent, binding modes is unclear. It could be argued that the two binding modes may enable the catalytic subunits to bind to multiple cohesins leading to polycellulosome assembly, as observed in C. thermocellum (28). The observed stoichiometry of one, where measured, however, indicates that both ligand-binding sites on the dockerin cannot be occupied simultaneously. Indeed, overlaying the two C. cellulolyticum complexes shows that the cohesin ␣-helix, extending from cAsn-70 to cAsn-74, will make steric clashes precluding the formation of a trimolecular complex.
The high degree of sequence conservation in the two cohesin-and calcium-binding sites of the C. cellulolyticum dockerins (13), leading to the retention of dyad symmetry, indicates that the observed dual binding mode is a general feature of the cellulosome of the bacterium. Indeed, the lack of sequence conservation in the regions linking the two helices/EF-hand motifs again points to how retention of both cohesin-binding sites represents a strong evolutionary driver. Furthermore, the structures of the protein complex, in combination with mutagenesis data, show a high degree of sequence conservation of the dockerin recognition residues displayed by the cohesin domains of C. cellulolyticum (8), indicating that there is little, if any, specificity between specific cohesin-dockerin partners in the cellulosome of the bacterium.
Thus, the biological significance of the dual binding mode displayed by C. cellulolyticum and C. thermocellum dockerins is an intriguing question. One possibility, which may merit further exploration, is that dual binding confers flexibility in cellulosome assembly (for review see Refs. 4 and 29). It could be argued that steric constraints imposed by the appended catalytic modules might restrict the combination of enzymes that can be assembled into a cellulosome complex. The dual binding mode displayed by dockerins could overcome these spatial limitations and thus increase the range of enzymes that can be integrated into discrete cellulosome molecules. There is a very large number of possibilities for cellulosomal organization, and the flexibility provided by dual binding may contribute to the plasticity required for correct incorporation of different enzyme combinations. Indeed the different binding modes displayed by the dockerins may allow optimal placement of the enzymes on the complex substrate, which is more easily accomplished in the plant cell wall degrading apparatus of aerobic microorganisms as the free enzymes of these organisms do not physically associate (30).
It should be emphasized, however, that the linker sequences that join the dockerin and catalytic modules may contribute significant conformational flexibility to the cellulosome quaternary structure, questioning the proposed requirement for the dual binding mode reported here for C. cellulolyticum dockerins. However, small angle x-ray scattering (SAXS) experiments have demonstrated that when Cel48F binds to its cognate cohesin, the rigidity of the linker that connects the catalytic module of the enzyme to its dockerin increases and the cellulase becomes more compact (31). The SAXS data also showed that the possible motions of the catalytic module of Cel48F with respect to the cohesin/dockerin interface are highly restricted, compared with the free enzyme. Thus, the enzyme linker is not likely to generate the flexibility, which may be required for plant cell wall degradation, once the enzyme is bound within the cellulosome (31). It should also be acknowledged that another source of flexibility may stem from the linkers that connect the cohesin modules in the scaffoldins. Similar SAXS studies have indeed shown that the linker joining the two cohesins of divergent species in the hybrid scaffoldin Scaf4 remains flexible, even when large enzymes (Cel48F) are anchored to both cohesin modules (32). It is apparent, however, that the 49 residue hybrid linker is longer than typical C. cellulolyticum and C. thermocellum scaffoldin linkers, comprising 10 and 39 residues, respectively. Indeed short linkers is a generic feature of mesophilic clostridial scaffoldins exemplified by the corresponding proteins produced by Clostridium cellulovorans and Clostridium josui (33,34). It is unclear why nature should select a dual binding mode between cohesins and dockerins, in at least some species. Indeed if, as we propose, it is to confer conformational flexibility, it would seem facile for evolution to impose a selection for longer linker sequences. It is possible that extended flexible linkers may be more prone to proteolytic attack than short inter-module sequences, and thus evolution may have selected a dual binding mode to introduce structural plasticity. In any event, the biological rationale for the dual binding mode remains opaque, and there is clearly an urgent need to investigate its biological significance.
This study, in conjunction with previous studies on the C. thermocellum cellulosome (14,15), shows that the flexibility in cohesin recognition appears to be a general feature of the type I dockerin modules that recruit the catalytic subunits into the clostridial cellulosome. In nonclostridial cellulosomes there are examples of sequence divergence in the duplicated regions of dockerins exemplified by the Ruminoccocus flavefaciens system, suggesting that the dual binding mode reported here may not be a universal feature of cellulosomes. This study also provides a mechanistic understanding of how the C. cellulolyticum cohesins recognize their cognate dockerins, and how subtle changes to the structure at the interface leads to significant changes in specificity, providing a template for engineering novel specificities into these highly efficient nanomachines.