Mutational Insights into the Roles of Amino Acid Residues in Ligand Binding for Two Closely Related Family 16 Carbohydrate Binding Modules*

Carbohydrate binding modules (CBMs) are specialized proteins that bind to polysaccharides and oligosaccharides. Caldanaerobius polysaccharolyticus Man5ACBM16-1/CBM16-2 bind to glucose-, mannose-, and glucose/mannose-configured substrates. The crystal structures of the two proteins represent the only examples in CBM family 16, and studies that evaluate the roles of amino acid residues in ligand binding in this family are lacking. In this study, we probed the roles of amino acids (selected based on CBM16-1/ligand co-crystal structures) on substrate binding. Two tryptophan (Trp-20 and Trp-125) and two glutamine (Gln-81 and Gln-93) residues are shown to be critical in ligand binding. Additionally, several polar residues that flank the critical residues also contribute to ligand binding. The CBM16-1 Q121E mutation increased affinity for all substrates tested, whereas the Q21G and N97R mutants exhibited decreased substrate affinity. We solved CBM/substrate co-crystal structures to elucidate the molecular basis of the increased substrate binding by CBM16-1 Q121E. The Gln-121, Gln-21, and Asn-97 residues can be manipulated to fine-tune ligand binding by the Man5A CBMs. Surprisingly, none of the eight residues investigated was absolutely conserved in CBM family 16. Thus, the critical residues in the Man5A CBMs are either not essential for substrate binding in the other members of this family or the two CBMs are evolutionarily distinct from the members available in the current protein database. Man5A is dependent on its CBMs for robust activity, and insights from this study should serve to enhance our understanding of the interdependence of its catalytic and substrate binding modules.

Plant cell walls, composed mainly of cellulose, hemicellulose, and lignin, represent one of the most abundant sources of biomass on Earth (1), and microbial glycoside hydrolases play a critical role in their degradation into oligosaccharides and monosaccharides (2). These enzymes possess highly diversified biochemical activities, a reflection of the complexity of the chemical bonds by which the sugars are linked in plant cell walls.
The glycoside hydrolases commonly occur as modular polypeptides that contain catalytic domains and other modules, including carbohydrate binding modules (CBMs). 3 CBMs exhibit three auxiliary effects that serve to improve the enzymatic functions of the catalytic domain (3). First, CBMs are able to increase the proximity of the catalytic domain to the substrates, thereby increasing the local concentration of the catalytic domain on the substrate (4). Second, CBMs recognize and bind to specific substrates, and this targets the catalytic domain to specific parts or regions of the complex plant cell wall (5,6). Third, some types of CBMs function to physically disrupt the intermolecular interactions within certain polysaccharide substrates, thus facilitating subsequent hydrolysis by the associated catalytic domain (7,8).
The carbohydrate binding modules are divided into three types based on their substrate preference as well as their binding mechanisms (3). Type A CBMs have a flat binding platform, composed mainly of aromatic amino acid residues, which enables them to bind to the surface of crystalline cellulose (9). In contrast, the binding cleft of type B CBMs is concave, forming a cleft that allows accommodation of up to six sugar residues in a single polysaccharide chain (10 -12). Type C CBMs lack the extended binding cleft that type B CBMs possess and only bind to small sugars such as mono-, di-, or trisaccharides (13).
The large variation in substrate specificities of CBMs suggests that the structures of CBMs have plasticity that can be evolutionarily selected to accommodate different ligands. A single CBM may be able to bind to differently configured substrates, and this has been demonstrated for the Piromyces equi NCP1 CBM29-1 and CBM29-2, Caldanaerobius polysaccharolyticus Man5A CBM16-1 and CBM16-2, Clostridium thermocellum Lic26A-Cel5E CBM11 and CtCel9D-Cel44A CBM44 (5, 12, 14 -16). The P. equi NCP1 CBM29-2 can bind amorphous cellulose and mannan and also glucomannan.
Similarly, the two family 16 CBMs of C. polysaccharolyticus (abbreviated as CBM16-1 and CBM16-2 below) are also able to bind to mannose-and glucose-configured substrates (16). The CBM16 polypeptides constitute a large family of 59 members that are linked to diverse catalytic domains (17)(18)(19)(20)(21), suggesting that these CBMs possess distinct binding specificities. Until recently, the residues that are key to ligand binding in this family were unknown. Insights were therefore gained when the crystal structures of the C. polysaccharolyticus Man5A CBM16-1 and CBM16-2 were solved by Bae et al. (16). Furthermore, the high resolution co-crystal structure of CBM16-1 in complex with mannopentaose and cellopentaose provided a sound basis for a detailed mutational analysis of these two CBMs. The Man5A CBM16-1 and CBM16-2 share high amino acid sequence homology (62% identity and 74% similarity), which forms the basis of their similar spatial structures (16). However, not all of the polar residues predicted to make direct contact with the ligand in CBM16-1 are conserved in CBM16-2. The high amino acid sequence conservation, accompanied by the minor differences in polar residues, as well as the 2-fold difference in binding affinity for these CBMs, provides an excellent model for probing the importance of key residues in binding to differently configured substrates (16).
In this study, each of the key residues of both CBM16-1 and CBM16-2, proposed to interact with ligands by either hydrophobic stacking interaction or forming direct hydrogen bonds, were mutated to alanine. Furthermore, nonconserved polar residues, predicted to be critical to binding, were mutated to their equivalent residues in the counterpart CBM. Subsequently, the relative binding affinities of the wild-type and mutant derivatives were compared by affinity gel electrophoresis (AGE) using sodium carboxymethyl cellulose (CMC-Na), locust bean gum (LBG), and konjac glucomannan, which represent glucose-configured, mannose-configured, and glucose/mannose mixed polysaccharide substrates, respectively. Furthermore, the binding parameters of the different derivatives of CBM16-1 and CBM16-2 were determined by isothermal titration calorimetry using ␤-1,4-cellopentaose and ␤-1,4-mannopentaose as substrates. Results from these studies revealed that Trp-20, Trp-125, Gln-81, and Gln-93 in the two CBMs play critical roles, although other polar residues targeted for mutation play moderately important roles in ligand binding. Interestingly, CBM16-1 Q121E increased the binding affinities for all substrates tested, suggesting that this noncritical polar residue can be used to manipulate the binding affinities of the C. polysaccharolyticus CBMs.

EXPERIMENTAL PROCEDURES
Materials-The construction of the plasmids expressing the C. polysaccharolyticus (formerly Thermoanaerobacterium polysaccharolyticum and reclassified by Lee et al. (22)) Man5A carbohydrate binding modules CBM16-1 and CBM16-2 were described by Bae et al. (16). The Escherichia coli JM-109 strain from Promega (Madison, WI) was used for maintaining and propagating all the plasmids in this study. The oligosaccharides, including ␤-1,4-D-cellopentaose, ␤-1,4-D-mannopentaose, and the polysaccharide konjac glucomannan, were purchased from Megazyme (Wicklow, Ireland). The other two polysaccharides, sodium carboxymethyl cellulose and locust bean gum, were from Sigma.
Site-directed Mutagenesis-The generation of point mutations for CBM16-1 or CBM16-2 was conducted using the QuikChange site-directed mutagenesis kit according to the manufacturer's instructions (Stratagene, La Jolla, CA). Briefly, 25 ng of plasmid DNA encoding either CBM16-1 or CBM16-2 wild-type was used as template in a PCR amplification. In addition, the reaction mixture contained 250 nM each of the primers, 0.2 mM dNTPs and 1 l of PfuTurbo DNA polymerase. The PCR amplification steps were as follows: 95°C for 30 s for 1 cycle, then 95°C for 30 s, 55°C for 1 min, and 68°C for 6 min for 18 cycles. The PCR products were visualized by running 5-l aliquots on a 1% agarose gel in TAE buffer. If a band appeared on the gel with the correct size, the rest of the PCR product was digested with DpnI at 37°C overnight. A 1-l aliquot of the digested PCR product was then used to electroporate E. coli JM-109 electrocompetent cells. To select for recombinant E. coli cells, the products of the electroporation were spread on LB plates containing 100 g/ml ampicillin and incubated at 37°C overnight. A single colony from each plate was inoculated in 10 ml of LB medium supplemented with ampicillin and cultured at 37°C with shaking overnight. The plasmids in the transformants were purified, and the DNA inserts were sequenced (W. M. Keck Center for Comparative and Functional Genomics, University of Illinois at Urbana-Champaign) to confirm the presence of the introduced mutation. To generate the plasmids encoding CBM16-1 Q21G/ N97R/Q121E and CBM16-2 G21Q/R97N/E121Q triple mutants, the QuikChange multisite-directed mutagenesis kit was used with a mixture of primers containing the desired mutations.
Gene Expression and Protein Purification-The plasmids encoding either the wild-type or a mutant of CBM16-1 or CBM16-2 were transformed individually into BL21(DE3) CodonPlus RIL competent cells (Stratagene, La Jolla, CA) that were then selected on LB plates supplemented with 100 g/ml ampicillin and 50 g/ml chloramphenicol and cultured at 37°C. A single colony containing the desired plasmid was then cultured in LB broth supplemented with the same antibiotics at 37°C until the absorbance at 600 nm (A 600 ) reached 0.3. The temperature for culturing the cells was decreased to 16°C, and isopropyl thiogalactopyranoside was added (0.1 mM, final concentration) to induce the expression of the particular CBM16 derivative. The culturing was continued for 16 h, and the cells were harvested by centrifugation. The cell pellets were resuspended in lysis buffer (50 mM Tris-HCl, pH 7.5, 300 mM NaCl, 20 mM imidazole), and the cell contents were released by passage through an EmulsiFlex C-3 cell homogenizer (Avestin, Ottawa, Canada). The supernatants were collected for subsequent protein purification.
All recombinant proteins were purified by immobilized metal ion affinity chromatography according to the manufacturer's instructions (Talon resin, Clontech). The eluted proteins were further purified by gel filtration using an AKTAxpress FPLC equipped with a HiLoad 16/60 Superdex 200 column (GE Healthcare) and a buffer composed of 50 mM Tris-HCl, 150 mM NaCl, pH 7.5.
AGE-AGE was used to compare the relative affinities of the various mutant proteins with those of the wild-type CBM16-1 or CBM16-2 in binding to CMC-Na, KGM, and LBG. The methods employed were essentially the same as that described by Tomme et al. (23) with minor modification. Thus, all gels were composed of two layers. The upper layer was the stacking gel, which contained 3% acrylamide in 1.5 M Tris-HCl buffer, pH 8.3. The lower layer was the separating gel, which contained 12% acrylamide in the same buffer as the stacking gel. Varying concentrations of polysaccharides were added in the separating gels (CMC-Na, 0.25, 0.5, 1, 2 mg/ml; KGM, 3.1, 6.2, 12, 25 g/ml; LBG, 7.8, 16, 32, 62 g/ml). As a control, the proteins were also resolved in a gel containing no polysaccharide. One microgram each of CBM16-1 or CBM16-2 wildtype and their mutants were subjected to electrophoresis at 100 V for 7 h at a constant temperature of 4°C. After electrophoresis, the gels were stained with Coomassie Brilliant Blue G-250 for visualization.
To calculate the relative affinities for the polysaccharides, the migration distances were plotted against the substrate concentrations, using 1/(Rr) as orthogonal and 1/[S] as horizontal coordinates, respectively. The R represents the relative migrating distance of a specific carbohydrate binding module compared with BSA in the absence of polysaccharides, and the r represents the relative migrating distance of this protein compared with BSA in the presence of a polysaccharide at a given concentration. For the horizontal coordinates, S represents the concentration of the polysaccharide, which was expressed as milligrams/ml. The methods used in calculating the binding constants and relative affinities were essentially the same as described by Takeo et al. (24).
Circular Dichroism Spectroscopy-Far-UV CD spectroscopy was used to analyze the secondary structures of the wild type and mutants of CBM16-1 or CBM16-2. The spectroscopy was performed using a J-815 CD spectropolarimeter (Jasco, Japan) equipped with a constant temperature cell holder. All protein samples (300 l, 0.2 mg/ml) in a 1-mm quartz cuvette were kept at 25°C for 5 min before scanning. The measurements started with an initial wavelength of 260 nm and a final wavelength of 190 nm with a wavelength step of 0.1 nm. The analysis of the secondary structure of the proteins was conducted on the Dichroweb website using an algorithm of CDSSTR with a reference Set 4 optimized for 190 -240 nm (25,26).
Isothermal Titration Calorimetry (ITC)-Isothermal titration calorimetry measurements were carried out at 25°C using a VP-ITC microcalorimeter from MicroCal Inc. (Northampton, UK). All proteins were extensively dialyzed against a buffer composed of 50 mM sodium phosphate, 150 mM NaCl, pH 7.0; and both oligosaccharide ligands (cellopentaose and mannopentaose) were dissolved in the same buffer. Typically, 100 M protein was injected with 28 successive 10-l volumes of ligand (1-2.6 mM) at 300-s intervals. The data collected were fitted to a nonlinear regression using a single site binding model (Micro-Cal Origin software), and the thermodynamic parameters were calculated with both the Gibbs free energy equation ⌬G ϭ ⌬H -T⌬S and the relationship ⌬G ϭ -RT lnK a .
Crystallization and X-ray Data Collection-Initial co-crystallization efforts of CBM16-1 variants in complex with either mannopentaose or cellopentaose focused around the crystallization conditions originally reported for wild-type CBM16-1, with the respective oligosaccharide (16). Although the co-crystallization conditions were similar to those previously reported, crystals of the site-specific variants demonstrated markedly different morphology and occupy different space groups than for the corresponding wild-type complexes. Briefly, for the variant CBM16-1 mannopentaose complex, 2 l of complex (20 mg/ml in 100 mM KCl, 10 mM HEPES, pH 7.5, plus 5 mM mannopentaose) was added to 2 l of precipitant (1.6 M ammonium sulfate, 100 mM MES, pH 6.5, 10% dioxane) and equilibrated over the precipitant solution at 20°C. For the variant CBM16-1 cellopentaose complex, 2 l of complex (20 mg/ml in 100 mM KCl, 10 mM HEPES, pH 7.5, plus 5 mM cellopentaose) was added to 2 l of precipitant (30% polyethylene glycol 3350, 100 mM Tris-HCl, pH 8.3, 200 mM MgCl 2 ) and equilibrated over the precipitant solution at 20°C. Prior to data collection, crystals of each complex were supplemented with a cryoprotectant and vitrified by direct immersion into liquid nitrogen.
Crystals of the CBM16-1 Q121E-mannopentaose complex occupy space group C222 1 with unit cell parameters a ϭ 56.4 Å, b ϭ 96.5 Å, c ϭ 72.9 Å, and contain one molecule in the crystallographic asymmetric unit. An 8-fold redundant data set was collected to a limiting resolution of 1.55 Å at LS-CAT (Advanced Photon Source, Argonne, IL) utilizing a Mar CCD detector (overall R merge ϭ 7.2% and I/(I) ϭ 4 in the highest resolution shell). Crystals of the CBM16-1 Q121E-cellopentaose complex occupy space group P1 with cell parameters a ϭ 32.4 Å, b ϭ 48.4 Å, c ϭ 48.9 Å, ␣ ϭ 62.5, ␤ ϭ 85.2, ␤ ϭ 86.2, with two molecules in the asymmetric unit. A 5-fold redundant data set was collected to a limiting resolution of 1.35 Å at LS-CAT (Advanced Photon Source, Argonne, IL) utilizing a Mar CCD detector (overall R merge ϭ 4.7% and I/(I) ϭ 9 in the highest resolution shell). All data were indexed, merged, and scaled using the HKL2000 package (27).
Structure Determination and Refinement-Crystallographic phases for the CBM16-1 Q121E complexes were determined by molecular replacement (28,29), utilizing the coordinates for the corresponding wild-type complex structures (minus all ligand and solvent molecules) as the search probe. Clear density for the appropriate ligand could be observed prior to any crystallographic refinement. Following rigid body refinement of the initial molecular replacement solution, each structure was subjected to automatic rebuilding using ARP/warp (30), followed by manual rebuilding using XtalView (31). The respective ligands were manually built into difference Fourier maps after the free R factors dropped below 25%. Solvent molecules and alternate conformers were added to yield final models with good free R factors, no Ramachandran outliers, and excellent stereochemistry.
Cross-validation, using 5-7% of the data for the calculation of the free R factor, was utilized throughout the model building process to monitor building bias (32). The stereochemistry of the models was routinely monitored throughout the course of refinement using PROCHECK (33) and MolProbity.
Crystal parameters, data collection parameters, and refinement statistics for each of the structures are summarized in Table 9. The refined coordinates have been deposited in the PDB with accession 30EA (CBM16-1 Q121E-cellopentaose) and 30EB (CBM16-1 Q121E-mannopentaose).
Amino Acid Sequence Alignment-The amino acid sequences of members of CBM family 16 were retrieved from the Carbohydrate Active enZYme database and the GenBank TM database and aligned with the sequences of Man5A CBM16-1 and CBM16-2 by using ClustalX.

Identification of Residues Involved in Ligand Binding and
Construction of Mutants-In a previous study, we reported the crystal structure of CBM16-1 in complex with ␤-1,4-cellopentaose and ␤-1,4-mannopentaose (16). These structures revealed a binding groove that accommodated the mannose-and glucose-configured oligosaccharides. Within the binding cleft, two tryptophans (Trp-20 and Trp-125) and six polar residues (Gln-21, Asp-77, Gln-81, Gln-93, Asn-97, and Gln-121) were predicted to contribute to ligand binding by either hydrophobic stacking interactions or hydrogen bond formation with the ligands (Fig. 1, A and B). To explore the roles of these residues in ligand binding, each residue was individually mutated to alanine. These mutations allowed subsequent comparison of the binding affinities of the wild-type and the mutant forms of each protein by AGE and isothermal titration calorimetry.
The primary sequences of CBM16-1 and CBM16-2 each include 143 amino acid residues, because the first methionine was included to facilitate gene expression. The two CBMs share 62 and 74% identity and similarity, respectively, at the amino acid sequence level (Fig. 1C). For CBM16-2, many of the residues predicted to be involved in binding for CBM16-1 are conserved. Thus, the two tryptophans (Trp-20 and Trp-125) and five polar residues (Asp-77, Gln-81, Gln-93, Arg-97, and Glu-121) of CBM16-2 were mutated to alanine. The Gly-21 in CBM16-2 (corresponding to Gln-21 in CBM16-1) was also mutated to alanine for a direct comparison with the CBM16-1 Q21A mutant.
Of the aromatic and polar residues, three differed in the two CBMs (Gln-21, Asn-97, and Gln-121 in CBM16-1 corresponded to Gly-21, Arg-97, and Glu-121 in CBM16-2). Each of these residues was mutated to the respective residue in the other CBM. The mutations therefore led to the mutants CBM16-1 Q21G, N97R, and Q121E and mutants CBM16-2 G21Q, R97N, and E121Q. Furthermore, two mutants carrying triple mutations were constructed, and these were CBM16-1 Q21G/N97R/Q121E and CBM16-2 G21Q/R97N/ E121Q. The triple mutations made the two CBMs even more similar to each other at the amino acid sequence level.
All wild-type and mutant genes were well expressed heterologously in E. coli, and the recombinant proteins were purified by affinity chromatography followed by size exclusion chromatography. On SDS-polyacrylamide gels, the wild-type and the mutant proteins appeared as single bands (data not shown).
Structural Integrity of CBM16-1 and CBM16-2 Mutants-To evaluate whether the mutations affected the structural integrity of the CBMs, the secondary structural elements of all proteins were compared using circular dichroism scans, and the spectra were analyzed using the Dichroweb on-line server (Tables 1 and  2). For all 12 mutants of CBM16-1, there were no significant changes in the secondary structure compared with the wildtype. For CBM16-2, most of the mutations did not cause significant changes to the secondary structural elements with the exception of the Q93A mutant, which showed a small but significant increase in ␤-sheet composition, and the G21Q/R97N/ E121Q mutant, which showed a small increase in unordered structure.
Two Conserved Tryptophan Residues Are Critical for Binding to Substrate in Both CBM16-1 and CBM16-2-Sample plots used to determine the relative affinities of the wild-type and mutants of CBM16-1 and CBM16-2 from AGE are given in Fig.  2. In addition, representative ITC data are presented in Fig. 3. Mutation of Trp-20 and Trp-125 in CBM16-1 to alanine attenuated the capacity of CBM16-1 to bind to polysaccharides (Table 3), cellopentaose (Table 5), and mannopentaose ( Table  7) to levels that were undetectable by the assays employed. One exception to this observation was that CBM16-1 W125A retained 13% binding affinity to CMC-Na relative to the wildtype protein (Table 3). CMC-Na is an artificial soluble cellulose analog with carboxymethyl groups linked to some hydroxyl groups of the glucopyranose chain. Therefore, the residual binding of W125A to this artificial substrate may not be biologically relevant. These results demonstrated that CBM16-1 Trp-20 and Trp-125 play critical roles in binding to polysaccharide and oligosaccharide substrates. Furthermore, the observation that binding to both cellopentaose and mannopentaose was attenuated indicates that the roles of the two amino acid residues in binding are irrespective of the orientation of the C2 hydroxyl.
Mutation of CBM16-2 Trp-20 and Trp-125 to alanine also abolished binding to all three polysaccharides (Table 4) as well as cellopentaose (Table 6) and mannopentaose (Table 8). These results further support the critical role of these two conserved tryptophan residues in ligand binding.
Two Conserved Polar Residues, Gln-81 and Gln-93, Are Essential for Ligand Binding in Both CBM16-1 and CBM16-2-Mutation of Gln-81 to alanine completely abolished the binding of CBM16-1 to each oligosaccharide (Table 5 and Table 7) and also the polysaccharides (Table 3), indicating that, similar to Trp-20 and Trp-125, this residue plays a critical role in ligand binding. Similarly, CBM16-2 Q81A did not show binding to any polysaccharides (Table 4) and oligosaccharides ( Table 6 and  Table 8), thus confirming the important role of Gln-81 in ligand binding in each of the two family 16 CBMs.
The binding of CBM16-1 Q93A was severely impaired with cellopentaose (1% residual binding, Table 5) and CMC-Na (13% residual binding, Table 3) as ligands, and for mannopentaose (Table 7), LBG, and KGM (Table 3), our methods did not detect any binding. As expected, the CBM16-2 Q93A mutant also had no detectable binding affinity for mannopentaose (Table 8), KGM, and LBG (Table 4); however, this mutant retained some affinity for cellopentaose (6% residual binding, Table 6) and CMC-Na (28% residual binding, Table 4). These results therefore suggested that Gln-93 is a critical residue for binding to mannose-configured substrates for both CBM16-1 and CBM16-2. Furthermore, the residual binding to cellopentaose indicates that this residue may be important for discriminating between glucose-and mannose-configured substrates.
Other  (Table 5) and mannopentaose (Table 7), respectively. The mutation also reduced the affinity by about 3-fold for CMC-Na, 8-fold for LBG, and 7-fold for KGM, respectively (Table 3). Not surprisingly, the binding affinities of CBM16-2 G21A remained almost the same for polysaccharides in AGE analysis (Table 4). However, the binding affinities of G21A for both oligosaccharides were decreased 2-fold in the ITC measurements (Tables 6 and 8).
The CBM16-1 D77A mutation decreased the binding affinities for all polysaccharides and oligosaccharides, with a range from 1.3 to 4.3-fold (Tables 3, 5, and 7). The CBM16-2 D77A mutation also decreased the binding affinities for both the oligosaccharides (Tables 6 and 8) and polysaccharides ( Table 4).

Binding Affinities of CBM16-1 and CBM16-2 Can Be Tuned by Replacing a Nonconserved Polar Residue with Its Equivalent
in the Other CBM-When CBM16-1 Gln-121 was mutated to glutamic acid, the binding affinities were increased for all of the substrates tested, ranging from 1.3-fold for LBG to 2.8fold for KGM (Tables 3, 5, and 7). Meanwhile, the enthalpies were more favorable for oligosaccharide binding: ⌬⌬H ϭ Ϫ1.02 kJ for Q121E-cellopentaose (Table 5) and ⌬⌬H ϭ Ϫ0.83 kJ for Q121E-mannopentaose reaction (Table 7). This result suggests that the amino acid residue at position 121 may be a critical determinant in manipulating the binding affinities for these two CBMs.
Structural Basis for Increased Oligosaccharide Affinity of CBM16-1 Q121E-Given that our calorimetric analysis suggests that Gln-121 in CBM16-1 is a critical determinant for modulating both oligosaccharide and polysaccharide binding affinity, we determined the co-crystal structures of CBM16-1

FIGURE 2. Sample plots used to determine the relative affinities of CBM16-1 with KGM (A), CMC-Na (B), and LBG (C) or CBM16-2 with KGM (D), CMC-Na (E), and LBG (F) in AGE.
Q121E in complex with mannopentaose (1.55 Å resolution) and cellopentaose (1.35 Å resolution) to further delineate the basis for increased binding affinity observed for this variant.
Each co-crystal structure was determined by molecular replacement, and clear and continuous electron density, corresponding to each of the five ␤-1,4-linked sugars, could be  The interactions are expressed as HS, residues predicted to be involved in hydrophobic stacking with ligand; HB, residues predicted to be involved in hydrogen bonding with ligand. b ND means no binding was detected. NOVEMBER 5, 2010 • VOLUME 285 • NUMBER 45 observed in initial electron density maps, prior to any crystallographic refinement. Each structure was refined without restraints to free R factors below 22%. Detailed information on the collected data, phasing, and refinement is provided in Table 9.

Role of Amino Acids in Family 16 CBM
A comparison of each variant co-crystal structure against the appropriate wild-type complex suggests only modest changes with root mean square deviations of 0.2 Å for all C␣ atoms for the cellopentaose complex structures and 0.3 Å for the mannopentaose complex structures. In each of these variant co-crystal structures, the oligosaccharide is buried in an extended conformation within a 25-Å cleft orthogonal to the ␤-sheets of the polypeptide. As observed in each of the wild-type co-crystal structures, interactions with the oligosaccharide are mediated by hydrophobic stacking (with Trp-20 and Trp-125) and extensive hydrogen bonding interactions. a The interactions are expressed as HS, residues predicted to be involved in hydrophobic stacking with ligand; HB, residues predicted to be involved in hydrogen bonding with ligand. b ND means no binding was detected.  In comparison with the wild-type CBM16-1-mannopentaose structure, the position of the oligosaccharide in the Q121Emannopentaose co-crystal structure deviates significantly at sites 1 and 2. This is not unexpected as these sites are liganded by the polypeptide only along one face of the sugar-binding cleft. As a consequence, the C6 hydroxyl at site 2 now orients toward the top part of the cleft, resulting in a hydrogen bond interaction with the O␦2 of Asp-77 (O-O distance of 2.7 Å) that is absent in the wild-type co-crystal structure.
In the variant structure, the side chain of Glu-121 is situated closer to the pyranoside in site 4 than in the wild-type structure, resulting in a favorable electrostatic interaction between O⑀1 and the C6 hydroxyl (Fig. 4A). The introduction of an additional negative charge in this variant does not disrupt the hydrogenbonding interaction with neighboring ligand residue Gln-81, which interacts with the C3 hydroxyl in site 3. However, the Q121E mutation results in the replacement of a nitrogen atom with oxygen as the hydrogen-bonding partner. The Glu-121 side chain is positioned directly in line with the side chain of Gln-81, resulting in a stereochemistry that may orient the lone pair electrons on N⑀2 of Gln-81 for more favorable interactions with the C3 hydroxyl of the ligand at site 3.
The CBM16-1 Q121E-cellopentaose co-crystal structure reveals a similar set of compensatory movements, in comparison with the wild-type co-crystal structure (Fig. 4B). The side chain of Glu-121 is shifted even more toward the pyranoside at site 4,  relative to the mannopentaose co-crystal structure, resulting in a long hydrogen bond (3.5 Å) between the C6 hydroxyl and O⑀2 of Glu-121. This may account for the modestly more favorable enthalpic difference of Ϫ0.21 kJ (relative to the wild-type) for the interaction of this variant with cellopentaose over mannopentaose. Amino Acid Sequence Analysis-When we used CBM16-1 to search the GenBank TM protein database, we found that the protein with the top hit score was the CenC domain protein from Paenibacillus sp. JDR-2 (GenBank TM accession number YP_003011283; supplemental Fig. 1). The result from our search was similar for CBM16-2. Despite the important roles in substrate binding of the two tryptophan residues and six polar residues of CBM16-1/-2 along the binding cleft, the amino acid sequence alignment revealed that only Trp-20 is conserved in the CenC domain protein and an endo-␤-galactosidase (GenBank TM accession number AAD04036), and Trp-125 is replaced by a conservative residue, phenylalanine, in the CenC domain protein. Interestingly, the polar residues are not conserved. Furthermore, we only detected limited conservation of the residues studied in this study in other members of CBM family 16 (supplemental Fig. 1).

DISCUSSION
The C. polysaccharolyticus CBM16-1 and CBM16-2 are the only two proteins within the carbohydrate binding module family 16 with known three-dimensional structures (16). From the co-crystal structure of CBM16-1 with substrate, two tryptophan residues and six polar residues were predicted to contribute to ligand binding by providing hydrophobic stacking interaction or direct hydrogen bonding interactions. From the current mutational studies, we have demonstrated that the roles of these residues in CBM16-1 as well as the equivalent residues in CBM16-2 in ligand binding are similar. The two tryptophans (Trp-20 and Trp-125) in both CBMs were found to be critical for ligand binding, which is consistent with many studies emphasizing the importance of aromatic residues in the binding cleft of CBMs (5,34,35). Two polar residues (Gln-81 and Gln-93) are also critical for ligand binding in both CBM16-1 and CBM16-2, although Gln-81 seems indispensable. The two hydrogen bonds formed between the ␥-amino of Gln-81 and the C3 hydroxyl at site 3 and between the ␥-carbonyl group of Gln-81 and the C2 hydroxyl at the same site not only ensure that the two CBMs bind to glucose-configured or mannose-configured substrates rather than other substrates but also provide an essential force to form a stable CBM-substrate complex. Similar to Gln-81, the CBM16-1 Gln-93 is relatively buried at the bottom of the binding cleft, possibly excluding solvent from ligand binding. Similar critical roles for buried polar residues have been observed for C. thermocellum Xyn10B CBM22-2 and C. thermocellum Xyn11A CBM6 (11,36). Thus, Trp-20, Trp-125, Gln-81, and Gln-93 collectively form a core binding platform, which interacts with sugar units at four subsites (site 5, 4, 3, and 2) of the pentasaccharides.
In contrast, the other polar residues (Gln-21, Asp-77, Asn-97/Arg-97, and Gln-121/Glu-121) play moderately important roles in ligand binding. These residues are scattered around the binding core, providing auxiliary interactions with the ligand. The four positions contain the nonconserved polar residues (Gln-21/Gly-21, Asn-97/Arg-97, and Gln-121/Glu-121) among the eight positions investigated in this study, suggesting that the residues at these positions may permit manipulation of the binding affinities of the CBMs. Indeed, changing the nonconserved residue Gln-121 to its CBM16-2 equivalent glutamate led to a CBM16-1 mutant (CBM16-1 Q121E) with an increased binding affinity for each of the substrates tested. Although there have been many mutational studies, in regard to amino acid residues predicted to be responsible for substrate binding by type B CBMs, to our knowledge, this is the second example of a single mutation reported to increase the binding affinity for the original substrates of a CBM (5, 34 -39). The P. equi CBM29-2 K85A was the first example; Lys-85 makes a steric clash with O 2 of glucoside (but not man- noside) 4, explaining why its mutation to alanine increases the affinity for glucose-containing substrates.
To investigate the structural basis for its increased affinity, we have determined high resolution crystal structure of the CBM16-1 Q121E variant in complex with the oligosaccharides mannopentaose and cellopentaose. Although the overall structures of the variant complexes are nearly identical to those of the corresponding wild-type complex, modest movements at and around the side chain of residue 121 results as a consequence of the Q121E mutation. These changes, along with a more favorable electrostatic environment, result in the improved affinity of this variant toward oligosaccharides.
Although residues Trp-20, Trp-125, Gln-81, and Gln-93 coordinately hold the ligand tightly from its upper and lower faces, Gln-21/Gly-21, Asp-77, Asn-97/Arg-97, and Gln-121/ Glu-121 are residues that can be fine-tuned for important auxiliary function in binding the ligand. This viewpoint is further supported by the data from the CBM16-1 triple mutant. Although the Q121E mutation increased the binding affinity of CBM16-1 for all substrates, the combination of this mutation with N97R and Q21G mutations, which have negative effects on binding activity by CBM16-1, counteracted the positive effect of Q121E. The fine-tuning process of the key residues has likely taken place in these tandemly linked family 16 CBMs to optimize their function in Man5A. The tandem CBMs are similar to each other in both binding specificity and strength. These similarities might be functionally important for their roles in the intact Man5A.
Distinctive responses for differently configured substrates were observed for some mutants. Specifically, two critical residues, i.e. Trp-125 of CBM16-1 and Gln-93 of CBM16-1/-2, have higher residual binding affinities for glucose-configured than mannose-configured substrates after they were mutated to alanine. For the alanine mutants of the polar residues with moderate importance, there were still subtle differences, with a trend of slightly higher affinities remaining for glucose-configured oligosaccharide or polysaccharide for Gln-21, Asp-77, Asn-97, and Gln-121 of CBM16-1, and Asp-77 and Glu-121 of CBM16-2.
Interestingly, the mutants of CBM16-2 seemed to be more susceptible to new H-bond formation, because more mutants (CBM16-2 D77A, R97A, E121A, R97N, and E121Q) either had their binding enthalpy nearly unchanged or increased for both cellopentaose and mannopentaose (Tables 6 and 8). In contrast, CBM16-1 had less number of mutants (Q93A and Q121A) showing this characteristic (Tables 5 and 7). This implies that, although CBM16-1 and CBM16-2 are similar to each other in both primary sequence and spatial structure, CBM16-2 may have a more flexible binding cleft than that of CBM16-1.
Man5A is a robust enzyme that releases oligosaccharides from polysaccharides such as locust bean gum, guar gum, ␤mannan, glucomannan, lichenan, and carboxymethyl cellulose. In the hydrolysis of the first four substrates, the enzyme is highly dependent on its two CBMs for release of end products, because deletion of the two CBMs almost abolishes product detection (40,41). Our preliminary studies suggest that the two CBMs may be important in destabilization of the polysaccharide substrates, thus increasing substrate availability to the cat-alytic module. We look forward to investigating the effects of the mutations reported in this study on the enzymatic activity of Man5A, especially on glucose-, mannose-, and glucose/mannose-configured polysaccharides. Of significance to our findings in this study is also the observation that the four critical residues (Trp-20, Trp-125, Gln-81, and Gln-93) in the two family 16 CBMs are not highly conserved in the other assigned members of this family. We postulate that most of the members in this family either bind to different substrates or bind to ligands similar to those of the Man5A CBMs but with different strategies.