Family 46 Carbohydrate-binding Modules Contribute to the Enzymatic Hydrolysis of Xyloglucan and β-1,3–1,4-Glucans through Distinct Mechanisms*♦

Background: CBMs are, generally, functionally and structurally autonomous from their associated catalytic domains. Results: The structure of a novel cellulase, BhCel5B, reveals that the appended carbohydrate-binding module, CBM46, extends the enzyme catalytic cleft. Conclusion: CBM46 targets BhCel5B to xyloglucan and is part of the catalytic cleft required for the hydrolysis of β-1,3–1,4-glucans. Significance: CBM46 has a dual role in the hydrolysis of complex carbohydrates by BhCel5B. Structural carbohydrates comprise an extraordinary source of energy that remains poorly utilized by the biofuel sector as enzymes have restricted access to their substrates within the intricacy of plant cell walls. Carbohydrate active enzymes (CAZYmes) that target recalcitrant polysaccharides are modular enzymes containing noncatalytic carbohydrate-binding modules (CBMs) that direct enzymes to their cognate substrate, thus potentiating catalysis. In general, CBMs are functionally and structurally autonomous from their associated catalytic domains from which they are separated through flexible linker sequences. Here, we show that a C-terminal CBM46 derived from BhCel5B, a Bacillus halodurans endoglucanase, does not interact with β-glucans independently but, uniquely, acts cooperatively with the catalytic domain of the enzyme in substrate recognition. The structure of BhCBM46 revealed a β-sandwich fold that abuts onto the region of the substrate binding cleft upstream of the active site. BhCBM46 as a discrete entity is unable to bind to β-glucans. Removal of BhCBM46 from BhCel5B, however, abrogates binding to β-1,3–1,4-glucans while substantially decreasing the affinity for decorated β-1,4-glucan homopolymers such as xyloglucan. The CBM46 was shown to contribute to xyloglucan hydrolysis only in the context of intact plant cell walls, but it potentiates enzymatic activity against purified β-1,3–1,4-glucans in solution or within the cell wall. This report reveals the mechanism by which a CBM can promote enzyme activity through direct interaction with the substrate or by targeting regions of the plant cell wall where the target glucan is abundant.

Deconstruction of plant cell wall carbohydrates is a process of considerable biological importance but is relatively inefficient due to the interlocking organization of polysaccharides within this macromolecular assembly (1,2). Reflecting the complex organization of plant cell walls, which restricts enzyme access to their target substrates (3), hydrolysis of structural polysaccharides requires the cooperative action of a large consortia of carbohydrate active enzymes (CAZYmes), 3 primarily glycoside hydrolases but also polysaccharide lyases, carbohydrate esterases, and polysaccharide oxidases (4 -6). These enzymes have recently acquired a significant biotechnological significance in different industries, particularly in the emerging bioenergy and biorefinery sectors (7).
CAZYmes acting on recalcitrant substrates often present a modular architecture comprising a catalytic domain connected through flexible linker sequences to one or more noncatalytic carbohydrate-binding modules (CBMs). CBMs potentiate the activity of their appended catalytic modules by promoting a close interaction between the associated catalytic domains and their target substrates (8 -13). Glycoside hydrolases and CBMs are grouped in families based on primary sequence similarities in the continuously updated CAZy database (14 -16). Currently (July 2014), there are 133 families of GHs and 69 families of CBMs. GH5, a large glycoside hydrolase family, contains members from a broad spectrum of organisms that display a diversity of specificities against substrates comprising ␤-linked oligo-and polysaccharides and glycoconjugates (17). GH5 was recently organized into more than 51 evolutionary different subfamilies, which may be mono-or poly-specific. One example of a poly-specific GH5 subfamily is GH5_4 that includes endo-␤-1,4-glucanases (EC 3.2.1.4), xyloglucanase (EC 3.2. 1.151), ␤-1,3-1,4-glucanases (EC 3.2.1.73), and xylanases (EC 3.2.1.8). GH5 catalytic domains, including those of subfamily GH5_4, are generally fused to CBMs of different families, which display a ligand specificity that reflects the substrate specificity of the associated catalytic module.
Based on structure/function studies, CBMs were classified into three types. Type A CBMs recognize the surfaces of crystalline polysaccharides such as cellulose and are located in cellulases and also in noncellulosic enzymes (18). Type B CBMs bind internally single carbohydrate chains (endo-type). In contrast, type C CBMs bind the termini of a large variety of polysaccharides (exo-type) (13). Thus, CBMs are generally structurally independent of the associated catalytic domain and may express a variety of ligand specificities supported from three major surface topologies. One notable exception to this general trend is CBM3 subfamily c (CBM3c) (19). Members of subfamilies a and b were shown to bind strongly to crystalline cellulose (20,21). In contrast, CBM3cs, as discrete entities, do not bind crystalline cellulosic substrates. Instead, CBM3c members are always associated with a subgroup of GH9 catalytic domains and have been shown to alter GH9 function from the standard endo-acting mode to a processive endo-mode of action (22,23). Structural data revealed that CBM3c extends the catalytic site of the associated GH9 catalytic domains (22). More recent examples of CBMs modulating catalytic specificity is provided by a type C fructan-binding CBM66 that directs the cognate enzyme toward highly branched glucans rather than linear fructose polymers (24) and a CBM48 that contributes to substrate binding at the active site of a glucan phosphatase (25).
Although in general CBMs function independently from their appended catalytic domains, it is possible that there are other examples, similar to CBM3c, CBM66, and CBM48, in which CBMs are structurally integrated into the associated catalytic domains and make a direct contribution to enzymatic activity and specificity. Inspection of the CAZy database revealed a structurally uncharacterized CBM family, CBM46, where all members are associated with GH5_4 catalytic domains and are part of proteins displaying identical molecular architectures. Thus, all CBM46 members are found at the C terminus of CAZymes containing an N-terminal GH5_4 catalytic domain, followed by an internal immunoglobulin (Ig)-like module. Conservation in the molecular architectures of proteins containing CBM46 suggests a functional relevance for the association of GH5_4 and CBM46. Here, we report the biochemical, structural, and functional characterization of endo-␤-1,4-glucanase B (BhCel5B) from B. halodurans ( Fig. 1) (26). B. halodurans is a rod-shaped, Gram-positive soil bacterium that in comparison with Bacillus subtilis is known to colonize more alkaline environments (26). This study reveals that BhCBM46 as a discrete entity does not bind soluble or insoluble polysaccharides. The crystal structure of BhCel5B reveals that CBM46 is tightly associated with the catalytic module and, dependent on the glucan presented to the enzyme, can contribute directly to substrate binding or play a targeting role in directing the enzyme to regions of the plant cell wall rich in the polysaccharide hydrolyzed by the enzyme.

EXPERIMENTAL PROCEDURES
Carbohydrates-All carbohydrates were purchased from Megazyme International (Bray, County Wicklow, Ireland), except hydroxyethylcellulose that was obtained from Sigma and Avicel from Merck. Regenerated cellulose was prepared as described by Ref. 27.
Cloning, Expression, and Purification-DNA encoding fulllength BhCel5B (residues 30 -574; accession no. BA000004) and its truncated derivatives, BhGH5-Ig (residues 30 -463), BhCBM46 (residues 464 -574), and BhIg-CBM46 (residues 365-574), were amplified by PCR from B. halodurans genomic DNA using the thermostable DNA polymerase NZYProof (NZYTech Ltd., Portugal) and primers described in Table 1 (see Fig. 1 for molecular architecture of the proteins). Primers contained engineered restriction sites for direct cloning into the prokaryotic expression vector. Thus, the amplified genes were digested with NheI and XhoI and cloned into pET28a. The gene encoding BhCel5B_W501A_F504A_F507A_Y509A_R531A_ E296A (Table 1) was synthesized in vitro (NZYTech Ltd.) with a codon usage optimized for expression in Escherichia coli. The synthesized gene was cloned into pET28a as described above. All recombinant proteins contained N-terminal His 6 tags. Recombinant plasmids encoding BhCel5B derivatives were used to transform E. coli BL21 (DE3) cells. Expression of all proteins was achieved by adding isopropyl ␤-D-thiogalactopyranoside (1 mM final concentration) to mid-exponential phase (A 600 nm ϭ 0.6) grown cells with incubation for a further 16 h at 19°C. The His 6 -tagged recombinant proteins, and their respective mutant derivatives, were purified from cell-free extracts by immobilized metal affinity chromatography as described previously (28). For crystallization, proteins were further purified by size exclusion chromatography. Following immobilized metal affinity chromatography, fractions containing the purified proteins were buffer-exchanged, using PD-10 Sephadex G-25 M gel filtration columns (GE Healthcare), into 50 mM Na-HEPES buffer, pH 7.5, containing 200 mM NaCl and 5 mM CaCl 2 . Recombinant proteins were subjected to gel filtration using a HiLoad 16/60 Superdex 75 column (GE Healthcare) at a flow rate of 1 ml/min. Preparation of E. coli to generate selenomethionylated BhCBM46 (SeMet-BhCBM46) was performed as described previously (29), and the protein was purified using the same procedures as employed for the native BhCBM46. Purified proteins were concentrated using an Amicon 10-kDa molecular mass centrifugal concentrator and washed three times with 5 mM DTT (for the Se-Met proteins) or water (for native BhCBM46 and BhCel5B), containing 1 mM CaCl 2 . Protein purity was analyzed through SDS-PAGE.
Site-directed Mutagenesis-Site-directed mutagenesis was carried out using the PCR-based NZYMutagenesis site-directed mutagenesis kit (NZYTech Ltd., Portugal) deploying the primers listed in the Table 1. BhCBM46, BhGH5-Ig, and BhCel5B were used as DNA templates. The generated DNA sequences were sequenced to ensure that only the engineered mutations had been incorporated into the nucleic acid.
Affinity Gel Electrophoresis-The binding to soluble polysaccharides was evaluated by affinity gel electrophoresis following the method described previously (30). Polysaccharide ligands were used at a concentration of 0.3% (w/v), unless otherwise stated. Electrophoresis was carried out at room temperature in native 10% (w/v) polyacrylamide gels. The gels were also loaded with BSA, which acts as a noninteracting negative control. After electrophoresis, proteins were visualized through staining with Coomassie Blue.
Isothermal Titration Calorimetry (ITC)-ITC experiments were carried out essentially as described previously (30), except that proteins were in 50 mM Na-HEPES buffer, pH 7.5, containing 200 mM NaCl at 25°C. The reaction cell contained protein at 50 M, and the syringe contained the polysaccharide at 20 mg/ml, unless stated otherwise. For experiments with regenerated cellulose, the ligand was retained in the cell at 12 mg/ml, and the protein (200 M) was injected. Titrations were carried out at same conditions. Integrated heat effects, after correction for heats of dilution, were analyzed by nonlinear regression using a single site-binding model (Microcal ORIGIN, Version 5.0; Microcal Software). The fitted data yielded the association constant (K a ) and the enthalpy of binding (⌬H). Other thermodynamic parameters were calculated by using the standard thermodynamic equation: ϪRTlnK a ϭ ⌬G ϭ ⌬H Ϫ T⌬S.
Interaction with Insoluble Polysaccharides-The binding of BhCBM46 to insoluble polysaccharide (Avicel) was carried out

Residues Primers Direction
BhCel5B 30-574 Gene synthesized as follows. 30 g of protein in 5 mM Tris-HCl buffer, pH 8.0, containing 0.05% (v/v) Tween 20 and 5 mM CaCl 2 (Buffer A) were mixed with 20 mg of Avicel in a final reaction volume of 200 l. The reaction mixture was incubated for 2 h at 4°C with gentle shaking, after which time the insoluble ligand was precipitated by centrifugation at 13,000 ϫ g for 5 min. The supernatant was removed, and the pellet was washed three times with 200 l of Buffer A. Bound and unbound fractions were analyzed by SDS-PAGE using a 14% acrylamide gel. BSA (Sigma) and CBM3a from Clostridium thermocellum (NZYTech Ltd.) were included as negative and positive controls, respectively. Enzyme Assays-BhCel5B and BhGH5-Ig were assayed for enzyme activity using the 3,5-dinitrosalicylic acid assay, described previously (31), to detect the release of reducing sugar. To explore the pH profile of BhCel5B, 50 mM MES, pH 4.5-7, 50 mM Tris-HCl, pH 7-9.5, 50 mM NaHCO 3 , pH 9 -11, buffers were used in enzyme assays that employed 0.2% barley ␤-glucan as the substrate. The activity was determined at 55°C by measuring the amount of reducing sugar released after a 10-min incubation period, using glucose to construct the standard curve. Determination of temperature of maximal enzyme activity for BhCel5B was performed by incubating the enzyme for 10 min at temperatures ranging from 20 to 80°C and measuring reducing sugar release from barley ␤-glucan. For thermostability experiments, BhCel5B and BhGH5-Ig were incubated at temperatures ranging from 20 to 70°C for 20 min, and residual activity was measured as described above at 30°C. To determine kinetic parameters, assays with BhCel5B and BhGH5-Ig were carried out in 50 mM Tris-HCl buffer, pH 7, at 30°C. Kinetic parameters were determined by nonlinear regression analysis using the Michaelis-Menten equation in GraphPad Prism 5.
Thin Layer Chromatography (TLC)-The qualitative analysis of BhCel5B and BhGH5-Ig hydrolysis products was performed by TLC on a silica gel-coated aluminum plate for detecting the released sugars. Reactions were performed in 20 mM sodium phosphate, pH 8, 0.1 mg/ml BSA, 0.3% (w/v) of substrate at 37°C. Enzymes were incubated for 4 h, and at different time points the reactions were stopped by incubation at 100°C for 10 min. Enzyme-substrate reaction product and standard and negative control were loaded onto the TLC plate.
Preparation of Plant Materials, Enzymatic Treatments, and Polysaccharide Immunodetection Procedures-Excised regions of tobacco and Miscanthus stems were fixed in 50 mM Pipes buffer, pH 6.9, containing 5 mM EGTA, 5 mM MgSO 4 , and 4% (v/v) paraformaldehyde and were embedded in wax and sectioned as described previously for tobacco stems (8). As access to xyloglucan can be masked by pectic homogalacturonan (32), tobacco stem sections were treated with pectate lyase (CjPel10A; 10 g/ml) in 50 mM CAPS buffer, pH 10, containing 2 mM CaCl 2 for 2 h before the activity of BhCel5B and its derivatives were assessed. Enzyme treatments were carried out as described previously (8) at concentrations ranging from low nanomolar to low micromolar for 30 min in phosphate-buffered saline (PBS) at 30°C. Sections not treated with the enzymes were incubated for an equivalent time with the corresponding buffers. All sections were subsequently treated for 20 min with 5 g/ml proteinase K (Sigma) in PBS before substrate detection to remove any enzymes still attached to the cell walls through their CBMs. Xyloglucan was detected with a rat monoclonal antibody LM15 (32), and ␤-1,3-1,4-glucan was detected with a mouse monoclonal antibody, deploying appropriate FITC-labeled secondary antibodies as described (8).
Immunofluorescence Microscopy and Quantification of Enzyme Impact on Cell Wall Polysaccharides-Immunofluorescence analysis was carried out with an Olympus BX-61 microscope equipped with epifluorescence irradiation, and all micrographs were captured with an ORCA 285 camera (Hamamatsu) using Volocity software (PerkinElmer Life Sciences). The relative capacities of the enzymes to degrade their substrates within cell walls were determined by quantitative assessments of the immunofluorescence intensities, which were captured in equivalent micrographs using a protocol that has been described (8). Briefly, using Volocity quantitation software, the absolute level of fluorescence contained in the micrographs was determined. For cell wall deconstruction, the modulation of signal is the disappearance of epitopes after polysaccharide degradation. Control micrographs obtained without enzymatic treatment were designated as 100% of initial fluorescence. In all cases the fluorescence quantification derives from the analysis of micrographs obtained from a minimum of four assessments for each enzyme concentration.
Structural Determination and Refinement-Data were collected for the native BhCBM46 to 2.46 Å resolution (Protein Data Bank code 4uzn). The data collected for the SeMet-BhCBM46 were used to solve the BhCBM46 structure. The crystal diffracted to a resolution of 2.3 Å (Protein Data Bank code 4uz8). The SeMet-BhCBM46 structure was determined by a single-wavelength anomalous dispersion experiment using AutoSol (39) from the PHENIX suite (40). The three-dimensional structure for the native BhCBM46 was solved by molecular replacement using Phaser (41) with the SeMet-derivative model as a search model, giving a TFZ of 22.0 and an LLG of 2404. Data collection and refinement statistics are presented in Table 2. The BhCel5B crystals diffracted to a resolution beyond 1.5 and to 2.75 Å. The best crystal was processed to a resolution of 1.64 Å (Protein Data Bank code 4v2x). BALBES was used to carry out molecular replacement (42). The best solution was found using the GH5 catalytic domain of endoglucanase D from Clostridium cellulovorans (PDB code 3ndz, with a sequence identity of 31.1%). An ARP/wARP (43) run after BALBES gave an almost complete model with 521 amino acid residues identified in a single chain, with an estimated correctness of 98%. Structure refinement and analysis are presented in Table 2.

RESULTS
Ligand Specificity of BhCBM46 -Previous qualitative studies suggested that the CBM46 of the endoglucanase BhCel5B displayed moderate binding to Avicel (44). Here, we have confirmed that the CBM displays some binding to Avicel; however, the affinity was too low to accurately quantify. Affinity gel electrophoresis showed that BhCBM46 fused to the central Ig domain (BhIg-CBM46) did not bind to a range of polysaccharides, including ␤-1,3-1,4-glucans (mixed linkage glucan), ␤-1,4-glucans, highly decorated ␤-1,4-glycans such as xyloglucan, mannan, glucomannan, xylans or to galactans, pectins, or ␣-glucans (data not shown). ITC confirmed that both BhIg-CBM46 and BhCBM46 did not bind to barley ␤-glucan (a ␤-1,3-1,4-glucan with an average sequence of Glc-␤1,4-Glc-␤1,4-Glc-␤1,3-Glc) or to tamarind xyloglucan, which contains a core structure of a ␤1,4-linked Glc backbone that is decorated at O6 with ␤-D-Xyl units in the repeating sequence of XXXG (X is a Glc decorated with Xyl and G is an undecorated Glc), see Table 3. Taken together, the data suggest that BhCBM46 is unable to bind significantly to either soluble or insoluble carbohydrates.
To explore the function of BhCBM46 in the context of the full-length enzyme, two inactive variants of the endoglucanase were generated, BhCel5B-E296A and BhGH5-Ig-E296A, in which the catalytic nucleophile (Glu-296) had been substituted for alanine. BhCel5B-E296A includes the full-length enzyme, whereas BhGH5-Ig-E296A consists of the GH5_4 catalytic module and the central Ig domain but lacks the CBM46 module. Analysis of glycan binding of the two proteins by ITC (full dataset presented in Table 3 with example titrations in Fig. 2) showed that BhCel5B-E296A bound to barley ␤-glucan and is the ith intensity measurement of reflection hkl, including symmetry-related reflections, and ͗(I(hkl)͘ is its average. b R p.i.m. ϭ (͚ hkl ͌1/n Ϫ 1/͚ j ϭ 1 n ͉I hkl, j Ϫ ͗I hkl ͉͘)/(͚ hkl ͚ I hkl, j , where ͗I hkl ͘ is the average of symmetry-related observations of a unique reflection. c CC1 ⁄ 2 is the half-data set correlation coefficient (51). d Values for the outer shell are given in parentheses. Structure refinement was carried out using REFMAC5 (52), interspersed with manual rebuilding using COOT (53). In the penultimate round, the models were optimized using the PDB_REDO server (54). In the final round, each module (GH5, Ig_Like and CBM46) was treated as a single group for anisotropic TLS refinement.  xyloglucan with affinities of ϳ10 5 and ϳ10 7 M Ϫ1 , respectively. Although BhGH5-Ig-E296A displayed no affinity for the ␤-1,3-1,4-glucan, the truncated derivative lacking CBM46 bound to xyloglucan with an affinity (ϳ4 ϫ 10 4 M Ϫ1 ) 400-fold lower than BhCel5B-E296A. For both glycans, ligand binding was driven primarily by the change in enthalpy, a feature that is common to virtually all proteins that bind to soluble glycans (12). Overall, the data indicate that synergistic interactions between the GH5_4 catalytic domain and CBM46 of BhCel5B play a critical role in glucan binding. The contribution of the enhanced substrate binding afforded by BhCBM46 to catalytic activity is described below.

Effect of BhCBM46 on Enzyme Activity in Solution-
The activities of recombinant forms of BhGH5-Ig and BhCel5B were evaluated. Consistent with the previous data of Wamalwa et al. (44), only the full-length enzyme was functional when assayed at 55°C. A temperature stability profile of the endoglucanase, however, showed that the CBM mediated a substantial increase in thermostability, and it was this stabilization effect that had a profound influence on enzyme activity (Fig. 3). We therefore explored the influence of BhCBM46 on the activity of the endoglucanase at a permissive temperature (30°C), where the truncated enzyme, BhGH5-Ig, did not undergo thermal inactivation (Fig. 3). The data, presented in Table 4, showed that BhCel5B was ϳ5-fold more active against the barley ␤-glucan than xyloglucan. TLC analysis of the products generated by BhCel5B against the two glucans revealed a range of differently sized oligosaccharides, indicative of an endo-acting mode of action (data not shown). The removal of the CBM46 had little impact on the activity of the enzyme against xyloglucan; however, BhGH5-Ig was ϳ60-fold less active against barley ␤-glucan than the full-length enzyme (Table 4). These data show that in solution the CBM46 contributes to the degradation of ␤-1,3-1,4-glucans, but the module contributes little to xyloglucan hydrolysis.
Effect of BhCBM46 on Enzyme Activity against Plant Cell Walls-To assess the activity of BhGH5-Ig and BhCel5B against plant cell walls, the two enzymes at a range of concentrations were incubated with sections of tobacco (pretreated with pectate lyase to expose the xyloglucan (32)) and miscanthus stems.  After 1 h, the reactions were stopped, and the amounts of xyloglucan and ␤-1,3-1,4-glucan remaining in the tobacco and miscanthus sections, respectively, were determined by immunohistochemistry using antibodies specific for the two polysaccharides. The data (Fig. 4) showed that BhCel5B at 30 nM removed 50% of the xyloglucan from tobacco cells, although even at 4 M BhGH5-Ig was only able to remove ϳ40% of the polysaccharide. With respect to miscanthus, BhCel5B at 200 nM removed ϳ75% of the mixed linked, although at the same concentration BhGH5-Ig did not appear to release any of the polysaccharide. These data show that BhCBM46 enhances the enzymatic degradation of ␤-1,3-1,4-glucans. In contrast to the in vitro solution experiments described above, in the context of plant cell walls the CBM46 mediated a substantial increase in the xyloglucanase activity of the enzyme. Crystal Structure of BhCBM46 -To explore the structural basis for the different roles of BhCBM46 against xyloglucan and the ␤-1,3-1,4-glucan, the crystal structure of BhCBM46 and the full-length enzyme BhCel5B were determined. The crystal structure of BhCBM46 yielded an initial trace of the polypeptide backbone, which was further refined in this report yielding a structure determined to 2.2 Å. The CBM displays a classic ␤-sandwich jelly roll fold. The two ␤-sheets contain four antiparallel ␤-strands. The order of the ␤-strands in ␤-sheet 1 and ␤-sheet 2 are ␤1, ␤2, ␤5, ␤4 and ␤3, ␤6, ␤7, ␤8, respectively. The ␤-strands are connected primarily by loops, although there is a small helix extending from residues Glu-524 to Val-530 (Fig. 5). Inspection of the ␤-sheet presenting a slight concave surface revealed an absence of aromatic residues, which generally play a central role in ligand recognition by CBMs (12). In contrast, the loop connecting ␤-strand 3 and 4 contains four aromatic residues comprising Trp-501, Phe-504, Phe-507, and Tyr-509. This loop may participate in carbohydrate recognition; however, such interactions are likely mediated only by Trp-501 and Tyr-509, as Phe-504 and Phe-507 are predominantly buried at the interface with the catalytic domain (see below). Three-dimensional structural comparison using the SSM site revealed that the closest structural homologue of BhCBM46 is the filamin immunoglobulin-like repeat from Homo sapiens (PDB code 2rgh), with a Z score of 4.9, r.m.s.d. of 2.85 Å over 97 aligned residues. Several other immunoglobulin-like modules with a ␤-sandwich fold showed similar levels of structural identity with BhCBM46. BhCBM46 presents a lower degree of

Structure and Function of CBM46
APRIL 24, 2015 • VOLUME 290 • NUMBER 17 homology with functionally relevant CBMs presenting a ␤-sandwich fold.
The crystal structure of BhCBM46 suggests that residues Trp-501 and Tyr-509 may constitute the CBM46 ligand-binding site. Arg-531 is located at the surface of the protein in the vicinity of the aromatic residues and thus may also play a role in ligand recognition. Substituting these three residues with alanine individually in BhCel5B-E296A revealed that only the W501A mutation resulted in a substantial reduction in affinity for both xyloglucan and barley ␤-glucan, suggesting that the tryptophan makes an important contribution to ligand recognition. In conclusion, it would appear that evaluating the binding of mutants of CBM46 to glucans has little relevance to the biological role of these amino acids in a catalytically competent enzyme.
Crystal Structure of BhCel5B-To visualize the contribution of BhCBM46 to substrate recognition in the full-length enzyme, the structure of the trimodular ␤-1,4-glucanase BhCel5B (GH5-Ig-CBM46) was solved to a resolution of 1.64 Å by molecular replacement (Fig. 6). The structure of BhCBM46 reported above was used as the search model. The polypeptide chain is visible from Lys-31 to the C-terminal residue Gln-564.
GH5_4 -As expected, the N-terminal GH5_4 module displayed a (␤/␣) 8 barrel architecture. GH5 enzymes are members of clan GH-A, in which the two catalytic residues are invariant glutamates presented at the end of ␤-strands 4 and 7 (45). From the structure of BhCel5B, the catalytic acid base is likely to be Glu-174 (end of ␤-strand 4) and the catalytic nucleophile Glu-274 (end of ␤-strand 7). The catalytic role of Glu-296 was confirmed by the observation that the mutant E296A is inactive (see above). A narrow and deep V-shaped cleft, ϳ30 Å in length, extends along the entire length of the GH5_4 module and sits over the top of the ␤-barrel. Cleft dimensions and the position of the catalytic apparatus suggest that the protein contains ϳ5 subsites extending from Ϫ3 to ϩ2, although the presence of CBM46 might contribute to an additional subsite extending the number of positive subsites to three (see under "Discussion"). An analysis of structural homologues of GH5_4 using the SSM site identified a large number of GH5 and clan GH-A enzymes that displayed significant structural similarity to GH5_4. The C. cellulovorans endoglucanase D (PDB code 3ndz), with r.m.s.d. of 1.46 Å over 345 C␣ atoms and a Z-score of 15.2, and the fungal GH5 endoglucanase of Piromyces rhizinflata (PDB code 3ays), with r.m.s.d. of 1.59 Å over 367 C␣ atoms and a Z-score of 13.7, are the closest structural homologues. The structures of the two GH5_4 homologues were previously solved in complex with a cellotriose molecule bound to the Ϫ3, Ϫ2, and Ϫ1 subsites. An overlay of the three structures revealed that the Ϫ1 subsite (active site), where the transition state is formed, is similar in the three enzymes. Glu-174 makes a hydrogen bond with His-249, which may be important to both the position and ionization state of the catalytic acid base. The interactions of GH5_4 of BhCel5B with substrate can be predicted from an overlay of cellotriose (derived from PDB code 3ndz, where cellotriose is bound to the negative subsites of C. cellulovorans endoglucanase D) with the GH5-4 domain of BhCel5B. The overlay (Fig. 7) predicts that Asn-173, which is highly conserved in clan GH-A, hydrogen bonds with the O2 of the sugar at the Ϫ1 subsite. It is believed that this interaction plays an important role in transition state stabilization (46). The position of the GH5_4 catalytic nucleophile, Glu-296, is stabilized through hydrogen bonds with Tyr-251 and Arg-84, althoughTrp-335 is likely to form the sugar-binding hydrophobic platform at the Ϫ1 subsite. In addition to the amino acids coordinating substrate recognition and cleavage at the catalytic center, GH5_4 contains several residues in subsites distal to the active site that likely participate in substrate recognition and are conserved in other GH5s. Thus, at the Ϫ3 subsite, Trp-62 is predicted to make hydrophobic interactions with glucose and Asn-50 could make polar contacts with the sugar. At subsite Ϫ2, Asn-50 and Asp-303 are within hydrogen bonding distance of the sugar moiety. Significantly, subsite Ϫ2 of GH5_4 contains a tryptophan (Trp-132) that is not present in the other cellulases. The aromatic residue is in close proximity with the O6 of the sugar at the Ϫ2 subsite, suggesting that the Trp-132 side chain could make hydrophobic interactions with sugar decorations of the glucose backbone exemplified by xyloglucan. Indeed, the Ϫ3 subsite is particularly solvent-exposed and could accommodate decorations appended to O2, O3, or O6 of the bound glucose. Several aromatic residues may contribute to substrate recognition at the positive subsites of the substrate binding cleft of BhCel5B, notably Trp-254, Leu-300, and Trp-181. Thus, although Leu-300 and Trp-181 could form a pair of hydrophobic residues that could bind the ␣and ␤-face of the sugar at subsite ϩ1, Trp-254 could play a major role in carbohydrate recognition at the ϩ2 subsite.
Ig-like and CBM46 Modules within BhCel5B-The Ig-like module of BhCel5B consists of two ␤-sheets arranged around a hydrophobic core in a typical ␤-sandwich fold (Fig. 6). The structure is highly homologous to other immunoglobulin domains of prokaryotic or eukaryotic origin. The twisted pair of ␤-sheets contain ␤-strands ␤1, ␤7, ␤6, and ␤3 (␤-sheet 1) and ␤-strands ␤2, ␤5, and ␤7 (␤-sheet 2), respectively. The ␤strands are connected primarily by loops, although there is a small helix extending from Ala-442 to Gly-432 and connecting ␤5 and ␤6. ␤1 and ␤7 of ␤-sheet 1 form a planar surface that establishes an extensive network of polar and apolar contacts with GH5-4 ␣-helices 7 and 8. The role of Ig-like domains in CAZYme function is not completely elucidated, although in the context of BhCel5B it may fulfill a structural anchor by providing the correct spatial organization of the associated protein modules in the context of the full-length enzyme. A small linker sequence (Thr-456 to Thr-459) connects the immunoglobulin like domain and CBM46. The structure of the CBM46 either when expressed individually (described above) or within BhCel5B was essentially identical (r.m.s.d. ϳ1.2 Å). Thus, CBM46 does not undergo significant conformational changes when folded in the context of the entire protein. CBM46 ␤4 (␤-sheet 1), ␤3 (␤-sheet 2), and the loop connecting these two ␤-strands make a large number of contacts with GH5_4 loops connecting ␣7 and ␤7 and ␣6 and ␤6. In particular, CBM46 Phe-504, Phe-507, and Trp-542 dominate the hydrophobic contacts with the GH5_4 surface.

DISCUSSION
The data presented in this study show that BhCBM46 is not a BhCel5B-stabilizing domain but, in synergy with the catalytic module, binds to glucans and plays an active role in catalysis. The intriguing feature of the role of the CBM is that it appears to be variably dependent on the substrate. Against ␤-1,3-1,4glucans the CBM participates in productive substrate binding and thus plays a direct role in the hydrolytic activity of the enzyme. In some ways the CBM46 resembles CBM3c modules that abut onto the substrate binding cleft of GH9 cellulases and are described in detail above. By contrast BhCBM46 was shown not to contribute to the hydrolysis of xyloglucan chains in solution. This suggests that the CBM is unable to bind to xyloglucan molecules that occupy the substrate binding cleft. BhCBM46, however, mediated a substantial increase in activity against xyloglucan contained in plant cell walls. This indicates that the CBM46 potentiates activity by directing the enzyme to regions of the cell wall rich in xyloglucan.
The crystal structure of BhCel5B provides insights into the possible mechanisms by which BhCBM46 makes different contributions to the activity of the enzyme. With respect to ␤-1,3-1,4-glucans, a ␤-1,3-linkage between Glc at the ϩ1 and ϩ2 subsites may enable the downstream glucan chain to make productive interactions with BhCBM46, explaining why this module contributes to the activity of BhCel5B against ␤-1,3-1,4-glucans. In contrast, the linear trajectory of ␤-1,4-glucan chains, such as xyloglucan, would result in these polymers occupying only the substrate binding cleft. These homopolymers would lack the kink in the backbone, afforded by a ␤-1,3linkage, which is required for a glucan chain that occupies the active site and proximal subsites to access the CBM46. Inspection of the crystal structure, however, showed that a xyloglucan chain bound to the aromatic residues presented by the putative ligand-binding site of CBM46 would be too distant from the substrate binding cleft to prevent access to the active site. Thus, the CBM46 could contribute to substrate targeting in the context of the plant cell wall, without sterically hindering access to the active site.
BhCel5B is a tri-modular protein composed of an N-terminal glycoside hydrolase family 5 catalytic module (GH5_4) followed by an immunoglobulin (Ig)-like module and a C-terminal family 46 CBM. Inspection of the CBM46 family revealed that all of its 45 members are located at the C terminus of CAZymes containing an N-terminal GH5_4 catalytic domain and an

Structure and Function of CBM46
APRIL 24, 2015 • VOLUME 290 • NUMBER 17 internal immunoglobulin-like module. This is unusual as generally CBMs of the same family are associated with catalytic modules of different families that display diverse substrate specificities (12), particularly in the context of plant cell wall-degrading enzymes (47). Alignment of CBM46 representatives revealed that Trp-501 is invariant in the family (Fig. 8). Thus, CBM46 may display a conserved function in polysaccharide recognition. Similarly, all the residues that are predicted to play FIGURE 9. Alignments of BhCel5B with four proteins. The alignment was made using ClustalW2. Residues required for substrate recognition and catalysis are conserved in the five proteins. The residues occupying the subsites are indicated. an important role in catalysis or substrate binding in BhCel5B are conserved in five enzymes displaying a GH5-Ig-CBM46 molecular architecture with closest sequence similarity with BhCel5B ( Fig. 9), suggesting that these proteins are optimized to mediate xyloglucan and ␤-1,3-1,4-glucan hydrolysis within plant cell walls. Significantly, the CBM46 contributes to the depolymerization of these glucans through distinct mechanisms.