Spatially remote motifs cooperatively affect substrate preference of a ruminal GH26-type endo-β-1,4-mannanase

β-Mannanases from the glycoside hydrolase 26 (GH26) family are retaining hydrolases that are active on complex heteromannans and whose genes are abundant in rumen metagenomes and metatranscriptomes. These enzymes can exhibit distinct modes of substrate recognition and are often fused to carbohydrate-binding modules (CBMs), resulting in a molecular puzzle of mechanisms governing substrate preference and mode of action that has not yet been pieced together. In this study, we recovered a novel GH26 enzyme with a CBM35 module linked to its N terminus (CrMan26) from a cattle rumen metatranscriptome. CrMan26 exhibited a preference for galactomannan as substrate and the crystal structure of the full-length protein at 1.85 Å resolution revealed a unique orientation of the ancillary domain relative to the catalytic interface, strategically positioning a surface aromatic cluster of the ancillary domain as an extension of the substrate-binding cleft, contributing to galactomannan preference. Moreover, systematic investigation of nonconserved residues in the catalytic interface unveiled that residues Tyr195 (−3 subsite) and Trp234 (−5 subsite) from distal negative subsites have a key role in galactomannan preference. These results indicate a novel and complex mechanism for substrate recognition involving spatially remote motifs, distal negative subsites from the catalytic domain, and a surface-associated aromatic cluster from the ancillary domain. These findings expand our molecular understanding of the mechanisms of substrate binding and recognition in the GH26 family and shed light on how some CBMs and their respective orientation can contribute to substrate preference.

␤-Mannanases from the glycoside hydrolase 26 (GH26) family are retaining hydrolases that are active on complex heteromannans and whose genes are abundant in rumen metagenomes and metatranscriptomes. These enzymes can exhibit distinct modes of substrate recognition and are often fused to carbohydrate-binding modules (CBMs), resulting in a molecular puzzle of mechanisms governing substrate preference and mode of action that has not yet been pieced together. In this study, we recovered a novel GH26 enzyme with a CBM35 module linked to its N terminus (CrMan26) from a cattle rumen metatranscriptome. CrMan26 exhibited a preference for galactomannan as substrate and the crystal structure of the full-length protein at 1.85 Å resolution revealed a unique orientation of the ancillary domain relative to the catalytic interface, strategically positioning a surface aromatic cluster of the ancillary domain as an extension of the substrate-binding cleft, contributing to galactomannan preference. Moreover, systematic investigation of nonconserved residues in the catalytic interface unveiled that residues Tyr 195 (؊3 subsite) and Trp 234 (؊5 subsite) from distal negative subsites have a key role in galactomannan preference. These results indicate a novel and complex mechanism for substrate recognition involving spatially remote motifs, distal negative subsites from the catalytic domain, and a surface-associated aromatic cluster from the ancillary domain. These findings expand our molecular understanding of the mechanisms of substrate binding and recognition in the GH26 family and shed light on how some CBMs and their respective orientation can contribute to substrate preference.
Endo-␤-1,4-mannanases from GH26 family display distinct preference for heteromannans. For instance, YpMan26A, BoMan26B, and AnMan26A prefer galactomannans, whereas RsMan26C, BoMan26A, PaMan26A, and CjMan26A showed higher activity on glucomannans (6 -8). However, the molecular determinants governing the substrate preference in GH26 ␤-mannanases are only partially understood. The diverse nature of the substrate-binding sites in this family have led to the discovery of a wide range of molecular strategies for heteromannan recognition and depolymerization from endo-␤-mannanases with distinct modes of action (8 -11) to mannobioseproducing exo-␤-mannanases (7,12). The presence of ancillary domains in these enzymes adds a further complexity to the mechanisms of specificity and cleavage of heteromannans, which is yet elusive.
In this work, we revealed key motifs for galactomannan preference of a novel cattle rumen GH26 member, which involves both distal negative subsites and an aromatic cluster from the CBM35 ancillary domain. The ancillary domain is fused to the N-terminal region of the catalytic domain and adopts a unique geometric orientation in relation to the catalytic interface positioning an aromatic cluster seamlessly aligned to the substratebinding subsites, which have implications in the mechanism of substrate binding and preference. This study proposes a complex and cooperative effect between distal negative subsites and ancillary domain to modulate substrate preference in the GH26 family.

A novel GH26 endo-␤-1,4-mannanase retrieved from cattle rumen metatranscriptome
The transcript sequence named here as CrMan26 (cattle rumen mannanase from GH26 family; accession number MT026709) was recovered from a proprietary database derived from a cattle rumen metatranscriptome de novo assembly. This enzyme was predicted as a GH26 member with a CBM35 module based on a dbCAN2 (13) hidden Markov model-based search, which was further validated using Inter-ProScan 5 (14), PDBSum (15), DETECT2 (16), and BLAST searches against NCBI nonredundant and CAZy databases (17) (Table S1).

Substrate recognition and preference in GH26 family
CrMan26 present a CBM35 module and are from Firmicutes phylum distributed in the genus Ruminococcus, Hungateiclostridium, Clostridium, Ruminiclostridium, Paenibacillus, and Lachnoclostridium. CrMan26 is the first structure to be elucidated from this group (Fig. 1B). Other members of GH26 family with experimentally determined structures are in distinct branches, and the closest ones are from Podospora anserina (PDB code 3ZM8 and SEQID: 35%), Bacteroides ovatus (PDB code 6HF2 and SEQID: 33%), Yunnania penicillata (PDB code 6HPF and SEQID: 33%), and Reticulitermes speratus (PDB code 3WDQ and SEQID: 31%) (Figs. 1A and 2). These enzymes have distinct substrate preference and share low sequence similarity that reflects in several amino acid substitutions in the catalytic interface and accessory domain, hampering the understanding of the molecular determinants for substrate selectivity and modes of action. . Sequence alignment of CrMan26 with endo-␤-1,4-mannanases from family GH26 with structural data available at the PDB. Secondary structure elements of CrMan26 are displayed above the alignment. The symbol refers to a 3 10 -helix. Identical residues are shown in white on a black background. Catalytic residues are marked with a star. The typical aromatic-rich motif (WFWWG) of the GH26 family is marked with an arrow, and mutated residues of CrMan26 are marked with asterisks. The sequences were aligned using CLUSTALW, and the image was generated using ESPript3 Webserver (37).

CrMan26 displays preference for galactomannan as substrate
The ORF corresponding to CrMan26 (CBM35_GH26) comprises 1464 bp encoding a 488-amino acid protein with a molecular mass of 56 kDa. The CrMan26 fused to an N-terminal His tag was recombinantly expressed in Escherichia coli BL21(DE3) and purified to homogeneity. Activity assays with polysaccharides showed that CrMan26 is active on heteromannans, with higher activity on carob galactomannan (CGM), followed by konjac glucomannan (KGM) and guar galactomannan (GGM) (Fig. S1A). Similar to other bovine rumen CAZymes, CrMan26 exhibits an optimum pH range and temperature of 4.5-7.5 and 45°C, respectively (Fig. S1, B and C). Despite the broad pH plateau for catalytic activity, enzyme thermal transition temperature (T m ) is significantly affected by pH with a reduction of ϳ7°C from pH 6.0 to 7.4, indicating a lower enzyme stability under basic pH levels (Fig. S1D).
Kinetic characterization of CrMan26 with CGM and KGM resulted in turnover numbers of 333 Ϯ 6.5 and 140 Ϯ 1.7 s Ϫ1 , respectively ( Table 1). The enzyme showed similar affinity for these heteromannans with K m of 1.62 and 0.96 mg ml Ϫ1 for CGM and KGM, respectively. CrMan26 was also able, although with lower catalytic efficiency, to cleave linear mannan, resulting in a turnover number of 46.58 Ϯ 3.3 s Ϫ1 and K m of 6.09 Ϯ 0.86 mg ml Ϫ1 (Table 1 and Fig. S2). These results indicate that CrMan26 is well-adapted to cope with mannans decorated with ␣-1,6-linked galactopyranosyl residues. A GH26 member from B. ovatus (BoMan26B) showed similar capacity, but it does not contain an ancillary domain. On the other hand, the enzyme from P. anserina (PaMan26A) shares similar domain architecture with an N-terminal CBM35, but it displays a clear preference for undecorated mannans, contrasting to CrMan26. These observations indicate distinct molecular strategies among GH26 members for substrate preference.

Modular organization and conformation
To shed light on the molecular determinants for substrate preference of CrMan26, its crystal structure was determined at 1.85 Å resolution ( Table 2). The overall structure of CrMan26 comprises a classical (␣/␤) 8 catalytic domain as expected for a GH26 member with an N-terminal CBM35 (Fig. 3A). The catalytic residues (Glu 331 and Glu 446 ) were inferred by structural comparisons, and the subsites Ϫ1 and ϩ1 are strictly conserved as in other structures available at the PDB (1GVY, 2BVT, 2QHA, 2WHK, 3WDQ, 3ZM8, 6HF2, and 6HPF) ( Table 3 and Fig. 2). According to Z-score analysis (18), CrMan26 showed the highest structural similarity to RsMan26C (PDB code 3WDQ) followed by PaMan26A (PDB code 3ZM8) ( Table 4). Although with a lower Z score, PaMan26A presented higher sequence similarity with CrMan26 probably because of the presence of the CBM35 module, which is not observed in other GH26 members that had only the catalytic domain crystallized.
When compared with its closest homologue (PaMan26A), the ancillary domain CBM35 from CrMan26 (CrCBM35) is positioned closer to the catalytic domain, presenting a distinct relative orientation (Fig. 4, A and B). The ancillary domain is connected to the catalytic domain by a 12-residue-long linker starting at residue Ala 151 at the end of the last ␤-strand of the CBM domain and includes a short 3 10 helix element. Unlike PaMan26A (9), in which the linker rigidity was attributed to the high relative number of proline residues, CrMan26 linker has only one proline. Despite that, the CrMan26 linker displayed lower relative B-factor values than PaMan26A linker, suggesting a more rigid conformation of CrMan26. The interface between the catalytic domain and the ancillary module beyond being favored by the short linker is also supported by multiple hydrophobic contacts and electrostatic interactions between Asp 76 , Arg 124 , and Glu 459 (Fig. 3B). The residue Phe 156 from the linker is buried in a hydrophobic nest from the catalytic domain Table 1 Kinetic

parameters of WT CrMan26 and mutants for CGM and low viscosity KGM hydrolysis
The parameters k cat and K m were calculated from the nonlinear fitting with the Michaelis-Menten function. The values are given as mean values of four replicates Ϯ S.D. One-way analysis of variance along with Tukey post hoc test was performed (95% confidence interval). Kinetics curves are presented in Figs. S2 and S3.

Substrate
Enzyme

Substrate recognition and preference in GH26 family
formed by the residues Phe 429 , Leu 432 , Leu 433 , and Ala 462 . Another aromatic residue of the linker, Phe 153 , establishes a stacking interaction with His 460 from the short 3 10 helix protruding from the catalytic domain (Fig. 3B). In addition, two methionine residues (Met 78 and Met 122 ) form the hydrophobic core between the catalytic domain and the CBM35 strengthening the interface. The calculated area of the interface is ϳ 550 Å 2 , and the ⌬iG p value (a measure of interface specificity), according to PISA webserver, is 0.168, supporting that this interface is interaction-specific. Moreover, the analysis of crystal packing did not indicate any relevant contact with neighboring molecules that would justify the conformation adopted by the ancillary domain in relation to the catalytic domain.
The CrMan26 conformation was further validated in solution by small angle X-ray scattering (SAXS) exhibiting a high agreement between the theoretical scattering calculated from the crystallographic structure and the experimental in solution scattering data (Fig. 3, C-F). Kratky plot analysis (Fig. 3F) indicates low flexibility, and the ab initio molecular envelope (Fig.  3C) is fully compatible with the CrCBM35 positioning in relation to the catalytic domain, discarding the hypothesis of the catalytic domain, and CBM35 association would be an artifact of crystallization. The comparison of the PaMan26A conformation with CrMan26 SAXS data resulted in higher 2 values and worse normalized spatial discrepancy, indicating that the in-solution conformation of CrMan26 does not resemble that of PaMan26A (Table S2).
The observed conformation of the ancillary domain results in a precise structural alignment of a surface aromatic cluster with the substrate-binding subsites from the catalytic domain (Fig. 4A). Similar intermodule interactions were

Substrate recognition and preference in GH26 family
observed in a GH9 endo-␤-1,4-glucanase in which the ancillary module is aligned with the catalytic cleft, presumably forming one functional entity (20). CrMan26 aromatic cluster consists of the tyrosine residues Tyr 46 , Tyr 138 , and Tyr 140 and forms an extension of the catalytic cleft increasing the substrate recognition interface in ϳ36 Å in length (Fig. 4A). These three tyrosine residues are yet adopting rotameric configurations that favor all of them to make CHstacking interactions simultaneously with the substrate (Fig. 4C). In PaMan26A, an equivalent aromatic cluster was identified in the surface of its ancillary domain, being formed by two tryptophan residues (Trp 117 and Trp 119 ) and a phenylalanine (Phe 87 ) (Fig. 4, B and D). The orientation of the ancillary domain in PaMan26A is in a conformation unlikely for simultaneous binding of the substrate to both subsites from the catalytic domain and the surface aromatic cluster from CBM35 (Fig. 4B). In addition, the arrangement of these residues forming the hydrophobic cluster in the CBM35 of PaMan26A is not in a favorable stereochemical configuration to make CH-stacking interactions simultaneously (Fig. 4D). Therefore, the CBM35 from PaMan26A performs a similar role of extension in the substrate recognition interface as observed for CrMan26; it requires a large angular reorientation of the ancillary domain of ϳ60° (Fig. 4E) and further conformational changes of the aromatic residues in the CBM35 to allow simultaneous CH-stacking interactions with the substrate. This finding suggests a functional implication of the ancillary domain in CrMan26, which was further demonstrated by mutational studies.

Identification of key residues involved in substrate preference
CrMan26 mutants were designed to investigate the role of specific residues not conserved in the homologous structure of PaMan26A (PDB code 3ZM8) for substrate recognition. For this purpose, five mutants were generated, two of them representing mutations in distal negative subsites, another two representing mutations in positive subsites, and one triple mutant of CBM35 domain corresponding to the surface aromatic cluster (Y46A/Y138A/Y140A mutant) (Fig. 4F). The W234A, Y195A, Y410A, and W384A mutants correspond to the subsites Ϫ5, Ϫ3, ϩ2, and ϩ3, respectively (Fig. 4F).
The kinetic parameters were evaluated against CGM, KGM, and 1,4-␤-D-mannan (MAN) for WT CrMan26 (Fig. S2) and against the two substrates that CrMan26 presented with higher catalytic efficiency (CGM and KGM) for all mutants (Table 1 and Fig. S3). The most pronounced effect was observed when the distal negative subsites were mutated, with a k cat increase of 4-fold on KGM and a discreet increase on CGM of 1.5-fold (Table 1). It indicates that these residues promote steric conditions to increase the performance on galactose-decorated mannans at a cost of reducing catalytic turnover of the WT enzyme. BsMan26A (PDB code 2WHK) also presents a tyrosine (Tyr 40 ) forming the Ϫ3 subsite, but the corresponding residue for Trp 234 is not conserved (11). The opposite is observed in RsMan26C (PDB code 3WDQ), where the residue corresponding to Trp 234 is present (Trp 94 ) and the Tyr 195 equivalent is not conserved (8). In both cases, the enzymes presented higher activity against glucomannan as substrate. Thus, we can pro-

Substrate recognition and preference in GH26 family
pose that the presence of both residues (Tyr 195 and Trp 234 ) in CrMan26 are essential to drive the substrate preference toward galactomannans. On the other hand, positive subsites mutations did not result in a change in substrate preference, and the kinetic parameters were modified to a lesser extent ( Table 1).
The triple mutant of the ancillary module also promoted a change toward a lack of substrate preference, with increased performance on KGM and reduction in the k cat on CGM compared with the WT enzyme. This fact raised the hypothesis that the tyrosine cluster of CrCBM35 could contribute to the binding of galactose-decorated mannans. Despite the identification of this surface aromatic cluster in PaMan26A, neither biochemical nor mutational studies were conducted to demonstrate its functional role. Therefore, to investigate our hypothesis and get more insights into the role of the surface aromatic cluster of CBM35 in substrate binding, affinity gel electrophoresis (AGE) with KGM (Glc:Man ratio of 1:1.5), CGM (Glc:Man ratio of 1:4), and guar galactomannan (GGM, Gal:Man ratio of 1:2) were performed. The results obtained indicated a binding preference of CrMan26 to the highly substituted GGM compared with CGM and KGM (Fig. 5). Thus, AGE with GGM was also performed with the isolated CBM (WT and triple mutant, Fig.  S4), demonstrating that the tyrosine cluster from the ancillary module contribute to the affinity to heteromannans decorated with galactosyl moieties. These results corroborate the hypothesis that the ancillary module plays a role in substrate preference of CrMan26.

Discussion
This study focused on the molecular understanding of the catalytic properties of a cattle rumen endo-␤-1,4-mannanase from GH26 family naturally designed to favor galactomannan hydrolysis. CrMan26 ancillary domain presents a stable attachment to the catalytic domain, which was not observed in any other structurally characterized GH26 mannanase so far. This unique arrangement allows the placement of a surface aromatic cluster aligned to the catalytic interface, which was further demonstrated by mutagenesis to play a role in substrate recognition, especially for galactose-decorated heteromannans.
CrMan26, RsMan26C (8), and BoMan26B (21) share a similar open active-site cleft, which is extended to the subsite Ϫ5. Mutation of distal negative subsites in CrMan26 resulted in a change of substrate preference from galactomannan to glucomannan. Interestingly, enzymes that lack one of the mutated residues display preference over glucomannan (8,11). It indicates that both residues are essential for galactomannan selectivity.
As observed for CrMan26, BoMan26B from the human gut bacterium B. ovatus ATCC 8483 was more active on galactomannan than glucomannan (7). Comparative structural analysis revealed that the two key residues from Ϫ5 (Trp 234 ) and Ϫ3 subsites (Tyr 195 ) are also present in the corresponding subsites in BoMan26B (Trp 112 and Tyr 317 at Ϫ5 and Ϫ3 subsites, respectively), indicating a similar role in driving substrate preference toward galactomannan (Fig. 6, A and B). Variants of the Ϫ5 subsite (Trp 112 ) of BoMan26B showed ϳ20-fold lower catalytic efficiency compared with the WT enzyme, which was attributed to a reduction of CH-stacking interactions at this position (21). The same effect was not observed for Trp 234 variant of CrMan26, suggesting that the effect of this mutation on CrMan26 activity was likely compensated by the surface aromatic cluster from the ancillary domain.
By superimposing CrMan26, BoMan26B (7), and RsMan26C (8) structures, we can verify that the CrMan26 active-site cleft can likely accommodate a Ϫ5 to ϩ2 mannose/glucose-derived substrate, including galactosyl decorations (Fig. 6A). The ability to accommodate galactosyl side groups seems to be only impaired at the Ϫ2 subsite in CrMan26 as observed in BoMan26B (21) because of the presence of a tyrosine residue (Tyr 470 ) (Fig. 6B).
In summary, our work reveals a novel mechanism for substrate preference in the GH26 family that involves the cooperative effect of two spatially remote motifs: the distal negative subsites and the CBM35 surface aromatic cluster. The latter is precisely aligned to the active-site cleft acting as an extension of the substrate recognition interface.

Metatranscriptome analysis
A proprietary database derived from cattle rumen metatranscriptome de novo assembly was used as a source of lignocellulose-degrading enzymes. Putative CAZymes were annotated using hidden Markov models from dbCAN database (13), and proteins predicted as complete coding sequences were further analyzed. Among those, a transcript sequence encoding an endo-␤-1,4-mannanase predicted as a member of the GH26 family with a CBM35 module (CrMan26) was selected for further biochemical and structural analyses.

Phylogenetic analysis
The sequences of the catalytic domain of GH26 endo-␤-1,4mannanases from CAZy database (17) assigned with EC number 3.2.1.78 were retrieved. To reduce redundancy, the data set was clustered at 90% sequence identity using CD-HIT (22). A multiple sequence alignment was performed using MAFFT (23), including the clustered data set and all GH26 characterized sequences. The evolutionary relationships of CrMan26 and other endo-␤-1,4-mannanases was inferred by maximum likelihood method based on the Jones-Taylor-Thornton matrix-based model using FastTree (24). The phylogenetic tree was visualized and annotated using the iTol web tool.

Gene cloning, protein expression, and purification
The CrMan26 sequence was synthesized by GenScript without the predicted signal peptide (18 N-terminal residues) and

Substrate recognition and preference in GH26 family
cloned into the pET28a (ϩ) vector with an N-terminal His 6 tag using the restriction enzymes NheI and BamHI. The CrMan26-pET28a expression vector was transformed into E. coli BL21(DE3) cells. The cells were grown in 1 liter of LB medium at 37°C until A 600 nm ϳ 0.6; then the induction was performed with 0.3 mM isopropyl-D-thiogalactopyranoside at 37°C for 4 h. The cells were harvested by centrifugation at 8,000 ϫ g (25 min at 4°C) and lysed with 20 ml of buffer A (20 mM sodium phosphate buffer, 150 mM NaCl, and 5 mM imidazole, pH 7.4), 1 mM phenylmethylsulfonyl fluoride, 7 mM sodium deoxycholate, 1 mg/ml lysozyme, and 75 g/ml DNase. The cell extract was clarified by centrifugation (20,000 ϫ g for 30 min at 4°C), and the supernatant was loaded onto a 5-ml Hi-Trap chelating HP column (GE Healthcare) coupled to an ÄKTA system (GE Healthcare) and pre-equilibrated with buffer A. The bound proteins were eluted with a nonlinear gradient of buffer B (20 mM sodium phosphate, 150 mM NaCl, and 500 mM imidazole, pH 7.4). The eluted fractions were analyzed by denaturing PAGE (SDS-PAGE), and those containing pure proteins were pooled, concentrated by filtration, and submitted to size-exclusion chromatography using a Superdex 75 16/60 column (GE Healthcare) previously equilibrated with 20 mM sodium phosphate, 150 mM NaCl (pH 7.4), at a flow rate of 1 ml/min. Protein concentration was estimated measuring the absorbance at 280 nm using the molecular mass of CrMan26 (56 kDa) and extinction coefficient (144,870 M Ϫ1 cm Ϫ1 ). Two nucleotide primers (forward 5Ј-TATAGCTAGCGCGGAGGAATACACCAAAT-3Ј; reverse 5Ј-TATAGGATCCTTAGGTCGCTTCCTCCA-3Ј) were designed to amplify the ancillary domain (WT and triple mutant). Both CBM versions were cloned, transformed, expressed, and purified as mentioned above for CrMan26.

Enzyme assays
The enzymatic activity of CrMan26 was determined from the amount of reducing sugar released from low-viscosity CGM, low-viscosity KGM, and high-viscosity KGM and MAN (all purchased from Megazyme) according to the DNS method (25). The optimum pH and temperature of CrMan26 was determined with 0.5% (w/v) of CGM, 50 ng of purified enzyme, 100 mM McIlvaine buffer (pH 3.0 -8.0), and a temperature ranging from 20 to 75°C for 20 min. All experiments were carried out in triplicate.

Differential scanning fluorescence
Differential scanning fluorescence measurements were performed on a ViiA 7 real-time PCR system (Thermo Fisher). The experiment was conducted with a total volume of 25 l of mixtures containing SYPRO Orange ϫ1000 (Sigma-Aldrich), 20 mM sodium phosphate buffer (pH 6.0 or 7.4), 150 mM NaCl, and 2 M of purified CrMan26. Compounds were mixed on ice in a 96-well PCR plate, and fluorescence emission was measured at 580 nm from 20 to 95°C in 1°C/1 min steps. Measurements were carried out in triplicate. Data analysis and melting temperature determination were performed using Origin 8.1 (OriginLab Corporation, Northampton, MA).

Protein crystallization
Crystals of CrMan26 were grown using the hanging-drop vapor-diffusion method at 18°C in drops containing 1.0 l of protein solution at 20 mg/ml and 1.0 l of the crystallization condition equilibrated against 200 l of crystallization condition in 48-well plates. The enzyme was crystallized in the condition consisting of 18% (v/v) PEG 8000, 10% (v/v) PEG 400, 0.1 M sodium acetate (pH 5.5), 500 mM NaCl, 10% (v/v) glycerol, and 0.5% (v/v) dioxane.

Data collection and structure determination
X-ray diffraction data were collected at the MX2 Beamline from the Brazilian Synchrotron Light Laboratory (Campinas, Brazil). The data were indexed, integrated, and scaled using the XDS package (26). The crystal structure was solved by molecular replacement using the PHASER software (27) with the atomic coordinates of PaMan26A (PDB code 3ZM8) as a search model. The structure was refined with PHENIX.REFINE (28), with alternated cycles of manual modeling and inspection with COOT (29). The final model was validated with MolProbity server (30). Data collection, processing, and refinement statistics are summarized in Table 2.
Small angle X-ray scattering SAXS measurements were performed at the SAXS1 Beamline from the Brazilian Synchrotron Light Laboratory (Campinas, Brazil). The wavelength was set to 1.488 Å, and X-ray scattering was recorded using a Pilatus 300K (Dectris, Baden, Switzerland). The sample-to-detector distance was adjusted to a scattering-vector range of 0.01 Ͻ q Ͻ 0.5 Å Ϫ1 , where q is the

Substrate recognition and preference in GH26 family
magnitude of the q vector defined by q ϭ 4sin/ (2 is the scattering angle). Tests with exposure times of 10 s were performed to evaluate radiation damage with different protein concentrations up to 10 mg ml Ϫ1 . One frame of 300 s (of sample at 10 mg ml Ϫ1 ) was recorded and used for calculations, after subtracting the sample buffer scattering. The errors from the scattering curve were calculated with the program ERREST, developed by Prof. D. Svergun. The distribution curve p(r) was calculated using the GNOM package (31) and was used to estimate the radius of gyration (R g ). Molecular envelopes were calculated from the experimental SAXS data using DAMMIN (32) and averaged using the DAMAVER package (33). Theoretical scattering curve of the CrMan26 crystallographic structure was calculated and compared with the experimental SAXS curve using CRYSOL (34). The structure fitting into the SAXS molecular envelope was performed using SUPCOMB (35). Data collection parameters and SAXS statistics are detailed in Table S2.

Mutagenesis
Five CrMan26 mutants (Y195A, Y410A, W384A, W234A, and Y138A/Y140A/Y46A) were synthesized by the company GenScript, using the WT enzyme as template. The expression and purification of the mutants followed the same steps as described for the WT enzyme. Enzyme assays of the mutants were performed at optimum pH and temperature, and the amount of reducing sugar released was quantified by the DNS method (25).

Enzyme kinetics
The kinetic constants of WT CrMan26 were determined by the hydrolysis of MAN, CGM, and KGM (low viscosity) in the concentration range from 0.5 to 16 mg/ml using the DNS assay as described above. For the mutants, the kinetics parameters were determined using decorated mannans (CGM and KGM). The enzyme concentrations used for the hydrolysis were 50 ng for WT, Y410A, W384A, and Y138A/Y140A/Y46A and 25 ng for Y195A and W234A. The v 0 /[E] t was plotted as a function of the substrate concentration, and the kinetic parameters were obtained by nonlinear regression analysis of the Michaelis-Menten plot using the program Origin 8.1.

Ligand-binding assays
The capacity of CrMan26 to bind galacto-and glucomannans was evaluated by AGE, as described by Correia et al. (36). Briefly, continuous native polyacrylamide gels consisted of 7.5% (w/v) acrylamide in 25 mM Tris, 250 mM glycine buffer (pH 8.3) with 0.1% of each polysaccharide. 10 g of the WT enzyme/ mutants and BSA (negative control) were loaded on the gels and subjected to electrophoresis at room temperature for 2 h and 60 mA. Proteins were visualized by Coomassie Blue stain. AGE was also performed with the isolated CBM35 (WT and triple mutant) under similar conditions.

Data availability
Nucleotide and protein sequences have been deposited in the NCBI GenBank TM under accession number MT026709. The atomic coordinates and structure factors (code 6UEH) have been deposited in the Protein Data Bank.