The Quaternary Structure of a Glycoside Hydrolase Dictates Specificity toward β-Glucans*

In the Carbohydrate-Active Enzyme (CAZy) database, glycoside hydrolase family 5 (GH5) is a large family with more than 6,000 sequences. Among the 51 described GH5 subfamilies, subfamily GH5_26 contains members that display either endo-β(1,4)-glucanase or β(1,3;1,4)-glucanase activities. In this study, we focused on the GH5_26 enzyme from Saccharophagus degradans (SdGluc5_26A), a marine bacterium known for its capacity to degrade a wide diversity of complex polysaccharides. SdGluc5_26A displays lichenase activity toward β(1,3;1,4)-glucans with a side cellobiohydrolase activity toward β(1,4)-glucans. The three-dimensional structure of SdGluc5_26A adopts a stable trimeric quaternary structure also observable in solution. The N-terminal region of SdGluc5_26A protrudes into the active site of an adjacent monomer. To understand whether this occupation of the active site could influence its activity, we conducted a comprehensive enzymatic characterization of SdGluc5_26A and of a mutant truncated at the N terminus. Ligand complex structures and kinetic analyses reveal that the N terminus governs the substrate specificity of SdGluc5_26A. Its deletion opens the enzyme cleft at the −3 subsite and turns the enzyme into an endo-β(1,4)-glucanase. This study demonstrates that experimental approaches can reveal structure-function relationships out of reach of current bioinformatic predictions.

To date, with more than 6,600 available sequences in the CAZy database, family GH5 is one of the largest. Enzymes in this family are retaining glycoside hydrolases that operate via a classical Koshland double-displacement mechanism (5). The first crystallographic structure of the GH5 family was solved in 1995 (6). It revealed a (␤/␣) 8 barrel found in other GH families that belong to the structural clan GH-A. Even if enzymes from family GH5 are predicted to be mainly involved in plant cell wall degradation, assignment of enzyme specificity is still complex. Indeed, up to 20 different activities have been reported for this large family (7). Recently, a subdivision into 51 subfamilies has been implemented to improve correspondence between specificity and sequence (7).
Herein, we show that SdGluc5_26A is a lichenase with ␤(1,3; 1,4)-glucanase and side cellobiohydrolase activity. By using structural and enzymatic characterization as well as mutational analyses, we found that the substrate specificity of SdGluc5_ 26A relies on its quaternary structure.

Experimental Procedures
Cloning of SdGluc5_26A and Mutants-SdGluc5_26A was amplified by PCR using Pfx polymerase (Invitrogen, Saint-Aubin, France) and Sde2-40 genomic DNA as template. Forward and reverse primers were designed to use the Gateway PCR cloning technology (Life Technologies). Primers used for the wild-type cloning were: forward 5Ј-GGGGACAAGTTTGTA-CAAAAAAGCAGGCTTAGAAAACCTGTACTTCCAGGGT-GCAAATAACAGCGCCCCA-3Ј, reverse 5Ј-GGGGACC-ACTTTGTACAAGAAAGCTGGGTCTTATTAGCGTTTTT-TAGCTTCTAGCATAACC-3Ј (italic letters indicate the sequence needed for the Gateway cloning protocol). The amplified PCR product was inserted into the pDEST TM 17 vector. The encoding native signal peptide was eliminated in the construction. The same protocol with a different reverse primer was applied to clone the mutant SdGluc5_26A⌬S38 (5Ј-GGG-GACAAGTTTGTACAAAAAAGCAGGCTTAGAAAACCTGT-ACTTCCAGGGTAGCCAATTCGATGTAAAAAGC-3Ј). The E291Q mutation was performed using the QuikChange sitedirected mutagenesis kit following the manufacturer's instructions (Stratagene, Santa Clara, CA). The primers used were: forward 5Ј-CCTGTAATAGCAACACAGCTAGGCTGGGT-ACAAC-3Ј and reverse 5Ј-GTTGTACCCAGCCTAGCTGT-GTTGCTATTACAGG-3Ј (the mutated codon for the punctual mutation is underlined).
SdGluc5_26A and Mutant Production and Purification-All the constructs were produced and purified by following the same protocols. Recombinant proteins were produced in Escherichia coli BL21(DE3) pLysS cells. ZYP5052 medium was chosen for bacterial growth (15). After 3 h at 37°C to reach exponential growth phase, the temperature was reduced to 17°C for an overnight protein production. Cells were harvested by centrifugation at 4,000 ϫ g during 10 min, and the pellet was stored at Ϫ80°C until purification. Cells were resuspended in lysis buffer (50 mM Tris-HCl, pH 8.0, 300 mM NaCl, 5 mM DTT, 1% (w/v) Triton X-100, 1 mM PMSF, 0.25 mg Ϫ1 ml Ϫ1 DNase, 10 mM MgCl 2 , 0.25 mg Ϫ1 ml Ϫ1 lysozyme) and then sonicated at 4°C (sonicated for 30 s three times, interspaced by 1-min breaks). The crude extract was centrifuged (11,000 ϫ g during 30 min at 4°C), and the supernatant was loaded onto a preequilibrated HisTrap TM 5-ml column (GE Healthcare Life Science, Vélizy-Villacoublay, France). The column was washed first with buffer A (50 mM Tris-HCl, pH 8.0, 300 mM NaCl, and 50 mM imidazole) and then washed a second time by adding 10% buffer B (50 mM Tris, pH 8.0, 300 mM NaCl, and 500 mM imidazole). Finally, the protein was eluted with 50% buffer B. The eluted fractions were pooled and injected onto a HiLoad TM 26/60 Superdex TM 200 column (GE Healthcare Life Science). Size-exclusion chromatography was performed with buffer containing 50 mM Tris-HCl, pH 8.0, and 300 mM NaCl. SdGluc5_26A was concentrated with centrifugal filter units (30-kDa cut off, Millipore, Darmstadt, Germany) to reach the concentration of 13 mg Ϫ1 ml Ϫ1 used for crystallization trials. Selenomethionine-labeled protein was produced in M9 minimum medium with selenomethionine and amino acids known to inhibit the pathway of methionine biosynthesis (16). After 3 h at 37°C, the culture turbidity was checked by measuring absorbance at 600 nm. When the exponential growth phase was reached (A 600 ϭ 0.6), protein production was induced by adding 1 mM isopropyl-1-thio-␤-D-galactopyranoside, and the temperature was decreased to 17°C for overnight growth. Cells were harvested by centrifugation at 4,000 ϫ g during 10 min, and the pellet was stored at Ϫ80°C until purification. The purification protocol was the same as for SdGluc5_26A.
Biochemical Characterization-The protein concentration was determined using a Thermo Scientific NanoDrop 2000 spectrophotometer (Thermo Scientific, Illkirch, France) by measuring absorbance at 280 nm. The oligomerization state was confirmed by performing size-exclusion chromatography coupled with multi-angle laser light scattering (SEC-MALLS, Wyatt technology, Santa Barbara, CA) experiments. For this purpose, 100 l of enzyme at 5 mg Ϫ1 ml Ϫ1 were dialyzed overnight in 50 mM Tris-HCl, pH 7.0, and 300 mM NaCl. After overnight equilibration of the column (Shodex TM column KW803), 30 l of sample were injected. The elution was carried out during 47 min at a flow rate of 0.5 ml Ϫ1 min Ϫ1 . Results were analyzed by the ASTRA software (Wyatt Technology). The oligomeric state stability of SdGH5_26A and SdGH5_26A⌬S38 mutant was verified by size-exclusion chromatography on a 26/60 S200 column. The peak corresponding to the monomer was further injected onto a 10/30 Superdex TM 200 column (GE Healthcare Life Science), eluted, and concentrated in a buffer containing 50 mM Tris-HCl, pH 8.0, and 300 mM NaCl.
Crystallization and Data Collection-Crystallization trials were performed using the sitting-drop vapor diffusion method using a mosquitoCrystal (TTP Labtech) crystallization robot and a protein concentration of 13 mg Ϫ1 ml Ϫ1 . Cubic and bipyramidal shaped crystals were obtained in several conditions arising from various commercial screening kits. Finally, the best diffracting crystals of native SdGluc5_26A were obtained in the presence of 0.2 M ammonium sulfate, 0.1 M sodium cacodylate buffer, pH 6.0, and 25% (w/v) PEG 8000. The crystals belong to cubic space group P4 1 32 with average unit cell axes of 143 Å and one molecule per asymmetric unit. Crystals of selenomethionine-labeled protein were obtained from 25% (w/v) PEG 3350, 0.2 M MgCl 2 , and 0.1 M Bis-Tris buffer, pH 6.5. These crystals belong to space group P4 1 2 1 2 with unit cell dimensions 143 ϫ 143 ϫ 136 Å and three molecules per asymmetric unit. Crystals of SdGluc5_26A-E291Q used for substrate soaking were grown in the same conditions as the native enzyme, and soaking was performed by the addition of small amounts of powder of G4, G4A, or G4B to the crystallization droplet, followed by incubation for approximately 1 h. For co-crystallization experiments, SdGluc5_26A-E291Q was mixed with 15 mM (final concentration) of G4A, and the best diffracting crystals where obtained from 30% (w/v) PEG 4000, 0.1 M Tris-HCl buffer, pH 7.5, and 0.2 M MgCl 2 . These crystals belong to space group P2 1 with unit cell parameters 72 ϫ 60 ϫ 130 Å, ␤ ϭ 104°and three molecules per asymmetric unit. A second crystal form in space group C2 and unit cell dimension of 188 ϫ 132 ϫ 131 Å, ␤ ϭ 134°was obtained in the same crystallization condition. All crystals were cryo-protected with mother liquor containing 25% (v/v) glycerol prior to flash-cooling in liquid nitrogen. X-ray diffraction data for all crystals were collected on beamline Proxima1 at the Synchrotron SOLEIL (Gif-sur-Yvette, France). Diffraction data were indexed with XDS (17) and scaled with the program SCALA (18).
Structure Determination and Refinement-The sub-structure of selenium-labeled SdGluc5_26A was determined with the program ShelxD (19), and phase calculations and solvent flattening were carried out with ShelxE (20). The protein chain was traced automatically with the program Buccaneer (21), and refinement and model adjustment were carried out with the programs Refmac (22) and Coot (23), respectively. Random sets of ϳ5% reflections were set aside for cross-validation purposes. The composition of cross-validation data sets was systematically taken over from the parent data set of the equivalent space group. Model quality was assessed using the MolProbity server (24).  (25).
Enzyme Assays-Enzyme activity on the library of substrates was monitored using the dinitrosalicylic acid (DNS) assay (26) and Dionex ICS 3000 high performance anion exchange chromatography (HPAEC) coupled with pulsed amperometric detection (PAD) (HPAEC-PAD) and equipped with a CarboPac PA-10 column (Thermo Scientific). Unless otherwise indicated, assay mixtures contained substrate and suitably diluted enzyme in McIlvaine's buffer, pH 7.0, at 30°C. Briefly, 20 l of recombinant enzyme were mixed with 100 l of polymeric substrate (5 mg Ϫ1 ml Ϫ1 ) or oligosaccharides (100 M). The reaction was stopped by the addition of 120 l of DNS followed by a 5-min incubation at 100°C. For HPAEC analysis, the reaction was stopped using 8 M urea. The reaction mixtures were then transferred to a microfiltration plate and centrifuged during 2 min at 1,450 ϫ g, and 20 l of the mixture were injected. The elution was carried out in 130 mM NaOH using a multi-step linear gradient program as follows. The first step was a 10-min linear gradient from 100% A (130 mM NaOH) to 95% A and 5% B (500 mM NaOAc, 130 mM NaOH). The second step was a 10-min linear gradient from 95% A and 5% B to 85% A and 15% B, and the third step was a 5-min linear gradient from 85% A and 15% B to 83.75% A and 16.25% B. The optimal pH was estimated using lichenan as substrate at a concentration of 10 mg Ϫ1 ml Ϫ1 in McIlvaine's buffer (pH from 4.0 to 8.0) and in 100 mM Tris-HCl buffer (pH from 8.0 to 10.0). The optimal temperature was estimated at temperatures ranging from 4 to 80°C. Specific activities and kinetics parameters toward the different complex polysaccharide substrates were measured using the DNS assay as described above. The determination of Michaelis-Menten constants on lichenan, barley ␤-glucan, and CMC was measured with substrate concentrations ranging from 1 to 20 mg Ϫ1 ml Ϫ1 . All assays were carried out in triplicate. The specific activities are expressed in mol of sugar released per min per mg of enzyme, whereas the kinetic parameters were estimated using weighted nonlinear squares regression analysis using the GraFit program (Erithacus Software, Horley, UK). Monosaccharides and oligosaccharides generated after hydrolysis by the recombinant enzymes of the different polymeric substrates (lichenan, barley ␤-glucan, and CMC) and of the different cello-oligosaccharides (G3 to G6) and ␤(1,3;1,4)- were analyzed by HPAEC as described above. Calibration curves were constructed using appropriate standards from which response factors were calculated (Chromeleon program, Dionex) and used to estimate the amount of product released in test incubations. All assays were carried out in triplicate. The specificity constants were calculated using the Matsui equation for oligosaccharides (27,28). Activities toward pNP-substrates (1 mM) were determined by measuring the release of 4-nitrophenol in McIlvaine's buffer, pH 7.0, 30°C in 100-l reaction volume. The reaction was stopped by the addition of 200 l of 1 M sodium carbonate, and the release of 4-nitrophenol was quantified at 405 nm using the molar extinction coefficient of 4-nitrophenol (18,300 M Ϫ1 cm Ϫ1 ). One unit of enzyme activity was defined as the amount of protein that released 1 mol of glucose per min.

Results
Three-dimensional Structure of SdGluc5_26A-The sequence of SdGluc5_26A is composed of a 21-residue-long signal peptide and a catalytic GH5 module comprising 344 residues. The crystal structure of SdGluc5_26A heterologously expressed in E. coli was solved in its native form at 2.05 Å resolution. The polypeptide chain is visible from Asn 29 to Lys 364 (for data collection and refinement statistics, see Table 2). SdGluc5_26A has the prototypical (␤/␣) 8 fold characteristic of other family GH5 members and of enzymes of clan A, to which GH5 belongs. A narrow and deep cleft ϳ30 Å long runs across the surface of the protein near the C termini of the ␤-strands and hosts the two invariant catalytic glutamate residues presented at the end of ␤-strands 4 and 7. From similarity to other GH5 enzymes, the catalytic acid-base and the catalytic nucleophile can be assigned to Glu 189 and Glu 291 , respectively. The carboxylate groups of Glu 189 and Glu 291 are separated by a distance of ϳ4.5 Å, in agreement with a double displacement retaining mechanism proceeding via oxocarbenium ion-like transition states (29). The catalytic role of Glu 291 was confirmed by the observation that the E291Q mutant displayed only residual activity based on the DNS assay (less than 100 milliunits Ϫ1 mg Ϫ1 ). One glycerol molecule and two ethylene glycol molecules were found to bind within the substrate-binding cleft, close to the catalytic machinery. On the side opposite to the substrate-binding groove, SdGluc5_26A contains an extra lidlike ␤-hairpin, commonly encountered in GH5 enzymes. Studies of thermostable GH5 cellulases from Thermotoga maritima (30) and Bacillus subtilis 168 (31) indicate that the presence of additional structural elements at the N terminus stabilizes the structure of the proteins. Likewise, the presence of a metal ion in the proximity of the lid-like ␤-hairpin, such as manganese in the structures of B. subtilis Cel5A (31) and Cel5Z from Dickeya dadantii (formerly Erwinia chrysanthemi) (32,) has been attributed a stabilizing role. In SdGluc5_26A, a magnesium ion is present in all structures, as well as in those obtained from crystals grown in the absence of a magnesium salt, in a position close to the one occupied by metal ions in other GH5 structures, and coordinated by water molecules and two main-chain carbonyl groups originating from loop ␤5-␣5. Beyond the lidlike ␤-hairpin, the N terminus of SdGluc5_26A (residues Asn 29 -Thr 48 ) projects tangentially away from the TIM-barrel, and the interlacement of three N termini with three SdGluc5_26A monomers gives rise to a compact trimeric assembly (Fig. 1A). The association of SdGluc5_26A into a trimer within the crystal lattice results in a buried surface of 5,290 Å 2 per monomer as calculated by the PISA server (33). The existence of SdGluc5_26A as a trimer in solution was confirmed by size-exclusion chromatography coupled to multi-angle laser light scattering (molecular mass of 123 kDa; Fig. 1B). The tightest interactions between monomers are delivered by the extremity of the N-terminal extension, from Asn 29 to Pro 37 (Fig. 1C). The side chains of Asn 29 , Asp 30 , and Tyr 36 are involved in hydrogen bonds with the side chains of Lys 113 , Arg 82 , and Gln 153 of an adjacent subunit, respectively, whereas the main-chain carbonyl group of Ile 34 interacts with the side chain of His 154 of an adjacent monomer. The interactions between subunits are completed by stacking interactions between the aromatic ring of Trp 31 and the backbone of Val 109 -Ser 110 and between the side chain of Trp 32 and the side chain of Lys 78 . Interestingly, both tryptophan residues are well conserved in the N-terminal extension of some GH5_26 members (Fig. 1D).
A search for structural homologues of SdGluc5_26A with the DALI server identified a large number of GH5 enzymes with significant structural similarity. The highest Z-scores are associated with the subfamily GH5_26 metagenome-derived endoglucanase Cel5A (Z-score 51.8; Protein Data Bank (PDB) entry 4HTY (13)), and with Thermobifida fusca TfCel5A (Z-score 32.6; PDB entry 2CKR) and BaCel5A from Bacillus agaradhae-    MARCH 25, 2016 • VOLUME 291 • NUMBER 13 rans (Z-score 32.0; PDB entry 1A3H (34)). The other most closely related structures are endoglucanases from subfamilies GH5_1 and GH5_2, pointing toward a strong structural relationship with these two subfamilies. The structure of metagenome-derived Cel5A has previously been solved in complex with a cellotetraose molecule bound to subsites Ϫ2 to ϩ2 (PDB entry 4HU0 (13)), and the structure of a TfCel5A mutant with a cellopentaose molecule spanning subsites Ϫ3 to ϩ2 is available in the PDB (accession code 2CKR). A plethora of complex structures are available for BaCel5A, allowing an exact mapping of sugar-binding subsites (34,35). A structural overlay of the native form of SdGluc5_26A with the above complexes suggests that the enzyme features four sugar-binding subsites extending from Ϫ2 to ϩ2. The Ϫ3 subsite engaged by the ␤(1,4)-linked glucose moieties in the complexes of TfCel5A and BaCel5A is occupied by the N-terminal helix-turn motif of a neighboring molecule within the trimeric assembly of SdGluc5_26A (Fig. 1, A and C). Complex Structures-To investigate substrate recognition by the active site, a SdGluc5_26A nucleophile mutant E291Q was generated. Complex structures with compounds G4, G4A, and G4B (Table 1) were obtained by soaking and diffracted to a resolution of 2.3, 1.9, and 2.2 Å, respectively. Cellotetraose G4 binds in subsites Ϫ2 to ϩ2 of the enzyme, by-passing the catalytic machinery in a non-productive manner ( Figs. 2A and 3). Similar non-productive binding modes evading the catalytic machinery have been observed in the cellopentaose complex of TfCel5A (accession code 2CKR) and in the complex of BaCel5A with a thio-cellopentaoside (PDB entry 1H5V (36)). In the E291Q-G4 complex, the individual pyranoside units are all in the standard 4 C 1 chair conformation and refine with average temperature factors of 28, 29, 27, and 44 Å 2 , going from subsites Ϫ2 to ϩ2. The glucose unit in the Ϫ2 subsite stacks against the fully conserved Trp 336 and interacts with Ser 337 main-and sidechain atoms via its O 2 hydroxyl group, and with the side chain of His 107 by means of the O6 hydroxyl. Due to the displacement of the glucose in subsite Ϫ1 from a catalytically relevant position, this subsite is characterized by a paucity of direct interactions with the protein. Only the O 2 hydroxyl establishes direct hydrogen bonds with the carboxyl unit of the catalytic acid/base Glu 189 and the side chain of the fully conserved Tyr 256 . Also, as a consequence of the linearized bypass binding mode of cellotetraose, the glucose units at the aglycon side are shifted away from the true ϩ1 and ϩ2 binding sites, as inferred from comparison with other family GH5 enzyme substrate complexes. However, a structural comparison with metagenome-derived Cel5A containing a cellotetraose molecule bound to subsites Ϫ2 to ϩ2 (PDB entry 4HU0 (13)) suggests that in a catalytically relevant binding mode, pyranose units in subsites ϩ1 and ϩ2 could be stabilized by stacking interactions with residues Trp 232 and Pro 305 and by hydrogen bonds with the side chains of Lys 259 and Lys 261 .

Structure-Activity Relationships of a GH5_26 Glucanase
In the E291Q-G4B complex, the mode of binding of the sugar polymer is identical to that observed in the cellotetraose complex, except for subsite ϩ2, where, by virtue of the ␤(1,3) linkage between subsites ϩ1 and ϩ2, the plane of the glucose ring in subsite ϩ2 is rotated by 180 degrees with respect to a pyranose in a ␤(1,4) oligomer (Figs. 2B and 3). The individual glucose units refined to average temperature factors of 38, 42, 35, and 49 Å 2 , going from subsites Ϫ2 to ϩ2.
The most intriguing complex was obtained by soaking of E291Q crystals with compound G4A. Excellent electron density could be observed in subsites Ϫ2 to ϩ2 for the four glucose units (Fig. 2C), all in the 4 C 1 conformation and refining to average temperature factors of 20, 19, 21, and 23 Å 2 , when going from subsites Ϫ2 to ϩ2. Although the oligosaccharide appears to bind in a productive manner, with a glucose ring well docked into the Ϫ1 subsite, the chain is reversed, with the reducing end located in subsite Ϫ2 and the non-reducing end located in subsite ϩ2 (Fig. 3A). Interestingly, there are more direct hydrogenbonding interactions between compound G4A and the enzyme, and bonding distances are shorter as compared with interactions between E291Q and compounds G4 and G4B.
To obtain a true complex with compound G4A, co-crystallization experiments were performed. Two different crystal forms were obtained, hereafter called G4A-C2 and G4A-P2 1 , diffracting to 2.0 and 1.35 Å resolution, respectively. Crystal form G4A-C2 contains six molecules per asymmetric unit, and a trimer is present in the asymmetric unit of crystal form G4A-P2 1 . All monomers of both crystal forms showed clear electron  MARCH 25, 2016 • VOLUME 291 • NUMBER 13 density in subsites Ϫ3 to Ϫ1 (Fig. 2, D and E), and a trisaccharide with all pyranose units in the low-energy 4 C 1 chair conformation could be modeled, with the ␤(1,3) linkage between subsites Ϫ3 and Ϫ2 (Fig. 2). Average temperature factors for glucose units in the individual subsites are in the order Ϫ3 Ͼ Ϫ1 Ͼ Ϫ2. In the tetrasaccharide G4A, the (1,3) linkage is situated at the non-reducing end, and for an intact substrate, one would expect the chain spanning the substrate-binding cleft from subsites Ϫ3 to ϩ1. Nonetheless, no residual density could be observed in subsite ϩ1, indicating that the E291Q retained sufficient residual activity for cleaving the substrate during the prolonged incubation necessary for crystal growth. In both crystal forms, the glucose units in subsites Ϫ1 and Ϫ2 adopt positions almost identical to the ones observed in the Acidothermus cellulolyticus endocellulase E1 in the Michaelis complex with cellotetraose (PDB entry 1ECE (37)). In subsite Ϫ1, the O1 hydroxyl group forms a hydrogen bond with the side chain of the catalytic acid/base, and the O3 group contacts the conserved His 142 (Fig. 3B). The O6 hydroxyl is found in intermediate states between syn-and anti-positions with respect to the endo-cyclic O5 and contracts accordingly a short hydrogen bond of 2.3-2.65 Å in length with His 303 . This latter residue is located on loop ␤7-␣7, structurally and sequence-wise very conserved within subfamily GH5_26, but very distinct from members of other GH5 subfamilies. Interestingly, a spatially overlapping His residue is delivered in the T. maritima enzymes Cel5A (subfamily GH5_25) and Cel5B (subfamily GH5_36) from loop ␤6-␣6 into the active site cleft, and His 205 of T. maritima Cel5A (PDB accession code 3AZT) interacts equally with the O6 hydroxyl of a pyranose located in subsite Ϫ1. His 303 points toward the solvent in the unbound form and in the unproductive complexes of SdGluc5_26A, and the conformational change upon substrate binding leading to a strong hydrogen bond indicates that this residue is crucial for substrate binding. In subsite Ϫ2, the carbohydrate-enzyme interactions are virtually the same as the ones observed in the complexes with compounds G4 and G4B. By virtue of the ␤(1,3) linkage between subsites Ϫ3 and Ϫ2, the pyranose in subsite Ϫ3 is projected out of the cleft toward the rim formed by loop ␤7-␣7, thus avoiding the steric hindrance imposed by the presence of the N terminus of a neighboring SdGluc5_26A monomer. The O 2 and O3 hydroxyl groups interact with the side chains of Ser 337 and Tyr 300 , respectively, whereas the O6 hydroxyl interacts with the side chain of Trp 336 and the mainchain carbonyl of Trp 32 , located at the N terminus of a neighboring subunit protruding into the substrate-binding cleft. In crystal form G4A-P2 1 , the glucose unit in subsite Ϫ3 adopts an alternate, or double conformation, depending on the monomer. In the alternate conformation, a change in / angles from Ϫ92/Ϫ139 to Ϫ69/Ϫ109 tilts the glucose unit out of the plane formed with the glucose in Ϫ2 and moves it away from loop ␤8/␣8 and closer to the loop ␤7/␣7. As a consequence, in this alternate conformation, the O 2 hydroxyl establishes a new interaction with the main-chain carbonyl of Asp 335 , at the expense of loss of a direct interaction between the O6 group and residues from the N terminus of an adjacent monomer, contacts that now are only water-mediated. The possibility that the alternate conformation in subsite Ϫ3 is dictated by crystal packing constraints can be ruled out on the basis that in one of the monomers within the G4A-P2 1 trimer, both conformations are present concomitantly.
SdGluc5_26A exhibits an apparent pH optimum of 7.0 and a broad range of activity from pH 4.0 to 10.0 and displays an apparent optimum temperature of 30°C using lichenan as sub- strate (data not shown). Interestingly, SdGluc5_26A was able to conserve its activity after a drastic treatment with 500 mM NaOH, and the only way to stop enzyme activity was treatment with 8 M urea.
The enzyme showed a better affinity and catalytic efficiency on barley ␤-glucan than on lichenan (K mapp ϭ 0.90 versus 10.86 mg Ϫ1 ml Ϫ1 and k cat /K m ϭ 1.0 ϫ 10 6 versus 2.3 ϫ 10 4 min Ϫ1 mg Ϫ1 ml Ϫ1 , respectively). Catalytic efficiencies on G3B, G4A, and G4B revealed some differences, the highest value being obtained using G4B where the ␤(1,4)-bond was cleaved with an endo-mode of action (Table 1). The catalytic efficiencies measured on G4A and G4B were in the same order of magnitude as those measured for cello-oligosaccharides (Table 1). Interestingly, the enzyme was also able to cleave the ␤(1,3)-bond in G3B, although with a lower catalytic efficiency.
Properties of the SdGluc5_26A⌬S38 Mutant-Based on structural analyses and substrate specificity of SdGluc5_26A, we decided to investigate further the N-terminal sequence motif within the GH5_26 subfamily. Of the 24 GH5_26 sequences available, more than half originate from metagenomic analyses and belong to unidentified microorganisms. The others belong to Gammaproteobacteria, Bacteroidetes, and Verrucomicrobia. All of the GH5_26 sequences present the  same modularity with a signal peptide at their N terminus followed by the catalytic domain. Sequences alignment of the 24 sequences subdivided GH5_26 family in two groups (Fig. 1D). The first group consists of 11 unidentified sequences without N-terminal extension. In the second group, the length and the sequence of the N-terminal motif are well conserved. The tryptophan motif that is directly involved in the interaction with the active site is found in the majority of sequences (Fig. 1D). Based on these sequence alignments, we decided to delete the N-terminal sequence of SdGluc5_26A up to residue Ser 38 (Fig. 1D). The deletion of the N-terminal region in the SdGluc5_ 26A⌬S38 mutant destabilized the oligomeric state, resulting in a mixture of trimeric and monomeric forms (Fig. 6). The deletion did not induce any significant differences on the hydrolysis of both barley ␤-glucans and lichenan in terms of products formed (i.e. G2, G3A, and G4C) and initial rate constants (i.e. K mapp ϭ 2.68 versus 12.56 mg Ϫ1 ml Ϫ1 and k cat /K m ϭ 7.7 ϫ 10 5 versus 1.8 ϫ 10 4 min Ϫ1 mg Ϫ1 ml Ϫ1 , respectively) as compared with the wild type. However, SdGluc5_26A⌬S38 displayed better activity toward CMC (k cat /K m ϭ 1.1 ϫ 10 5 min Ϫ1 mg Ϫ1 ml Ϫ1 ) as compared with the wild-type enzyme (k cat /K m ϭ 5.7 ϫ 10 1 min Ϫ1 mg Ϫ1 ml Ϫ1 ). More strikingly, the SdGluc5_ 26A⌬S38 mutant released a mixture of G2 to G6 products from CMC, suggesting a modification of the substrate accommodation (Fig. 5). This trend was confirmed using cello-oligosaccharides as G2, G3, and G4 were released from G6 (Table 1). Cleavage of ␤(1,3;1,4)-gluco-oligosaccharides also revealed striking differences as compared with the wild-type enzyme. Indeed, G4B was cleaved into G3 and G1, suggesting the accommodation of a ␤(1,4)-linked glucose in subsite Ϫ3 of the SdGluc5_ 26A⌬S38 mutant (Fig. 5).
The three-dimensional structure of SdGluc5_26A adopts a (␤/␣) 8 topology, classical for GH5 members and clan GHA enzymes in general. Surprisingly, a unique N-terminal extension projects away from the main structural core and protrudes into the substrate-binding groove of a neighboring catalytic subunit, thereby conferring a trimeric quaternary architecture to SdGluc5_26A. To ascertain that this novel structural feature was not a crystallographic artifact, we confirmed the oligomeric nature of SdGluc5_26A by solution studies. Interestingly, the only other known three-dimensional structure of a GH5_26 member, metagenome-derived Cel5A (PDB code 4HTY (13)), exhibits an identical trimeric assembly, generated through the three-fold axis of the cubic space group. Nevertheless, in this case, the N-terminal extension, sequence-wise quite conserved with respect to that of SdGluc5_26A, adopts a different conformation and does not protrude into the active site of a neighboring subunit. However, the elevated thermal displacement factors of the N-terminal residues in the Cel5A structure suggest that they are not well stabilized in the observed conformation. The configuration of the N terminus in the Cel5A structure might also arise from crystal packing constraints conferring a thermodynamically more favorable conformation in the present conditions. Unfortunately, this structural feature and the oligomeric state of Cel5A are not discussed by the authors (13).
When overlaying the native structure of SdGluc5_26A with the structures of other GH5 enzymes in complex with oligosaccharides, it turned out that the Ϫ3 subsite engaged by ␤(1,4)linked glucose moieties in those complex structures is occupied by the N-terminal helix-turn motif of a neighboring molecule within the trimeric assembly of SdGluc5_26A. However, a putative Ϫ3 subsite could be envisioned for a ␤(1,3)-linked glucose moiety, and in the light of the enzymatic data, which have shown that SdGluc5_26A is active on mixed-linkage polysaccharides, the inactive nucleophile mutant E291Q was produced to obtain a structural view of the binding of mixed-linkage oligosaccharide substrates to SdGluc5_26A. Complex structures of SdGluc5_26A with compounds G4 and G4B revealed a strict requirement for the accommodation of ␤(1,4)-bonds between subsites Ϫ1 and Ϫ2 and a tolerance for both ␤(1,4)-bonds and ␤(1,3)-bonds between subsites ϩ1 and ϩ2. The failure to degrade G4C and the inability to detect enzymatic activity on pure ␤(1,3)-linked compounds indicate a hydrolytic activity, preferentially on ␤(1,4)-linkages (Fig. 5). Initial co-crystallization attempts with compound G4A revealed an intriguing inverse binding mode of the oligosaccharide. Retrospectively, the inverse binding mode of G4A in the complex obtained by soaking can be explained by the fact that in the native crystal form, crystal packing contacts did not allow accommodation of a glucose moiety in subsite Ϫ3. Surprisingly, as judged by the number and length of hydrogen bonds established between the oligosaccharide and the enzyme, as well as by the temperature factors of the individual sugar subunits, this mode of binding appears to be very strong. Finally, the structures of E291Q-G4A complexes obtained by co-crystallization disclosed a strict requirement for the lodging of ␤(1,3)-linkages between subsites Ϫ3 and Ϫ2 and provided the structural rationalization of the observed enzymatic activities. Within the family GH5, only a few enzymes have been identified as ␤(1,3;1,4)-glucanases, and none with structural insights supporting this specificity have been identified. The present characterization of SdGluc5_26A allows us to ascertain that its specificity finds its origin in the quaternary structure. Until this work, only a carbohydrate-binding module has previously been shown to be able to affect the substrate specificity of the appended catalytic domain (40). To further confirm the influence of the quaternary structural assembly on the substrate specificity of SdGluc5_26A, we produced the deletion mutant SdGluc5_26A⌬S38, devoid of the helix-turn motif interacting with residues of the substrate-binding cleft at the level of subsite Ϫ3. As anticipated by our results, this mutation uncovered the enzyme cleft at the Ϫ3 subsite and turned the enzyme into an endo-␤(1,4)-glucanase. However, this novel Ϫ3 subsite, able to accommodate ␤(1,4)-linkages, does not seem to make productive interactions with the substrate. This is supported by the lack of Ϫ3 to Ϫ1 binding of G4 and the similar or significantly reduced activity of the mutant against G6 and G5, respectively, as compared with the wild-type enzyme (Table 1).
So far, only a limited number of studies have reported the conversion of carbohydrate-active enzymes (CAZymes) from an exo-mode of action into endo-acting enzymes, i.e. xylanase activity was introduced into a GH43 arabinofuranosidase (38), and endo-mannanase activity was introduced into a GH26 exo-acting enzyme (39). Further careful biophysical characterization of other members of the GH5_26 subfamily is now required to assess whether the quaternary structure controls the substrate specificity of the entire GH5_26 subfamily.