Kinetic and molecular dynamics study of inhibition and transglycosylation in Hypocrea jecorina family 3 β-glucosidases

β-Glucosidases enhance enzymatic biomass conversion by relieving cellobiose inhibition of endoglucanases and cellobiohydrolases. However, the susceptibility of these enzymes to inhibition and transglycosylation at high glucose or cellobiose concentrations severely limits their activity and, consequently, the overall efficiency of enzyme mixtures. We determined the impact of these two processes on the hydrolytic activity of the industrially relevant family 3 β-glucosidases from Hypocrea jecorina, HjCel3A and HjCel3B, and investigated the underlying molecular mechanisms through kinetic studies, binding free energy calculations, and molecular dynamics (MD) simulations. HjCel3B had a 7-fold higher specificity for cellobiose than HjCel3A but greater tendency for glucose inhibition. Energy decomposition analysis indicated that cellobiose has relatively weak electrostatic interactions with binding site residues, allowing it to be easily displaced by glucose and free to inhibit other hydrolytic enzymes. HjCel3A is, thus, preferable as an industrial β-glucosidase despite its lower activity caused by transglycosylation. This competing pathway to hydrolysis arises from binding of glucose or cellobiose at the product site after formation of the glycosyl-enzyme intermediate. MD simulations revealed that binding is facilitated by hydrophobic interactions with Trp-37, Phe-260, and Tyr-443. Targeting these aromatic residues for mutation to reduce substrate affinity at the product site would therefore potentially mitigate transglycosidic activity. Engineering improved variants of HjCel3A and other structurally similar β-glucosidases would have a significant economic effect on enzymatic biomass conversion in terms of yield and production cost as the process can be consequently conducted at higher substrate loadings.

␤-Glucosidases enhance enzymatic biomass conversion by relieving cellobiose inhibition of endoglucanases and cellobiohydrolases. However, the susceptibility of these enzymes to inhibition and transglycosylation at high glucose or cellobiose concentrations severely limits their activity and, consequently, the overall efficiency of enzyme mixtures. We determined the impact of these two processes on the hydrolytic activity of the industrially relevant family 3 ␤-glucosidases from Hypocrea jecorina, HjCel3A and HjCel3B, and investigated the underlying molecular mechanisms through kinetic studies, binding free energy calculations, and molecular dynamics (MD) simulations. HjCel3B had a 7-fold higher specificity for cellobiose than HjCel3A but greater tendency for glucose inhibition. Energy decomposition analysis indicated that cellobiose has relatively weak electrostatic interactions with binding site residues, allowing it to be easily displaced by glucose and free to inhibit other hydrolytic enzymes. HjCel3A is, thus, preferable as an industrial ␤-glucosidase despite its lower activity caused by transglycosylation. This competing pathway to hydrolysis arises from binding of glucose or cellobiose at the product site after formation of the glycosyl-enzyme intermediate. MD simulations revealed that binding is facilitated by hydrophobic interactions with Trp-37, Phe-260, and Tyr-443. Targeting these aromatic residues for mutation to reduce substrate affinity at the product site would therefore potentially mitigate transglycosidic activity. Engineering improved variants of HjCel3A and other structurally similar ␤-glucosidases would have a significant economic effect on enzymatic biomass conversion in terms of yield and production cost as the process can be consequently conducted at higher substrate loadings.
Plant biomass is a rich, sustainable carbon source that presents an attractive alternative to fossil reserves for the generation of liquid fuel products (1,2). Current industrial practices for converting biomass to small, fermentable sugar precursors often include an enzymatic hydrolysis process that utilizes the secreted hydrolytic enzymes of filamentous fungi, most commonly Hypocrea jecorina (teleomorph of Trichoderma reesei) (3). The fungal endoglucanases, lytic polysaccharide monooxygenases, processive cellobiohydrolases, and ␤-glucosidases work synergistically to decompose cellulose, the most abundant polysaccharide of plant biomass, to glucose (4). ␤-Glucosidases play a particularly important role in maintaining efficient enzyme mixture turnover as they are responsible for relieving endoglucanase and cellobiohydrolase product inhibition through hydrolysis of cellobiose to glucose (5).
It is generally accepted that glycoside hydrolase family 3 (GH3) 4 enzymes are among the best cellobiose-hydrolyzing enzymes, and accordingly, they comprise the majority of ␤-glucosidases in industrial enzyme mixtures (6). For instance, the optimum level of Cel3A from H. jecorina (HjCel3A) in cellulase mixtures enhanced the conversion of various cellulosic substrates by nearly 10% (7). Overexpression of the HjCel3A-encoding gene in H. jecorina also resulted in a 5-fold increase in glucose production from cellobiose (8).
The solution of the first HjCel3A crystal structure revealed new insights into catalytic activity and substrate affinity (7). Namely, a network of hydrogen bonds, originating from surrounding charged residues, stabilizes a glucose molecule in the Ϫ1 site (substrate site). Aromatic residues flank the ϩ1 site (product site) where the reducing end of cellobiose would be situated. Additionally, the ability of HjCel3A to catalyze hydrolysis of longer cello-oligomers points toward the existence of a putative ϩ2 product site.
Despite these very important findings, many of the fundamental molecular mechanisms of HjCel3A activity, along with that of related GH3s, have yet to be defined. Decomposition of polysaccharides by GH3 ␤-glucosidases is proposed to follow the retaining mechanism (see Fig. 1). In the initial glycosylation step, the glycosidic oxygen is protonated by the acid/base catalyst, Glu, whereas the nucleophile, Asp, attacks the anomeric carbon of the glycone moiety to form a glycosyl-enzyme intermediate (GEI). In the subsequent hydrolysis reaction, the deprotonated Glu activates a water molecule that carries out another nucleophilic attack, yielding a product with the same stereochemistry as the substrate (12).
The activity of ␤-glucosidases is seriously impeded at high cellobiose or glucose concentration because of their susceptibility to product and/or substrate inhibition and transglycosylation (13). Transglycosylation is a competing catalytic pathway to hydrolysis wherein glucose or another cellobiose molecule, instead of water, attacks the GEI to form a di-or trisaccharide (Fig. 1). Accumulation of cellobiose and glucose inevitably occurs under high substrate loadings, a necessary condition to increase the yield and lower the production costs of industrial biomass conversion (13)(14)(15).
A number of studies have investigated the effects of competitive inhibition and transglycosylation on GH3 ␤-glucosidase performance (13, 16 -26); the more recent studies feature novel methods such as isothermal titration calorimetry (21,22) and 13 C isotope labeling (23) that allow quantitative determination of the effects of these two processes on hydrolytic activity. One of the notable findings is that transglycosylation is the main cause of decay in activity at high cellobiose concentration but not at high glucose concentration (22).
The critical role of ␤-glucosidases in the overall biomass conversion efficiency provides the impetus to engineer variants that overcome limitations arising from environmental process conditions. Protein engineering requires knowledge of the molecular factors underlying inhibition and transglycosylation. An analysis of published Michaelis (K m ) and inhibition (K i ) constants for cellobiose and glucose, respectively, suggests positive correlation between affinity for cellobiose and strength of glucose inhibition (13). Modulation of the transglycosylation activity of the A. niger ␤-glucosidase upon mutation of tryptophan residues was linked to affinity of glucose for the Ϫ1 and ϩ1 sites (17,19). The results of these studies imply that reducing affinity at the Ϫ1 and ϩ1 sites of ␤-glucosidases will result inlowerproductinhibitionanddecreasetheincidenceoftransglycosylation, respectively.
We investigated this hypothesis on HjCel3A and HjCel3B, which exhibit the highest activity against cellobiose among the GH3 ␤-glucosidases in H. jecorina (11). Kinetic studies and binding free energy calculations were performed to determine which process, inhibition or transglycosylation, has a more significant impact on hydrolytic activity. Protein residues responsible for cellobiose and glucose affinity at the different binding sites were then identified using molecular dynamics (MD) simulations.

Physical properties and model structure of HjCel3B
SDS-PAGE analysis under nonreducing and nondenaturing conditions suggested that HjCel3B is a dimer with a molecular mass of 100 kDa. In contrast, HjCel3A is a monomer with a molecular mass of 60 kDa (7). Both enzymes have the same pH optimum of 5.0 at 37°C (Fig. S1). The HjCel3B homology model reveals an overall fold similar to HjCel3A, consisting of 1) a triose-phosphate isomerase (TIM) barrel-like domain, 2) ␣/␤ sandwich domain, and

Inhibition and transglycosylation in H. jecorina Cel3A/B
3) fibronectin type III (FnIII) domain ( Fig. 2A). However, HjCel3A lacks the insertion region in the FnIII domain. The active site is located at the interface of domains 1 and 2. Glu-516 (domain 2) acts as the acid/base catalyst, and Asp-287 (domain 1) acts as the nucleophile. A glucose substrate is shown bound at the Ϫ1 site (Fig. 2B). It is held in place by hydrogen bond interactions with Asp-99, Arg-163, Lys-196, and His-197 (Asp-61, Arg-125, Lys-158, and His-159 in HjCel3A). Glucose also has a hydrophobic interaction with Trp-288 (Trp-237 in HjCel3A) at this site.
Aromatic residues Trp-75 and Tyr-518 (Trp-37 and Tyr-443 in HjCel3A) are central to the ϩ1 product site. Phe-312 is also included in this site, but the corresponding residue in HjCel3A, Phe-260, is oriented away from the substrate pocket. Based on the predicted shape of the substrate pocket (Fig. 2C), the putative ϩ2 site of HjCel3A is formed by Phe-260 and Tyr-68. Although the two aromatic residues at the entrance of the HjCel3B substrate pocket, Phe-106 and Trp-366, may form the corresponding site, the predicted ϩ2 site instead consists of Gln-208, Glu-211, and Gln-289 (Fig. 2D). Gln-208 and Gln-289 correspond to Glu-170 and Asn-238 in HjCel3A, whereas Glu-211 is located in an ␣-helix in domain 1 that is missing in HjCel3A. The calculated solvent-excluded volume of the HjCel3B substrate pocket is smaller (431 Å 3 ) than that of HjCel3A (519 Å 3 ). HjCel3A, there is a hyperbolic increase in initial velocity at low substrate concentrations followed by a decrease from about 1 mM CNPG. In contrast, no decrease in initial velocity is observed with HjCel3B within the same concentration range. The same trend was observed with cellobiose as substrate. The two ␤-glucosidases are also active on CNPX, which is not unexpected considering the prevalence of xylosidase activity in structurally similar GHs (27,28). The initial velocity does not decrease in this case, and enzyme activation occurs at high substrate concentrations (Fig. 3B).

2-Chloro-4-nitrophenyl-␤-glucoside (CNPG), 2-chloro-4nitrophenyl-␤-xyloside (CNPX), and cellobiose kinetics
K m and k cat values for the hydrolysis of CNPG by HjCel3A (Table 1) were determined at low substrate concentrations where inhibition and/or transglycosylation is minimal and the kinetics obey the Michaelis-Menten equation. As the kinetics of CNPX hydrolysis did not follow the Michaelis-Menten equation, the parameters were not calculated. HjCel3A has a higher specificity for CNPG than cellobiose. Interestingly, HjCel3B not only has a higher specificity for cellobiose but also hydrolyzes this substrate more efficiently than HjCel3A. This is contrary to the comparative study of HjCel3A and HjCel3B expressed in A. oryzae, showing that the latter has a slightly

Inhibition and transglycosylation in H. jecorina Cel3A/B
lower k cat /K m (21.6 against 37.3 mM Ϫ1 s Ϫ1 ) (11); we hypothesize that the difference is due to expression in a nonnative host.

Glucose inhibition
The inhibition constants (K i ) determined from the Lineweaver-Burk plot show that, for both HjCel3A and HjCel3B, glucose inhibition increases as the pH increases ( Table 2). HjCel3B is more inhibited than HjCel3A at acidic pH, although the K i values of the two ␤-glucosidases are similar at higher pH.

Identification of transglycosylation products with cellobiose as substrate
A variety of transglycosylation products were obtained for both HjCel3A and HjCel3B (Fig. 4). These include disaccharides (gentiobiose, laminaribiose, and sophorose), formed with the glucose product as acceptor, and cellotriose, formed with the substrate as acceptor. Additional transglycosylation products were formed, likely longer glucosides with mixed ␤-linkages; however, we were unable to identify these products without appropriate comparative standards.

Binding free energies from free energy perturbation with Hamiltonian replica-exchange MD (FEP/-REMD)
The free energies for binding glucose and cellobiose in the active site models (Fig. 5) of HjCel3A and HjCel3B are summarized in Table 3. Glucose has similar affinity for both reactive forms of HjCel3A and HjCel3B (Model I). Comparison of indi-vidual free energy contributions when the substrate is in the binding site and in bulk solution shows that electrostatic interactions provide the most favorable contribution (Ϫ18 kcal/ mol) in each case. Cellobiose (Model II) has a much lower affinity for HjCel3B dominated by dispersion (Ϫ18 kcal/mol) in contrast to HjCel3A where the binding energy is largely electrostatic (Ϫ28 kcal/mol). The energy for glucose binding in the product site (Model III) is essentially zero in both enzymes. Cellobiose binds more stably in the product site when the O4 atom of the nonreducing end is positioned for attack by Glu (Model IV) with comparable energies for HjCel3A and HjCel3B. The binding of both compounds to the GEI is mainly facilitated by dispersion interactions, which contribute between Ϫ8 and Ϫ14 kcal/mol to the binding energy.

Enzyme-substrate interactions from MD simulations
Hydrogen bond occupancies are listed in Tables S1-S7, and van der Waals interaction energies are listed in Tables S8 -S10.

Table 1 Kinetic parameters for hydrolysis of CNPG and cellobiose by HjCel3A and HjCel3B
Experiments were conducted at 37°C and pH 5.7 for CNPG and pH 5.0 and 5.7 for cellobiose.

Inhibition and transglycosylation in H. jecorina Cel3A/B
The active site residues of HjCel3A and HjCel3B form several hydrogen bonds with the glucose substrate in the Ϫ1 site (Table  S1). In HjCel3A, the residues involved are Asp-61, Lys-158, His-159, and the nucleophile Asp-236 (Fig. 6A). There is also a hydrogen bond with Arg-67, which was not observed in the crystal structure (Protein Data Bank (PDB) code 3ZYZ (7)). Glucose interacts with the corresponding residues in HjCel3B with the exception of the arginine residue; however, an additional interaction with Arg-163 is formed (Fig. 6B). Among the hydrophobic residues, only the tryptophan adjacent to the nucleophile has significant van der Waals interaction with glucose (Ϫ5.0 kcal/mol with Trp-237 in HjCel3A and Ϫ3.5 kcal/ mol with Trp-288 in HjCel3B). There are also relatively weak interactions with Leu-110 and Met-201 in HjCel3A (Table S8). Glucose is oriented almost perpendicular to Trp-288 in HjCel3B, and its O6 atom also briefly formed a hydrogen bond with the indole side chain (Table S1). The nonreducing end of cellobiose forms hydrogen bonds with many of the same residues as glucose in HjCel3A (HjCel3B) ( Table S2). The reducing end, however, interacts with the side chains of Arg-67 and Asp-236, and the backbone oxygen atom of Trp-237 in HjCel3A (Fig. 6C). The glycosidic oxygen also formed a hydrogen bond with Glu-441, albeit a short-lived one (Table S2). In HjCel3B, the reducing end only forms hydrogen bonds with Arg-105 and Arg-207 (Fig. 6D). The van der Waals interaction of cellobiose with Trp-288 in HjCel3B (Ϫ6.9 kcal/mol) is slightly stronger compared with that of cellobiose with Trp-237 in HjCel3A (Ϫ4.9 kcal/mol) (Table S8). This substrate additionally interacts with Tyr-443 (Tyr-518) in HjCel3A (HjCel3B) with an energy of about Ϫ3 kcal/mol. The van der Waals interactions with Trp-37 (Trp-75) and Phe-260 (Phe-312) are negligible.
In HjCel3A-and HjCel3B-GEI, hydrogen bonding with the acid/base catalyst (Glu-441 and Glu-516) positions glucose in the ϩ1 site slightly above the covalently bound glucose instead of next to it where the reducing end of cellobiose would be (Fig.  7). In addition to Glu-441, hydrogen bonds are also formed with Arg-67, Asp-370, and, briefly, Arg-169 and Tyr-204 in HjCel3A (Fig. 7A and Table S3). The van der Waals interactions with ϩ1 site aromatic residues are negligible except with Tyr-443 (Ϫ3 kcal/mol; Table S9). Glucose has fewer hydrogen bond interactions in HjCel3B-GEI; the only residues forming hydrogen bonds with greater than 10% occupancy are Glu-516, Arg-207, and Tyr-255 ( Fig. 7B and Table S3). Unlike in HjCel3A, glucose is located almost centrally between Trp-288 and Tyr-518, resulting in comparable van der Waals interaction energies (Ϫ3 kcal/mol; Table S9).
Only the results of the stable binding orientation of cellobiose (Model IV) in the substrate pocket of the GEI will be discussed here in detail. Results for Model V simulations are included in Tables S6, S7, and S10 and Fig. S2. Cellobiose is initially above the covalently bound glucose (similar to glucose in Model III) but becomes very mobile during the simulations (Fig. S3). The hydrogen bond between cellobiose and Glu-441 of HjCel3A (Fig. 8A) lasts for only ϳ55 ns as Glu-441 eventually forms hydrogen bonds with Arg-169, Tyr-443, and Ser-384 instead of with cellobiose. Consequently, the substrate moves farther from the covalently bound glucose and forms interactions with Glu-170 and Asn-238 ( Fig. 8B and Table S4). The strongest van der Waals interactions are with Tyr-443 and Phe-260 (3-4 kcal/mol) (Table S9). Interaction with Trp-37 at the ϩ1 site is rather short-lived, whereas that with Tyr-68 at the putative ϩ2 is negligible. In HjCel3B-GEI, cellobiose initially forms hydrogen bonds with the side chains of Glu-516, Asp-287, Arg-207, and Glu-211 and the backbone oxygen atom of Trp-288 ( Fig. 8C and Table S5). Hydrogen bonds with Gln-208 and Gln-289 in the putative ϩ2 site have occupancies less than 10%. The substrate flips, and the reducing end eventually forms hydrogen bonds with Glu-516, Asp-443, and Ser-457 after ϳ130 ns (Fig. 8D). Only Tyr-518 forms significant van der Waals interaction with cellobiose (Ϫ3 kcal/mol; Table S9).

Discussion
Optimizing the hydrolytic activity of ␤-glucosidases is critical to enhancing enzymatic biomass conversion efficiency. At high substrate loadings, product inhibition and transglycosylation tend to be the main limitations to efficient catalytic turnover and can be linked to the affinity of cellobiose and glucose for the product site(s). Thus, kinetic data of industrially important GH3 ␤-glucosidases from H. jecorina, HjCel3A and HjCel3B, were interpreted with the aid of binding free energy calculations and MD simulations. Decomposition of the binding free energy provides insight on the nature of the interaction governing substrate affinity. The residues involved can then be

Inhibition and transglycosylation in H. jecorina Cel3A/B
identified from the MD simulations and targeted for mutation to reduce inhibition and transglycosylation.
HjCel3Aissusceptibletosubstrateinhibitionand/ortransglycosylation as evidenced by the decrease in initial velocity at both high CNPG and cellobiose concentrations; in contrast, the activity of HjCel3B did not decay within the same concentration range. Substrate inhibition arises from binding to a nonproductive position (i.e. ϩ1 and ϩ2 sites) instead of across the Ϫ1 and ϩ1 sites. Affinity at the ϩ1 site can be estimated from the binding energy difference between HjCel3A Models I and II (ϳ13 kcal/mol). The existence of a ϩ2 site is implied by the reported activity of HjCel3A toward cellotriose and cellotetraose (7). MD simulations of Models IV and V suggested that this site is the pocket formed by Glu-170 and Asn-238 (instead of the predicted pocket formed by Tyr-68 and Phe-260), and its affinity is relatively low (ϳϪ3 kcal/mol). Binding of cellobiose across the Ϫ1 and ϩ1 sites, with a free energy of Ϫ21 kcal/mol, would, therefore, be more favorable, rendering substrate inhibition less likely.
Moreover, a previous study on various GH3 ␤-glucosidases demonstrated through quantitative determination of products that reduced glucose production at high substrate concentrations can be attributed to transglycosylation (22). Transglycosylation competes with hydrolysis and has a lower k cat (18). Cellobiose, either from solution or the original cello-oligomer substrate, binds to the product site following formation of the GEI to yield longer glucosides of various ␤-linkages. In the HjCel3A Model IV simulation, the O4 atom of the nonreducing end of cellobiose formed a hydrogen bond with Glu-441, positioning it for formation of the experimentally observed cellotriose product. In comparison, the main trisaccharide product obtained for the A. niger ␤-glucosidase was 6-O-glucosyl-cellobiose (18,22).
Free energy calculations for HjCel3A Model IV showed that binding is favorable mainly due to dispersion effects. The nonreducing end of cellobiose interacted with Tyr-443, whereas the reducing end interacted with Phe-260. Although cellobiose was still hydrogen-bonded to Glu-441 in the early stages of simulation, there was also hydrophobic interaction between the reducing end and Trp-37. A study on the A. niger ␤-glucosidase suggested that this residue might also be critical to transglycosylation activity. Mutation of the corresponding residue, Trp-49, to Gly, Ala, Asp, or Asn resulted in lower efficiency of the transglycosylation pathway due to higher K m . However, the authors suggested that Trp-49 only has an indirect conformational effect on cellobiose binding on the basis of the similarity of the K m values regardless of the nature of the substituted residue (17).
Mutation of Trp-262 (adjacent to the nucleophile Asp-261) of this A. niger ␤-glucosidase led to preferential transglycosidic activity, which was attributed to a lower hydrolysis rate constant and decreased affinity at the Ϫ1 site (whereas product sites were unaffected) (19). The latter was inferred by comparing K i values for inhibitors of increasing length (glucose, cellobiose, gentiobiose, and cellotriose). This hypothesis is consistent with MD simulations of HjCel3A Models I and II showing that the corresponding residue, Trp-237, forms a strong van der Waals interaction with either glucose or the nonreducing end of cellobiose at the Ϫ1 site. In the case of cellobiose, the O2 atom of the reducing end even formed a hydrogen bond with the backbone oxygen atom of Trp-237 toward the end of the simulation (from ϳ175 ns).
Accumulation of glucose from hydrolysis can also cause inhibition as glucose competes with the substrate for binding to the Ϫ1 or ϩ1 site. The pH dependence of K i indicated that glucose preferentially binds at the Ϫ1 site where it forms hydrogen bonds with charged residues. Comparison of K m(CNPG) and K i(glucose) values obtained at pH 5.7 using CNPG as substrate suggested that HjCel3A is more glucose-tolerant than HjCel3B.

Inhibition and transglycosylation in H. jecorina Cel3A/B
However, it is well known that K i values depend on the substrate used (23,29). Although product inhibition studies with cellobiose were not performed, the relative affinities for cellobiose and glucose can be evaluated from the binding free energy calculations. Glucose bound at the Ϫ1 site of HjCel3A (as observed in the crystal structure) had a much more positive binding energy than cellobiose. The two substrates had comparable binding energies in HjCel3B; thus, glucose can compete with cellobiose for the binding site. The more positive binding energy of cellobiose in HjCel3B compared with that in HjCel3A may be attributed to the weaker affinity at the ϩ1 site where electrostatic effects do not make a significant contribution unlike the case in HjCel3A. Thus, MD simulations provide further support that HjCel3B is more product-inhibited than HjCel3A.
The significant formation of disaccharide products suggested that transglycosylation, to some extent, relieves product inhibition in HjCel3A. Model III represents the case where glucose is retained after the glycosylation step. The binding free energy was essentially zero, even despite the strong affinity at the HjCel3A ϩ1 site; this reduced affinity may be due to the glucose positioning for nucleophilic attack, which was no lon-

Inhibition and transglycosylation in H. jecorina Cel3A/B
ger exactly in the ϩ1 product site but slightly above the covalently bound glucose. This indicates that glucose is easily released following glycosidic bond cleavage, making it more likely that the acceptor for the transglycosylation reaction is another molecule that reentered from solution. Because protein-substrate interactions at or near the ϩ1 site are primarily hydrophobic, glucose can freely adopt different orientations, which may have more favorable binding free energies than Models III-V. This would account for the formation of various ␤-linked disaccharides. During the simulation of the GEI-glucose complex of HjCel3A, Glu-441 eventually formed hydrogen bonds with the O2 and O3 atoms (instead of O4 as in the initial simulation coordinates), which would be consistent with formation of sophorose and laminaribiose, respectively. However, the major disaccharide product observed from previous studies of other GH3 ␤-glucosidases was gentiobiose. This was attributed to the higher reactivity of O6 compared with other hydroxyl groups (18,20). Although HjCel3A is more prone to transglycosylation than HjCel3B, the calculated binding free energies indicated that the GEIs of both ␤-glucosidases have similar affinities at the product site(s). Water availability within the first and second solvation shells of the GEI is also comparable ( Table S11). The more accessible and larger substrate pocket of HjCel3A may account for the significant occurrence of transglycosylation as glucose or cellobiose from solution can easily enter and bind at the product site(s). Another explanation, suggested by the study on the A. niger ␤-glucosidase, is that transglycosylation activity depends on not only the product-site binding affinity but also the hydrolysis rate (17)(18)(19). Unlike the transglycosylation rate, the hydrolysis rate was reported to vary with pH (18). A constant-pH MD study of HjCel3A-GEI (with or without glucose in the ϩ1 site) indicated that Glu-441 has a very low pK a (ϳ2) due to hydrogen bond networks with Arg-125 and Arg-169. The low basicity of Glu-441 would reduce its ability to deprotonate water, which has a higher deprotonation enthalpy than sugars (30). In contrast, HjCel3B homology modeling showed negatively charged residues (Glu-211, Glu-454, and Glu-525) situated within 10 Å of Glu-516 (Fig. S4). MD simulations of HjCel3B-GEI without cellobiose or glucose in the product site(s), moreover, showed that Glu-516 only forms relatively short-lived hydrogen bonds with neighboring residues (Tables S12 and S13 and Fig. S5). These conditions favor an elevated pK a (31,32), which would account for the predominance of hydrolysis over transglycosylation in HjCel3B.
Based solely on the specificity constant k cat /K m , HjCel3B is a better ␤-glucosidase than HjCel3A. However, a survey of various family 1 and 3 ␤-glucosidases showed that those with high k cat /K m (Ͼ10 5 M Ϫ1 s Ϫ1 ) tend to have almost equal affinity for cellobiose and glucose (13) as demonstrated to be the case for HjCel3B by free energy calculations. This may prove to be detrimental to the efficiency of cellulase mixtures as cellobiose will eventually accumulate and inhibit endoglucanases and cellobiohydrolases. From this perspective, the more glucose-tolerant HjCel3A remains industrially preferable despite the lower k cat /K m . Transglycosylation is a major limiting factor for HjCel3A, but this can be mitigated by point mutations of residues responsible for cellobiose and glucose affinity at the product site, specifically Trp-37, Phe-260, and Tyr-443. This is another approach in addition to elevating the pK a of Glu-441 as we proposed in the recent constant-pH MD study (30). Finally, HjCel3A's sequence similarity with xylosidic GH3s suggests that xylosidase activity can be conferred with relatively small modifications to the active site; this is advantageous in potentially reducing the number of enzyme components required in cellulase mixtures. Protein engineering of HjCel3A variants with reduced transglycosidic activity and broad substrate specificity, therefore, presents a preferential path forward for developing highly efficient industrial biomass conversion mixtures.

Enzyme purification
HjCel3A was further purified by affinity chromatography using a p-aminobenzyl-thio-␤-glucopyranoside (coupled to Sepharose) affinity column as described previously (35). The column was equilibrated and washed with 100 mM acetate buffer (pH 5.0) containing 200 mM NaCl. The bound fraction was eluted with 100 mM glucose in 100 mM acetate buffer (pH 5.0). HjCel3B did not bind to the column and was used as received. Protein concentrations were determined according to the Bradford (36) assay using BSA as standard.

Kinetics studies
Kinetic studies of HjCel3B with CNPG and CNPX as substrates were carried out in 100 mM phosphate buffer (pH 5.7) in a microtiter plate. The substrate (190 l) was added to 10 l of enzyme at an appropriate concentration, and the release of

Inhibition and transglycosylation in H. jecorina Cel3A/B
2-chloro-4-nitrophenol (CNP) was followed continuously for 10 min at 37°C by spectrophotometry at 405 nm on a EL808 microplate reader (Bio-Tek Instruments, Inc.). The initial velocity ([CNP] (M/min)) was calculated using a standard curve of CNP in the range of 0 -200 M. The kinetic parameters were calculated by fitting the data to the Michaelis-Menten equation using the software KaleidaGraph TM 3.0. The procedure for HjCel3A kinetics has been described previously by Karkehabadi et al. (7). For HjCel3A with CNPG as substrate, the kinetic parameters were calculated at low substrate concentrations (up to 800 M) because of the inhibition and subsequent decrease of activity observed at higher substrate concentrations. In the kinetic experiments of the ␤-glucosidases, the substrate was incubated with the enzyme at 37°C at the optimum pH. At timed intervals (2 min), an aliquot was withdrawn and heated to 100°C for 5 min to stop the reaction. The released glucose was determined using the glucose oxidase-peroxidase assay.

Identification of transglycosylation products
Experiments were performed using 100 and 50 mM cellobiose for HjCel3A and HjCel3B, respectively. The reaction was incubated at 37°C and optimum pH for 19 h for HjCel3A and 24 h for HjCel3B. The reaction mixture was loaded to a Carbograph column (1000 mg/15 ml; Alltech), which was subsequently washed with water to remove excess glucose. The transglycosylation products with cellobiose were eluted with 30% (v/v) acetonitrile (37). The eluent was removed on a Rotavapor, and the residue was redissolved in water before analysis by highperformance anion-exchange chromatography with pulsed amperometric detection (ICS-3000, Dionex, Sunnyvale, CA).

Enzyme-substrate complex models
The crystal structures of HjCel3A with and without a glucose molecule bound in the Ϫ1 substrate site (PDB codes 3ZYZ and 3ZZ1, respectively) were reported previously by our group (7). The structure of HjCel3B (UniProt ID Q7Z9M5) was modeled based on the crystal structure of A. aculeatus ␤-glucosidase 1 (AaBGL1; PDB code 4IIB; 58.6% sequence identity) (38) using the automated SWISS-MODEL homology modeling server (39,40). Residues 1-28 were not included in the final model owing to the unavailability of a structural template. Side chain conformations of residues in the active site region were manually refined using Coot (41) by comparison with HjCel3A, Cel3A from Rasamsonia emersonii (PDB code 5JU6 (42)), and AaBGL1 (38). Residues 641-655 of HjCel3B, constituting an inserted loop region when compared with the latter two enzymes, had an implausible conformation in the generated homology model. This 15-residue peptide was manually adjusted using Coot. First, an ␣-helix at the N-terminal end of the loop was extended such that the length agreed with both R. emersonii Cel3A and AaBGL1. Second, the 10-residue insertion was adjusted such that a relaxed loop conformation was adopted, stretching from the end of the helix to residue 656;

Inhibition and transglycosylation in H. jecorina Cel3A/B
each side chain in the manually adjusted loop region was initially placed in the most commonly occurring rotamer. The CASTp server was used to identify the substrate pocket of HjCel3B as well as HjCel3A and calculate the solvent-excluded volume (43).
Five simulation models each were constructed for HjCel3A and HjCel3B (Fig. 5). HjCel3B was modeled as a monomer to reduce computational resource requirements. This is not expected to affect the results as the active site is far from the homodimeric interface (ϳ40 Å) ( Fig. 2A). In Models I and II, the enzyme is in the reactive state during the glycosylation step wherein the acid/base catalyst Glu is protonated (Fig. 1). Glucose is bound in the Ϫ1 site (Model I), and cellobiose is bound across the Ϫ1 and ϩ1 sites (Model II). Models III-V were used to study the binding of glucose or cellobiose as an acceptor for the transglycosylation reaction. The enzyme is in the GEI state with a glucose molecule covalently bound to the nucleophile Asp (Fig. 1). Model III represents the case in which either glucose is retained in the ϩ1 product site after cleavage of the cellobiose glycosidic bond or glucose has reentered the active site from solution. Models IV and V represent the scenario in which either cellobiose, under high substrate conditions, enters the enzyme product site or the enzyme has acted upon a cellotriose molecule; two different orientations of cellobiose are considered, both with the nonreducing end positioned in the ϩ1 site.

MD simulations
System preparation and equilibration are described in detail in the supporting information. Protein and substrates were modeled using the CHARMM36 force field (44 -49), and the solvent was modeled using TIP3P (50,51). Production MD simulations were performed using NAMD (52) in the NVT ensemble at 300 K with a 2-fs time step. The simulation time varied between 110 and 250 ns as, in some cases, the substrate diffused from the relatively shallow binding site pockets. The Langevin thermostat was used for temperature control (53). The particle mesh Ewald method (54) was used to treat long-range electrostatic interactions. All other simulation parameters are the same as those used in equilibration. Hydrogen bond occupancies, with a distance cutoff of 3.0 Å and angle cutoff of 135°(55), were calculated from the trajectories using the cpptraj module of AMBER (56). Pairwise interaction energies were calculated using NAMD.

FEP/-REMD calculations
Free energy calculations were performed by dividing the binding process according to a thermodynamic cycle wherein 1) the bound ligand (shown in blue in Fig. 5) is decoupled from the enzyme and 2) the solvated ligand is decoupled from bulk solution. The binding free energy, ⌬G b 0 , for the solvated enzyme-substrate complex is the difference between the free energies associated with 1) and 2). Insertion of the ligand into the binding pocket was done in three stages using the thermodynamic coupling parameters rep , disp , and elec , giving the repulsive (⌬G rep ), dispersive (⌬G disp ), and electrostatic (⌬G elec ) contributions, respectively. A fourth parameter, rstr , controls the translational and orientational restraints on the ligand and gives ⌬G rstr . Insertion of the ligand into the bulk was calculated similarly but without the restraint (57).
Initial structures for the FEP/-REMD (57, 58) calculations were taken from the 25-ns snapshot of each MD simulation of the enzyme-substrate complexes. At least 20 sequential, 0.1-ns simulations were performed using 128 replicas (72 repulsive, 24 dispersive, and 32 electrostatic) with an exchange frequency of 1/100 steps. Further details of the calculation are as described in a previous publication (59). Repulsive, dispersive, and electrostatic free energy contributions were obtained from the last 1 ns of data using the multistate Bennett acceptance ratio (60), and the standard deviation was subsequently calculated.