The hydrolysis mechanism of a GH45 cellulase and its potential relation to lytic transglycosylase and expansin function

Family 45 glycoside hydrolases (GH45) are endoglucanases that are integral to cellulolytic secretomes, and their ability to break down cellulose has been successfully exploited in textile and detergent industries. In addition to their industrial relevance, understanding the molecular mechanism of GH45-catalyzed hydrolysis is of fundamental importance because of their structural similarity to cell wall–modifying enzymes such as bacterial lytic transglycosylases (LTs) and expansins present in bacteria, plants, and fungi. Our understanding of the catalytic itinerary of GH45s has been incomplete because a crystal structure with substrate spanning the −1 to +1 subsites is currently lacking. Here we constructed and validated a putative Michaelis complex in silico and used it to elucidate the hydrolytic mechanism in a GH45, Cel45A from the fungus Humicola insolens, via unbiased simulation approaches. These molecular simulations revealed that the solvent-exposed active-site architecture results in lack of coordination for the hydroxymethyl group of the substrate at the −1 subsite. This lack of coordination imparted mobility to the hydroxymethyl group and enabled a crucial hydrogen bond with the catalytic acid during and after the reaction. This suggests the possibility of a nonhydrolytic reaction mechanism when the catalytic base aspartic acid is missing, as is the case in some LTs (murein transglycosylase A) and expansins. We calculated reaction free energies and demonstrate the thermodynamic feasibility of the hydrolytic and nonhydrolytic reaction mechanisms. Our results provide molecular insights into the hydrolysis mechanism in HiCel45A, with possible implications for elucidating the elusive catalytic mechanism in LTs and expansins.

Cellulases are glycoside hydrolase (GH) 3 enzymes that hydrolyze glycosidic bonds in cellulose and have been the main-stay of enzyme technology platforms for the production of biofuels from cellulosic feedstocks (1). Among the various GH families that have been characterized structurally and biochemically (2), family 45 GHs (GH45) have garnered substantial interest for their industrial utility in textile applications, such as denim polishing, and as an integral component of detergents (3,4).
GH45s have been classified into three subfamilies (A, B, and C) based on phylogenetics (1). Davies et al. (5,6) characterized the native and oligosaccharide-bound (cellobiose and cellohexaose) GH45 structures from Humicola insolens (HiCel45A), belonging to subfamily A. Biochemical experiments on HiCel45A revealed that HiCel45A hydrolyzes glycosidic bonds in cellulo-oligomers via a one-step inverting mechanism. Point mutations to three candidate aspartic acid residues (D121A, D10A, and D114A) at the putative active site implicated Asp-121 (proton donor) and Asp-10 (catalytic base) as indispensable catalytic residues for hydrolytic activity in HiCel45A (5) (Fig.  1A). Cocrystallization of HiCel45A with cellohexaose yielded a structure with two cellotriose molecules occupying subsites Ϫ4 to Ϫ2 and ϩ1 to ϩ3. However, no sugar occupancy was observed at the Ϫ1 subsite in this structure (Fig. 1A) nor in any published structure of a member of subfamily A (5). Given that the enzyme hydrolyzes the glycosidic bond between the Ϫ1 and ϩ1 binding sites, the absence of the substrate at the Ϫ1 subsite leaves a gap in our understanding of the GH45 catalytic itinerary. Furthermore, it has been demonstrated that cellulases facilitate hydrolysis by modifying the conformation of the sugar at this site (7)(8)(9). Beyond the static information provided by crystal structures, a more detailed understanding of the catalytic mechanism of HiCel45A remains unknown.
GH45s bear structural similarity to other cell wallmodifying enzymes found in plants and bacteria: expansins and lytic transglycosylases (LTs) (1,10,11). Expansins facilitate growth in plant cell walls (10,12), but hydrolytic products of their action have not been detected (13). LTs nonhydrolytically cleave glycosidic linkages in bacterial cell wall peptidoglycans (14), forming an anhydro glycan product (15)(16)(17). Expansins as well as one family of LTs (called family 2 in the classification scheme of Blackburn and Clarke (18), classified as GH102 in the This work was supported by Department of Energy, Office of Energy Efficiency and Renewable Energy, Bioenergy Technologies Office via Grant DE-AC36-08GO28308. The authors declare that they have no conflicts of interest with the contents of this article. This article contains Figs. S1-S6, Tables S1-S3, Sections S1-S4, and Videos S1 and S2. 1 To whom correspondence may be addressed: E-mail: gregg.beckham@ nrel.gov. 2 To whom correspondence may be addressed: E-mail: michael.crowley@ nrel.gov. 3 The abbreviations used are: GH (11)) share a similar double-␤-barrel fold with GH45s and are structurally similar at the active site. Although catalytic activity in most cellulases (including GH45s) is mediated by a proton donor and a base (usually an aspartate or glutamate dyad (19)), the catalytic domain in MltA and the putative active site in expansins is characterized by only a single aspartate. As illustrated in Fig. 1B, MltA and expansins share the conserved HFD domain (20) containing the aspartic acid that serves as the proton donor in HiCel45A but display no analog to the base aspartic acid. GH45 subfamily C also features a similar overall fold and conservation of the HFD domain (21) as well as the conspicuous absence of the base aspartic acid at the active site (22). Given the catalytic machinery of HiCel45A for conventional hydrolysis, its well-characterized biochemistry, and its active-site similarity (conserved HFD domain) with GH45 subfamily C, we posit that mechanistic details of GH45A hydrolysis will lead us to insights into plausible modes of catalysis where the base aspartate is mutated, which could also be widely applicable to other cell wall-modifying proteins with a similar active-site architecture.
Although crystal structures provide static pictures of longlived states, molecular simulations provide the requisite spatiotemporal resolution necessary for elucidating atomistic and dynamic details of reaction mechanisms. Though many simulation techniques applied to chemical reactions require a priori specification of the reaction coordinate (RC), which quantifies dynamic progress from reactant to product, transition path sampling (TPS) produces an unbiased ensemble of trajectories (23), which can be analyzed using likelihood maximization (LM) so that the optimal RC is an output of the simulations. TPS has been instrumental in gaining valuable insight into hydrolytic mechanisms in cellulases (24,25). Knott et al. (24) elucidated the two-step retaining mechanism in Trichoderma reesei Cel7A. More recently, Mayes et al. (25) applied TPS to reveal the single-step inverting mechanism in Cel6A from the same fungus. These studies enabled key insights into the RC, free energy barriers for the reactions, sugar puckering itineraries, and rate-limiting steps in these important cellobiohydrolases.
In this study, we employ TPS to elucidate the hydrolysis mechanism in the inverting endoglucanase HiCel45A. We begin by constructing and validating a Michaelis complex (MC) in silico. Subsequent quantum mechanics/molecular mechanics (QM/MM) TPS simulations result in determination of the RC for hydrolysis by WT HiCel45A. In addition, insights gained from the WT hydrolysis mechanism form the basis for an LTlike nonhydrolytic mechanism for the HiCel45A D10N mutant, where the catalytic base aspartic acid is mutated to asparagine. Computed free energy barriers demonstrate the feasibility of the nonhydrolytic mechanism which could serve as a critical segue to understanding mechanistic action in LTs and expansins.

In silico construction of the MC
The MC in HiCel45A has not yet been structurally characterized. To construct a model MC for HiCel45A, we began with the crystal structure featuring two cellotriose molecules occupying subsites Ϫ4 to Ϫ2 and ϩ1 to ϩ3 (PDB code 4ENG) and connected these via the sugar occupying the Ϫ1 site from another inverting GH, Thermobifida fusca Cel6A (TfCel6A, PDB code 4AVO) (26). The sugar conformation at the Ϫ1 site in the substrate binding groove is critical for catalysis, as GHs generally stabilize the ring pucker states away from the chair

Humicola insolens GH45 catalytic mechanism
conformations that are stable in solution (8,24,25). Inverting cellulases utilize a single-step nucleophilic substitution involving an oxocarbenium ion at the anomeric carbon (25,27,28). The important criteria facilitating the hydrolysis mechanism include the puckered state of the substrate ring (generally 2 S O for inverting cellulases (28)) at the Ϫ1 subsite, the presence of a water molecule between the anomeric carbon and the base (Asp-10), and the orientation of the acid (Asp-121) proton toward the glycosidic oxygen at the Ϫ1 site. Unbiased molecular dynamics (MD) simulations were analyzed for simultaneous satisfaction of all three criteria, starting from the constructed HiCel45A MC, and each frame was categorized as catalytically competent (all above criteria met) or not competent for catalysis (any one or more criteria not met). Of the five independent MD runs, one simulation demonstrated catalytically competent conformations 54.5% of the time. There, the catalytically competent conformations were stable over tens to hundreds of picoseconds, timescales beyond what would be required for a single hydrolytic event (Fig. S3). This result provides sufficient confidence in the stability of the putative MC for more detailed studies of the hydrolysis reaction mechanism.
Further analysis of the equilibrium MD simulations reveals that the hydroxymethyl group of the substrate ring at the Ϫ1 position is not coordinated by specific direct interactions with the protein, although a hydrogen bond with an active site water is observed (ϳ14% occupancy over 100 ns simulation). This differs from TfCel6A, where the oxygen atom on the hydroxymethyl group of the substrate ring bound at the Ϫ1 subsite in GH6 coordinates closely with an active-site tyrosine to maintain its puckered conformation in the MC (29). This has implications in the reaction mechanism in the absence of the catalytic base, as described in subsequent sections. The important enzyme-substrate interactions observed from MD simulations involve Asp-178, Asp-10, Asn-179, Glu-48, and Trp-18 ( Fig. 2 and Fig. S4).

Hydrolysis mechanism in HiCel45A
The catalytically competent configurations from the simulations described above enabled investigation of the HiCel45A hydrolysis reaction mechanism using QM/MM MD simulations. Fig. 3A depicts snapshots of the reactant, transition state, and product configurations of the system, accompanied by chemical structures depicting the bonds being broken and formed. Note the strained conformation of the substrate ring bound at the Ϫ1 subsite in the reactant and transition state conformations and the development of a strong interaction between the hydrogen atom on O6 and the deprotonating oxygen on Asp-121. Movie S1 depicts one of these trajectories.
The ensemble of reactive and unreactive trajectories was systematically analyzed to determine the optimal RC. LM analysis determined the optimal RC to include two differences in distances: the difference between the length of the forming and breaking bonds involving the acidic proton and the difference between the length of the forming and breaking bonds involving the anomeric carbon ( Fig. 3A and Section S2.2). This RC was validated using the p B histogram test (30) (Fig. S2).
In GH enzymes, ring puckering at the active site is an important feature of the catalytic itinerary. Although the 1.5-ps trajectories are long enough to capture bond-breaking phenomena, observations of pucker transitions in the reactant and product basins require longer simulation times. Hence, starting from the ends of a successful reactive trajectory, we also conducted 500-ps QM/MM simulations to evaluate this behavior for the substrate at the Ϫ1 subsite preand post-reaction (Fig. 3B). The substrate in the reactant state starts near the 2 S O region ( cremer-pople (31) ϳ 90°) plot and then immediately shifts to the 4 C 1 ( cremer-pople ϳ 10°) chair state, with the hydroxymethyl and 2, 3, and 4 hydroxyls in equatorial positions. In the product state, the substrate ring at the Ϫ1 subsite again starts in the vicinity of the 2 S O region and explores various pucker states ( cremer-pople ϳ 90°) before transitioning to the less common 1 C 4 ( cremer-pople ϳ 180°) state with axial substituents.
The elucidation of an optimal RC sets the stage for free energy calculations along the inverting hydrolysis mechanism.

Humicola insolens GH45 catalytic mechanism
During free energy calculations, the sugar pucker in the reactant windows (RC Ͻ Ϫ1.3) was restrained to 2 S O , as this is in analogy with the pucker state observed in GH6, which exhibits similar catalytic machinery (25,26). Likewise, product windows (RC Ͼ 0.7) were restrained to 1 C 4 , as we observed this to be the stable conformation in the product basin. The resulting free energy profile revealed a slightly endergonic 3.8 Ϯ 1.9 kcal/mol reaction with a barrier of 23.55 Ϯ 0.3 kcal/mol. The order parameters that constitute the best predicted RC as evaluated with LM are also shown. B, Cremer-Pople itineraries for the pucker state of the substrate bound at the Ϫ1 subsite in unbiased QM/MM simulations of the system in its reactant and product states; the reactant basin starts in the 2 S O pucker and ends in the 4 C 1 state, whereas the product basin starts in the 2 S O state and ends in the 1 C 4 state. C, the free energy profile for the hydrolysis mechanism, calculated by umbrella sampling along the RC. The standard deviations in ⌬G were estimated by bootstrap analysis. The inset depicts the sugar ring at the Ϫ1 position puckered in the 2 S O state in the reactant basin windows and the 1 C 4 state in the product basin windows. D, the average distances between the Asp-121 proton to the glycosidic oxygen (blue), glycosidic oxygen and the anomeric carbon (green), the water oxygen to anomeric carbon (red), and water oxygen and its proton (light blue) are plotted as the reaction proceeds from the reactant state (left) to the product state (right). The reaction is initiated by transfer of the proton from Asp-121 and breaking of the glycosidic bond (crossover between green and blue lines), followed by splitting of the water molecule and attack on the anomeric carbon (crossover between the light blue and red lines). The average values and the associated standard deviations (shaded colors) were calculated from 16 reactive trajectories.

Humicola insolens GH45 catalytic mechanism
Analysis of unbiased reactive trajectories reveals that the reaction can be described to occur in two stages: transfer of the proton from Asp-121, accompanied by breakage of the glycosidic bond, and splitting of the water molecule and the nucleophilic attack on the anomeric carbon, accompanied by transfer of the proton to the base Asp-10 ( Fig. 3D). Specifically, in the reactant basin, the hydroxymethyl group of the Ϫ1 sugar is not directly coordinated to any active-site residues. Distance analysis between the hydroxymethyl group of the substrate bound at the Ϫ1 subsite and the Asp-121 oxygens from 16 reactive trajectories is presented in Fig. 4. As the reaction proceeds, Asp-121 becomes increasingly electronegative. This result, coupled with the proximity and lack of coordination of the hydroxymethyl group, enables close coordination of the C6 hydroxyl hydrogen with the Asp-121 oxygen as indicated by the average distances ϳ2 Å in the product state. Hydrogen bond analysis for the 500-ps unrestrained QM/MM MD of the product also reveals the presence of a hydrogen bond between the hydroxymethyl proton and Asp-121 oxygen in 58% of the frames.

Proposed catalytic mechanism in the D10N mutant
The active sites of GH45 subfamilies A, B, and C share with MltA and expansins the conserved catalytic acid, but of these, only GH45 subfamilies A and B exhibit the catalytic base at the equivalent position of Asp-10 in HiCel45A. The coordination of the hydroxymethyl proton with the catalytic acid Asp-121 in our simulations of the hydrolysis mechanism in WT HiCel45A suggests the possibility of Asp-121 acting as both acid and base for the reaction in the absence of the WT catalytic base Asp-10. Although experimental characterization of the HiCel45A D10N mutant revealed no hydrolytic activity (5), this does not preclude other catalytic activity in the HiCel45A D10N mutant, similar to MltA or expansins. The role of aspartic/glutamic acid as the sole catalytic residue mediating glycosidic bond lysis is the established mechanism in LTs (1, 11, 15-17, 32, 33). Fig. 5A illustrates the mechanism proposed in the HiCel45A D10N mutant, which involves initiation of glycolysis by proton trans-fer from the acid (Asp-121) to the glycosidic oxygen, followed by transfer of the hydroxymethyl hydroxyl proton to Asp-121 acting as a base. This generates a nucleophilic O6 and an oxocarbenium ion at the anomeric carbon, which react to form a 1,6-anhydro product (Fig. 5).
We sought to evaluate the relative feasibility of this proposed reaction mechanism via an RC that is the difference in the distances between the anomeric carbon to the glycosidic oxygen and the Ϫ1 hydroxymethyl oxygen to the anomeric carbon (the breaking and forming C-O bonds, respectively). This is the most significant component of the WT RC, consistent with past studies (24,25). The free energy simulations estimate that the proposed reaction mechanism is exergonic by 14.8 Ϯ 0.1 kcal/ mol with a free energy barrier of 30.56 Ϯ 0.13 kcal/mol. This free energy barrier is higher than the hydrolysis mechanism by ϳ7 kcal/mol.
Configurations near the barrier in Fig. 5C were used to initiate further unbiased trajectories for conceptualizing the sequence of events that constitute the reaction. Analysis of 16 of these reactive trajectories reveals that the reaction is predicted to consist of three main stages: glycolysis initiated by proton transfer from Asp-121; transfer of the hydroxymethyl proton to the Asp-121, which now acts as a base; and nucleophilic attack of the hydroxymethyl group onto the oxocarbenium ion at the anomeric carbon (Fig. S6). Movie S2 depicts one of these reactive trajectories.

Discussion
A detailed molecular understanding of the inverting hydrolysis mechanism in HiCel45A has to date been hindered by the lack of an experimentally characterized MC. Here we constructed the MC in silico and elucidated the WT hydrolysis mechanism (including a unique puckering itinerary) via TPS simulations. The insights facilitated exploration of an LT-like mechanism of glycosidic bond cleavage in the absence of the base aspartic acid (i.e. the D10N HiCel45A mutant). The predicted plausibility of this mechanism has potential implications for our understanding of the mode of action in GH45 subfamily C, LTs, and expansins.
Prior TPS studies of enzymatic reactions have focused on retaining and inverting cellobiohydrolases and glycosynthases (24,25,34), elucidating accurate RCs, free energies, puckering itineraries, and rate-limiting steps. This study extends the application of TPS to the inverting mechanism of an endoglucanase. As illustrated in Fig. 3, the key OPs that constitute the RC (OP#7 and OP#9 in Table S1) in HiCel45A-catalyzed hydrolysis involve differences in distances representing cleaved and formed bonds, akin to reaction mechanisms elucidated in TrCel6A and TrCel7A (24,25). When taken together with the present findings regarding HiCel45A, these studies overall suggest a general trend where the major components of optimized RCs in GH-catalyzed cellulose deconstruction are relatively simple, involving distances of breaking and forming bonds, especially C-O bonds.
Computational reaction barriers allow comparison with experimentally observed turnover rates. Assuming a transmission coefficient of 1, an upper estimate for the rate constant (k cat ) can be obtained using transition state (TS) theory (Section

Humicola insolens GH45 catalytic mechanism
S4) (35). In HiCel45A, the experimental turnover k cat was measured to be 17.7 s Ϫ1 for the cellohexaose substrate (5). The computed barrier of 23.55 Ϯ 0.3 kcal/mol corresponds to a rate coefficient of 4.01 ϫ 10 Ϫ5 s Ϫ1 . This may indicate that some TS-stabilizing interactions are absent in our MC as constructed, although these details await experimental determination of the MC structure.
Another difference in the predicted reaction free energy profile is the presence of a much more stable hydrolysis product in the CtGH8, TrCel6A, and TrCel7A mechanisms (24,25,36), where the reaction is exergonic, whereas in this study, the reaction is mildly endergonic by 3.8 Ϯ 1.9 kcal/mol. This difference could be attributed to a number of reasons, including, but not limited to, an imperfect MC or the inherent inaccuracies of the semiempirical force fields used here. The observed endergonicity could be related to the unique puckering itinerary observed in HiCel45A. In both inverting (GH8 (36) and GH6 (25)) and retaining (GH7 (24, 37) and GH16 (38)) ␤-glucosidases, the puckering itinerary for the substrate ring bound at the Ϫ1 subsite is restricted to the top hemisphere of the Cremer-Pople coordinates (31). This involves the Ϫ1 subsite substrate ring transitioning from the 4 C 1 chair (in solution) to a strained pucker conformation in the MC and transition state followed by return to the 4 C 1 chair conformation in the product state (24,25,(36)(37)(38). However, in this study, the puckering itinerary in HiCel45A involves the reactant binding in the 4 C 1 chair conformation, the MC puckering to the 2 S O state, and the product transitioning to the 1 C 4 conformation. Although the 1 C 4 conformation has not been observed previously in products of cellulose hydrolysis, it has been characterized experimentally and computationally in the GH47 ␣-mannosidase (39). In HiCel45A, the 1 C 4 state is stabilized by interactions of the hy- Figure 5. A, proposed catalytic mechanism in the D10N mutant. Shown are snapshots from a reactive QM/MM trajectory, accompanied by chemical structures of the reactant, transition state, and product for the proposed mechanism in the absence of the base aspartic acid, Asp-10. B, the order parameters utilized to compute the free energy profile. C, the free energy profile for the LT mechanism in the D10N mutant. Note that the standard deviations for the ⌬G values, estimated using bootstrap analysis, are less than 1 kcal/mol and, hence, are within the thickness of the line depicting the free energy profile.

Humicola insolens GH45 catalytic mechanism
droxymethyl group with Asp-121, enabling an intramolecular hydrogen bond between the proton on the C3 hydroxyl group and the hydroxymethyl oxygen (Fig. 6). Another hydrogen bond is observed between the proton on the C2 hydroxyl group and the glycosidic oxygen connecting to the sugar bound at the Ϫ2 subsite.
The ability of HiCel45A to stabilize the 1 C 4 conformation via interactions of Asp-121 with hydroxymethyl groups may be significant, considering the possible similarity of reaction mechanisms in HiCel45A D10N and LTs such as MltA. In general, LTs are essential bacterial proteins that enable cell division, macromolecular transport, and growth by cleaving glycosidic bonds in peptidoglycans without hydrolysis and are characterized by a single catalytic residue (Asp/Glu) at the active site (14). In a recent QM/MM study, Byun et al. (40) described the energetics and puckering itinerary for the LT (MltE) mechanism. It was observed that the lowest-energy pucker conformation in the final 1,6-anhydro product (1,6-anhydroMurNAc) of the LT mechanism is the 1 C 4 chair. Hence, the ability of the HiCel45A active site to stabilize the 1 C 4 conformation further bolsters the hypothesis that it may facilitate formation of the 1,6 anhydro product.
The GH45 subfamily A members HiCel45A and Humicola grisea Cel45A hydrolyze xylan (41) (which differs from cellulose by the absence of the hydroxymethyl exocyclic groups), whereas no xylan activity was observed for a GH45 subfamily C member (PcCel45A) (21). Based on the rapid dissociation of xyloheptaose in MD simulations, Godoy et al. (21) pointed to the importance of the hydroxymethyl groups for substrate binding as a key differentiator between cellulose and xylan activity by PcCel45A (21). Our simulations on the HiCel45A D10N mutant indicate that the interaction between the hydroxymethyl group of the glucose moiety at the Ϫ1 subsite and the catalytic acid is essential for glycolysis in the absence of the catalytic base. These results suggest that, in addition to binding, the absence of hydroxymethyl groups on xylan may preclude an important interaction with the catalytic acid, possibly prohibiting xylan hydrolysis at the GH45 subfamily C active site. Fig. 1B illustrates conservation of the active site domain and the catalytic aspartic acid of MltA and expansins. Although the barrier computed here for 1,6 anhydro product formation in HiCel45A D10N is relatively high, our work still predicts that the proposed mechanism is potentially feasible, as in LTs such as MltA and MltE as well as expansins, with some differences in stabilizing interactions at the active site. For example, the peptidoglycan substrate of LTs exhibits an acetyl group that stabilizes the TS, whereas HiCel45A has no analogous stabilizing interaction. Consider also that Quay et al. (42) recently characterized structures of a stationary phase survival protein in Bacillus subtilis that has significant similarities with LTs (as well as GH45s and expansins) and observed the presence of 1,6 anhydro GlcNAc (N-acetylglucosamine) at the active site when cocrystallized with a pentasaccharide (N-acetylglucosamine). A superposition of this substrate-bound structure with a snapshot from our simulations of the product in the HiCel45A D10N mutant reveals similarities at the active site as well as in the substrate binding modes (Fig. 7).
The effect of expansins on various plant cell wall polysaccharides has been well characterized, with their activity most prevalent in xyloglucan composites and mild extension activity observed in cellulose substrates (43,44). Despite close structural similarities between expansins and GH45s, the general consensus on the expansins' mode of action is that they are nonenzymatic (10,45), based on the lack of observable hydrolytic activity (10,13). The hydrolytic activity of GHs is quantified by assays that require the presence of a reducing end on a sugar (43). The 1,6 anhydro product does not form a reducing end on the sugar; hence, it is plausible that expansins employ an enzymatic mode of action despite the lack of detectable hydrolytic activity. We also note that an interesting "Newton's cradle" hydrogen transfer mechanism has been proposed to be involved in catalysis in GH45 subfamily C, which is also lacking the catalytic base of GH45 subfamily A and B. Combined neutron and X-ray crystallography studies of PcCe45A have revealed amide-imidic acid tautomerization of one strategically positioned asparagine (Asn-92) and a proton transfer chain through the protein from Asn-92 to the catalytic acid Asp-114 (22). However, PcCel45A generates hydrolysis products, i.e.

Humicola insolens GH45 catalytic mechanism
reducing sugars. The lack of observed hydrolytic activity in expansins points to the presence of an alternative catalytic strategy for glycosidic bond cleavage without the formation of hydrolytic products, as we have studied here in HiCel45A D10N. Our findings support the hypothesis that the HiCel45A D10N and expansin active sites, as in LTs such as MltA and MltE, might be capable of cleaving glycosidic linkages in the absence of a catalytic base. Furthermore, Godoy et al. (21) reported recently that a peptide pattern recognition analysis revealed that the asparagine residue involved in the mechanism proposed by Nakamura et al. (22) is not universally conserved among all subfamily C GH45s, suggesting the distinct possibility of an alternative mechanism for catalytic action in the absence of the base aspartic acid.
Although this study predicts the feasibility of 1,6 anhydro glucan formation on cellulosic substrates, albeit with a high free energy barrier, other substrates with functional groups on the C2 and C3 hydroxyl groups may promote transition state stabilization. This could rationalize the fact that expansins show greater activity on xyloglucan substrates, which are decorated with acetyl/methyl functional groups, compared with the undecorated cellulose substrate (44). Hence, the insights gained in this study suggests the need for re-evaluation of potential enzymatic activity in expansins and development of an experimental assay to test the presented hypothesis.

Preparing the MC
The MC is not available from reported structures for the inverting cellulase HiCel45A. Among the other inverting GH families that act on glucosides (e.g. GH8 and GH6), the MC is observed to adopt the 2 S O pucker conformation for the substrate at the Ϫ1 subsite (25,28,36). The MC for TfCel6A presents a template for constructing the MC for HiCel45A, as its ligand spans six subsites across the active site, including a skewed sugar ring at Ϫ1 site (PDB code 4AVO) (26). The glucan chain substrate in the TfCel6A structure spans subsites Ϫ2 to ϩ4, whereas in the HiCel45A structure (PDB code 4ENG (5)), it is in two pieces, across subsites Ϫ4 to Ϫ2 and ϩ1 to ϩ3, missing the key Ϫ1 sugar. Transferring cellohexaose to HiCel45A based on substrate-based alignment of HiCel45A and TfCel6A at sites Ϫ2 to ϩ2 enabled construction of the MC for HiCel45A.

Molecular dynamics simulations
The CHARMM36 forcefield was used for modeling solvated systems of native and cellohexaose-bound HiCel45A, whereas the TIP3P forcefield was used for modeling water (14,46,47). The seven disulfide bridges observed in the crystal structure (5) were included, and the protonation states for amino acid side chains were determined using the Hϩϩ web server for a pH of 5, the pH at which HiCel45A has been assayed (1,41,48). The crystal structure contains a mutation of the catalytic base Asp-10 to asparagine; this was converted back to the native aspartic acid, and the active-site proton donor Asp-121 (catalytic acid) was protonated. All water molecules observed in the crystal structure were retained before solvating the systems with a 12 Å buffer. The net charge on the system was neutralized with addition of sodium ions. The equilibration protocol involved initial restraints on the protein and the substrate, which were gradually released in subsequent simulations (Table S1). During these simulations, dihedral restraints were employed to maintain the pucker state of the substrate ring bound at the Ϫ1 subsite and distance restraints for maintaining the acid proton orientation toward the glycosidic oxygen; details are described in Section S1. This was followed by five independent unrestrained production runs, each for 100 ns at 300 K. The DOMDEC engine was used with the CHARMM molecular simulation package for these MD simulations with a 2-fs time step and bonds to hydrogen atoms constrained using the SHAKE algorithm (49 -51). The simulations were analyzed using the CHARMM analysis tools (49). Catalytically competent configurations (as defined in Fig. S3) were selected from these equilibrium simulations for subsequent QM/MM simulations.

QM/MM setup
The CHAMBER utility was used to convert the CHARMM files to the AMBER format for QM/MM simulations (52). All reactions were modeled using the QM/MM MD suite in AMBER (53). The QM region was modeled using the self-consistent charge density functional tight binding method (54,55), which has been demonstrated previously to be well-suited for studying hydrolysis reactions in GHs (24,25). Although selfconsistent charge density functional tight binding is a semiempirical approach, it enables access to many thousands of reactive and unreactive trajectories required for judicious analysis of a transition state ensemble from path sampling. The method has also been demonstrated to reliably reproduce energies and geometries for reactions involving C-O and O-H bond formation and breakage encountered in this study (56). The QM/MM boundary was set across C-C bonds, and hydrogen atoms were used for link atoms as in the default AMBER QM/MM protocol. The QM/MM region consisted of the 89 atoms that include the side chains of the acid and base (Asp-121 and Asp-10), the hydrolytic water, the substrate rings at the Ϫ1 and ϩ1 subsites, and Thr-6, Tyr-8, and Asp-114, which have close substrate interactions. All QM/MM simulations were conducted with a 1-fs timestep, a 10 Å nonbonded cutoff, periodic boundary conditions, and in the isothermal-isobaric ensemble at 300 K and 1.0 bar. Particle mesh Ewald was used for calculation of longrange electrostatics (57). Temperature control was maintained using an Andersen thermostat.

TPS and free energy calculations
The aimless shooting (AS) (30) variety of TPS (58), as described in detail in Section S2, starts with guessing a transition state configuration and initiating trajectories with randomized velocities in the "forward" direction and the opposite of these velocities in the "backward" direction. A Monte Carlo procedure is then carried out, with the acceptance criterion for a trajectory being that it connects the reactant and product basins (as defined in Fig. S1). Initial guesses for the transition states were obtained by restraining key bonds (namely, anomeric C to glycosidic oxygen and water oxygen to anomeric carbon) to putative transition state values from catalytically competent structures observed from the MD simulations of the constructed MC. This produced 32 near-TS structures, each of which seeded an independent AS run. For each of the 32 AS runs, 1,000 trajectories were obtained from each, the first 100 of which were discarded for equilibration purposes. In this way, 28,800 production AS trajectories were obtained.
The RC is constructed from candidate order parameters (OPs), which measure the geometric changes that characterize the transition of the system from reactant to product. A total of 99 OPs (Table S2) were analyzed in this transition state ensemble using LM (59, 60) to predict the best RC. LM (59, 61) predicts the best RC from these OPs and their combinations by fitting the model to p B , defined as the fraction of trajectories that commit to the product basin from a given configuration (59). The best predicted RC was validated using the committor probability (30) (p B ) test (Section S2.3, Table S3, and Fig. S2).
The best predicted RC was used to set up umbrella sampling windows for the calculation of the free energy profile for the reaction. Thirty-one windows were seeded from a reactive AS trajectory spanning the RC values between Ϫ3.5 to ϩ2.5, describing the transition from reactant to product. In AMBER, in-house modifications were made to the sander utility source code to enable restraining the windows to customspecified RCs that can be combinations of distance and angle values and differences between them. The windows were restrained to the desired RC values using this modified sander routine (rxncoor) with a force constant of 15 kcal/ (mol⅐Å 2 ) and run for 100 ps each. The weighted histogram analysis method was used to construct the free energy profile (62).

Mutant simulations
The D10N mutant system was created by choosing a QM/MM snapshot of the WT system in the reactant state and mutating Asp-10 to asparagine. The RC considered for this calculation involved the difference in the distance between the anomeric carbon to the glycosidic oxygen and the Ϫ1 hydroxymethyl oxygen to the anomeric carbon (the breaking and forming C-O bonds, respectively). In analogy with past studies (24,25), the values of the RC for the reactant and product state (Fig.  S5) were known to be Ϫ2.25 and 1.75, respectively. Restrained RC simulations using rxncoor were used to prepare seeds for 81 windows spanning these RC values, starting from the reactant state. Following a short equilibration of 2 ps in each window, production runs of 50 ps with a force constant of 100 kcal/ (mol⅐Å 2 ) were conducted, and the weighted histogram analysis method was used to construct the free energy profile (62). A more detailed description of the setup for mutant simulations is provided in Section S3.