Substrate binding in the processive cellulase Cel7A: Transition state of complexation and roles of conserved tryptophan residues

Cellobiohydrolases effectively degrade cellulose and are of biotechnological interest because they can convert lignocellulosic biomass to fermentable sugars. Here, we implemented a fluorescence-based method for real-time measurements of complexation and decomplexation of the processive cellulase Cel7A and its insoluble substrate, cellulose. The method enabled detailed kinetic and thermodynamic analyses of ligand binding in a heterogeneous system. We studied WT Cel7A and several variants in which one or two of four highly conserved Trp residues in the binding tunnel had been replaced with Ala. WT Cel7A had on/off-rate constants of 1 × 105 m−1 s−1 and 5 × 10−3 s−1, respectively, reflecting the slow dynamics of a solid, polymeric ligand. Especially the off-rate constant was many orders of magnitude lower than typical values for small, soluble ligands. Binding rate and strength both were typically lower for the Trp variants, but effects of the substitutions were moderate and sometimes negligible. Hence, we propose that lowering the activation barrier for complexation is not a major driving force for the high conservation of the Trp residues. Using so-called Φ-factor analysis, we analyzed the kinetic and thermodynamic results for the variants. The results of this analysis suggested a transition state for complexation and decomplexation in which the reducing end of the ligand is close to the tunnel entrance (near Trp-40), whereas the rest of the binding tunnel is empty. We propose that this structure defines the highest free-energy barrier of the overall catalytic cycle and hence governs the turnover rate of this industrially important enzyme.

Cellobiohydrolases (EC 3.2.1.21 and EC 3.2.1.91) are among the most effective enzymes for the breakdown of cellulose. They are dominant in the secretome of many cellulose-degrading microorganisms and hence play a vital role in the natural carbon cycle. They are also of major technological interest because they make up the principle component in enzyme cocktails used for the conversion of lignocellulosic biomass to fermentable sugars. This industrial process (so-called saccharification) is critically important for the development of efficient biorefineries that produce fuels and chemicals from lignocellulosic feedstock. Reaching the desired goals for production of renewable fuels will involve an unprecedented consumption of industrial enzymes, and improved technical and scientific understanding of the saccharification process hence appears valuable. However, many mechanistic aspects of cellobiohydrolases remain poorly understood, and this is at least in part because these enzymes operate by a complex, processive mechanism.
In the current work, we address the kinetics and thermodynamics of substrate complexation and decomplexation of the cellobiohydrolase, Cel7A, from the cellulolytic fungus Trichoderma reesei. These two steps appear to be particularly important, because they have alternately been proposed to be rate-limiting for the overall reaction (see Ref. 1 for a recent review). Focus on the on and off rates of Cel7A has been accentuated by the observation that covalent transitions associated with the actual bond cleavage proceed quite rapidly. Thus, the sequence of events (sometimes called the inner catalytic cycle; step 3 in Fig. 1A), which includes breaking of one ␤-1,4-glycosidic bond, expulsion of product (cellobiose), and one processive step forward, occurs at a rate of ϳ5 s Ϫ1 at room temperature (2). However, the maximal turnover of Cel7A, V max / E 0 , is much lower at ϳ0.1-0.3 s Ϫ1 (3,4), and this clearly suggests the existence of slow (noncovalent) transitions outside the inner catalytic cycle. Slow complexation, for example, appears likely as the process involves "threading" of at least nine pyranose moieties into the tunnel-shaped binding area of the enzyme (Fig. 1). This process could well be associated with a high activation energy because the piece cellulose strand must be pulled out of a bound state in the cellulose particle before it can engage in favorable interactions with the enzyme (5). Conversely, some studies have ascribed low turnover to a particularly slow rate of decomplexation and a concomitant accumulation of inactive enzyme on the substrate surface (4,6). These interpretations of either on-or off-rate limitation of Cel7A may appear discordant, but they could also reflect that the rate-limiting step changes with experimental conditions such as temperature or substrate load (7,8). In any event, more insights into the binding and unbinding of a cellulose strand in Cel7A will be required to understand the mechanism of this enzyme. In particular, it would be rewarding to gain insights into the nature of the transition state for the processes of substrate complexation and decomplexation, because these structures may govern the overall reaction rate under most conditions.
The structural element that has received most interest regarding threading and substrate binding for Cel7A is four highly conserved Trp residues placed along the binding tunnel (9,10) as shown in Fig. 1B. They have all been shown to interact with the ligand (11), and two of them, Trp-38 and Trp-40, are located in "relaxed subsites" (11) (subsites number Ϫ7 and Ϫ4) quite far from the catalytic residues, where the ligand is in an extended conformation. The third, Trp-367, is in a "twisting subsite" (11) (subsite Ϫ2), where it helps to flip the chain and hence expose the scissile bond for nucleophilic attack, and the last, Trp-376, is in a "product subsite" (subsite ϩ1) in which the product is transiently located after hydrolysis of the glycosidic bond. Different functional roles of these Trp residues in Cel7A and other cellulases have been discussed extensively (reviewed in Ref. 9), and in the current work, we address this through quantitative kinetic data. Specifically, we have implemented a continuous fluorescence method by which complexation of Cel7A and insoluble cellulose can be monitored after a dead time of a few hundred ms. We have used this method to study the dynamics of complex formation for both Cel7A and a number of enzyme variants in which one or two of the Trp residues had been replaced with Ala. The results elucidated both WT ligand binding and the roles of the four Trp residues for the rate and strength of complexation. Finally, the data allowed so-called ⌽-factor analysis to elucidate the nature of the transition state in the complexation/decomplexation steps.

Titration experiments
The fluorescence emission at 328 nm increased distinctly when WT Cel7A bound its insoluble substrate (Fig. S1). For the W376A/W38A mutant, the change was much less pronounced (ϳ15% of the change for Cel7A WT ), and the other enzymes in Table 1 showed changes between these two values. In general, the changes were large enough to be readily measurable and probably reflect dequenching because water molecules are removed from the indole side chains of Trp residues (12) during ligand binding. It follows that the populations of free and bound enzyme can be estimated by a linear combination of the fluorescence signals from enzyme solutions with respectively no substrate and saturating substrate loads. Under this interpretation, the fluorescence method distinguishes between free enzyme (E) and enzyme in complex with substrate (ES), whereas it is blind to activity and (uncomplexed) adsorption (i.e. step 1 in Fig. 1A). In light of this, we will interpret the current data along the lines of a simple binding model (Scheme 1).
We note that Scheme 1 assumes binding equilibrium, and this criterion has previously been confirmed for short experiments with the current systems (13).
The fraction of enzyme molecules that are engaged in a complex, , may be determined from as, where E 0 is the total enzyme concentration, F max is the increment in the fluorescence signal at saturation (measured at high substrate loads where E 0 ϭ ϳ[ES]), and F x is the increment in signal in a titration experiment with a given cellulose load (Fig.  S1). Because the signals in the titration experiments were allowed to become constant, we assume that represents the equilibrium distribution. However, the equilibrium constant for Scheme 1 is not straightforward to define because the substrate load, S, is in mass units (the accessible molar concentration of sites at the surface of the insoluble substrate is unknown).
To address this, we introduce a parameter, ⌫, which specifies the density of sites on the substrate surface (in mol/(g cellulose)), to which the enzyme can bind and form the ES complex (sometimes called an attack site). The total (molar) concentration of attack sites is then S 0 * ⌫, where S 0 * is the (known) initial cellulose load in g/liter. We may then write the following mass balance,

Cellulose binding and transition state of Cel7A
where [S] is the (molar) concentration of unoccupied attack sites. The overall strength of cellulase binding can be described by the so-called partition coefficient, K P (14), which may be expressed as follows.
for Scheme 1). The partitioning coefficient takes into account both intrinsic binding strength and the availability of sites and hence provides a weighted affinity parameter. This is an important property because distinctly different ⌫ values among cellulases (15,16) tend to decouple K d and the overall binding capacity. For this reason, we will use K P in discussions of the relative binding strength of the investigated variants. We note, however, that according to Equation 3, K P can be readily converted to a dissociation constant (in units of M Ϫ1 ), K d ϭ K P /⌫. This latter parameter carries larger uncertainty because it requires conversion to molar substrate concentration, and it is hence subject to error propagation through division of two parameters from the regression analysis (⌫ and K p ). However, it is useful to use K d in comparisons with other studies that reports affinity through the dissociation constant.
To derive experimental values of K P and ⌫ from the titration measurements, we combine Equations 1-3. This yields.
Solving the quadratic Equation 4 for and combining with Equation 1 gives the following.
Equations 5 and 6, which show a standard relationship for ligand binding (17), were implemented in a nonlinear regression routine (Origin 2018b; OriginLab, Wellesley Hills, MA) and used to derive maximum likelihood values for K P , ⌫, and F max . We consistently found that Equations 5 and 6 accounted well for the experimental data, and examples of measurements and regression results are shown in Fig. 2 (a schematic representation of how the raw data from the titration experiments were converted into binding curves can be found in Fig. S2). The parameters (K p , ⌫, and F max ) for all investigated enzymes are listed in Table 1. Note that F max is reported as the ratio (in %) of variant and Cel7A WT . Fig. 3 shows representative examples of the real-time data for ligand binding; the results for other enzymes may be found in Fig. S3. We used 250 nM enzyme, and at t ϭ 0, we added enough regenerated amorphous cellulose (RAC) 3 suspension (always 100 l) to reach the total substrate load listed in the legend of Fig. 3. To analyze the data, we first assessed the instrument dead time in control experiments with a nonbinding protein. As detailed in Fig. S4, effects of mixing tapered off with a half-time of 150 ms and became undetectable (compared with the experimental scatter) after ϳ0.5 s. Hence, we discarded data points for t Ͻ 0.5 s for all runs. The results could be described by a single exponential function of the type F

Time-course experiments
where F x (t) is the real-time fluorescence signal, and F eq is the equilibrium value specified by the plateau level for each RAC load as illustrated in Fig. 3. In some cases, a slight improvement of the variance of the fits could be gained by using the sum of two exponential terms. We did not associate any theoretical meaning with the exponential fitting parameters but used them as a practical way to find the initial slope of the fluorescence traces as the mathematical derivative for t 3 0. Combining with Equation 1 and F max from the titration experiments (Table 1), we converted slopes for t 3 0 to the initial rate of complex formation, v on , in units M/s. We plotted v on as a function of the RAC load as exemplified in Fig. 4. These plots consistently showed that v on scaled proportionally to the RAC load and hence confirmed that the reaction was first order with respect to the substrate at least in the concentration range studied here.
According to Scheme 1, the initial on rate may be expressed as v on ϭ k on S 0 E 0 , and it follows that the slope, ␣, of the lines in Fig. 4 is ␣ ϭ k on E 0 . We used ␣ values from Fig. 4 and (known) E 0 to calculate the k on values listed in Table 1. On-rate constants determined in this way are in the somewhat unusual (massbased) units of (g/liter) Ϫ1 s Ϫ1 , and to emphasize this, we will

Cellulose binding and transition state of Cel7A
henceforth use the symbol mass k on for this parameter. However, because we know the density of attack sites on the substrate (⌫ in mol/g cellulose) from the titration experiments, we may convert mass k on to an on-rate constant, molar k on , in the usual units of M Ϫ1 s Ϫ1 ( molar k on ϭ mass k on /⌫). This value is also listed in Table 1. Finally, combination of Equation 3 and the general relation between K d and rate constants, K d ϭ k off / molar k on , allows us to determine the off-rate constant, k off ϭ mass k on /K P . We note in passing that all parameters could have been derived from the time-course measurements without conducting titration measurements at all. However, we found that the experi-mental scatter was lessened by making separate titration experiments with repeated additions of RAC to the same enzyme sample, and we hence preferred this procedure. Cellulase kinetics is known to depend on the physical properties of the substrate, and to investigate this aspect we attempted to repeat the above experiments on bacterial microcrystalline cellulose (BMCC), which is mainly crystalline. Unfortunately, excessive light scattering from BMCC limited this work to fairly low loads of cellulose (i.e. far from saturation with superimposed curves at increasing loads as in Fig. 3). This prevented systematic analysis, but raw data for Cel7A WT at low BMCC loads suggested kinetics comparable with that observed on RAC (Fig. S5). In fact, the movement toward equilibrium was slightly faster (half-time of ϳ5 s) on this more crystalline substrate.

Discussion
Cellulases and other enzymes that attack solids must break interactions in the substrate surface before a complex can be formed, and this "excavation" of the ligand is likely to be important for the energy landscape of heterogeneous enzyme reactions. Heterogeneous reactions are widespread both in vivo and in technical enzyme applications (18 -22), and insights into the thermodynamics and kinetics of ligand binding appear particularly important for mechanistic understanding. Earlier works on this topic for cellulases have mostly reported adsorption data derived from assays that distinguish respectively bulk-and surface-adsorbed enzymes. The results from this type of work have provided quite consistent conclusions regarding the strength of binding (see below), but the picture has been less clear with respect to dynamics. Some works have found that Cel7A and other cellulases build up quite slowly on the substrate surface, with apparent adsorption equilibrium established after tens of minutes (23)(24)(25)(26)(27). Others have reported a faster process, which leveled off toward a constant adsorbed population within seconds or tens of seconds (28, 29). Recently, more detailed insights have emerged from single-molecule imaging methods (24, [30][31][32][33]. This work has elucidated key

Cellulose binding and transition state of Cel7A
parameters such as the rate of movement and residence times of enzymes on the substrate, although on-and off-rate constants derived in different studies have been quite divergent (24, 30,31,33). Other studies have used specialized biochemical assays, which enables distinction between cellulases that are either free in solution, complexed with substrate, or surfaceadsorbed with empty binding tunnel (i.e. not complexed; Fig. 1) (29,(34)(35)(36). This type of work has elucidated the population of these three states and the dynamics of their interconversion, but the necessary assays are very labor-intensive and hence not apt for comparative investigations. In the current work, we address the complexation and decomplexation of Cel7A through a new fluorescence-based method that monitors ligand binding in real time. The observed changes in fluorescence signal occur because the ligand causes dequenching of Trp residues, when it enters the binding tunnel, and the method hence distinguishes threaded enzyme from apoenzyme, whereas it is insensitive to whether the enzyme is adsorbed with empty binding tunnel or free in the bulk. We argue that this makes it particularly useful in studies of complexation dynamics. We first focus on the rate of threading of Cel7A WT and subsequently discuss the roles of the conserved Trp residues and the possible structure of the transition state for the complexation step.

Wildtype Cel7A and variants with native Trp pattern
The partitioning coefficient, K P , for Cel7A WT was 122 Ϯ 33 liters/g (Table 1), which corresponds to a dissociation constant, K d ϭ K P /⌫ of ϳ52 nM. This binding strength falls in the same range as earlier reports on Cel7A (14,27,(36)(37)(38). Affinity data from these and other references are listed in Table S1. This table shows that earlier work with submicromolar enzyme concentrations (like the current work) found K d values in the low nanomolar range in accordance with the results reported here. Experiments with tens or hundreds of micromolar enzyme, on the other hand, indicate much weaker binding (K d in the micromolar range). This difference probably reflects heterogeneity of the binding sites and thus a tendency to detect weaker, unproductive binding at higher concentrations (37,39). The value for the attack site density on amorphous cellulose (⌫ ϭ 6.4 Ϯ 0.6 mol attack sites per gram of RAC; Table 1) is in excellent accordance with earlier adsorption studies and biochemical assessments of productive binding capacities (13,40). Thus, results from the titration experiments generally supported the validity of the approach and interpretation based on Scheme 1. This simplified interpretation is in line with the argument that the method only differentiates complexed and uncomplexed states and further supported by the observation that the WT and the E212Q variant have almost the same on/off-rate constants (Table 1). Thus, the E212Q variant is practically inactive on insoluble cellulose (41) and hence formally meets the premises of Scheme 1.
The off-rate constant for WT Cel7A was 0.0048 Ϯ 0.0013 s Ϫ1 (Table 1), and this corresponds to a mean ES lifetime (1/k off ) of ϳ200 s. This number is in good agreement with recent measurements of Cel7A residence times based on single-molecule fluorescence imaging, and it is also in line with biochemical data (4,33). The low k off value has the important corollary that slow dissociation governs the reaction rate at saturation (4, 28), and as a result, the maximal turnover number may be approximated V max /E 0 ϭ ϳnk off (42), where n is the average processivity number. Earlier work has suggested that n ϭ ϳ15-20 for Cel7A acting on RAC (42)(43)(44)(45), and multiplying this value by k off from Table 1 predicts a maximal turnover of ϳ0.1 s Ϫ1 . This matches V max /E 0 values measured directly in biochemical assays (3,4), and this prediction of the maximal turnover from the current ligand-binding studies (without any activity measurements) further supports applicability of the method. The on-rate constant for Cel7A WT determined from the initial slope in Fig. 3 was 0.59 Ϯ 0.01 (g/liter) Ϫ1 s Ϫ1 ( Table 1). As seen directly in Fig.  3, this corresponds to a rather swift complexation process, which moves toward equilibrium with a half-time of ϳ10 s under the conditions studied here. As argued above, the massbased on-rate constants ( mass k on ) in Table 1 can readily be converted to molar k on in the conventional units of M Ϫ1 s Ϫ1 . The values of molar k on calculated in this way are also listed in Table 1. We posit that the rate of complexation scales with the molar concentrations of enzyme and attack sites like a normal bimolecular rate constant and hence that it could be meaningful to compare it with on-rate constants for other enzyme reaction. Typical values for on-rate constants fall in the (1-50) ϫ 10 6 M Ϫ1 s Ϫ1 range for enzymes binding small, soluble ligands such as amino acids, nucleotides, small metabolites, or hydrogen peroxide (46). The current molar k on for Cel7A was 0.1 ϫ 10 6 M Ϫ1 s Ϫ1 (Table 1), and thus, 1-2 orders of magnitude slower than typical values for small-molecule ligands. This slower complexation probably reflects energy barriers associated with the polymeric nature of the ligand and its binding to the surface prior to complexation, but we note that the difference in on-rate constants for solid cellulose and soluble ligands is not exceptionally large. This obviously suggests that Cel7A has developed an effective stabilization of the transition state for complexation, and we return to this below. We may also compare the off-rate constant in Table 1 with typical values for small, soluble ligands. This comparison suggests that Cel7A is far more atypical with respect to decomplexation, because typical off-rate constants for small ligands are in the 10 3 -10 4 s Ϫ1 range (46) and hence larger than k off for Cel7A by some 6 orders of magnitude. Overall, these numbers illustrate the slow complexation/decomplexation dynamics of Cel7A, and the uncommonly low value of k off underscores that Cel7A binds its substrate more tightly than average enzymes (as K d ϭ k off / molar k on ). This latter point is confirmed by comparing the standard free energy of binding for Cel7A WT , ⌬G°ϭ RTlnK d ϭ ϳϪ42 kJ/mol, with values in a recent meta-analysis of literature data for hydrolases acting on soluble substrates (47). Thus, only a few percent of the hydrolases bound their cognate substrate as strongly as Cel7A WT , and the average ⌬G°was approximately Ϫ20 kJ/mol.

Trp variants and ⌽-factor analysis
Tryptophan residues play a key role as driver of proteincarbohydrate interactions. This is reflected, for example, in a distinct overrepresentation of Trp in contact regions of noncovalently bound carbohydrates in protein crystal structures (48). This special role of the indole side chain of Trp for carbohydrate binding is also evident in Cel7A, which has four Trp residues along the substrate-binding tunnel (11, 49) as illustrated in Fig. 1. Functional roles of these Trp residues in Cel7A and

Cellulose binding and transition state of Cel7A
analogous Trp residues in other processive glycoside hydrolases have been widely discussed, and several mutational studies have highlighted their importance for e.g. ligand-binding strength, initial threading, processivity number, and processive sliding (10,32,(50)(51)(52)(53)(54)(55)(56)(57). We first notice that the maximal change in fluorescence emission, F max , differed markedly among the investigated enzymes (Table 1 and Fig. S1). The WT and variants with a native Trp pattern (i.e. E212Q and catalytic domain (CD)) all showed approximately the same F max , whereas the Trp variants had smaller changes. This is in line with the interpretation that the fluorescence changes represent dequenching of these Trp residues. The most conspicuous effect was in the two variants that included the W38A mutation. In these cases, F max was reduced to respectively 15% (W38A/W376A) and 26% (W38A) of the WT value (Table 1), and this suggested that Trp-38 makes a particularly large contribution to measured F max values in Cel7A WT . This may reflect a particularly strong dehydration of Trp-38 upon ligand binding, and this interpretation was confirmed by MD simulations as described in Fig. S6. Thus, in the complex, no water was found within 5 Å of the center of mass of Trp-38, whereas the other three Trp residues showed some residual hydration with 1-3 water molecules within this cutoff (Fig. S6).
Although the parameters in Table 1 generally revealed a role of the Trp residues for the thermodynamics and kinetics of ligand binding, the observed effects were moderate, and this is in contrast to some earlier suggestions on the roles of these residues. In particular, the on-rate constants, mass k on , were quite similar and fell within 30% of the WT for all Trp point mutants. We note that mass k on is a robust, model-free, parameter and that the low sensitivity of the on rate to Trp point mutations appeared directly from the initial slopes of the raw data in Fig. 3 and Fig. S3. Variants with two Trp replacements showed some reduction in the rate of complexation, but again the changes were not dramatic, and the drop in mass k on for both W367A/W376A and W38A/W376A was less than 3-fold compared with Cel7A WT . We conclude that at least on amorphous cellulose, one Trp residue can be replaced with limited effect on the threading rate of Cel7A. Even two replacements resulted in on rates of the same order of magnitude, and this suggests that maintaining a high on rate is not the major evolutionary driving force for the conservation of Trp in the binding tunnel of Cel7A. Binding affinity of the Trp point mutants was similar or moderately lowered compared with the WT. The relative ligand affinity was Cel7A WT ϳ W376A Ͼ W367A ϳ W40A Ͼ W38A. The difference in standard free energy of binding, ⌬⌬G°, between the boundaries in this sequence (i.e. between W38A and the WT) was ϳ4 kJ/mol. This ⌬⌬G°is in line with an earlier kinetic investigation of these two enzymes (53). For the variants W367A and W40A, the free energy of ligand binding was ϳ2 kJ/mol, higher than the WT, thus signifying weak favorable interactions from the Trp residues in these positions.
The results in Table 1 lists both rate constants and ligand affinities for variants with mutations in different regions of the binding area, and this opens for the application of so-called ⌽-factor analysis to elucidate the transition state (TS) of the complexation/decomplexation reactions. This approach has been successfully used in protein folding studies, and the basic idea is to compare changes in respectively binding and activation energy. Principles and practices of the approach have been lucidly discussed elsewhere (58, 59) and will not be reiterated here, except for some key aspects of the current application, which are sketched out in Fig. 5. For the present data, we may define ⌽ as follows. Figure 5. ⌽-factor analysis. A shows experimental ⌽-factors for complexation (⌽ on ) and decomplexation (⌽ off ). The cartoon above the histogram indicates the location of the investigated mutations. B and C illustrate the interpretation of ⌽-factors on the basis of two hypothetical transition states for the complexation process. In B, the TS is late in the process just before the ligand fully fills the tunnel, whereas in C, it is early in the complexation process. If we mutate a residue located at the yellow star, which interacts with the ligand, the energy landscape changes differently in the two cases (dashed lines). Thus, in B the free energies of the bound state (ES) and the TS (ES ) will shift in parallel because the mutated residue interacts with the ligand in both states, and this yields ⌽ on ϭ 1. Conversely, in C, only ES (and not ES ) will be affected by the mutation, and ⌽ on will be 0. This difference allowed us to assess whether a given mutation was located before or after the reaction coordinate of the TS. An analogous analysis may be conducted for decomplexation. In this case, the ⌽ off -factors that indicate whether the mutation is before or after the TS are 0 and Ϫ1, respectively.

Cellulose binding and transition state of Cel7A
If we first insert on-rate constants for different mutants in Equation 7, we obtain ⌽-factors for complexation, ⌽ on . A ⌽ on value of ϳ1 for a given point mutation suggests that ligand interactions at this position are established before the TS (the mutation affects the free energy of ES and ES equally; Fig. 5B). Conversely, if ligand interactions are established in the downhill part of the energy diagram after the TS, the ⌽ on value will be at ϳ0 (only ES is affected; Fig. 5C). The analogous values for decomplexation (i.e. going from right to left in Fig. 5, B and C) are ⌽ off ϭ 0 and ⌽ off ϭ Ϫ1, respectively. We plotted the ⌽-factors derived by Equation 7 from the data in Table 1 in the histogram in Fig. 5A. In this figure, the columns are organized according to the position of the mutation along the binding tunnel, and the horizontal axis hence represents the reaction coordinate of the essentially one-dimensional threading process (the ligand enters the tunnel opening near the CBM and Trp-40 and gradually moves forward to fill the tunnel).
Although ⌽-factors are usually calculated for point mutations, we include the CD variant in this analysis as the CD is a distinct domain separated by a linker (the CD retains its native structure without the CBM (11)). As other ⌽-factor analyses (60), interpretation of Fig. 5A was challenged by experimental errors, but we see that ⌽ on is ϳ1 for the CD variant, attains an intermediate for W40A, and is close to 0 for the next three Trp residues. This suggests that the transition state for ligand binding occurs early in the threading process at a stage where the CBM is interacting with the substrate and the cellulose reducing end makes contact with Trp-40 at the tunnel entrance. This implies that upon further forward movement, favorable enzyme-ligand interactions in the tunnel compensate the work required to release more glucose moieties from the cellulose surface and hence makes this part of the complexation process thermodynamically downhill. This interpretation is supported by analyzing ⌽ off values derived from the decomplexation rate constants (Table  1). Thus, ⌽ off for residues near the middle or exit of the tunnel (Trp-376, Trp-367, Glu-212, and Trp-38) were approximately Ϫ1 (Fig. 5A), suggesting that the TS of decomplexation occurs after ligand interaction of these residues have been broken. W40A has an intermediate value, whereas the CBM-less variant (CD) has ⌽ off near 0, and this again points toward a TS with the reducing end near the tunnel entry and Trp-40. Fig. 6 provides a structural interpretation of results from the ⌽-factor analysis. We stress that this is a coarse representation and that other structures could be in accord with the data, but the figure serves to illustrate that the TS has several glucose moieties broken away from the cellulose surface. However, the TS does not involve ligand interactions at the middle of the tunnel or near the catalytic residues. This observation of a TS (in both directions) with limited ligand penetration was also proposed in an enzyme kinetic study (61), and it is interesting to note that ligand interactions with Trp-40, which are established around the TS in Fig. 6, have been highlighted for their role in Cel7A efficacy in different earlier works (10,32,44,62). The unfavorable free energy of the structure in Fig. 6 relies on a balance between attractive ligand interactions in the cellulose surface layer and the enzyme complex, respectively. Strong interactions in the complex are illustrated by the large binding free energy found here (⌬G°ϭ ϳϪ42 kJ/mol) and elsewhere (14,27,(36)(37)(38), but for the complex to form, several pyranose moieties must be dislodged from a bound state in the solid surface.
The current results suggest that before the reducing end reaches Trp-40, it is favorable for the cellulose strand to flick back to a bound state on the cellulose surface. However, when the reducing end has passed Trp-40, it is thermodynamically downhill to proceed forward and establish enzyme-ligand interactions.
To address this interpretation quantitatively, one may (as a first approximation) estimate the free energy of this TS by inserting molar k on in the Eyring equation (although this approach may be questionable for a diffusive process like complexation). Doing so suggests an activation free energy of ϳ45 kJ/mol, and it is interesting to compare this with computational estimates of the work required to pull off cellooligosaccharides from the surface of crystalline cellulose. This work is in the range of 12-18 kJ/(mol glucose moieties) (5,63), and these numbers suggest that the activation free energy of complexation corresponds to pulling off 3-4 glucose moieties. Although this is only a very coarse estimate, this stretch of cellulose strand seems to be in a realistic order of magnitude

Conclusions
The intrinsic fluorescence of Cel7A has previously been shown to change distinctively upon ligand binding (64,65). Based on this, we have implemented a novel, real-time method and a simple kinetic analysis, which provides on and off rates for Cel7A on its insoluble substrate. The approach appears robust and derives on rates directly from initial slopes of raw progress curves, independent of models or regression analyses. The setup used a stirred cuvette, hence obviating potential problems of handling insoluble particles in stopped-flow systems, for example. The setup had subsecond dead time, and that was sufficient to monitor essentially the full complexation process in real time. We envision that the method could be broadly applicable to cellulases and other glycosyl hydrolases because changes in tryptophan fluorescence are quite common for these enzymes (66). We found that complexation occurred fairly rapidly with an on-rate constant of the WT of 0.59 (g/liter) Ϫ1 s Ϫ1 . Taking the density of attack sites on RAC into account, this Figure 6. Structural interpretation of the ⌽-factor analysis. The transition state of complexation and decomplexation of Cel7A is proposed to occur at a stage in which the reducing end is near Trp-40, whereas the rest of the tunnel is empty. The unfavorable free energy of this structure arises from the (uncompensated) detachment of several glucose moieties from the cellulose surface. At a higher reaction coordinate of the complexation process (when the tunnel fills), compensating ligand interactions occur in the tunnel and hence lower the free energy. At a lower reaction coordinate, the free energy becomes more favorable as the ligand interacts with the cellulose surface.

Cellulose binding and transition state of Cel7A
on-rate constant corresponded to ϳ0.1 ϫ 10 6 M Ϫ1 s Ϫ1, and this is only moderately slower than typical on-rate constants for enzymes and small, soluble substrates. The off-rate constant was 0.0048 s Ϫ1 , and this is similar to some values reported earlier for Cel7A and many orders of magnitude slower than typical off rates for soluble substrates. The roles of the four highly conserved Trp residues in the binding tunnel were elucidated through the characterization of several mutants. We found that these Trp residues promoted the strength and rate of complexation, but the contributions were low or moderate, and we propose that their high conservation must rely on other functional factors than rapid complexation. Finally, we used the kinetic and thermodynamic data for the mutants in a so-called ⌽-factor analysis, which suggested that the transition state for the on/off step occurred when the reducing end of the ligand reached the tunnel entrance near Trp-40, and the remainder of the tunnel was empty. Because earlier work has suggested that complexation or decomplexation (rather than the chemical changes associated with bond breakage) govern the maximal turnover of Cel7A, we propose that a structure akin to the one in Fig. 6 underlies the apex of the complex free energy landscape of this enzyme.

Enzymes, substrate, and buffer
The cellobiohydrolase Cel7A form T. reesei and a number of Cel7A variants specified below were expressed in Aspergillus oryzae as detailed elsewhere (67). The enzymes were purified from fermentation broth, validated to show a single band on NuPAGE 4 -12% Bis-Tris SDS-PAGE, and quantified by UV absorption as described previously (8). We expressed and purified nine enzymes for this work. In addition to the WT (denoted Cel7A WT ), we made six enzymes with variations in the pattern of Trp residues shown in Fig. 1. Specifically, we made four mutants in which each of the Trp residues had been replaced with Ala (W38A, W40A, W367A, and W376A) and two mutants with two replacements of Trp (W367A/W376A and W38A/W376A). Finally, we expressed the CD form Cel7A (i.e. a variant with truncated linker and carbohydrate binding module, CBM) and a point mutant in which the catalytic nucleophile, Glu-212, had been replaced with Gln. This latter variant, E212Q, has previously been shown to retain WT-like structure while being essentially inactive on insoluble cellulose (41). The substrates were RAC and BMCC, and both were prepared by standard methods (68). The load of cellulose in the stocks was determined as dry matter content. All experiments were conducted in 50 mM sodium acetate buffer, pH 5, referred to as standard buffer.

Fluorescence spectroscopy
Fluorescence measurements were conducted in a Jasco FP-8200 spectrofluorometer equipped with a custom-made injection inlet directly to the cuvette, as well as a standard thermostat and magnetic stirrer (Jasco STR-811). The experimental temperature was 20°C. All measurements relied on changes in the emission spectrum, when Cel7A binds the ligand, and to assess this, we initially recorded full spectra (310 -500 nm) for all investigated enzymes at several (typically 10) different loads of RAC. We used an excitation wavelength of 295 nm, which primarily probes Trp residues (12). In accordance with earlier reports on Cel7A, (64,65), the emission spectra (see supporting information) showed blue shift and increased intensity upon ligand binding. The changes were particularly pronounced ϳ328 nm (Fig. S1), and we used this wavelength in all subsequent measurements. The strength of ligand binding was assessed from titration measurements. In these trials, 2 ml of enzyme dissolved in pure standard buffer was initially transferred to a stirred 1 ϫ 1-cm quartz cuvette. Subsequently, the enzyme solution was successively titrated with 10 -20 aliquots of RAC suspension until we did not observe further changes in the signal (typically at a total RAC load of 0.2-0.4 g/liters). After each RAC addition, we waited for the fluorescence signal to reach a constant value (i.e. apparent binding equilibrium) and then measured the average emission over 20 s. These titration trials were conducted for all variants at two enzyme concentrations (0.25 and 0.50 M) and used to derive binding constants and binding site densities as described below. The rate of ligand-enzyme binding was investigated in real-time in a separate series of time-course experiments. In this case, the fluorescence from 2 ml of 0.25 M enzyme solution with vigorous stirring was monitored continuously. When the signal was stable, one 100-l aliquot of cellulose suspension was added quickly from a syringe, and the changes in fluorescence were recorded (10 data points/s) over ϳ500 s. The experiment was repeated (with a fresh enzyme solution) for ϳ10 different RAC loads for each enzyme variant so that the kinetics of binding was monitored for RAC loads between 0.005 and 0.3 g/liter (the concentration of the injected RAC aliquot was adjusted to reach the desired final load upon addition of 100 l sample in all cases). We made a number of control measurements with similar additions of RAC suspension to either pure standard buffer or a solution of a nonbinding protein (BSA; Sigma-Aldrich catalog no. 05470). These results were used partly to estimate the instrument dead time (i.e. the time required to mix the reactants in the cuvette) and partly to quantify the loss emission associated with light scattering from RAC particles.