The carbohydrate-binding module and linker of a modular lytic polysaccharide monooxygenase promote localized cellulose oxidation

Lytic polysaccharide monooxygenases (LPMOs) are copper-dependent enzymes that catalyze the oxidative cleavage of polysaccharides such as cellulose and chitin, a feature that makes them key tools in industrial biomass conversion processes. The catalytic domains of a considerable fraction of LPMOs and other carbohydrate-active enzymes (CAZymes) are tethered to carbohydrate-binding modules (CBMs) by flexible linkers. These linkers preclude X-ray crystallographic studies, and the functional implications of these modular assemblies remain partly unknown. Here, we used NMR spectroscopy to characterize structural and dynamic features of full-length modular ScLPMO10C from Streptomyces coelicolor. We observed that the linker is disordered and extended, creating distance between the CBM and the catalytic domain and allowing these domains to move independently of each other. Functional studies with cellulose nanofibrils revealed that most of the substrate-binding affinity of full-length ScLPMO10C resides in the CBM. Comparison of the catalytic performance of full-length ScLPMO10C and its isolated catalytic domain revealed that the CBM is beneficial for LPMO activity at lower substrate concentrations and promotes localized and repeated oxidation of the substrate. Taken together, these results provide a mechanistic basis for understanding the interplay between catalytic domains linked to CBMs in LPMOs and CAZymes in general.

A biobased economy strives to derive biofuels, biomaterials, and commodity chemicals from biocatalytically processed biomass (1). Although cellulose is the most abundant form of biomass, its crystalline nature affects enzyme accessibility and hence the efficiency and costs of biochemical conversion processes (2). A major breakthrough came with the discovery of lytic polysaccharide monooxygenases (LPMOs) 2 (3)(4)(5)(6), which are monocopper enzymes that catalyze oxidative cleavage of glycosidic bonds in crystalline regions of polysaccharides such as chitin and cellulose (4,(7)(8)(9)(10)(11). By acting on crystalline regions, LPMOs generate new access points for hydrolytic enzymes such as endoglucanases and processive cellobiohydrolases (3-5, 11, 12). Furthermore, it is conceivable that LPMO action promotes the dissociation of cellobiohydrolases, which is known to be rate-limiting under many conditions, e.g. by removal of obstacles (13)(14)(15)(16)(17). The synergistic action between LPMOs and cellulases results in a significant enhancement in the efficiency of biomass saccharification and has led to the inclusion of LPMOs in commercial enzyme mixtures for degradation of lignocellulosic biomass (18,19).
Like many other industrially important cellulose-degrading enzymes, a considerable fraction of LPMOs contain a carbohydrate-binding module (CBM). Such CBMs may be connected to the catalytic domain through a variety of linkers, differing in length, sequence, and flexibility (20). These linkers often contain regions of low sequence complexity and are predicted to be extended and flexible. The nature of these linkers hampers structural studies of the full-length proteins, although such studies could yield important insights into the interplay between the domains, substrate binding, and overall enzyme functionality. This lack of information exists for LPMOs and for carbohydrate-active enzymes in general. In an attempt to bridge this knowledge gap, we used NMR spectroscopy to characterize the structure and dynamics of a cellulose-active modular LPMO.
ScLPMO10C (previously known as CelS2), is an LPMO from Streptomyces coelicolor A3 (2) that cleaves cellulose by oxidizing C 1 in the susceptible ␤-1,4-glycosidic bond, thus generating aldonic acids (7). Full-length ScLPMO10C is composed of an N-terminal catalytic domain (hereafter called ScAA10; residues 35-234) connected by a linker region (30 residues; residues 235-264) to a C-terminal family 2 carbohydrate-binding module (hereafter called ScCBM2; residues 265-364). The X-ray structure of the ScAA10 domain (Protein Data Bank (PDB) code 4OY7) depicts the characteristic LPMO ␤-sandwich fold with a densely packed hydrophobic core and a flat substratebinding surface that includes the exposed catalytic copper site (21). The copper ion is coordinated by a histidine brace composed of the side chain (N ␦1 ) and the amino nitrogen of the N-terminal histidine (His 35 ) and the side chain (N ⑀2 ) of His 144 . The linker region (Fig. 1A) consists of a Gly-rich part starting in the C-terminal end of the ScAA10 domain that is followed by a Pro-Thr-Asp-rich part and a Gly-Ser dyad just in front of the ScCBM2 domain. ScCBM2 belongs to the CBM2a subfamily whose members are known to bind cellulose and for which there is structural information (22)(23)(24). The presence of ScCBM2, the structure of which is not known, significantly enhances ScLPMO10C binding to cellulose (25) and results in higher yields of oxidized products compared with a truncated version comprising only the catalytic domain (21). Despite additional studies on LPMOs (26 -28), the mechanism through which CBMs enhance product yields and the interactions among CBMs, linker regions, and catalytic domains are not well-understood.
We have used NMR to investigate structural and dynamic aspects of full-length ScLPMO10C in solution. As a result, we have solved the first NMR structure of an LPMO-associated CBM (ScCBM2) and generated a structural model for fulllength ScLPMO10C by combining the structures of ScCBM2 and ScAA10 with the dihedral angle constraints derived from the secondary chemical shift data for the linker region. This model was complemented with NMR relaxation data (T 1 , T 2 , and 1 H-15 N NOE) to provide a description of the dynamic features of ScLPMO10C. Furthermore, we have used NMR to map the substrate interaction surface of ScAA10 and ScCBM2 binding to cellulose nanofibrils and have compared the catalytic performance of full-length ScLPMO10C and isolated ScAA10. Taken together, the results provide a comprehensive, experimentally supported picture of the interplay among the catalytic domain, flexible linker, and CBM and its impact on LPMO functionality.

Structure and dynamics of full-length ScLPMO10C
A model of full-length ScLPMO10C (Fig. 1B) was generated from X-ray crystal diffraction data (21) (for the ScAA10 domain; PDB code 4OY7) and NMR data (29) for the linker region (Biomagnetic Resonance Data Bank accession number 27078) and the ScCBM2 domain (Fig. S1). The result was an ensemble showing large conformational variation as a result of the low structural restriction in the linker region. The ensemble of conformers provides an experimentally supported picture of the overall conformational dynamics of ScLPMO10C (see Figs. 1 and 6 and Videos S1 and S2). Assignment of secondary structure based on chemical shifts (30) analyzed using TALOS-N (31) indicated that the linker has an extended conformation (Fig. 1C), which was confirmed by dynamic light scattering. Indeed, the average radius of gyration for the model ensemble calculated by YASARA (32) was 40 Ϯ 10 Å, and the z-average hydrodynamic radius determined by dynamic light scattering was 35 Ϯ 9.5 Å (Fig. S2); a hydrodynamic radius of about 25 Å would be expected for a spherical protein of similar mass. The linker region of ScLPMO10C is evidently disordered as its amide proton chemical shifts are distributed in a narrow region (8.0 -8.6 ppm) of the 15 N HSQC spectrum (Fig. S3B), showing narrower line widths (Fig. S3A) compared with signals from the ScAA10 and ScCBM2 domains.
To investigate the global mobility features of ScLPMO10C, rotational correlation times ( c ) were calculated for isolated ScAA10 and ScCBM2 and full-length ScLPMO10C (Fig. 1B) based on the ratio between the T 1 (Fig. S4) and T 2 (Fig. 1D) relaxation times (33). c is a measure of the overall tumbling of a protein, and it increases proportionally with molecular weight. We found that the c values for the isolated domains (ScAA10, 9.40 Ϯ 0.98 ns; ScCBM2, 7.40 Ϯ 0.36 ns) are not significantly different from the values observed when the domains are linked together in full-length ScLPMO10C (ScAA10, 9.38 Ϯ 1.59 ns; ScCBM2, 8.57 Ϯ 0.87 ns). Notably, these c values are significantly lower than the ϳ17 ns that would be expected for a 34.5-kDa globular protein (34), suggesting that the domains tumble independently of each other when tethered together in ScLPMO10C by the flexible linker. NMR relaxation data support this by showing that the linker A shows the amino acid sequences of the C terminus of ScAA10 (green; residues 221-234), the linker region (blue; residues 235-264), and the N terminus of ScCBM2 (red; residues 265-269). Amino acids in black do not have their chemical shifts assigned. B shows a representative model of full-length ScLPMO10C. The N and C termini are labeled as are the ScAA10 (green) and ScCBM2 (red) domains with their respective c . The blue colored patches on the linker region correspond to the residues for which NMR assignments are available. C shows the secondary structure propensity (SSP) of the linker region (blue) and adjacent C-terminal residues of ScAA10 (green) and N-terminal residues of ScCBM2 (red) derived from the secondary chemical shifts for ScLPMO10C. D shows 1 H-15 N NOEs and T 2 relaxation times for full-length ScLPMO10C (black bars) and isolated ScAA10 (green dots) and ScCBM2 (red dots). The 221-269 region, including the linker, 235-264 region, is indicated by a box.

Cellulose binding
Interactions of 15 N-labeled ScAA10 and 15 N-labeled ScCBM2 with cellulose nanofibrils were probed by measuring the changes in amide H and N H signal intensities in 15 N HSQC spectra upon addition of cellulose nanofibrils to each of the protein samples (Fig. S5). The largest reduction of signal intensities is expected to occur at the cellulose-binding surface as substrate binding reduces protein mobility, resulting in signal broadening and decrease of intensity (35). Both proteins showed decreased signal intensities for residues clustered on their putative binding surfaces (Figs. 2 and S5), in particular around the active-site histidines (His 35 and His 144 ) and Tyr 79 for ScAA10 and around the conserved tryptophans (Trp 275 and Trp 312 ) and His 331 for ScCBM2. The decrease in signal intensity for all the affected residues was on average almost 10 times higher for ScCBM2 compared with ScAA10, indicating that cellulose binds more strongly to the CBM compared with the catalytic domain (Fig. S5).
Cellulose binding was also assessed by measuring the freebound enzyme equilibrium in reactions with Avicel (Fig. 3). The substrate-binding capacity (B max ) for the full-length enzyme was 2.8 Ϯ 0.2 mol/g of Avicel, and the equilibrium dissociation constant (K d ) was 7.8 Ϯ 1.0 M. As only minimal binding was detected for the isolated ScAA10 domain (Fig. 3A), the obtained values primarily reflect the binding strength of ScCBM2. Indeed, the isolated CBM2 domain and the fulllength protein showed similar binding properties (Fig. 3A).

Cellulose degradation
To investigate the effect of the CBM2 on cellulose degradation, Cu(II)-loaded ScLPMO10C and ScAA10 were incubated with 10 g/liter Avicel in the presence of 1 mM ascorbic acid for 24 h (Fig. S6). As observed before (21), both enzyme variants yielded aldonic acids with a degree of polymerization ranging from 2 to 9 with yields after 24 h being almost 10 times higher for the full-length enzyme (Fig. S6).
It has recently been shown that, in the case of LPMOs, the effect of the CBM is not only due to the generally accepted targeting effect (i.e. increased substrate binding (36)). In LPMOs, substrate binding protects the enzyme from auto-oxidative inactivation, which implies that weaker substrate binding as a result of CBM removal leads to decreased enzyme stability (28,37). For the same reasons, at nonsaturating substrate concentrations, the substrate concentration will not only affect LPMO activity but also stability. Another complicating factor in the study of CBM functionality resides in the payoff between CBM-mediated substrate affinity and possible negative effects of the CBM related to low off-rates and/or nonproductive binding (38). To overcome these complex issues and to gain true insight into the role of the CBM, reactions with ScLPMO10C and ScAA10 were set up with varying concentrations of Avicel (2-40 g/liter), and formation of soluble products was monitored over time ( Fig. 4). At the last time point (60 min), we analyzed the total amounts of oxidized sites (i.e. soluble and insoluble) (Fig. 4, F and G) and the product distributions in the soluble fraction (Figs. 5 and S7) for each substrate concentration.
The action of a C 1 -oxidizing LPMO such as ScLPMO10C on crystalline cellulose will lead to release of soluble oxidized products if the same cellulose chain is cut twice at internal positions that are maximally ϳ10 glycosidic bonds apart (longer fragments have too low solubility). Oxidized products may also result from a single cleavage near the nonreducing end of a cellulose chain, whereas a single cleavage close to the reducing end will yield a native (nonoxidized) cello-oligomer. Indeed, all reactions displayed in Figs. 4, 5, and S7 and discussed below yielded both oxidized and native products. Fig. 4 shows that at the lower substrate concentrations ScAA10 is less effective than ScLPMO10C and that this is in part due to rapid enzyme inactivation (Fig. 4, A-C, nonlinear dotted progress curves). At these concentrations, the initial rate of ScAA10 seems to increase with the substrate concentration, which, thus, is nonsaturating. At higher substrate concentrations, ScAA10 is stable during the 60-min incubation time (Fig.  4, D and E), and measurements of the total amount of oxidized sites after 60 min (Fig. 4G) show that enzyme activity keeps increasing with substrate concentration up to 40 g/liter. Because a plateau in product formation is not reached, a substrate concentration of 40 g/liter is not a saturating substrate concentration for ScAA10. At the lower substrate concentrations, ϳ30% of the oxidized sites are in the insoluble fraction. Interestingly, the proportion of nonsoluble oxidized sites increases at the higher substrate concentrations, reaching ϳ60% at 40 g/liter (Fig. 4G).
For the CBM-containing full-length enzyme, stability issues are only observed at the lowest substrate concentration (2 g/liter; Fig. 4A, nonlinear solid curve). At all substrate concentrations from 5 g/liter and higher, stability issues are not observed (Fig. 4, B-E), and the total amount of oxidized sites after 60 min is the same (Fig. 4F). Thus, at 5 g/liter, the substrate concentration becomes saturating. In contrast with ScAA10, at lower sub-

Cellulose oxidation by a modular LPMO
strate concentrations, almost all oxidized sites appear as soluble products (Fig. 4F). Interestingly, the absolute amount and the fraction (relative to the total) of soluble oxidized sites decrease at higher substrate concentrations, and this effect is more pronounced compared with ScAA10 (Fig. 4, F and G). It is also worth noting that, at the highest substrate concentration tested, the truncated enzyme generates more oxidized sites than the full-length enzyme (Fig. 4, E-G).
The fact that, at lower substrate concentrations, the CBMcontaining enzyme produces a higher fraction of soluble products indicates a higher probability of the same cellulose chain being cut twice. This could be due to an immobilizing effect of the CBM that would keep the catalytic domain in the proximity of a previous cut for a prolonged time. Such an immobilizing effect would also increase the chances of two cuts in the same cellulose chain happening close to each other, meaning that, on average, one would expect shorter soluble oxidized products for the full-length enzyme. Fig. 5 shows that this is indeed the case. At the lower substrate concentrations, ϳ60% of the oxidized products solubilized by full-length ScLPMO10C have degrees of polymerization of 2-4 as compared with ϳ40% for ScAA10.
Most of the functional differences, including the difference in product distribution, were not visible at the highest of the tested substrate concentrations where the two enzyme forms show quite similar functionalities (Figs. 4 and 5). Furthermore, differences between the full-length and the truncated enzymes were largely absent when assessing the production of native soluble cello-oligomers (Fig. S7), which are the result of single cleavages near reducing chain ends.

Discussion
Taken together, the experiments described above yield a structural (Figs. 1, 2, S1, and S2) and a dynamic (Figs. 1, 6, and S3 and Videos S1 and S2) model of the interactions between the CBM-containing LPMO and its substrate, cellulose. Despite a lack of structural information for full-length proteins, models of full-length carbohydrate-active enzymes containing both a CBM and a flexible and extended linker have appeared in the literature (e.g. Ref. 39). To our knowledge, none of these models

Cellulose oxidation by a modular LPMO
is supported by atomic-resolution structural data for the fulllength protein molecule of the type we present here.
Comparative functional characterization of the full-length enzyme and its isolated catalytic domain revealed complexities that are unique for LPMOs and that relate to the multiple effects of substrate affinity and substrate binding on LPMO performance as discussed above. Although not being the core focus of this study, these complexities are of major importance and must be taken into account when interpreting existing functional data on the effect of CBMs on LPMO efficiency (21, 26 -28) and when planning novel studies of the roles of CBMs in LPMOs. Clearly, characterization of LPMO variants by mea-suring product formation at one single time point and/or at one single substrate concentration is not sufficient to fully appreciate the interplay between an LPMO and its CBM.
Functional studies at low substrate concentrations revealed clear differences between the full-length and truncated LPMOs, which can largely be explained by the CBM promoting binding to internal positions on the substrate surface and by the immobilizing effect of the CBM, which promotes multiple cleavages in the same region and same cellulose chain. As a consequence, the full-length protein produces a higher fraction of soluble oxidized products (relative to insoluble products; Fig. 4F), and the soluble products are shorter (Fig. 5).  A and C) and its catalytic domain, ScAA10 (B and D). The panels show the relative distribution (A and B) and the absolute values (C and D) of oxidized products with a degree of polymerization of 2-8, released after incubating the enzymes for 60 min with Avicel, at varying concentrations. To produce A and B, the peak areas for each of the eight monitored products were summed, and the sum was set to 100%. Note that the method used for product quantification differs from the method used to produce Fig. 4.

Cellulose oxidation by a modular LPMO
Native products can only emerge from cleavage at chain ends, which for Avicel, with an average degree of polymerization of 200 (40), are lower in concentration than the amount of internal LPMO-binding sites. Thus, one would expect the amount of native products to increase with substrate concentration all the way up to 40 g/liter as is indeed observed (Fig. S7). The dependence of native product formation on the substrate concentration differs between the two enzyme variants. The increase in product formation upon increasing the substrate concentration is less pronounced for the full-length enzyme (Fig. S7, compare B and C). This is likely because at high substrate concentrations the overall efficiency of the CBMcontaining enzyme becomes lower than the efficiency of the truncated enzyme (for reasons discussed below). This efficiency difference is clearly apparent from the quantification of oxidized products shown in Fig. 4, F and G. Interestingly, at lower substrate concentrations, the ratio between native and oxidized products is considerably lower for the full-length LPMO than for the truncated version (Figs. S7, B and C). This shows that the CBM promotes binding of the LPMO at internal positions on the crystalline surface.
Although the distribution of oxidized products showed that the full-length enzyme generates shorter products than the truncated enzyme (Fig. 5), a similar trend is not observed for the production of native products, which are the result of one single cleavage. Substrate binding by the CBM to some extent immobilizes the LPMO, allowing it to carry out multiple oxidations in the same substrate region. It is thus not surprising that the effect of the CBM on the product distribution is only seen for oxidized products, which are primarily the result of two spatially close chain cleavages.
The effect of the substrate concentration on several of the analyzed functional features is remarkable. First, the data (Figs. 4, 5, and S7) show that removal of the CBM is beneficial for overall enzyme performance at the highest substrate concentrations, similar to what has been observed and discussed for CBMs in certain cellulases (38). The work by Várnai et al. (38) on cellobiohydrolases indicated that there may be a payoff between (beneficial) CBM-mediated substrate affinity and possible negative effects of the CBM related to low off-rates and/or nonproductive binding.
Second, the data show that the product mixtures generated by the full-length and the truncated enzymes become more similar as the substrate concentration increases. At the highest substrate concentration, the two enzyme forms behave similarly in terms of the ratio of soluble versus total oxidized products (Fig. 4), the length distributions of the oligomeric products (Figs. 5 and S7), and the ratio between native and oxidized soluble products (Fig. S7). Thus, at the highest substrate concentration, the tendency of the full-length enzyme to promote multiple cleavages in the same region of the substrate disappears, and the cleavage patterns become equally random for both enzyme forms. Although the similar overall activities can be explained by the compensatory effect of a high substrate concentration on weaker substrate affinity, this effect does not explain the changes in product profiles. The freely moving LPMO domain of a bound ScLPMO10C molecule could act on a cellulose chain in another fibril to which it is not directly bound. It is conceivable that this effectively more random, as opposed to local, mode of action, which is also expected for the truncated CBM-free enzyme, becomes more prominent as the substrate concentration increases. This would explain why the product profiles generated by the full-length and the truncated enzymes become more similar as the substrate concentration increases. Probing interactions between LPMOs and their polymeric substrate is challenging but has been accomplished by using a soluble polymeric substrate (xyloglucan (41)) and by using hydrogen/deuterium exchange to measure binding of insoluble chitin fibrils to a chitin-active LPMO (42). Here, we show that cellulose nanofibrils forming a stable suspension can be used to probe interactions between ScLPMO10C and its substrate. This approach allowed direct mapping of the cellulose-binding surfaces of ScCBM2 and ScAA10 and showed, together with classical binding studies, that ScCBM2 binds much more strongly to the insoluble cellulose substrate than ScAA10. The clear difference in binding strength implies that ScCBM2 prolongs the residence time of ScLPMO10C on cellulose, keeping ScAA10 proximal to the substrate.
Using chemical shifts as a probe of backbone conformation, we provide experimental evidence for an extended linker region. This was expected as proline residues in the linker confer rotational restriction around their peptide bonds, which has been observed previously for Pro-rich linkers (43,44). Increased T 2 values and decreased 1 H-15 N NOEs in the linker region (Fig. 1D) show that, overall, the linker is conformationally dynamic. As a consequence, the linker in ScLPMO10C decouples the motions of the tethered ScCBM2 and ScAA10 domains, which can move independently of each other (Video S1), as clearly indicated by the rotational correlation times (Fig. 1B).
The dynamic features described above should be regarded qualitatively. In particular, residue-specific relaxation data should not be evaluated individually (as "spikes" in the data may arise from integration errors or signal overlap) but in the context of clusters of amino acids displaying similar trends. Such trends can be clearly seen in the linker region but can also be observed for flexible loop regions in the structured domains (e.g. residues 174 -176 and 199 -202 as seen in Fig. 1D). In the case of ScLPMO10C, the dynamic nature of the linker will in part be due to the various glycine residues adjacent to the Pro-Thr-Asp-rich region (Fig. 1A). It should be noted, however, that NMR studies of a shorter (20-residue) Pro-Thr linker in Xyn10A, a xylanase from Cellulomonas fimi (43), also showed considerable conformational flexibility.
The linker in ScLPMO10C serves as a flexible spacer, maintaining a distance between the carbohydrate-binding and catalytic domains while simultaneously enabling these domains to move independently of each other. Due to the prolonged residence time on a local area of cellulose, caused by ScCBM2, the linker keeps the LPMO domain in proximity of the substrate. The consequences of these features are visible in Video S2, which is based on an ensemble of conformations calculated from amino acid-specific NMR data for fulllength ScLPMO10C. The video, summarized in Fig. 6, shows that upon binding to the cellulose surface by ScCBM2 the ScAA10 domain moves around as a result of the flexibility

Cellulose oxidation by a modular LPMO
conferred by the linker, sampling an area of ϳ1300 Å 2 , equaling around 300 glucose residues. This dynamic model of ScLPMO10C explains the preferential release of shorter, soluble oxidized products by the full-length protein as discussed above. Of note, although Fig. 6 may give the impression that the CBM is statically bound to the substrate surface, lateral diffusion of the CBM on the cellulose surface cannot be excluded.
Despite the insights presented here, several questions related to the interplay of catalytic domains, flexible linkers, and CBMs remain. One issue concerns the events that trigger eventual desorption and relocation of the CBM. Another issue concerns the effect of glycosylation on linker structure and dynamics. Such glycosylation is known to happen in CAZymes from fungi (39,45) and actinomycetes (46), and its impact is currently receiving considerable attention (39,45). In their study on the C. fimi xylanase, Poon et al. (43) concluded that glycosylation of the 20-residue Pro-Thr linker had limited effects on linker structure and dynamics. Studies on the roles of glycosylated linkers in fungal modular CAZymes have revealed an impact of glycosylation on substrate binding and proteolytic resistance, but information on the impact of glycosylation on linker shape and dynamics is scarce. Interestingly, in a recent study, Amore et al. (45) concluded that glycosylation of the linker in a fungal cellobiohydrolase ensures the separation between the catalytic domain and the CBM. The present data show that the nonglycosylated linker of ScLPMO10C has an extended conformation that separates the domains. Another issue concerns the functional implications of variation in linker length and composition as exemplified by the contrast between the 30-residue linker of mixed nature in ScLPMO10C versus the 20-residue strict Pro-Thr linker in Xyn10A from C. fimi. A final question concerns the biological implications of the localized multiple substrate oxidations that are enabled by the CBM. One could envisage that such localized multiple action helps to create "weak spots" in an otherwise tough substrate, which could facilitate further degradation or, perhaps, in the case of an invading plant pathogen, easier penetration.
In conclusion, the present results provide a mechanistic description of the function, conformation, and dynamics of a modular carbohydrate-active enzyme with a flexible linker. Furthermore, we have unraveled some of the pitfalls in assessing the effect of a CBM on LPMO functionality, and by dealing with these pitfalls, we were able to generate functional data that contribute to a better understanding of the role of CBMs in LPMOs. Taking into account the abundance and diversity of CAZymes that have CBMs and flexible linkers (39,43,47), the present experimental insights will contribute to a greater understanding of CAZymes in general and lignocellulose-degrading enzymes in particular.

Sample preparation
The protocols used for the production of pure, isotopically labeled ( 15 N and 13 C) full-length ScLPMO10C; the catalytic domain, ScAA10; and the carbohydrate-binding domain, ScCBM2, as well as sample conditions for NMR measurements have been described previously (29). Briefly, ScLPMO10C was produced using Escherichia coli HI-Control TM BL21 (DE3) and the Expresso TM pETite N-His SUMO T7 expression vector (Lucigen). The SUMO-fused protein was subsequently purified on a nickel column, the SUMO tag was proteolytically cleaved using SUMO Express Protease, and pure ScLPMO10C was obtained by inverse nickel-affinity purification to remove the SUMO tag and SUMO Express Protease. ScAA10 was produced in E. coli RV308 using an LPMO expression cassette (48). The protein was isolated from the periplasm and purified by ionexchange chromatography and size-exclusion chromatography. ScCBM2 was produced in E. coli BL21 (DE3) using the isopropyl 1-thio-␤-D-galactopyranoside-inducible pNIC-CH vector (Addgene) with a C-terminal His tag and purified by immobilized metal affinity chromatography as described previously (29). The His tag was not removed after purification. Prior to NMR experiments, samples of isotopically labeled ScLPMO10C, ScAA10, and ScCBM2 were incubated in 10 mM Na-EDTA for 1 h after which the buffer was exchanged to 20 mM sodium phosphate pH 5.5 buffer with 10 mM NaCl.
Samples of ScLPMO10C and ScAA10 for cellulose degradation experiments were produced as described previously (25). Before use, these enzymes were saturated with copper, using Cu(II)SO 4 , prior to exchanging the buffer to 50 mM sodium phosphate, pH 7.0, using a PD MidiTrap G-25 (GE Healthcare) as described previously (49).

NMR spectroscopy
NMR spectra were recorded at 25°C on a Bruker Ascend 800-MHz spectrometer Avance III HD equipped with a 5-mm Z-gradient CP-TCI (H/C/N) cryoprobe at the NV-NMR Center/Norwegian NMR Platform (NNP) in Trondheim, Norway. NMR data were processed and analyzed using Bruker TopSpin version 3.5 and Protein Dynamic Center software version 2.3.1 from Bruker BioSpin. The NMR assignments of ScLPMO10C, ScAA10, and ScCBM2 have been published elsewhere (29).
Nuclear spin relaxation times T 1 and T 2 and heteronuclear 1 H-15 N NOE measurements of amide 15 N for all three proteins were carried out using transverse relaxation optimized spectroscopy (TROSY) experiments (50). T 1 and T 2 spectra were recorded as pseudo-3D spectra where two frequency dimensions correspond to the amide 1 H and 15 N chemical shifts, respectively, and the third dimension is made up of variable relaxation time delays. For T 1 , the time points were 0.1, 0.2, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, and 4.5 s. For T 2 , the time points were 17,34,68,136,170,204,238, and 272 ms. The 1 H-15 N NOE spectra were composed of two 2D planes recorded with and without presaturation, respectively. Overall c values were determined from the ratio between T 1 and T 2 (33).

Structure determination
For ScCBM2, NOE cross-peak intensities were converted into distance restraints using the CALIBA (51) subroutine in CYANA 3.97 (52,53). Dihedral torsion angles ( and ) calculated from chemical shift data (C ␣ , C ␤ , H N , N, and CЈ) by TALOS-N (31) were included as conformational restraints as was one disulfide bridge (Cys 265 -Cys 361 ). Based on this input, the structure was calculated using CYANA by generating 256

Cellulose oxidation by a modular LPMO
conformers that were optimized using 10,000 steps of simulated annealing to fit the NOE-and TALOS-N-derived distance restraints. The 20 conformers with lowest CYANA target function values were energy-minimized using YASARA (32) with the YASARA force field (54). The first minimization step, in vacuo, was followed by minimization in water and calculating electrostatics by applying the particle mesh Ewald method (55). The coordinates of the 20 ScCBM2 conformers with lowest energy (Table S1) have been deposited in the Protein Data Bank under code 6F7E.
The model of full-length ScLPMO10C was made in CYANA 3.97 by generating 256 conformers. These were then optimized using 10,000 steps of simulated annealing to simultaneously fit the following three conformational constraint inputs. Input I consisted of 1 H-1 H distances shorter than 5 Å that were extracted from the X-ray crystal diffraction structure of ScAA10 (PDB code 4OY7) (21) by using MOLMOL software (56). Input II was derived from the chemical shift assignment (29) of ScLPMO10C, which was used as an input for TALOS-N (31) to generate dihedral angle restraints for the linker region. Input III was the same 1 H-1 H distance constraints that were used for calculation of the structure of ScCBM2.
Videos S1 and S2 were produced from the experimentally determined ensemble of 32 conformers (calculated from amino acid-specific NMR data for full-length ScLPMO10C as described above). Video S1 was made by aligning conformers of ScLPMO10C with respect to all the ␣-carbons and using UCSF Chimera version 1.11.2 (57) to create the animation by using each conformer as a different frame. To make Video S2, the same 32 conformers of ScLPMO10C from the experimentally determined ensemble were aligned with respect to the CBM2 domain. A low-energy pathway between each of the conformers was calculated with YASARA (32) using simulated annealing with the YASARA force field (54) to generate a total of 3800 frames from the experimentally determined conformers. UCSF Chimera version 1.11.2 (57) was used to create the animation from the frames generated by YASARA. The cellulose fibril was generated using Cellulose-builder (58).

Binding to cellulose nanofibrils
Cellulose nanofibrils were produced from never-dried softwood bleached pulp fibers. A mechanical pretreatment, i.e. beating in a Claflin mill (1000 kilowatt h/ton for 1 h), was performed before fibrillation. Subsequently, the fibrillation was done by using a Rannie15 type 12.56X homogenizer (APV, SPX Flow Technology, Silkeborg, Denmark) with a pressure drop of 1000 bars in each pass. The cellulose nanofibrils were collected after three passes. The concentration of the cellulose dispersions used for homogenization was 1%, and the final product contained 0.98% dry weight (w/v).
Reference 15 N HSQC spectra were recorded for 15 N-labeled samples of ScAA10 (0.1 mM) and ScCBM2 (0.1 mM) in 20 mM sodium phosphate pH 5.5 buffer with 10 mM NaCl. Cellulose nanofibrils were added in a 20:1 ratio (w/w) to the protein samples, and new 15 N HSQC spectra were recorded. The normalized signal intensity for each amino acid was estimated from the ratio between peak intensities in the 15 N HSQC spectra recorded with and without cellulose nanofibrils.

Binding to Avicel
Binding to Avicel PH-101 (Fluka) was assessed in reaction mixtures containing 10 mg/ml substrate and 0.08 mg/ml protein (ScLPMO10C or ScAA10) in 50 mM sodium phosphate buffer, pH 7.0, that were incubated at 22°C in an Eppendorf Comfort Thermomixer set to 1000 rpm. At various time points (2.5, 5, 15, 30, and 60 min), a sample was taken and filtered using a 96-well filter plate (Millipore) operated by a Millipore vacuum manifold to remove insoluble substrate and substrate-bound protein. The relative amount of protein in the supernatant was determined by measuring A 280 (Eppendorf BioPhotometer, Eppendorf, Hamburg).
The K d and B max for the full-length enzyme were determined by mixing protein solutions of varying concentrations (0, 10, 20, 50, 75, 150, 300, and 500 g/ml) with 10 g/liter Avicel. Before adding Avicel, the A 280 was measured for each of the prepared protein solutions (in 50 mM sodium phosphate buffer, pH 7.0) to create a standard curve. After addition of Avicel, the solutions were placed at 22°C in an Eppendorf Comfort Thermomixer set to 1000 rpm for 60 min. Subsequently, samples were filtered using a 96-well filter plate (Millipore), and the concentration of free protein in the supernatant was determined by measuring A 280 . All assays were performed in triplicates and with blanks (buffer and 10 g/liter Avicel). The K d (M) and B max (mol/g of Avicel) were determined by fitting the binding isotherms to a one-site binding equation where P represents protein: [P bound ] ϭ B max [P free ]/K d ϩ [P free ]. The fitting was done by nonlinear regression using Prism 7 software (GraphPad, La Jolla, CA).

Cellulose degradation experiments
For all cellulose degradation experiments, 0.5 M Cu(II)loaded ScLPMO10C or ScAA10 was incubated with 2-40 g/liter Avicel PH-101, in 50 mM sodium phosphate buffer, pH 7.0, in the presence of 1 mM ascorbic acid. The reaction mixtures were incubated in an Eppendorf Thermomixer set to 40°C and 1000 rpm. Samples were taken at 10, 20, 30, and 60 min, and solubilized products were separated from the insoluble fraction by filtration as described above. The soluble products were further degraded by incubation with a 0.5 M concentration of the endoglucanase TfCel5A (59) at 37°C for 16 h, yielding oxidized products with a degree of polymerization of 2 and 3 (GlcGlc1A and Glc 2 Glc1A), which were quantified to yield the total concentration of soluble oxidized sites. Cellobiose (98% purity; purchased from Sigma-Aldrich) and cellotriose (95% purity; purchased from Megazyme) were used as substrates for production of C 1 -oxidized dimer (cellobionic acid; GlcGlc1A) and trimer (cellotrionic acid; Glc 2 Glc1A) by incubation with 1.5 M cellobiose dehydrogenase from Myriococcum thermophilum (MtCDH) (60). These in-house-made standards were used to quantify LPMO-generated products.
To determine the total concentration of oxidized products (soluble and insoluble) at the last time point (60 min), the LPMOs were inactivated by boiling the reaction mixture for 15 min at 100°C. Subsequently, the reaction mixtures were diluted with 50 mM sodium phosphate buffer, pH 7.0, to a final concentration of 2 g/liter LPMO-treated Avicel and further hydrolyzed by adding a mixture of two endoglucanases, TfCel5A (1.2 M) and

Cellulose oxidation by a modular LPMO
TfCel6A (61) (5 M), and incubation for 48 h at 50°C with shaking at 1000 rpm. As a result of this procedure, all cellulose is solubilized, and oxidized products appear as dimers and trimers only.
Native and oxidized products were analyzed by high-performance anion-exchange chromatography using a Dionex TM ICS-5000 system (Thermo Scientific, Sunnyvale, CA) set up with a disposable electrochemical gold electrode. Five-microliter samples were injected on a CarboPac PA1 (2 ϫ 50-mm) column operated with 0.1 M NaOH (eluent A) at a flow rate of 0.25 ml/min and a column temperature of 30°C. Elution was achieved using a stepwise gradient with increasing amounts of eluent B (0.1 M NaOH ϩ 1 M NaOAc) as follows: 0 -10% B over 10 min, 10 -30% B over 25 min, 30 -100% B over 5 min, 100 -0% B over 1 min, and 0% B (reconditioning) for 9 min. For analysis of endoglucanase-treated samples containing only oxidized cellobiose and cellotriose, a steeper gradient of acetate was used as follows: 0 -10% B over 10 min, 10 -14% B over 5 min, 14 -30% B over 1 min, 30 -100% B over 2 min, 100 -0% B over 0.1 min, and 0% B over 10.9 min. Eluted oligosaccharides were monitored using a pulsed detector, and chromatograms were recorded using Chromeleon 7.0 software (62).