Structural and functional variation of chitin-binding domains of a lytic polysaccharide monooxygenase from Cellvibrio japonicus

Among the extensive repertoire of carbohydrate-active enzymes, lytic polysaccharide monooxygenases (LPMOs) have a key role in recalcitrant biomass degradation. LPMOs are copper-dependent enzymes that catalyze oxidative cleavage of glycosidic bonds in polysaccharides such as cellulose and chitin. Several LPMOs contain carbohydrate-binding modules (CBMs) that are known to promote LPMO efficiency. However, structural and functional properties of some CBMs remain unknown, and it is not clear why some LPMOs, like CjLPMO10A from the soil bacterium Cellvibrio japonicus, have multiple CBMs (CjCBM5 and CjCBM73). Here, we studied substrate binding by these two CBMs to shine light on their functional variation and determined the solution structures of both by NMR, which constitutes the first structure of a member of the CBM73 family. Chitin-binding experiments and molecular dynamics simulations showed that, while both CBMs bind crystalline chitin with Kd values in the micromolar range, CjCBM73 has higher affinity for chitin than CjCBM5. Furthermore, NMR titration experiments showed that CjCBM5 binds soluble chitohexaose, whereas no binding of CjCBM73 to this chitooligosaccharide was detected. These functional differences correlate with distinctly different arrangements of three conserved aromatic amino acids involved in substrate binding. In CjCBM5, these residues show a linear arrangement that seems compatible with the experimentally observed affinity for single chitin chains. On the other hand, the arrangement of these residues in CjCBM73 suggests a wider binding surface that may interact with several chitin chains. Taken together, these results provide insight into natural variation among related chitin-binding CBMs and the possible functional implications of such variation.

Among the extensive repertoire of carbohydrate-active enzymes, lytic polysaccharide monooxygenases (LPMOs) have a key role in recalcitrant biomass degradation. LPMOs are copper-dependent enzymes that catalyze oxidative cleavage of glycosidic bonds in polysaccharides such as cellulose and chitin. Several LPMOs contain carbohydrate-binding modules (CBMs) that are known to promote LPMO efficiency. However, structural and functional properties of some CBMs remain unknown, and it is not clear why some LPMOs, like CjLPMO10A from the soil bacterium Cellvibrio japonicus, have multiple CBMs (CjCBM5 and CjCBM73). Here, we studied substrate binding by these two CBMs to shine light on their functional variation and determined the solution structures of both by NMR, which constitutes the first structure of a member of the CBM73 family. Chitin-binding experiments and molecular dynamics simulations showed that, while both CBMs bind crystalline chitin with K d values in the micromolar range, CjCBM73 has higher affinity for chitin than CjCBM5. Furthermore, NMR titration experiments showed that CjCBM5 binds soluble chitohexaose, whereas no binding of CjCBM73 to this chitooligosaccharide was detected. These functional differences correlate with distinctly different arrangements of three conserved aromatic amino acids involved in substrate binding. In CjCBM5, these residues show a linear arrangement that seems compatible with the experimentally observed affinity for single chitin chains. On the other hand, the arrangement of these residues in CjCBM73 suggests a wider binding surface that may interact with several chitin chains. Taken together, these results provide insight into natural variation among related chitin-binding CBMs and the possible functional implications of such variation.
Chitin is a linear and water insoluble polysaccharide composed of β-1,4-linked GlcNAc units found in the cell wall matrix of fungi and the exoskeletons of arthropods. Despite being the second most abundant polymer in nature, after cellulose, chitin does not accumulate in most ecosystems and tends to be absent in fossils (1). This is testimony to the capacity of nature to depolymerize and recycle chitin.
Carbohydrate Active enZymes (CAZymes), such as chitinases and LPMOs, may just be composed of a single catalytic domain (CD) or may contain one or more non-CDs such as carbohydrate-binding modules (CBMs). Currently (as of September 2021), the CAZy database (17) contains 88 families of CBMs with a wide variety of binding specificities, including crystalline polysaccharides and short, soluble oligosaccharides (18,19). The major role of CBMs is to keep an enzyme in close proximity of a substrate, thereby enhancing the effective concentration of the enzyme and overall reaction efficiency (18). In the context of LPMOs, CBMs may have a particularly important role because proximity to the substrate not only contributes to enzyme efficiency but also protects the enzyme from autocatalytic inactivation (20)(21)(22). Several studies have shown that removal of CBMs has a negative effect on LPMO performance (21)(22)(23)(24)(25). There are multiple families of chitin-binding and cellulose-binding CBMs, which may have different substrate specificities (e.g., (23,26)). For example, Lehtiö et al. (26) showed that cellulose-binding modules belonging to two different CBM families bind to different parts of cellulose. Therefore, it is not trivial to predict or determine the role of CBMs, and a better understanding of the ways in which they bind their substrates is needed.
To address functional variation among chitin-binding CBMs, we have used chitin-active CjLPMO10A from Cellvibro japonicus as a model system. The CD of this LPMO, which belongs to the auxiliary activity family 10 (AA10) in CAZy, is appended to two type A (18,19) chitin-binding CBMs: an internal family 5 CBM (CjCBM5) and C-terminal family 73 CBM (CjCBM73) (Fig. 1). The three domains of CjLPMO10A are connected by linkers that are rich in serine residues and are both approximately 30 amino acids long (Fig. 1). A previous study has shown that both CBMs bind to αand β-chitin, thus enhancing substrate binding by the LPMO, and that the fulllength (FL) protein is more efficient in comparison to the CD alone (25). In the present study, we have compared multiple truncated variants of CjLPMO10A (Fig. 1A) to understand the roles of the appended CBMs in LPMO functionality. Furthermore, we have used NMR spectroscopy to elucidate the solution structures of the two CBMs: CjCBM5 and CjCBM73, where the latter is the first structure to be determined for a member of the CBM73 family. We also used NMR titration experiments to investigate binding of the CBMs to chitohexaose. These results were complemented with molecular dynamics simulations to gain more insights into CBM binding to α-chitin. Overall, the results show that while CjCBM5 and CjCBM73 are similar in overall structure and both bind to crystalline chitin, they differ in apparent Tm, binding site architecture, and the ability to bind individual chitin chains.

Results
The effect of CBMs on chitin oxidation is substrate concentration dependent To better understand the functional roles of CjCBM5 and CjCBM73 in relation to full-length CjLPMO10A, we started by testing the performance of the three catalytically active  A   B   10  20  30  40  50  MFNTRHLLAG VSQLVKPASM MILAMASTLA IHEASAHGYV SSPKSRVIQC  60  70  80  90  100  KENGIENPTH PACIAAKAAG NGGLYTPQEV AVGGVRDNHD YYIPDGRLCS  110  120  130  140  150  ANRANLFGMD LARNDWPATS VTPGAREFVW TNTAAHKTKY FRYYITPQGY  160  170  180  190  200  DHSQPLRWSD LQLIHDSGPA DQEWVSTHNV ILPYRTGRHI IYSIWQRDWD  210  220  230  240  250  RDAAEGFYQC IDVDFGNGTG TGSSSSVASS VVSSVTSSSV ASSVASSLSN  260  270  280  290  300  DTCATLPSWD ASTVYTNPQQ VKHNSKRYQA NYWTQNQNPS TNSGQYGPWL  310  320  330  340  350  DLGNCVTSGG SSSVASSSVA SSVASSVTSS VASSVVSGNC ISPVYVDGSS  360  370  380 390 YANNALVQNN GSEYRCLVGG WCTVGGPYAP GTGWAWANAW ELVRSCQ Ser-rich Ser-rich Figure 1. Domain architecture and primary structure of CjLPMO10A. A, domain architecture and molecular weight of CjLPMO10A and the truncated variants used in this study. The numbers above the full-length enzyme show the transitions between the domains and the linkers. The signal peptide (residues 1-37) is cleaved off during secretion. The indicated molecular weights are based on the mature protein, that is, enzymes without signal peptides. B, primary structure of CjLPMO10A FL with color coding according to panel A. Aromatic residues located on the binding surfaces of the two CBMs, as determined in this study, are printed in bold face; cysteine residues involved in disulfide bonds are underlined. CBM, carbohydrate-binding module; CD, catalytic domain, Ser-rich linker; FL, full-length; His ×6, polyhistidine tag; SP, signal peptide.
versions of CjLPMO10A, namely CjLPMO10A FL (FL for fulllength), CjLPMO10A ΔCBM73 (for truncation of the CBM73 domain; see Fig. 1A) and fully truncated CjLPMO10A CD at different concentrations of α-chitin (2, 10, or 50 g/l; Fig. 2). At all substrate concentrations, the FL enzyme and the enzyme lacking only one CBM, CjLPMO10A ΔCBM73 , stayed active for the full duration of the experiment. At the two lowest substrate concentrations, product formation by CjLPMO10A CD ceased rapidly, and faster at the lowest substrate concentration, indicative of enzyme inactivation. However, at a substrate concentration of 50 g/l, all three variants showed similar progress curves and final product levels.
At the two lowest substrate concentrations, the amount of soluble oxidized products (relative to the total amount) was higher for the CBM-containing variants of CjLPMO10A (>85%) compared with CjLPMO10A CD (about 50%) (Fig. 2, D and E). This indicates that, at these lower substrate concentrations, the presence of at least one CBM leads to more localized oxidation, generating a higher fraction of short soluble products, as discussed in Ref. (22) and later. At the highest substrate concentration (Fig. 2F), however, the fraction of soluble oxidized products was close to 50% for all three enzyme versions. All in all, the experiments depicted in Figure 2 did not show significant differences between the catalytic behavior of the two CBM-containing variants, but deletion of both CBMs had a major effect.

Thermal stability and oxidative performance
To assess possible functional differences between the FL enzyme and the variant lacking only the CBM73, we analyzed the effect of temperature on the oxidative performance of these variants (Fig. 3). It is believed that CAZymes with multiple CBMs have an advantage at elevated temperatures as the CBM(s) can counteract the loss of binding because of increased temperature (27)(28)(29). Interestingly, at the highest tested temperature (70 C), CjLPMO10A FL showed significantly higher activity than CjLPMO10A ΔCBM73 . Thus, the presence of the CBM73 indeed has a beneficial effect on LPMO performance at higher temperatures. Determination of melting curves showed that the deletion of the CjCBM73 had some effect on the shape of the curve but not on the apparent Tm of approximately 70 C (Fig. S1). The apparent Tms of the isolated CBMs were 57.2 C for CjCBM5 and 75.4 C for CjCBM73, whereas the apparent Tm of the CjLPMO10A CD was 70.2 C. In accordance with previous studies on the effect of copper binding on the stability of AA10 (30,31) and AA9 (32) LPMOs, the apo variant of CjLPMO10A CD showed reduced stability (T m,app = 56.6 C). [GlcNAcGlcNAc1A] (μM) Figure 2. Chitin degradation by CjLPMO10A variants. Panels A-C show progress curves for the formation of soluble oxidized products by CjLPMO10A FL (solid black line), CjLPMO10A ΔCBM73 (dashed black line), and CjLPMO10A CD (solid gray line) at substrate concentrations of 2 g/l (A), 10 g/l (B), and 50 g/l (C) α-chitin. Panels D-F show quantification of solubilized (gray bars) and total oxidized sites (black bars) after 24 h of LPMO incubation at the various substrate concentrations, that is, 2 g/l (D), 10 g/l (E), and 50 g/l (F). The fraction of soluble oxidized products is given as a percentage of the total for each reaction.
All reactions were carried out with 0.5 μM LPMO and 1 mM ascorbic acid in 50 mM sodium phosphate at pH 7.0 in a thermomixer set to 37 C and 800 rpm. For quantification of soluble products, the solubilized fraction was further degraded by 0.5 μM SmCHB prior to HPLC quantification. For quantification of total products (i.e., soluble and insoluble fraction), samples were heat inactivated after which all α-chitin (diluted to 2 g/l) was degraded with a combination of 2.0 μM SmChiA and 0.5 μM SmCHB. Note that the LPMO variants were used directly after purification and that their copper saturation levels may have varied; thus, the progress curves in panels A-C cannot be used for direct comparison of catalytic initial rates. The error bars show ±SD (n = 3). LPMO, lytic polysaccharide monooxygenase.

Solution structures of CjCBM5 and CjCBM73
The solution structures of CjCBM5 (Protein Data Bank [PDB] ID: 6Z40) and CjCBM73 (PDB ID: 6Z41) were determined by NMR spectroscopy (Fig. 4 and Table S1). The chemical shift assignment completion for the backbone (N, H N , C α , H α , and C 0 ) and side chains (H and C) of CjCBM5 (BioMagnetic Resonance Databank [BMRB] ID: 34519) was >88% and >65%, respectively, whereas these values were >87% and >59% for CjCBM73 (BMRB ID: 34520). Because of the cloning procedure, both proteins contained a Met at the N terminus and an Ala followed by a 6×His tag at the C terminus. For CjCBM5, no resonances from these additional amino acids were assigned, whereas for CjCBM73, the backbone resonances of the additional Ala and the first His in the 6×His tag were assigned.
The structures of CjCBM5 and CjCBM73 are similar (C α RMSD = 5.6 Å) and share the same overall fold (Fig. 4). This fold has previously been described (33) as an "L" shape or "ski boot" fold because of the loop region attached perpendicularly to an antiparallel β-sheet. The structures of both CBMs are stabilized by a disulfide bridge connecting the N-and C-terminal ends of the domain. The structure of CjCBM73 has a short 3 10 helix (residues 371-374) that is linked to the central β-strand by an additional disulfide bridge. These features are unique for the CBM73 family ( Fig. S2) and lack in CjCBM5 and other structurally characterized members of the CBM5 family.
Most CBMs rely on exposed aromatic residues that bind carbohydrates through CH-π interactions (18,34). Based on structural information alone, Figure 4 shows that Y282, W283, and Y296 in CjCBM5, and W371, Y378, and W386 in CjCBM73, could be involved in substrate binding. As shown in   , the aromatic pair Y282-W283 is almost fully conserved within the CBM5 family, whereas Y296 is less conserved. In the context of the CBM73 family ( Fig. S2), W371, Y378, and W386 appear to be highly conserved. To test interactions between chitin and these aromatic patches and their neighboring polar residues, we performed NMR titrations with a soluble chitin substrate, chitohexaose (GlcNAc) 6 .

Probing interactions between soluble chitin and CBMs by NMR
For CjCBM5, titration with (GlcNAc) 6 led to significant 15 N-1 H N chemical shift perturbation for W283 and Y296 as well as for residues in the neighboring loop region (T284, Q285, and G297) that are part of the putative binding surface (Fig. 5). The chemical shift perturbations were used to calculate a K d = 2 ± 1 mM. Of note, this K d value is some three orders of magnitude higher than the value obtained with solid α-chitin (see later).
In contrast to the experiment with CjCBM5, titration of CjCBM73 with up to 6.5 mM (GlcNAc) 6 did not result in any significant chemical shift perturbations, indicating that this CBM does not bind this soluble substrate.
Binding of CjCBM5 and CjCBM73 to oxidized and nonoxidized α-chitin Previous binding studies have shown that both CBMs bind with micromolar affinity to both αand β-chitin (25). These previous studies indicated similar K d values (for α-chitin) for CjCBM5 and CjCBM73. Here, we tested binding using a similar setup, using both the same batch of α-chitin and a batch of α-chitin that had been preoxidized with CjLPMO10A CD as described later and in the Experimental procedures section.
Oxidized chitin was prepared to assess whether surface oxidation would affect CBM binding, one idea being that gradual oxidation of the substrate surface could facilitate release of otherwise strongly bound CBMs. The material was prepared by treating chitin with the CD of CjLPMO10A CD , followed by washing to remove solubilized oxidized chitooligosaccharides and residual LPMO (see Experimental procedures section for further details). The degree of oxidation of the solid fraction was determined upon complete enzymatic hydrolysis of the fraction, which entails that all oxidized sites end up as chitobionic acid. Data from six independent reactions, containing 20 mg/ml chitin, which correspond to approximately 45 mM of oxidized dimer in a theoretical 100% conversion reaction, indicated a degree of oxidation of about 0.3% (number obtained by dividing the chitobionic acid recovered from the solid fraction by the amount of chitobionic acid that would be obtained in a 100% conversion reaction). In an alternative approach, we divided the amount of chitobionic acid recovered from the solid fraction by the total amount of sugars (GlcNAc and chitobionic acid) recovered from this fraction, which indicated approximately 1% oxidation. Hence, the degree of oxidation of the insoluble fraction was estimated to be between 0.3% and 1%, and we assume that oxidation essentially happened on the substrate surface. Figure 6 shows binding curves for the two CBMs with "nonoxidized" (panel A and C) or "preoxidized" (panel B and D) α-chitin. The data show that CjCBM73 (K d = 2.9 μM) binds with slightly higher affinity than CjCBM5 (K d = 8.5 μM). The binding studies with partly oxidized chitin showed similar results. The data showed a 20% increase in the K d for CjCBM5, indicating that binding by this CBM may be negatively affected by surface oxidation, but the difference was not statistically significant.  6 . Affected residues (W283, T284, Q285, Y296, and G297) are highlighted in green on the surface model of CjCBM5. Other surface-exposed aromatic residues for which no significant chemical shift perturbation was detected (Y265 and Y282) are shown in blue for illustration purposes. HSQC, heteronuclear single quantum coherence.

Simulations provide insight into binding of CBMs to α-chitin
Coarse-grained (CG) simulations were performed to further investigate interactions between CBMs and a model of the surface of α-chitin. CG models based on the Martini force field represent 3 to 4 atoms by a single "bead," thereby reducing the number of particles that are simulated (35). This allows simulations to be run longer and to sample longer timescales, compared with atomistic simulations. We combined CG models of chitin and the CBMs with well-tempered metadynamics (WT-MetaD) simulations to further enhance sampling of CBM-chitin binding/unbinding events, which occur on long timescales. In the WT-MetaD approach, protein conformations along a set of collective variables are biased by a history-dependent potential. The total bias (i.e., sum of the Gaussians in the potential) forces the system to escape from local free-energy minima and explore different regions of the collective variable space. For the CBM-chitin model, we used two collective variables as proxies for binding: (i) the Euclidean distance (r chitin ) and (ii) number of contacts, Cc w D, between aromatic residues in the putative substrate-binding surfaces (CjCBM5: Y282, W283, and Y296; CjCBM73: W371, Y378, and W386) and the chitin surface. Details on the calculation of these collective variables are provided in the Experimental procedures section.
To promote binding to chitin by the CBMs, it was necessary to rescale the interaction strengths between chitin beads and protein beads in the Martini model (see Experimental procedures section for details). The effect of rescaling these interactions by 0% (unchanged) or by an up to 15% increase in the strength of the chitin-protein interaction was evaluated by running umbrella-sampling simulations (Fig. S3) on the rescaled models and by comparing dissociation constants calculated from these simulations with experimentally determined values (Fig. 6). The results (Table 1) show that the best agreement with experiments was attained with a 10% increase in the chitin-protein interaction strength. The free-energy surfaces of CjCBM5 and CjCBM73 have similar appearances, but CjCBM73 has a deeper well than CjCBM5, which correlates with its experimentally observed stronger affinity for chitin (Fig. S3).
The number of contacts between all amino acids in each CBM and the α-chitin surface was calculated for every frame (n = 15,000) in the WT-MetaD simulation and reweighted using the bias from the simulation (see Experimental procedures section for details). The results (Fig. 7) show which residues have the most contacts, that is, Cc w D > 0.5, with the substrate over time. For CjCBM5 (Fig. 7, A and C), regions with most contacts include, and are to a large extent limited to, the three aromatic residues of the putative binding surface (Y282, W283, and Y296). In addition, the region around Y265 seems to be somewhat involved in substrate binding albeit  The modeled values were calculated from umbrella-sampling simulations in which the interaction strength between chitin beads and protein beads remained unchanged (0%) or was increased by 5, 10, and 15%.
Structures and functional roles of two chitin-binding CBMs . Panels A and B show the weighted average number of contacts observed during simulations (CcD w ; red bars with dark red error bars) and the chemical shift perturbations (Δδ; black dots with gray error bars) observed by NMR upon addition of (GlcNAc) 6 . The error bars for the number of contacts were calculated using block analysis (89); error bars for chemical shift perturbations correspond to 0.003 ppm. Note that no significant chemical shift perturbations were recorded for CjCBM73. Panels C and D show the number of contacts between each amino acid and the α-chitin surface per frame of the 15 μs simulations, c i , using a cutoff distance of 0.3 nm (see Experimental procedures section for details). We note that because of the use of coarse-grained models and because of the use of metadynamics, that is, enhanced sampling, the time scales do not here correspond to a physical time scale. Panels E and F show the substrate-binding surfaces of representative conformations of the bound state of CjCBM5 and CjCBM73, respectively. The side chains of amino acids on the binding surface that have most contacts (Cc w D > 0.5) with chitin are colored blue, whereas the side chains of amino acids with fewer contacts (0.2 < Cc w D < 0.5) are colored cyan.
with much fewer contacts. These observations are in good agreement with the chemical shift perturbation data for binding of (GlcNAc) 6 . Similar observations were made for CjCBM73 (Fig. 7, B and D), in the sense that also in this case the interacting regions include, and are to a large extent limited to, the three aromatic residues of the putative binding surface (W371, Y378, and W386). Furthermore, also in this case, interactions with fewer contacts (0.2 < Cc w D < 0.5) with a fourth aromatic residue, Y351, were observed. In order to match the experiments as closely as possible, we included the C-terminal His in the simulations and found that these have a number of contacts with the substrate (Fig. 7, A and B). All in all, these analyses show that the amino acids on the surface of CjCBM5 with most chitin contacts form a relatively linear arrangement (Fig. 7E), perhaps reflecting that interactions are limited to a single chitin chain, whereas the arrangement of aromatic amino acids on the surface of CjCBM73 is wider and suggests a more extended substrate-binding surface (Fig. 7F).

Discussion
According to the Pfam database, only about 25% of AA10 LPMOs (Pfam ID: PF03067) contain one or more additional domains, and the large majority of these multimodular enzymes contain a CBM. While single-domain LPMOs can be efficient and may bind well to their substrates, as exemplified by the archetypal chitin-active LPMO CBP21 (6,36,37), CBMs tethered to the LPMO domain are known to have significant impact on the catalytic efficiency of multimodular LPMOs (22,24,25,38). Therefore, it is important to gain a deeper understanding of the mechanisms by which CBMs recognize and bind their target substrates. Here, we have investigated two CBMs from CjLPMO10A: CjCBM5 and CjCBM73, to illuminate structural and functional differences between these chitin-binding domains. The present results include the first structure for a member of the CBM73 family.
The NMR solution structures show that, although both CBMs have similar overall folds, CjCBM73 has a 3 10 -helix connected by an additional disulfide bridge. These features appear to be conserved in the CBM73 family (Fig. S2). To obtain further insight into the structural variation between small chitin-binding CBMs, we compared the structures of CjCBM5 and CjCBM73 with the structures of five CBM5s and a CBM12 (Fig. 8). The CBM12 is included because the CBMs in this family are closely related to family 5 CBMs (18). It has previously been shown (39)(40)(41) that, in addition to conserved surface-exposed aromatic residues, these CBMs share two additionally conserved aromatic amino acids (Y265 and W299 in CjCBM5) that also occur in CBM73s (Y351 and W390 in CjCBM73; Fig. S2). These residues are a part of the hydrophobic core of the proteins. All CBMs (39)(40)(41)(42)(43)(44)(45) in Figure 8 bind chitin.
Previous studies (40,41) have established the importance of the two consecutive and conserved aromatic residues, Y-W or W-W, in family 5 CBMs (Y282-W283 in CjCBM5). Sitedirected mutagenesis studies have shown that a third aromatic residue, Y296, present on the surface of CjCBM5, PfChiA_CBM5 and MmChi60_CBM5, also contributes to chitin binding (41). Whereas the NMR titration experiment with soluble chitohexaose did not show binding for CjCBM73, results for CjCBM5 showed that both W283 and Y296 are involved in binding (GlcNAc) 6 . In addition, the polar residues T284 and Q285 also appear to contribute to binding (GlcNAc) 6 . These observations suggest that chitin binding by CjCBM5 likely involves a combination of CH−π interactions (34) and hydrogen bonding. Binding of chitohexaose to a CBM5 has previously been addressed by Itoh et al. (46) and Akagi et al. (40) for SgChiC_CBM5 using isothermal titration calorimetry and NMR titration, respectively. Interestingly, these studies yielded dissociation constants of 2 mM and 1.6 ± 0.3 mM, respectively, which are consistent with the dissociation constant determined here for CjCBM5 (K d = 2 ± 1 mM).
CjCBM5, like other CBM5s in Figure 8, has three exposed aromatic residues with a close to linear arrangement of the side chains on the surface. This type of arrangement is often found in cellulose-binding domains (47)(48)(49), where the distance between the three aromatic residues coincides with the spacing of every second glucose ring in a single chain (48,50). Compared with the other CBMs in Figure 8, the arrangement of the three exposed aromatic residues in CjCBM73: W371, Y378, and W386, differs, which suggests that CjCBM73 has a wider binding surface that may interact with several chitin chains. This could explain why CjCBM73 cannot bind (GlcNAc) 6 , a single-chain analog, whereas CjCBM5 can.
The distinct arrangements of amino acids on the binding surfaces of CjCBM5 and CjCBM73 may also explain the experimentally and computationally observed differences in binding to α-chitin (Figs. 6 and 7). The side chains of the amino acids with most chitin contacts in the simulations (Fig. 7E) form a linear arrangement in CjCBM5 but are distributed on a larger and wider surface in CjCBM73 (Fig. 7F). Both experiments and simulations indicated that CjCBM5 binds to chitin with lower affinity compared with CjCBM73 (Table 1), whereas this is the other way around for (single chain) (GlcNAc) 6 . The stronger affinity of CjCBM73 for insoluble α-chitin can be explained by its binding surface covering a larger area than the binding surface of CjCBM5.
At low substrate concentrations, the catalytic performance of CjLPMO10A FL and CjLPMO10A ΔCBM73 is superior to that of CjLPMO10A CD , and the progress curves in Figure 2, A and B show that this is due to rapid inactivation of CjLPMO10A CD . Forsberg et al. (25) have previously shown that almost all the binding affinity for chitin in CjLPMO10A FL resides on the CBMs. The strong binding provided by the CBMs ensures that the LPMO stays close to its substrate, thus increasing the chances that the interaction of the reduced CD with the oxygen cosubstrate leads to a productive reaction (i.e., cleavage of chitin) rather than futile turnover that may lead to autocatalytic enzyme inactivation (20), as has previously been observed for other CBM-containing LPMOs (22,38). At the highest substrate concentration (Fig. 2C), inactivation of CjLPMO10A CD was much reduced, likely because the high substrate load favors CjLPMO10A CD binding to chitin, reducing the frequency of futile turnovers and the concurrent risk of enzyme inactivation. This observation is in agreement with a previous study (22) showing that the negative effect of truncation of the CBM2 from a two-domain cellulose-active LPMO was smaller at higher substrate concentrations.
The protective effect of the substrate, mediated by the CBMs, became more evident at higher temperatures (Fig. 3), where CjLPMO10A FL showed higher catalytic performance than CjLPMO10A ΔCBM73 , indicating that CjCBM73 appears to provide additional protection to the enzyme from thermal inactivation. It is conceivable that the increased performance at higher temperatures translates into increased performance at lower, more physiologically relevant temperatures where the enzyme may experience other types of stress, such as very low substrate concentrations or high levels of oxidant.
The aforementioned previous study with a two-domain cellulose LPMO (22) shows that the anchoring effect of the CBMs leads to a higher fraction of soluble oxidized products relative to oxidized sites on the insoluble substrate. A similar effect was also observed for CjLPMO10A FL and CjLPMO10A ΔCBM73 , which produced a higher fraction of soluble oxidized products compared with CjLPMO10A CD (Fig. 2, D-F). Interestingly, this difference became less at higher substrate concentrations, which may perhaps be due to the fact that higher substrate concentrations increase the chance that a substrate-anchored, but otherwise freely moving CD, acts on a neighboring fibril rather than the fibril to which it is bound, as discussed previously (22).
Considering the LPMO reaction cycle and considering that anchoring by the CBMs could lead to multiple oxidized sites localized on the chitin surface around the CBM-binding site, it is conceivable that accumulation of oxidized sites could trigger unbinding of the CBMs. The results of our attempts to test this Structures and functional roles of two chitin-binding CBMs hypothesis by studying CBM binding to partially oxidized chitin (Fig. 6) were not conclusive but did indicate that substrate oxidation slightly weakened chitin binding by CjCBM5.
In conclusion, we have revealed structural and functional variation between the two chitin-binding domains in CjLPMO10A. While it is clear that the presence of these CBMs has a significant effect on the catalytic performance of the LPMO, the question why nature has evolved enzymes with two different chitin-binding domains remains to be answered. Chitin occurs in different crystalline forms and may be intertwined with other polymers, such as β-glucans in fungal cell walls or proteins in crustacean shells. Perhaps, some CBMs are adapted to interacting with chitin in specific copolymeric contexts that are absent in the heavily processed α-chitin used in this study. Indeed, it is possible that functional differences between the CBM5 and the CBM73 remain undetected in the present experiments because of the choice of substrate. The possibility that different CBMs are adapted to different crystalline forms or different faces of a chitin fiber is supported by multiple studies in the cellulose field, which have shown that cellulose-binding domains belonging to different CBM families bind at different locations on cellulose crystals (26,51). Interestingly, chitin-binding studies with a chitinase from Thermococcus kodakaraensis, which contains two CDs and three CBMs, led Kikkawa et al. (52) to propose that two highly similar chitin-binding CBM2 domains ensure binding to the chitin surface, whereas a CBM5 would ensure binding to single chitin chains, perhaps near chain ends. It is conceivable that a similar scenario applies to CjLPMO10A; in this case, the CBM73 would promote binding to the chitin surface, whereas the CBM5, which was shown to bind soluble chitooligosaccharides, would promote binding to single chains. Eventually, insights into these two CBMs will increase our understanding of how LPMOs depolymerize insoluble polysaccharides.
To obtain better expression of CjLPMO10A FL , the codonoptimized gene encoding mature CjLPMO10A FL (residues 37-397) was cloned behind an IPTG-inducible T7 promoter in the pD441-CH expression vector by ATUM, resulting in a fusion construct with an N-terminal E. coli OmpA signal peptide and a C-terminal His 6 motif (Gly-(His) 6 ). The expression vector was transformed into chemically competent E. coli BL21 (New England Biolabs). Production of CjLPMO10A FL was achieved by fed-batch fermentation of the expression strain in a 1-l fermenter (DASGIP benchtop bioreactors for cell culture; Eppendorf), essentially as described previously (53), with the following modifications: at the start of the feed phase, the temperature was switched to 25 C, and 0.6 mM IPTG was added to the glucose feed solution for continuous induction of gene expression. After 18 h of glucose feed, the cells were removed by centrifugation. The culture supernatant containing the target protein was concentrated threefold and buffer exchanged against six volumes of working buffer (50 mM Tris-HCl, 300 mM NaCl, and pH 8.0) by crossflow filtration (Millipore Pellicon 2 mini filter, regenerated cellulose, 3 kDa molecular weight cutoff [MWCO]). After centrifugation for 30 min at 35,000g to remove precipitated proteins and filtration through a 0.2 μm Nalgene Rapid-Flow sterile bottle-top filter unit (Thermo Scientific), the culture filtrate was applied to a 20-ml nickel-nitrilotriacetic acid sepharose column connected to an ÄKTA express FPLC system (GE Healthcare Life Sciences). After washing with ten column volumes (CVs) of working buffer containing 20 mM imidazole, bound protein was eluted with a buffer containing 200 mM imidazole. Fractions containing the target protein were pooled and buffer exchanged into 20 mM Tris-HCl, 200 mM NaCl, pH 7.5 by gel filtration over Sephadex G25 (GE Healthcare, 4× HiPrep Desalting 26/10 columns).
CjLPMO10A ΔCBM73 and CjLPMO10A CD were expressed in lysogeny broth (LB) media containing 50 μg/ml ampicillin. Cells harboring the plasmid were grown at 30 C for 24 h, without any induction, prior to harvest. The protein was extracted from the periplasmic space using an osmotic shock method that was first described by Manoil and Beckwith (54), followed by purification using a two-step chromatography protocol. The periplasmic extract was adjusted to 50 mM Tris-HCl at pH 9.0 (loading buffer) and loaded onto a 5 ml Q Sepharose anion exchange column (GE Healthcare). Proteins were eluted using a linear salt gradient (0-500 mM NaCl) over 60 CVs using a flow rate of 2.5 ml/min. LPMO-containing fractions were pooled and concentrated to 1 ml before being loaded onto a HiLoad 16/60 Superdex 75 size exclusion column (GE Healthcare) operated with a running buffer consisting of 50 mM Tris at pH 7.5 and 200 mM NaCl, at a flow rate of 1 ml/min. Fractions containing pure LPMO were identified by SDS-PAGE and subsequently pooled and concentrated using Amicon Ultra centrifugal filters (Millipore) with an MWCO of 10 kDa. Protein concentrations were measured using the Bradford assay (Bio-Rad). The protein solutions were stored at 4 C until further use.
Typical yields of purified protein were 95, 10, and 10 mg per liter of culture for CjLPMO10A FL , CjLPMO10A ΔCBM73 , and CjLPMO10A CD , respectively. The absence of free copper in the preparations of purified LPMOs was confirmed by measuring hydrogen peroxide production upon addition of ascorbic acid using the Amplex Red assay as described by Kittl et al. (55). The presence of free copper would lead to drastically increased levels of hydrogen peroxide production (56), and this was not observed.
Expression plasmids for CjCBM5 (residues 251-309) and CjCBM73 (residues 338-397) based on the pNIC-CH vector (Addgene) were used for cytoplasmic expression as previously described (25). This cloning procedure adds a Met residue to the N terminus as well as one Ala residue and a polyhistidine tag (6×His tag) to the C terminus of both proteins. Precultures in 5 ml LB medium (10 g/l tryptone, 5 g/l yeast extract, and 5 g/l NaCl) were used to inoculate 500 ml of terrific broth medium supplemented with 50 μg/ml kanamycin. The cultures were grown at 37 C for approximately 3 h in a LEX-24 Bioreactor (Harbinger Biotechnology) using compressed air for aeration and mixing. Expression was induced by adding IPTG to a final concentration of 0.1 mM at an absorbance of 0.6 at 600 nm (OD 600 ), followed by incubation for 24 h at 23 C. Cells were harvested by centrifugation (5500g, 10 min) followed by cell lysis using pulsed sonication in a buffer containing 50 mM Tris-HCl at pH 8.0, 500 mM NaCl, and 5 mM imidazole. Cell debris was removed by centrifugation (75,000g, 30 min), and the supernatant was loaded onto a 5 ml HisTrap HP Ni Sepharose column (GE Healthcare) equilibrated with lysis buffer. The protein was eluted by applying a 25 CV linear gradient to reach 100% of a buffer containing 50 mM Tris-HCl at pH 8.0, 500 mM NaCl, and 500 mM imidazole, at a flow rate of 2.5 ml/min. Protein-containing fractions were analyzed by SDS-PAGE and subsequently concentrated, with concomitant buffer exchange to 20 mM Tris-HCl at pH 8.0, using an Amicon Ultra centrifugal filter (Millipore) with a 3 kDa cutoff. The concentrations of CjCBM5 and CjCBM73 were determined by measuring absorbance at 280 nm (A 280 ) and calculated using theoretical molar extinction coefficients (ε 280, CjCBM5 = 22,585 M −1 cm −1 ; ε 280, CjCBM73 = 28,210 M −1 cm −1 ). Typical yields of these procedures were 5 and 2 mg of pure protein per liter of culture for CjCBM5 and CjCBM73, respectively.

Production of CjCBM5 and CjCBM73 for NMR studies
CjCBM5 and CjCBM73 samples for NMR studies were produced both with 13 C and 15 N isotopic labeling and 15 N labeling only. A preculture was grown in 6 ml LB medium supplemented with 50 μg/ml kanamycin in a shaking incubator at 30 C, 225 rpm, for 6 h. A main culture of 500 ml M9 medium (6 g/l Na 2 HPO 4 , 3 g/l KH 2 PO 4 , and 0.5 g/l NaCl) supplemented with 500 μg/ml kanamycin, 0.5 g ( 15 NH 4 ) 2 SO 4 , 6 ml glycerol, 5 ml 15 N Bioexpress Cell Growth Medium (Cambridge Isotope Laboratories), 5 ml Gibco MEM Vitamin Solution (100×), 1 ml MgSO 4 (1 M), and 5 ml of a trace-metal solution (0.1 g/l ZnSO 4 , 0.8 g/l MnSO 4 , 0.5 g/l FeSO 4 , 0.1 g/l CuSO 4 , and 1 g/l CaCl 2 ) was inoculated with 1% of the preculture and incubated at 22 C in a LEX-24 Bioreactor as described previously. After 18 h, the cultures induced with 0.5 mM IPTG to a final concentration of 0.5 mM were followed by incubation at 22 C for 24 h. Cells were harvested by centrifugation at 4 C, 6000g, for 5 min. The pellet was resuspended in 20 ml lysis buffer (50 mM Tris-HCl, 50 mM NaCl, 0.05% Triton X-100, and pH 8.0) supplemented with a tablet EDTA-free cOmplete ULTRA protease inhibitor (Roche) followed by pulsed sonication. Cell debris was removed by centrifugation at 4 C, 16,600g, for 45 min. The supernatant was sterilized by filtration through a 0.2 μm Sterile-flip filter unit (Nalgene). Buffer B (50 mM Tris-HCl, 400 mM imidazole, and pH 8.0) was added to the filtered lysate to obtain a final concentration of 20 mM imidazole. The proteins were purified by loading the supernatant onto a 1 ml HisTrap HP Ni-sepharose column (GE Healthcare Life Sciences) equilibrated with 5 CV of 95% buffer A (50 mM Tris-HCl, pH 8.0) and 5% buffer B with a flow rate of 1 ml/ min. Impurities were removed by washing with 95% buffer A and 5% buffer B for 10 CV. The protein was eluted using a 30 CV gradient of 5 to 100% buffer B. The purity of the protein fractions was assessed with SDS-PAGE. The yields of the labeled proteins were 1 and 0.2 mg per liter of culture for CjCBM5 and CjCBM73, respectively.
The protocol for production and purification of nonlabeled samples of CjCBM5 and CjCBM73 for NMR studies was as described above, except that 2× LB medium (20 g/l tryptone, 10 g/l yeast extract, and 5 g/l NaCl) was used instead of M9. The yields of the nonlabeled proteins were 4 and 3 mg per liter of culture for CjCBM5 and CjCBM73, respectively.
Fractions shown to contain CjCBM73 were pooled and concentrated using Amicon Ultra protein concentrators (MWCO = 3 kDa) at 10 C and 7000g to obtain a volume of 5 ml. This protein solution was loaded onto a size-exclusion chromatography column (HiLoad 16/600 Superdex 75 pg; 120 ml CV) that had been equilibrated with 1 CV of sizeexclusion chromatography-buffer pH 7.5 (50 mM Tris-HCl and 20 mM NaCl). Protein fractions were eluted using a 1 ml/min flow rate, and the concentration was measured as mentioned previously.
The buffer in the protein-containing fractions was exchanged to NMR buffers (for structure elucidation: 25 mM sodium phosphate and 10 mM NaCl, pH 5.5; for interaction studies: 50 mM sodium phosphate [CjCBM5] or 25 mM sodium phosphate [CjCBM73], pH 7.0) prior to concentrating to 70 μM and a final volume of 400 μl. All steps were performed by centrifugation using Amicon Ultra protein concentration tubes (MWCO = 3 kDa) at 10 C and 7000g. NMR samples were prepared by adding D 2 O to a final ratio of 90% H 2 O/10% D 2 O.

Chitin degradation experiments
Unless stated otherwise, reactions were performed with 0.5 μM LPMO in 50 mM sodium phosphate buffer at pH 7.0 in the presence of 1 mM ascorbic acid at 37 C and 800 rpm in an Eppendorf thermomixer. All reactions were performed in triplicates.

Preparation of oxidized chitin for binding studies
CjLPMO10A CD , which is known to bind weakly to α-chitin (25) and which was expected to oxidize the chitin surface more randomly compared with the full-length enzyme ( Fig. 1; (22)), was used to prepare oxidized chitin. Six 1-ml reactions, each containing 20 g/l α-chitin suspended in 50 mM sodium phosphate at pH 7.0, were supplemented with 1 μM CjLPMO10A CD and 1 mM ascorbic acid three times with 24-h intervals (i.e., to a final concentration of 3 μM enzyme and 3 mM ascorbic acid). The reactions were incubated in a thermomixer set to 37 C and 800 rpm. After 72 h of incubation, samples were taken from all six reactions and diluted in buffer supplemented with 2 μM SmChiA (57) and 0.5 μM SmCHB (58) to a substrate concentration of 2 g/l. These reaction mixtures were incubated for 24 h at 37 C at 800 rpm after which oxidized products were analyzed quantitatively to determine the total degree of oxidation in the LPMO-treated chitin. The rests of the 20 g/l reactions were centrifuged in an Eppendorf centrifuge (12,000g for 3 min), the supernatant was removed, and the soluble products in the supernatant were subjected to degradation with 2 μM SmChiA and 0.5 μM SmCHB as described previously, to determine the amount of solubilized oxidized products. The pelleted oxidized chitin was washed with buffer (3 × 1 ml of 50 mM sodium phosphate at pH 7.0) by repetitively suspending the chitin in buffer and removing the supernatant after centrifugation. Finally, the oxidized chitin was suspended in buffer to 20 g/l. Again, samples were taken from all six reactions and diluted in buffer supplemented with 2 μM SmChiA and 0.5 μM SmCHB to a substrate concentration of 2 g/l. The reactions were incubated for 24 h at 37 C at 800 rpm, and the resulting samples were used to determine the amount of insoluble oxidized products. Compounds in SmChiA/SmCBH degraded samples were quantified as described later.

Quantitative analysis of chitobionic acid (GlcNAcGlcNAc1A)
Prior to product quantification, LPMO-generated products were degraded with only SmCHB (soluble fractions) or a combination with SmChiA and SmCHB (for total or insoluble fractions) to yield a mixture of nonoxidized monomeric GlcNAc and the oxidized dimer, chitobionic acid, which consists of a GlcNAc and an oxidized GlcNAc in the aldonic acid form (GlcNAc1A). Analysis and quantification of GlcNAcGlcNAc1A were carried out using an RSLC system (Dionex) equipped with a 100 × 7.8 mm Rezex RFQ-Fast Acid H+ (8%) (Phenomenex) column operated at 85 C. Samples of 8 μl were injected to the column, and sugars were eluted isocratically using 5 mM sulphuric acid as mobile phase with a flow rate of 1 ml/min. Standards of GlcNAcGlcNAc1A (10-500 μM) were used for quantification. GlcNAcGlcNAc1A was generated in house by complete oxidation of N-acetylchitobiose (Megazyme; 95% purity) by the Fusarium graminearum chitooligosaccharide oxidase as previously described (58,59).

Determination of apparent Tm
The apparent T m of the proteins was determined according to a protein thermal shift assay (Thermo Fisher Scientific) based on using SYPRO orange, a fluorescent dye, to monitor protein unfolding (60). The quantum yield of the dye is significantly increased upon binding to hydrophobic regions of the protein that become accessible as the protein unfolds. The fluorescence emission (relative fluorescence unit) was monitored using a StepOnePlus real-time PCR machine (Thermo Fisher Scientific). T m was calculated as the temperature corresponding to the minimum value of the derivative plot (−d[relative fluorescence unit]/dT versus T; Fig. S1). 0.1 g/l LPMO in 50 mM sodium phosphate buffer (pH 7.0) was heated in the presence of the dye in a 96-well plate from 25 to 95 C, over 50 min. For each protein, the experiment was carried out in quadruplicates (i.e., n = 4).

Binding studies with CjCBM5 and CjCBM73
Binding studies were performed as previously described (25). The equilibrium binding constants (K d ) and binding capacity (B max ) were determined for CjCBM5 and CjCBM73 by mixing protein solutions of varying concentrations (0, 20, 50, 75, 100, 150, 300, and 500 μg/ml for CjCBM5 and 0, 10, 20, 50, 75, 100, 150, and 300 μg/ml for CjCBM73; protein concentration was determined by A 280 ) with 10 mg/ml preoxidized (see aforementioned one) or untreated α-chitin. Before adding the chitin, A 280 was measured for each of the prepared protein solutions (in 50 mM sodium phosphate buffer, pH 7.0), to create individual standard curves for each protein. After addition of chitin, the solutions were placed at 22 C in an Eppendorf Comfort Thermomixer set to 800 rpm for 60 min. Subsequently, samples were filtered using a 96-well filter plate (Millipore), and the concentration of free protein in the supernatant was determined by measuring A 280 . All assays were performed in triplicate and with blanks (buffer and 10 mg/ml α-chitin). The equilibrium dissociation constants, K d (μM), and substrate-binding capacities, B max (μmol/g α-chitin), were determined by fitting the binding isotherms to the one-site binding equation, where P represents protein: ½P bound ¼ B max ½P free =ðK d þ½P free Þ, by nonlinear regression using the Prism 6 software (GraphPad Software, Inc).

NMR spectroscopy
NMR spectra of 70 μM CjCBM5 and CjCBM73 in NMR buffer (25 mM sodium phosphate and 10 mM NaCl, pH 5.5) containing 10% D 2 O were recorded at 25 C on a Bruker Ascend 800 MHz spectrometer with an Avance III HD (Bruker Biospin) console equipped with a 5 mm Z-gradient CP-TCI (H/ C/N) cryogenic probe at the NV-NMR-Centre/Norwegian NMR Platform at NTNU, the Norwegian University of Science and Technology. 1 H chemical shifts were referenced internally to the water signal, whereas 13 C and 15 N chemical shifts were referenced indirectly to water based on the absolute frequency ratios (61). Backbone and side-chain assignments of CjCBM5 and CjCBM73 were obtained using 15 N-heteronuclear single quantum coherence (HSQC), 13

Structure elucidation
The NMR data were recorded and processed with TopSpin version 3.6 (Bruker), and analyzed with CARA version 1.5.5 (63). For structure determination, 3D 13 C-edited and 15 Nedited NOESY-HSQC spectra as well as 2D 1 H-1 H NOESY spectra were recorded. NOE crosspeaks were manually identified, assigned, and integrated using the NEASY program within CARA version 1.5.5. Dihedral torsion angles (φ and ψ) were calculated from chemical shift data (C α , C β , H N , H α , H β , N, and C 0 ) by TALOS-N (64). Structures were calculated using the torsion angle dynamics program CYANA version 3.97 (65). The structure calculation started by generating 200 conformers with random torsion angles and the dihedral angles in each conformer were optimized using simulated annealing in 10,000 steps to fit the restraints. The 20 conformers with the lowest CYANA target function values were energy minimized using YASARA (66), first in vacuo, followed by using water as the explicit solvent and calculating electrostatics by applying the particle mesh Ewald method (67). In both these steps, the YASARA force field (68) was applied. The coordinates of the minimized CBM conformers have been deposited in the PDB under the IDs 6Z40 (CjCBM5) and 6Z41 (CjCBM73). The two structures were aligned using the combinatorial extension algorithm, which determines the longest continuous alignment between fragment pairs (69).

Titration of CBMs with chitohexaose
The interaction between CjCBM5 and chitohexaose, (GlcNAc) 6 was investigated using NMR spectroscopy. A 15 N-HSQC spectrum was recorded of a sample of 15 N-labeled CjCBM5 (70 μM) in 50 mM sodium phosphate containing 10% D 2 O and used as reference. Another sample of 15 N-labeled CjCBM5 (70 μM) with 10 mM (GlcNAc) 6 in 50 mM sodium phosphate containing 10% D 2 O was prepared. After recording a 15 N-HSQC spectrum of this latter sample, it was mixed with the reference sample to obtain the following concentrations of (GlcNAc) 6 0.2, 0.5, 1.0, and 5.0 mM, while maintaining a constant protein concentration. A new 15 N-HSQC spectrum was recorded at each (GlcNAc) 6  The same procedure was applied for the NMR titration of CjCBM73 with (GlcNAc) 6 , using CjCBM73 (500 μM) and the following (GlcNAc) 6 concentrations: 0.2, 0.6, 1.2, 2.8, 5.6, and 6.5 mM.

Modeling CG α-chitin
The crystal structure of α-chitin at 300 K (71) was used to generate an all-atom α-chitin surface composed of 12 chains with 20 residues each, in UCSF Chimera, version 1.13.1 (72). The all-atom model was coarse-grained using the bead mapping and topology parameters proposed by Yu and Lau (73) and bead types from the CG Martini version 3.0.beta.4.17 force field (35). A rectangular simulation box was defined with the same size as the surface in the y (110 Å) and z (100 Å) dimensions and 150 Å in the x dimension.

CG simulations
The CG Martini version 3.0.beta.4.17 force field was used in combination with GROMACS version 5.1.4 (74) to simulate interactions between the CBMs and α-chitin. The constructs were coarse-grained using the Martinize2 program (75). An elastic network model (76) was used to constrain the overall structure of the CBMs. The beads in the chitin surface were kept in place by applying a harmonic potential with a force constant of 1000 kJ mol −1 nm −1 on the x, y, and z positions. Models of CjCBM5 or CjCBM73 were manually placed in the simulation box above the chitin surface by using PyMOL (Schrödinger, Inc) (77). The simulation box was filled with water beads, and the system was neutralized with beads corresponding to Na + and Cl − ions to an ionic strength of 0.15 M. The complex was energy minimized using a steepest-descent algorithm (100 steps, 0.03 nm maximum step size) prior to being relaxed for 1 ns, with a time step of 5 fs, using the velocity-rescale thermostat (78), Parrinello-Rahman barostat (79), and Verlet cutoff scheme (80). Simulations were run on the relaxed models with a time step of 20 fs, using the velocityrescale thermostat, Parrinello-Rahman barostat, and Verlet cutoff scheme in the isothermal-isobaric (NPT) ensemble. Frames were written every 1 ns for each trajectory. Representative conformations of CG models of CjCBM5 and CjCBM73 were backmapped to atomistic models by using the CG2AT2 program (81).

Adjusting the protein-chitin interaction strength in the Martini force field
Initially, binding between CBMs and chitin was not observed; therefore, drawing inspiration from Larsen et al. (82), we considered the following approach to modify the Martini version 3.0.beta.4.17 force field to promote CBM binding to chitin. First, the interaction strength (i.e., ε parameter in the Lennard-Jones potential) between chitin beads and protein beads was increased by 10%, and WT-MetaD simulations (see later) were run until binding between the CBMs and chitin was observed. Then, a binding path for each CBM, which included bound and unbound conformations, was selected from the WT-MetaD simulations, and umbrella-sampling simulations were performed on these conformations as described later. Finally, we tested different interaction strengths by generating topologies where the interaction strength was modified from 0% (unchanged) to 15% increase of the chitin-protein interaction strength and ran umbrella-sampling simulations for each interaction strength. Fig. S3 shows the free-energy surfaces calculated for each umbrella-sampling simulation using the weighted histogram analysis method (WHAM) (http://membrane.urmc. rochester.edu/?page_id=126) (83). Dissociation constants calculated from each free-energy surface are shown in Table 1.
Structures and functional roles of two chitin-binding CBMs WT-MetaD simulations WT-MetaD simulations (84) were performed using the PLUMED 2.5 plugin (85)(86)(87) and the Martini model where chitin-protein interactions had been increased by 10%. We used the distance (r chitin ) and number of contacts (cutoff distance, r 0 = 0.7 nm) between aromatic residues in the putative substrate-binding surfaces (CjCBM5: Y282, W283, and Y296; CjCBM73: W371, Y378, and W386) and the chitin surface in 15-μs long (15,000 frames) simulations as collective variables to sample the binding of CBMs to chitin. Gaussian hills were added every 10 ps, with a starting height of 2.0 kJ mol −1 , width of 0.5, and bias factor of 50. Fig. S4 shows the evolution of the collective variables and deposition of Gaussian hills over the course of the simulation.
Since the modeled chitin surface corresponds to the x,y plane in the simulation box, r chitin , was calculated using Equation 1 (shown below with CjCBM5 as an example), which uses the geometric center of the z coordinates, z gc , calculated using Equation 2. Here, z i is the z coordinate of each amino acid or chitin bead, and N is the number of beads in the amino acid or chitin.
The number of contacts between all beads in each amino acid and all beads in the chitin surface, c i , were calculated using the COORDINATION routine in PLUMED 2.5 (i.e., Equation 3), with a cutoff distance r 0 = 0.3 nm, r i is the distance between all beads in each amino acid in the CBMs and all beads in the chitin surface.
Weights for each frame, w i , were calculated from the bias in the WT-MetaD simulation using the REWEIGHT_BIAS routine in PLUMED 2.5 (see Ref. (88) and https://www. plumed.org/doc-v2.5/user-doc/html/_r_e_w_e_i_g_h_t__b_i_a_ s.html for details). The weighted average number of contacts, CcD w , was calculated using Equation 4, and errors in CcD w were estimated using block analysis (89).
Dissociation constants from umbrella-sampling simulations A binding path for each CBM that included bound and unbound conformations was selected from the WT-MetaD simulations. Umbrella-sampling simulations were performed by running twenty-three 100 ns-long replicas using the PLUMED 2.5 plugin, where the distance between the putative binding surface and the chitin surface (r) was restrained from 0.0 to 4.4 nm in steps of 0.2 nm using a harmonic restraint with a force constant of 100 kJ mol −1 nm −1 . Free-energy surfaces were calculated using the WHAM (http://membrane.urmc.rochester.edu/? page_id=126) (83), where the errors were estimated by Monte-Carlo resampling. Dissociation constants (K d ) were calculated from the free-energy surfaces (Fig. S3) by using Equations 5-7, where r 0 = 0.0 nm (i.e., the minimum distance that for which WHAM calculated a nonzero probability), r c = 1.85 nm, k B is Boltzmann's constant, T is the temperature in Kelvin, P 0 is the protein concentration in the simulations, N A is Avogadro's number, V box is the volume of the simulation box, and _ α ¼ 4:38 nm is the upper limit of r chitin .

Data availability
The data for NMR structures and their restraints have been deposited in the Research Collaboratory for Structural Bioinformatics Protein Data Bank (https://www.rcsb.org/) under the following PDB ID codes: 6Z40 (CjCBM5) and 6Z41 (CjCBM73). NMR chemical shift assignments have been deposited in the BioMagnetic Resonance Databank (BMRB) under the IDs 34519 (CjCBM5) and 34520 (CjCBM73). All the data and PLUMED input files required to reproduce the simulation results reported in this article are available online at https://github.com/gcourtade/ papers/tree/master/2021/CBM5-CBM73-MetaD-US and on PLUMED-NEST (www.plumed-nest.org), the public repository of the PLUMED consortium (85) as plumID: 21.015. r chitin ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi À z gc;Y282 −z gc;chitin Á 2 þ À z gc;W283 −z gc;chitin Á 2 þ À z gc;Y296 −z gc;chitin Á 2 q (1) All other data are available in the main text and SI Appendix.