Insights into an unusual Auxiliary Activity 9 family member lacking the histidine brace motif of lytic polysaccharide monooxygenases

Lytic polysaccharide monooxygenases (LPMOs) are redox-enzymes involved in biomass degradation. All characterized LPMOs possess an active site of two highly conserved histidine residues coordinating a copper ion (the histidine brace), which are essential for LPMO activity. However, some protein sequences that belong to the AA9 LPMO family display a natural N-terminal His to Arg substitution (Arg-AA9). These are found almost entirely in the phylogenetic fungal class Agaricomycetes, associated with wood decay, but no function has been demonstrated for any Arg-AA9. Through bioinformatics, transcriptomic, and proteomic analyses we present data, which suggest that Arg-AA9 proteins could have a hitherto unidentified role in fungal degradation of lignocellulosic biomass in conjunction with other secreted fungal enzymes. We present the first structure of an Arg-AA9, LsAA9B, a naturally occurring protein from Lentinus similis. The LsAA9B structure reveals gross changes in the region equivalent to the canonical LPMO copper-binding site, whereas features implicated in carbohydrate binding in AA9 LPMOs have been maintained. We obtained a structure of LsAA9B with xylotetraose bound on the surface of the protein although with a considerably different binding mode compared with other AA9 complex structures. In addition, we have found indications of protein phosphorylation near the N-terminal Arg and the carbohydrate-binding site, for which the potential function is currently unknown. Our results are strong evidence that Arg-AA9s function markedly different from canonical AA9 LPMO, but nonetheless, may play a role in fungal conversion of lignocellulosic biomass.

glycoside hydrolases, carbohydrate esterases, polysaccharide lyases, and oxidoreductases. Lytic polysaccharide monooxygenases (LPMOs) are copper-dependent oxidoreductases shown to be pivotal in efficient plant biomass degradation (1). These enzymes activate molecular oxygen or hydrogen peroxide to introduce oxidative chain breaks into polysaccharide chains (2, 3) (Fig. 1). Some LPMOs are single domain enzymes, whereas others are linked to CBMs (carbohydrate binding modules) or domains of unknown function (4,5). LPMOs are classified into seven Auxiliary Activity families in the CAZy database, AA9-AA11 and AA13-AA16 (6), and have been found in bacteria, viruses, fungi, and recently in arthropod species and in plants (7,8). In the active site a copper ion is held by a motif denoted the histidine brace (His-brace) formed by the highly conserved N-terminal histidine and a second histidine later in the sequence (9). The His-brace motif is strictly conserved in all LPMOs studied to date (10 -12) and is considered essential for LPMO activity. This is supported by mutational analysis of both AA9 and AA10 LPMOs (first performed on TtAA9E and SmAA10A, followed by later studies on MtAA9D and ScAA10C) in which replacement of these critical residues within the His-brace or the secondary coordination sphere of the copper ion resulted in reduced LPMO activity and hence reduced glycoside hydrolase boosting (13)(14)(15)(16).
LPMOs display different substrate specificities, with activity demonstrated on cellulose, hemicelluloses, chitin, and starch (2,9,14,(17)(18)(19) and interact with their polysaccharide substrates through a surface on which the His-brace is found (20 -24). This surface can have varying contour and polarity, presumably dictating substrate specificity (19,23,25,26). Additionally, for some AA9 LPMOs activity has been demonstrated on soluble ␤-1,4-linked polysaccharides and also oligosaccharides (17,22,23,27,28), a feature that was instrumental in the determination of the first enzyme-substrate complex structures of an AA9 from Lentinus similis (LsAA9A) (22,29). The structures revealed polar residues in a loop denoted L3 (after notation in Ref. 25) interacting with cellooligosaccharides near the active site and demonstrated for the first time that a conserved Tyr (Tyr-203 in LsAA9A) located on the same surface was involved in substrate binding (22,23). In addition these structures revealed a new and highly unusual lone pair-aromatic interaction between the pyranose ring O5 and the imidazole of the N-terminal His of the His-brace, further highlighting its importance for LPMO function (29).
Both in Basidiomycetes and Ascomycetes fungal species multiple AA9-encoding genes are found (more than 20 and 30 in Schizophyllum commune and Podospora anserina, respectively), the vast majority of which encode proteins with N-terminal His (either demonstrated or predicted). For several of these genes the expression and subsequent secretion of the proteins is readily induced by growth on plant biomass (30). Interestingly, some AA9 members have an intriguing substitution of the N-terminal His to an Arg (from here on referred to as Arg-AA9). This can be observed in genomes of species belonging to the Lentinus genus, where for example, Lentinus tigrinis has 16 AA9s one of which is an Arg-AA9 (https://genome.jgi.doe.gov/mycocosm/ proteins-browser/browse;3gQkyP?pϭLenti7_1). Similarly, for L. similis at least seven proteins classified as AA9s are described in patent literature (31,32)), and one of these proteins (GH61-5 in Ref. 32 and denoted LsAA9B from this point) has an N-terminal Arg.
This type of His to Arg substitution was first noted by Yakovlev and co-workers (33) in the Russulales fungus Heterobasidion irregulare in an AA9 denoted HiGH61G (HiAA9G from this point). Yakovlev et al. (33) found that genes encoding six AA9s, including HiAA9G, were up-regulated during growth on lignocellulosic biomass. We have previously reported comprehensive integrative omics studies of the saprotrophic white-rot fungi Pycnoporus coccineus BRFM310 (34) and Polyporus brumalis BRFM985 (35) of the Polyporales order. There we reported co-regulation of glycoside hydrolases (GH) commonly involved plant biomass breakdown (e.g. GH5_5) and Arg-AA9 members (protein ID 1430659 and 1403153 on the JGI Mycocosm public database), which in each species was highly up-regulated during fungal growth on wheat straw (for the latter during solid-state fermentation). In addition, in a recent bioinformatics study (5) it was shown that Arg-AA9s are found in several fungal species almost entirely restricted to one phylogenetic class of largely wood-decaying members namely the Agaricomycetes. These Arg-AA9 sequences clustered together  An unusual AA9 naturally lacking the His-brace motif (cluster 63 in Ref. 5), and appear to have been maintained for a novel function rather than simply being a decaying AA9 gene product. This indicates a putative biological function for the Arg-AA9s related to lignocellulosic biomass degradation, but to date no function has been ascribed to this novel AA9 subgroup.
Using bioinformatics, transcriptomic, and secretomic approaches, we have obtained data supporting a potential role for Arg-AA9s from Agaricomycetes species in plant biomass conversion. We present a high-resolution X-ray crystal structure of a naturally occurring protein from the Polyporale fungus L. similis, LsAA9B, the first for any AA9 with an N-terminal Arg. The LsAA9B structure has an overall fold highly similar to AA9 LPMOs, but in addition to the N-terminal His to Arg substitution, reveals extensive changes in residues equivalent to those forming the copper-binding site and the secondary coordination sphere of the copper (essential for fully functioning AA9 LPMOs). Thus, perhaps predictably, no LPMO activity could be detected for LsAA9B. Unexpectedly, however, the structure appears with a post-translational modification in the form of a phosphoserine (pSer-25), in a position adjacent to the N-terminal Arg side chain. In addition, we have obtained a complex structure of LsAA9B with xylotetraose (Xyl4) bound at the protein surface in a conserved cleft near the speculated phosphorylation site. Sequence alignment indicates also that other Arg-AA9 proteins contain potentially similar putative phosphorylation sites and carbohydrate-binding sites.
Our results strongly suggest a role related to biomass degradation for these Arg-AA9 proteins, which seem to be strikingly different compared with canonical AA9 LPMOs. The functional significance of both the N-terminal His to Arg substitution, the phosphoserine, and the Xyl4 ligand require further investigation, but the characterization and structure determination of LsAA9B presented here will serve as an important starting point for the elucidation of the function and exact biological role of Arg-AA9s.

Phylogenetic distribution and regulation of AA9s with N-terminal Arg
To assess the phylogenetic distribution of fungal AA9 family sequences with N-terminal Arg (Arg-AA9s), we analyzed AA9 sequences belonging to cluster 63 (5) in the CAZy database (36), which we found were restricted to species of the Polyporales, Agaricales, and Russulales orders (all of the Agaricomycetes class in Basidiomycetes fungi). Notably, the sequence alignment showed that residues critical for function in canonical AA9 LPMOs (His-1, His-68, His-142, Gln-151, and Tyr-153 in TtAA9E (14)) were all changed in the Arg-AA9 sequences (Fig.  S1). Interestingly, among the Agaricales and Russulales sequences a small number displayed N-terminal Lys instead (and in a few occasions Asn). The Arg and Lys sequences clustered together phylogenetically; meaning that for species of these orders Arg and Lys are perhaps interchangeable at the N-terminal position. We found that many Arg-AA9 members have signal peptides that indicate they would be secreted. As an example, we found that for the Arg-AA9 of P. coccineus BRFM310 (JGI protein ID 1430659) the SignalP server (37) reports with high certainty a putative signal peptide with predicted cleavage before the Arg, meaning that the protein is most likely targeted for the secretory pathway.
To augment previous reports on transcriptional regulation of Arg-AA9 -encoding genes during fungal growth on plant biomass (33)(34)(35), we sought additional evidence in other fungal species. Using RNA-seq analysis (as previously described in Ref. 38), we identified a number of genes in several Polyporales species (with around 70% sequence identity to LsAA9B, Fig. 2), which were up-regulated when the fungi were cultivated on cellulose or complex biomass compared with control growth on maltose as an easily up-takable carbon source (  Multiple sequence alignment of Arg-AA9 sequences found to be up-regulated during growth on aspen, pine, and wheat straw. The signal peptides are included for each sequence. The N-terminal Arg of the mature proteins is indicated with a star above the sequence. Numbering above the sequence alignment is according to the mature LsAA9B protein with Arg in the N terminus. Secondary structure elements of LsAA9B are shown above the alignment in yellow and red for ␤-strands and ␣-helices, respectively. The JGI protein ID is given next to each sequence. Residues with more than 70% sequence identity are indicated with blue (i.e. residues shared by at least seven of the nine protein sequences).
An unusual AA9 naturally lacking the His-brace motif protein ID 248104), and Trametes elegans BRFM 1663 (JGI protein ID 360271), which were highly up-regulated at day 3 on Avicel, pine, aspen, and wheat straw. In addition, these were co-regulated with several other secreted CAZymes targeting plant biomass polysaccharides (Table 1 and Fig. 2).

LsAA9B, an AA9 with an N-terminal Arg from L. similis, is secreted upon growth on hardwood pulp
Because the bioinformatics analysis and transcriptomic data suggested that Arg-AA9s might be functionally expressed during fungal growth on biomass, we sought to structurally and functionally characterize one of these proteins. The Polyporales fungus L. similis contains at least seven genes encoding proteins that are classified as AA9 of which LsAA9A has been wellcharacterized (22,23,29). Also among these proteins is a single domain Arg-AA9 protein of 221 amino acids (aa) residues, which we denote LsAA9B (GH61-5 in Ref. 32, GenBank TM accession MN265867). The LsAA9B sequence includes a signal peptide (MKTWAVLSSLALLASSVSA) that suggests that this protein is secreted by L. similis. Indeed, from secreted fractions obtained after induction experiments of L. similis with hardwood pulp, we identified peptides corresponding to the mature LsAA9B protein by electrospray ionization tandem-MS (ESI MS/MS) ( Fig. 3 and Table S1), conclusively showing that the LsAA9B protein is produced extracellularly when the fungus grows on biomass. This is to our knowledge only the second report of identification of up-regulation of an Arg-AA9 at the protein level. The first one was of P. brumalis BRFM985 (JGI protein ID 1403153), which as reported in the table S3 in Ref. 35 was found in the fungal secretome when P. brumalis was grown on wheat straw.
Sequence comparison with well-characterized canonical AA9 LPMOs indicated that residues Arg-1, Asn-84, Leu-158, Gln-167, and Phe-169 in LsAA9B were equivalent to the Hisbrace, conserved Gln and His residues in the secondary coordination sphere, and a Tyr that occupies the axial position of the active site copper (which in TtAA9E are His-1, His-68, His-142, Gln-151, and Tyr-153 (14)). Thus, it appeared that, except for Gln-167, the residues important for AA9 LPMO activity (13,15) were changed in LsAA9B. LsAA9B was heterologously expressed in Aspergillus oryzae and purified for further biochemical studies (see "Experimental procedures" and Ref. 32). As expected, with the purified LsAA9B protein no enzymatic activity could be detected on cellulosic and hemicellulosic substrates (PASC, AZCL-cellulose, or AZO-Xylan), under the experimental conditions tested as compared with the previously characterized LsAA9A that showed clear activity on PASC and AZCL-cellulose.

The LsAA9B X-ray crystal structure reveals a fold very similar to active AA9 LPMOs
Crystals of LsAA9B were obtained in a range of conditions with combinations of polyethylene glycols (PEGs) (see Table 2 for further details). Structures could be determined by molecular replacement (MR) in space group P2 1 2 1 2 1 to better than 1.60-Å resolution. We obtained structures of both a naturally glycosylated and a partly deglycosylated form of LsAA9B (LsAA9B and LsAA9B_deglyc). In addition, using the glycosy-

Table 1 Transcriptional regulation of genes coding for Arg-AA9 in six Polyporales species
Normalized transcript read counts after 3 days growth on Avicel (Av), aspen, (As), pine (Pi), or wheat straw (WS) were compared to normalized read counts after 3 days growth on maltose. Coregulated genes were identified that shared similar transcript levels and transcription profiles on the five carbon sources. An unusual AA9 naturally lacking the His-brace motif lated LsAA9B batch, we obtained structures after soaking experiments with either a solution containing transition metals (LsAA9B_metalsoak) or a solution containing xylotetraose (LsAA9B-Xyl4). For all structures, all 221 residues corresponding to the mature protein could be modeled in the electron density and the final structures showed good refinement statistics ( Table 3). The structure of LsAA9B revealed a typical LPMO topology (  Table S2).

The LsAA9B structures show striking differences compared with canonical LPMO active sites, but share structural features with AA9 LPMOs active on oligosaccharides
In all the LsAA9B structures the N-terminal Arg (Arg-1) and the residues, Asn-84, Leu-158, and Gln-167, Phe-169 (equivalent to His-1, His-68, His-142, Gln-151, and Tyr-153 in TtAA9E are important for AA9 LPMO activity) were clearly defined in Table 2 Crystallization conditions for crystals from which datasets were collected Drop sizes were 0.3 l with protein:reservoir ratios of 3:1.

Primary precipitant
Additive a Buffer b (pH) Notes Structure  Table 3 Crystallographic data and refinement statistics The highest resolution shell is shown in parentheses.

Data collection
LsAA9B An unusual AA9 naturally lacking the His-brace motif the electron density map (Fig. 4b). These residues in LsAA9B ( Fig. 4c and Fig. S2) are arranged in a similar configuration compared with NcAA9C (the closest structural match, PDB 4D7U), as defined by the backbone positions, but are not associated with any metal or positioned compatibly with a likely metal binding function (guanidinium group of Arg-1 points away from Asn-84) (Fig. 4). This confirms that residues important for AA9 LPMO activity are changed in LsAA9B, as was indicated by sequence comparison (note that in other LPMO families with demonstrated activity Phe is found in positions equivalent to Phe-169). Furthermore, the structure does not indicate that nearby residues could functionally compensate for the loss of the conserved residues to provide a copper-binding site/LPMO activity. Near the N-terminal Arg, LsAA9B Tyr-206 is found (Fig. S2) in a position equivalent to Tyr-204 in NcAA9C and Tyr-203 in LsAA9A, shown to be involved in AA9 LPMO oligosaccharide interactions (21)(22)(23). In those structures the Tyr residues are well-defined and make hydrogen bonds to the backbone amide of the conserved His of the secondary copper coordination sphere (His-147 in LsAA9A and His-155 in NcAA9C). However, in LsAA9B the rotamer of the apolar Leu-158 side chain interferes with the hydrogen bonding of the backbone amide to Tyr-206, which in many of the structures was not well-defined  (23)), and Tyr-206 (Fig. S1). However, for Asn-159 and Glu-161 (equivalent to residues in LsAA9A interacting with the substrate in the minus subsites) there seems to be virtually no conservation ( Fig. 5 and Fig. S1).

Other structural features of LsAA9B
O-Linked and N-linked glycosylation at Thr-59 and Asn-134, respectively, were also observed in all of the LsAA9B structures. From Thr-59 one ␣-linked mannosyl unit could be modeled in all cases. In the LsAA9B structure, two N-acetylglucosamine (GlcNAc) units as well as one poorly defined mannosyl unit could be modeled from Asn-134. In the LsAA9B_deglyc structure, one well-defined and one less defined GlcNAc unit were modeled, indicating that the protein batch was not fully deglycosylated (as confirmed by MS).

An unusual AA9 naturally lacking the His-brace motif
Near Asn-134 a small pocket was found (mainly formed by Tyr-71 and a GlcNAc unit from the N-glycosylation site) (Fig.  5), which was often occupied by electron density that could be attributed to crystallization condition components (e.g. glycine, sulfate, or a MES molecule). In the LsAA9B_metalsoak structure a MES molecule could quite confidently be modeled in this position (Fig. 5b) interacting with His-65 and Tyr-71, and in part with the GlcNAc unit (Fig. 5c). A putative functional role is supported by full conservation of His-65 and Tyr-71 (in loop L3) among AA9-Arg, whereas the glycosylated Asn-134 can be substituted with Phe/Tyr, which could assume a similar structural role in forming a pocket. Additional full conservation of three prolines (Pro-76, -79, and -82) suggests that the extended L3 loop has a conserved rigid structure in this AA9 subgroup ( Fig. 5d and Fig. S1). In some canonical AA9 LPMOs charged or polar residues in the extended L3 loop are putative determinants of specificity toward soluble substrates.
For many of the crystal structures determined during this work, Fourier difference map density was visible proximal to Ser-25. Because additional evidence suggested a putative phosphorylation site in LsAA9B (see "Indications of a potential phosphorylation site in LsAA9B"), the final structure includes a phosphoserine (pSer-25), which fits well the electron density in two of the four structures (LsAA9B_deglyc and LsAA9B_metalsoak). In these structures, the pSer-25 and Ser-25 are modeled in alternative conformations with 70 and 30% occupancy, respectively (Fig. 4b).

LsAA9B is not a copper-binding protein
When inspecting the LsAA9B structures no bound metals were found. We speculated whether Arg-1 would be able to adopt a different conformation in response to addition of transition metals allowing for LsAA9B to coordinate a metal cofactor at this surface. Co-crystallization and soaking experiments with transition metals were attempted with solution mixtures containing Fe 2ϩ , Cu 2ϩ , Mn 2ϩ , and Co 2ϩ . X-ray diffraction data were collected at an appropriate wavelength of 1.35 Å ϳ9.184 keV to ensure an anomalous signal would be obtained to allow location of any bound metal ions in the crystal structure. Following map generation, no peaks that could be interpreted as bound metals could be found when inspecting an anomalous Fourier difference map. The highest peaks were found near the sulfur atoms in the Cys residues of the protein, confirming that these metals did not bind in any of these experiments. None of the structures inspected during this work showed any sign of bound metals.
Metal binding was also investigated by thermal shift assays with differential scanning fluorimetry (DSF) and differential scanning calorimetry (DSC). For three completely independent measurements with DSF, the T i for LsAA9B was 65.5 Ϯ 0.75°C. The presence of copper ions at slightly over stoichiometric amounts (0.048 mM copper acetate) resulted in a decrease of T i to 62.4°C. A similar approximate 4°C decrease of T m was found with the addition of copper measured by DSC. No significant change of T m was found for a range of other metals by DSC.
As a final step, microPIXE (Proton-induced X-ray Emission) analysis (42) was used to detect any additional elements present in the sample with no added metals, to ascertain whether LsAA9B copurified with any metal ions that might hint at function. The analysis did not reveal significant amounts of copper, iron, or any other metal typically associated with redox enzymes present in the sample above trace levels ( Fig. S4 and Table S3) (calculations suggest a lower detection limit for copper of 4 ϫ 10 Ϫ3 atoms per protein molecule).

Indications of a potential phosphorylation site in LsAA9B
Although it did not reveal any natively bound metals, the PIXE analysis detected the presence of a single atom of phosphorus in two point spectra ( Fig. S4 and Table S3) for every protein molecule in the sample. In many of the structures we obtained during this study, density was observed in close proximity to the side chain of Ser-25, upon inspection of the Fourier difference map. In light of these data, this was therefore interpreted as being a phosphoserine. Modeling a phosphorylated Ser (pSer-25), with the phosphate group hydrogen bonding to the side chain of the N-terminal Arg (2.9 Å), led to an improved fit to the electron density map and model quality (improved R-factor and clash score). The position of the phosphate group does not appear to interfere with possible copper binding (Fig. S3C).

An unusual AA9 naturally lacking the His-brace motif
Following this observation, we tried to identify phosphorylation using MS. Making several measurements of the Endo H-deglycosylated LsAA9B batches with matrix-associated laser desorption ionization time of flight (MALDI-TOF) we identified the largest peaks with m/z values that could correspond to LsAA9B with glycosylation (N-GlcNAc and O-Man) and a phosphorylation (23.255 Da ϩ 203 Da ϩ 162 Da ϩ 80 Da ϭ 23.670 Da). Using MALDI-TOF to measure the glycosylated LsAA9B we found that the m/z values could correspond with a single phosphorylation, two GlcNAc and varying degrees of mannosylation (m/z increasing by 162 Da for every mannosyl unit). MS techniques with a trypsin-digested preparation of LsAA9B only identified the nonmodified peptide (total sequence coverage of 55%), thus conclusive evidence for phosphorylation is lacking.
However, it is interesting to observe that in positions corresponding to Ser-24 and Ser-25 in LsAA9B most of the analyzed Arg-AA9 sequences have Ser or Thr (Figs. 4b and 5, and Fig. S1), which are potential phosphorylation sites, as predicted by the NetPhos 3.1 Server (43). Thus, at least theoretically, it seems that a number of the Arg-AA9 sequences could be phosphorylated.

Investigation of oligosaccharide binding
As already stated LsAA9B contains residues (e.g. Asn-26, His-66, and Tyr-206) equivalent to some of those involved in LsAA9A oligosaccharide binding (Fig. S2). When superimposing either the LsAA9A-Cell5 (PDB 5NLS) or LsAA9A-Xyl5 (PDB 5NLO) complex onto the LsAA9B structure, we found that some of these residues in LsAA9B came within hydrogen bonding distance of the Cell5 and Xyl5 ligands of the LsAA9A complex structures. In addition, we observed that the sugar moiety of the ϩ2 subsite in LsAA9A (the C6-hydroxyl and the C2-hydroxyl for Cell5 and Xyl5, respectively) could interact with the putative phosphorylation in LsAA9B. Thus, LsAA9B carbohydrate binding was investigated by cocrystallization with monosaccharides, and crystal soaking experiments and thermal shift assays with oligosaccharides.
Addition of oligosaccharides at 50 mM concentration in solution caused very little change in T i ( (30 -100 mM) because the presence of chloride ions has been shown to enhance oligosaccharide binding to canonical AA9 (22). No electron density that could be interpreted as cellooligosaccharides was identified at the protein surface.
However, we did identify a Xyl4 ligand bound at the surface of LsAA9B, but the binding mode is distinct compared with LsAA9A-Xyl5 (Fig. 6a). One explanation for this could be that glycosylation of a symmetry related molecule occludes this binding site (Fig. 6b). Instead the Xyl4 ligand bound in a small cleft formed by residues 41-49 (TGFIQPVSK) and 99 -10 (TQS). Three xylosyl units of the Xyl4 ligand (numbered 1-3 from the reducing end) are well-defined and could easily be modeled in the electron density, whereas the fourth xylosyl unit at the nonreducing end is only partially defined (Fig. 6c). Many of the glycosidic torsion angles of the Xyl4 ligand deviate from ideal values, e.g. between xylosyl units 1 and 2 (in particular the ⌿-torsion angle) (Table S4).

Discussion
LsAA9B can be heterologously expressed and is a well-folded and stable protein, with melting temperatures comparable with enzymatically active members of the family (44 -47), but showed no detectable LPMO activity under the condition tested. The X-ray crystal structure of LsAA9B in fact confirms a complete disruption of the copper-binding site that is the hallmark of LPMOs (9) and absence of other metal-binding sites, even though the overall-fold remains very similar to that expected. We clearly demonstrated that residues Arg-1 and Asn-84 in LsAA9B are positioned equivalently to the His-brace of AA9 LPMOs, but that neither of these (nor any other) residues coordinate copper. In all structures (with or without the putative phosphorylation site modeled and/or metals added), Arg-1 in fact assumes a rotamer conformation that is not optimal for forming a Hisbrace like metal-binding site. PIXE spectroscopy (42), crystal soaking experiments with metal ions, and thermal shift analyses in solution failed to indicate binding of any metal. This is consistent with a survey of metal-binding sites in protein structures reporting that Asn and Arg interact with elements such as potassium, sodium, magnesium, and calcium more frequently than with transition metals (of which only manganese is reported), whereas in contrast His is the most common protein residue to bind copper (48). Thus, our work confirms the expectation that LsAA9B and other Arg-AA9s are very unlikely to be copper-dependent LPMOs.
At the same time, evidence presented here and elsewhere shows that a number of genes (in wood-degrading fungal Polyporales species) encoding Arg-AA9 proteins are up-regulated during fungal growth on plant biomass alongside lignocellulose degrading CAZymes (most commonly members of GH and AA families). The majority of the Arg-AA9 sequences possess a signal peptide (with predicted processing before Arg) expectedly targeting the mature N-terminal Arg proteins for the secretory pathway. Indeed both LsAA9B and the Arg-AA9 of P. brumalis BRFM 985 (JGI protein ID 1403153) have been detected in fungal secretomic data. In addition, the mature LsAA9B protein with N-linked glycosylation is secreted when recombinantly produced in Aspergillus from a construct with the native L. similis signal peptide. Thus, there is circumstantial evidence that Arg-AA9 proteins, like conventional LPMOs, could be involved in plant biomass degradation. The open question remains how, and although this work does not provide a conclusive answer, the structural analysis suggest a number of new avenues for further investigation.
For example, we found that the relatively conserved L3 loop is likely structurally rigid in the majority of Arg-AA9 proteins

An unusual AA9 naturally lacking the His-brace motif
and together with the Asn-134 glycosylation forms a small pocket. In the LsAA9B_metalsoak structure, a MES molecule is interacting with the completely conserved His-65, Tyr-71 in this pocket. It is possible that the MES molecule mimics biologically relevant interactions and notably, MES bear some resemblance to phenolic compounds (e.g. p-coumarylconiferyl, sinapyl alcohols) that make up lignin, and thus the possible connection to lignin degradation/detoxification pathways should be investigated in the future.
Structural comparison suggested that LsAA9B could bind oligosaccharides in a manner similar to LsAA9A, but this could not be confirmed by our structural studies, perhaps because of glycosylation from a symmetry-related molecule occluding this binding site. In contrast, we obtained an LsAA9B structure with a Xyl4 ligand bound at a distinct binding site, in a small cleft made up mostly by relatively conserved residues (no similar cleft is present in LsAA9A where this space is occupied by the peptide stretch VDNRVV formed by residues 43 to 48). It is difficult to establish whether the binding site is biologically relevant: the expected structural conservation of this region in Arg-AA9 and the fact that there is only one hydrogen bond of Xyl4 with a symmetry-related molecule suggest this is not a crystal artifact. On the other hand the bound conformation of Xyl4 deviates considerably from what is usually observed in xylooligosaccharides and xylan (49) (Table S4), and Xyl5 in solution caused no thermal shift. The path connecting this binding site and the binding site in LsAA9A is definitely occludedbycrystalcontacts,andthusbindingoflongeroligosaccharides (connecting the two binding sites) is not possible in this crystal form. However, the discovery of a Xyl4-binding site warrants additional investigations of polysaccharide binding by this and other Arg-AA9.
Finally, we found indications that Ser-25 could be phosphorylated in LsAA9B as supported by some high-resolution structures and by PIXE data, and consistent with MALDI-TOF analysis, although a phosphorylated peptide could not be conclusively identified. Intriguingly, the phosphoserine modeled in the structure interacts with the N-terminal Arg (2.9 Å hydrogen bond) and could additionally potentially interact with bound polysaccharides near the binding sites observed for LsAA9A and LsAA9B.
Potential similar phosphorylation sites (equivalent to Ser-24 and Ser-25 of LsAA9B, Figs. 4b and 5d) are also predicted for Arg-AA9 protein sequences (presented here in Fig. 2 and Table  1 and previously (34,35,38)), for which the transcription of the corresponding genes were up-regulated during fungal growth on plant biomass. In addition, in the Fungi Phosphorylation Database (FPD) (50) documented Ser phosphorylation sites in fungal species (including Aspergillus) can be found that are consistent with the LsAA9B sequence surrounding the Ser-25 phos-

An unusual AA9 naturally lacking the His-brace motif
phorylation, supporting the notion that phosphorylation could play a role in fungal plant biomass degradation. It is known that proteins can be phosphorylated in the secretory pathway (51,52), in mammalian cells by the Fam20C kinase (53,54), and phosphorylation does appear to play a role extracellularly during some biological events (e.g. during microbial host infection (55)(56)(57)). Thus, it is possible that also phosphorylated Arg-AA9s could exist extracellularly, because the evidence points to an extracellular location for these proteins.
In summary, our results indicate the importance of Arg-AA9s in fungal adaptation to biomass degradation. We have shown that the transcription of Arg-AA9 genes are up-regulated in fungal species belonging to the phylogenetic class of Agaricomycetes (of which many are wood or litter decayers) when cultivated on plant biomass. We have determined this first structure of an Arg-AA9 protein, LsAA9B (belonging to the wood-decaying Lentinus genus (58)), and have identified interesting structural features, including a potential carbohydrate-binding cleft, a small conserved pocket, and a potential phosphorylation site. Sequences conservation suggests that these features are found in the majority of the Arg-AA9s as well. These findings provide a solid framework for future investigations of this subgroup of AA9 and for the pursuit of their potential role in fungal biology. The original structures of AA9 (14) and AA10 (24) family members were of key importance toward the subsequent identification of the nature of LPMO action on biomass. We hope that the structure of LsAA9B will likewise act as the starting point for further characterization of these unusual AA9 family members.

Fungal transcriptomic and secretomic data
L. similis was cultivated at 30°C on a basic fungal media including 0.5% (w/v) hardwood BCTMP (bleached chemithermo mechanical pulp) for induction experiments. Samples were taken after 0, 3, 5, 7, and 10 days to analyze secreted proteins and identify LsAA9B with ESI-MS/MS. Tryptic digests were prepared by a filter-aided sample preparation method. Following digestion the extracted peptides were analyzed on a nano LC-MS/MS system: UltiMate 3000 RSLC nano/LTQ Orbitrap Velos Pro (Thermo/Dionex). For protein identification the data were searched against the complete L. similis proteome (internal database) using the Mascot search engine (Matrix science) in the Genedata Expressionist software (1% false discovery rate cutoff). Relative protein concentrations were calculated by label-free quantification from peptide volumes using a Hi3 standard method in Genedata Expressionist.
P. coccineus BRFM 1662, T. ljubarskyi BRFM 1659, Leiotrametes sp. BRFM 1775, and T. elegans BRFM 1663 strains were obtained from the CIRM collection at the National Institute of Agricultural Research. Transcriptome and secretome data were collected from triplicated independent 3-day cultures in the presence of 20 g ϫ liter Ϫ1 of maltose, 15 g ϫ liter Ϫ1 of Avicel, 15 g ϫ liter Ϫ1 of ground wheat straw, 15 g ϫ liter Ϫ1 of ground pine wood, or 15 g ϫ liter Ϫ1 of ground aspen wood as the sole carbon source. RNA libraries were prepared and sequenced on Illumina HighSeq-2500 as described in Ref. 59. Secreted proteins were collected from the same cultures, diafiltered and identified by ESI-MS/MS (as in Ref. 59). Transcript reads were analyzed as described in Ref. 38. For each strain, the genes with similar transcription profiles on the five carbon sources were grouped into nodes using the Self-organizing maps Harboring Informative Nodes with Gene Ontology Pipeline (SHINϩGO (34)). All sequence data are available on the Mycocosm public database at Joint Genome Institute (https:// genome.jgi.doe.gov/programs/fungi) (60).

LsAA9B protein production, purification, and activity measurements
An AA9 protein from L. similis (LsAA9B) was recombinantly produced in A. oryzae MT3568 and purified as described in the patent literature (32) (similar to the previously characterized LsAA9A (22,31)). The mycelium was removed by filtration and the broth collected for protein purification by chromatography with a 50-mm diameter 167-ml Butyl-ToyoPearl 650 column (ToSoh BioSciences, Stuttgart, Germany) with a gradient of 0 -100% of buffer A (25 mM Tris-HCl, 1.0 M ammonium sulfate, pH 7.5) and buffer B (25 mM Tris-HCl, pH 7.5). To ensure purity suitable for structural studies, two further purification steps were applied. After buffer change by ultrafiltration (Vivaspin, Sartorius) the sample was applied to a 100-ml Q-Sepharose column (Sigma) in 20 mM Tris-HCl, pH 8.0. The protein was eluted with a gradient of 0 to 500 mM NaCl. Fractions containing protein were further purified on a 26-mm Superdex 75 size exclusion (Sigma) column with isocratic 20 mM MES, 125 mM NaCl, pH 6.0. All protein batches were buffered in 20 mM MES, pH 6.0. Activity was measured as previously described by depolymerization of AZCL-cellulose (61) and PASC (9) in parallel on LsAA9A and LsAA9B in the presence of a variety of electron donors (pyrogallol, 4-OH-5-CH3-3-furanone, ascorbate, and cysteine) and under similar conditions. In addition, LsAA9B activity was measured (according to manufacturer instructions) on AZOxylan (purchased from Megazyme; product code S-AXBL) without and with equimolar concentrations (1 M) of copper or manganese in the presence of ascorbate (1 mM).

Mass spectrometry analysis of purified LsAA9B
MS was performed using MALDI-TOF. The matrix was prepared by saturating a TA solution (0.1% (v/v) trifluoroacetic acid (TFA) and acetonitrile in a 2:1 (v/v) ratio) with sinapic acid. Matrix and protein sample were sequentially applied to the target plate (in a final 2:1 ratio) and left for solvent to evaporate. The TOF experiments were performed in linear mode calibrated for 10 -50 kDa molecules.
Products following a trypsin digest of the heterologously expressed LsAA9B batches were analyzed using a Thermo Fisher Scientific Q-Exactive HF-X Orbitrap mass spectrometer that was operated in positive mode with a MS1 resolution of 120,000. A top 10 method was utilized and a resolution of 45,000 for MS2 fragment scans was utilized. Peptides were separated on a 15-cm column (75 m inner diameter) packed inhouse with 1.9 M C18 particles. EASY-nLC 1200 nano-LC system coupled to the mass spectrometer was utilized to separate injected peptides over an increasing gradient of buffer B (80% An unusual AA9 naturally lacking the His-brace motif acetonitirile, 0.1% formic acid). The collected raw data were analyzed using MaxQuant (version 1.6.1.11).

Crystallization and data collection
Deglycosylation of protein for crystallization was performed in 20 mM MES, pH 6.0, 125 mM NaCl, by incubating about 10 l of endoglycosidase H (from Roche Diagnostics, 11643053001) per mg of LsAA9B protein overnight at room temperature. Crystallization was carried out by sitting drop vapor diffusion in 96-well MRC-2 plates using an Oryx8 crystallization robot (Douglas Instruments) with drop sizes of 0.3-0.4 l (with protein:reservoir:Milli-Q water ratios of 3:1:1, 3:1:0, 1:1:1, or 1:1:0) and with 100-l reservoirs in 24-well VDX plates with drop sizes of 2-4 l and reservoirs of 1 ml. Morpheus screens (62) were set up using both a naturally glycosylated batch (11 mg/ml) and a deglycosylated (with endoglycosidase H from Roche Diagnostics, 11643053001) batch (4 -5 mg/ml) of LsAA9B. For all batches, crystals could be obtained in quite similar conditions (Table 2).
With an initial protein batch a crystal grew in Morpheus condition 28 in well C4 (Table 2), from which a complete 1.6 Å resolution X-ray diffraction dataset of 155 frames (155°) was collected. With a glycosylated LsAA9B, intergrown crystal plates were obtained in Morpheus condition 85 (Table 2). A single plate was mounted separately from a cluster and a dataset of 200 images (200°) was collected. Crystals were also obtained with a deglycosylated LsAA9B (Morpheus condition 6 in well A6) and were reproducible in MRC-2 plates with protein concentrations from 2 to 5 mg/ml and 24 -48% (w/v) PPT4 (a 1:1:1 mixture of MPD(racemic), PEG100, and PEG3350), and a dataset of 324 images (162°) was collected ( Table 2). For all datasets the data collection were carried out with the crystals mounted in nylon loops at 100 K without additional cryoprotectant, because the Morpheus screen composition is already cryo-protecting.
Cocrystallization experiments with transition metals were performed in VDX plates with 33-42% (w/v) of PPT1 (a 2:1 mixture of PEGMME500 and PEG20.000) or PPT4 at pH 6.5, 7.5, or 8.5 and a metal mixture additive of 5-50 mM CuSO 4 , MnCl 2 , and CoCl 2 , but failed to provide crystals of sufficient quality. For soaking experiments, crystals were grown in PPT4 at pH 7.5 (with either carboxylic acids or amino acids or monosaccharides additives). Transition metal mixture solutions were added to an approximate final concentration of 6 -7 mM of each of Fe 2ϩ , Cu 2ϩ , Mn 2ϩ , and Co 2ϩ . Crystals were mounted at different time points from 1 min to 5 h 20 min and datasets (generally to better than 1.7 Å resolution) were collected as before but at a wavelength of 1.35 Å (9.184 keV) at which the anomalous signal should be significant to detect all of the metals. No peaks that could account for bound metals were found when inspecting an anomalous Fourier difference map. In fact, the highest peaks were found near the sulfur atoms in the cysteine residues.
Optimized cocrystallization with monosaccharides additives (20 -100 mM) were carried out in VDX plates with 33-42% (w/v) of either PPT1 or PPT4 at pH 6.5, 7.5, or 8.5 (because intergrown protein crystals in conditions with monosaccharide additives were observed in the screens). However, no crystals appropriate for data collection were obtained. Soaking experiments with oligosaccha-rides were carried out in MRC-2 plates using crystals grown with 6 -11 mg/ml of protein in 27-42% (w/v) PPT1 at either pH 6.5 or 8.5 with additives of either 30 or 100 mM MgCl 2 and CaCl 2 . Crystals were transferred to reservoir solutions containing 400 -800 mM of cellotetraose (Cell4), cellopentaose (Cell5), or xylotetraose (Xyl4) and soaked for 40 -80 min.
All data were collected at the MX beamline I911-3 at the MAX-IV laboratory, Lund (Sweden), or by remote access at beamline ID23-1 at the ESRF, Grenoble (France). All crystals were isomorphous and processed and scaled with XDS/XS-CALE (63) in P2 1 2 1 2 1 with similar cell dimensions (Table 3).

Structure determination and refinement
From sequence alignment the closest AA9s with structures available at the time were TtAA9E (PDB entry 3EJA (14)) and NcAA9D (PDB entry 4EIR (40)) with 42.9 and 45.1% sequence identity, respectively. MR with MOLREP (64) using only the protein coordinates (or Sculptor (65) modified models) of either of PDB entries 3EJA or 4EIR as search models gave solutions, which following one round (10 cycles) of restrained refinement (with Ref-mac5 (66), CCP4 suite (67)) resulted in R-factors of ϳ40% after which parts of the map could be relatively easily interpreted. Modeling in COOT (68, 69) followed by several rounds of restrained refinement resulted in an initial preliminary structure, which was used to refine additional structures obtained later. Structures of LsAA9B were refined in Refmac5 (CCP4 suite) (66, 67) using R free flags imported from the previous structure factor file. All LsAA9B structures obtained after this point were refined with anisotropic B factors for all protein atoms (including glycosylation) and isotropic B factors for all other atoms. Refinement and validation statistics are shown in Table 3.

Construction of multiple sequence alignments and structural comparison
Superpose (CCP4 suite) was used for calculation of the RMSD C ␣ trace using secondary-structure matching (39,67). Multiple sequence alignment was constructed using a standalone version of STRAP (STRuctural Alignments of Proteins) and programs within (70,71). For multiple sequence alignment, MAFFT was used (72). For the structure-based sequence alignment TMalign was used to align the C ␣ trace of the AA9 3D structures (73). Mapping of sequence on the LsAA9B structure was done using CAMPO (74) available in the PyMod 2.0 plugin for PyMOL (75).

Thermal shift assays
Thermal shift assays with DSF were carried out using intrinsic fluorescence using the NanoTemper Tycho NT.6 (Nano Temper Technologies) according to the manufacturer's instructions. Tycho NT.6 follows the unfolding process by recording sample fluorescence at 330 and 350 nm during thermal unfolding. A constant heating rate of 30°C/min is applied to the sample, heating from 35 to 95°C. A shift in T i to higher temperatures is usually indicative of ligand binding. 10 l of each sample were prepared, incubated for 5 min, and then loaded in capillaries for measurements. LsAA9B was at a concentration of 1 mg/ml in 20 mM MES, pH 6.0. All measurements were carried out at least in triplicates. Thermal shift assays with capillary DSC (Northampton, MA) were carried out with a An unusual AA9 naturally lacking the His-brace motif heating rate of 1.5°C/min with LsAA9B at pH 5.0 in the presence of 1 mM divalent metal ions (Cu 2ϩ , Ni 2ϩ , Fe 2ϩ , Zn 2ϩ , Mn 2ϩ , and Ca 2ϩ ) or 1 mM DTPA (as chelating agent).

Particle-induced X-ray emission (microPIXE) analysis
LsAA9B, purified as described above, was prepared for microPIXE analysis by passing down a 16/600 Superdex 75 (GE Healthcare) column to buffer exchange into 20 mM ammonium acetate, pH 5.5, 200 mM KBr. The final sample was then concentrated to 7.7 mg/ml using a 10-kDa cut off VivaSpin Concentrator (Sartorius). 0.1 l of sample was placed onto a 4-m polypropylene film (Prolene supplied by Fluxana GmbH & Co., Germany) which was then allowed to dry naturally in a closed environment to prevent dust contamination. The sample was mounted in the path of a 2.5-MeV proton beam with a beam diameter of ϳ2.5 m at the Ion Beam Centre, University of Surrey. Prior to the experiment a glass standard was analyzed and validated to ensure accurate elemental quantitation. Data were collected by first scanning the proton beam across the sample to generate a set of elemental maps. Point spectra were then measured from two distinct regions of the sample based on the distribution of sulfur in the map, signifying where protein was located. The collected spectra were processed using the Q-factor method (76) as implemented in the OMDAQ-3 software and converted into the number of metal atoms per protein molecule as described in Ref. 42.