Characterization and structural analyses of a novel glycosyltransferase acting on the β-1,2-glucosidic linkages

The IALB_1185 protein, which is encoded in the gene cluster for endo-β-1,2-glucanase homologs in the genome of Ignavibacterium album, is a glycoside hydrolase family (GH) 35 protein. However, most known GH35 enzymes are β-galactosidases, which is inconsistent with the components of this gene cluster. Thus, IALB_1185 is expected to possess novel enzymatic properties. Here, we showed using recombinant IALB_1185 that this protein has glycosyltransferase activity toward β-1,2-glucooligosaccharides, and that the kinetic parameters for β-1,2-glucooligosaccharides are not within the ranges for general GH enzymes. When various aryl- and alkyl-glucosides were used as acceptors, glycosyltransfer products derived from these acceptors were subsequently detected. Kinetic analysis further revealed that the enzyme has wide aglycone specificity regardless of the anomer, and that the β-1,2-linked glucose dimer sophorose is an appropriate donor. In the complex of wild-type IALB_1185 with sophorose, the electron density of sophorose was clearly observed at subsites −1 and +1, whereas in the E343Q mutant–sophorose complex, the electron density of sophorose was clearly observed at subsites +1 and +2. This observation suggests that binding at subsites −1 and +2 competes through Glu102, which is consistent with the preference for sophorose as a donor and unsuitability of β-1,2-glucooligosaccharides as acceptors. A pliable hydrophobic pocket that can accommodate various aglycone moieties was also observed in the complex structures with various glucosides. Overall, our biochemical and structural data are indicative of a novel enzymatic reaction. We propose that IALB_1185 be redefined β-1,2-glucooligosaccharide:d-glucoside β-d-glucosyltransferase as a systematic name and β-1,2-glucosyltransferase as an accepted name.

Carbohydrate chains are important polymer compounds for all organisms, which is attributed to the wide variety of carbohydrate chain structures. Such complexity of the structures is thought to be responsible for the repertoire of enzymes that synthesize and degrade carbohydrate chains. The functions and structures of these enzymes have become extensively diversified through molecular evolution. To date, various kinds of enzymes related to carbohydrates have been found and added to the Carbohydrate-Active enZYmes (CAZy) database (http://www. cazy.org) (1,2). This database classifies these enzymes called CAZymes into families mainly based on their amino acid sequences and is now expanding. However, obtaining carbohydrates is often difficult due to their rarity or inhomogeneity in nature, which limits exploration of novel enzymes.
Genes encoding β-1,2-glucan-degrading enzymes had not been identified until a novel phosphorylase was first found in Listeria innocua as an enzyme that can act on linear β-1,2-glucans (Hereafter, β-1,2-glucan represents a linear form unless otherwise noted.) in 2014 (20)(21)(22). This enzyme was named 1,2-β-oligoglucan phosphorylase (SOGP) and was given a new EC number (EC2.4.1.333) (23). After that, a putative glycoside hydrolase family (GH) 3 enzyme in the SOGP gene cluster was found to be a β-glucosidase preferably hydrolyzing Sop 2 by functional and structural analyses (24). A carbohydrate-binding subunit of a putative ABC transporter in the same gene cluster was also found to be a Sop n s-binding protein (25). These results are the first biochemical evidence of the existence of a gene cluster involved in β-1,2-glucan metabolism. A large-scale preparation method for β-1,2glucan has been established using SOGP and inexpensive sugars as materials (26,27). The prepared β-1,2-glucan was used for identification of endo-β-1,2-glucanases (SGLs) from a bacterium and a fungus. Both SGLs have been successfully identified and classified into new families (GH144 and GH162, respectively) (28,29). This finding enables us to explore SGL homologs and SGL gene clusters. An SGL homolog possessing an unknown function region at the N terminus has been identified as a novel exolytic enzyme that releases Sop 2 from the nonreducing ends of Sop n s (30). A β-glucosidase preferably acting on longer Sop n s and β-1,2-glucan has also been found from an SGL gene cluster in Bacteroides thetaiotaomicron (31). The structure-function relationships of these enzymes have also been analyzed (24,25,28,29,32). However, the abovementioned reports are most of the studies on β-1,2-glucandegrading enzymes, implying insufficient understanding of the variety of β-1,2-glucan-associated enzymes.
Here, we focus on an SGL gene cluster in the genome from Ignavibacterium album, a moderately thermophilic anaerobic bacterium found in a hot spring in Japan (33). This gene cluster includes two genes encoding putative GH144 enzymes, and β-1,2-glucan-related genes encoding a putative GH3 β-glucosidase, a putative GH94 enzyme (a homolog of the SOGP), and a putative Sop n s-binding protein in an ABC transporter (Fig. S1). The gene cluster also contains a gene (ialb_1185) encoding a putative GH35 enzyme (IALB_1185, hereafter IaSGT). While GH35 enzymes, which are distributed in a wide range of microorganisms, plants, and animals, are mainly β-galactosidases (β-galactosidase, exo-β-1,4-galactanase, and β-1,3-galactosidase) according to the CAZy database (34)(35)(36), several GH35 enzymes from Archaea have been found to be β-glucosaminindases (GlmAs), and GlmA from Thermococcus kodakaraensis (TkGlmA) hydrolyzes chitosan and chitooligosaccharides (37,38). The binding modes of natural substrates are not sufficiently understood, since only the glucosamine (GlcN) complex is available among GlmAs (39). Though many GH35 enzymes have been reported, as described above, no glucoside-acting enzyme has been reported in this family. In this study, we report the first β-1,2-glucan-associated GH35 enzyme biochemically and structurally and furthermore describe why the enzyme is a novel enzyme that should be given a new EC number.

Phylogenetic and sequence analysis of IaSGT
Phylogenetic analysis was performed using the amino acid sequences of characterized GH35 enzymes in the CAZy database and IaSGT. While Eukaryotic GH35 enzymes are divided into two clusters, each bacterial and archaeal GH35 enzymes form one cluster, respectively (Fig. S2 and Table S1). Notably, IaSGT and its homologs form a distinct cluster from the known GH35 enzymes. Though the group of archaeal GlmAs is close to that of IaSGT in the phylogenetic tree, the amino acid sequence identity between these enzymes and IaSGT is low (only 28%).
IaSGT has no N-terminal signal peptide, suggesting that it is localized in the cytosol. Nucleophile and acid/base residues in GlmAs (Glu347 and Glu179 in TkGlmA, respectively) are conserved in IaSGT (E343 and E176, respectively) (Fig. S3). Though most substrate recognition residues at subsite −1 in the GlmAs (Tyr53, Glu103, Glu179, Glu347, and Tyr379 in TkGlmA) are also conserved in IaSGT, aspartate residues (Asp178 in TkGlmA) considered to be responsible for specificity to the amino group in GlcN are replaced by an asparagine residue in IaSGT (Asn175). Furthermore, subsite plus side regions (around Glu184 and Leu282 in TkGlmA) are not conserved at all. These differences imply that IaSGT has different substrate specificity from the GlmAs.

General properties
The purified IaSGT migrated as a single band corresponding to approximately 75 kDa/m on an SDS-PAGE gel (Fig. S4A). The enzyme was eluted at the time corresponding to 141 kDa/m on size-exclusion chromatography (Fig. S4B). Thus, this enzyme should form a dimer. Since IaSGT acted on Sop 2 to release glucose (Glc) as described below in detail, quantification of Glc was used for investigation of pH and temperature profiles. IaSGT showed high activity at pH 5.0 − 8.0 (over 90% relative activity as to the highest) and was stable at pH 5.0 to 11.0 (Fig. S5A). IaSGT showed optimum activity at 55 C and was stable up to 60 C after incubation for 1 h (Fig. S5B), which is consistent with the bacterial growth property as to temperature.

Substrate specificity and reaction mode of IaSGT
Since most GH35 enzymes show β-galactosidase activity, the activity of IaSGT toward β-galactosides was investigated. However, IaSGT did not show any hydrolytic activity toward lactose (β-Gal-1,4-Glc) (Fig. 1A) or p-nitrophenyl (pNP)-βgalactopyranoside (Gal), an artificial substrate (less than 0.01 U/mg for 1 mM pNP-β-Gal). Nor did the enzyme act on various oligosaccharides such as cellooligosaccharides (Cel 2−5 ), laminarioligosaccharides (Lam 2−5 ), maltose (α-Glc-1,4-Glc), gentiobiose (β-Glc-1,6-Glc), or sucrose (α-Glc-1,2-β-Fru) (Fig. 1, A-C). On the other hand, IaSGT showed activity toward Sop 2−5 obviously (Fig. 1D). IaSGT produced oligosaccharides with both lower and higher DPs than those of the substrates. This disproportionation of DPs proceeded by transfer of a glucose unit. The products appeared not to show decreases in their average DPs even after reaction overnight. The enzyme did not show hydrolytic (glucose-releasing) activity toward Sop 3 (less than 0.01 U/mg for 20 mM Sop 3 ). These results indicated that the reaction mode of IaSGT was that of a glycosyltransferase. After the products at the beginning of the reaction with Sop 3 were fractionated by sizeexclusion chromatography, only the product at the position corresponding to Sop 4 on the TLC plate was collected and analyzed by 1 H-NMR (Fig. S6A). The chemical shifts of the product fitted completely with those of the reference Sop 4 (Fig. S6, B and C), indicating that the reaction product is Sop 4 and that the enzyme transfers a glucose unit to produce a β-1,2-glucosidic bond. Such elongation by transfer of Glc units has been found in GH16 elongating β-transglycosylase, though the GH16 enzyme acts on β-1,3/1,4-linkages (40). To understand the DPs of the reaction products in detail, the products after the overnight reaction with Sop 5 were separated on the TLC plate by developing twice. As a result, Sop n s with DP at least up to 9 were clearly detected (Fig. 1E). This is consistent with the fact that Sop 2−9 were clearly detected by electrospray ionization-mass spectrometry (ESI-MS) (Fig. S7A). Sop n s with DPs of 10 or more could also be assigned. Though velocities of disproportionation appeared to slow down after 3 h of the reactions (Fig. 1D), this is probably because the proportion of Sop n s molecules with the highest and the lowest DPs to all the substrate molecules in the reaction solution was reduced.

Kinetics for Sop n s
In order to determine whether Sop n s are appropriate substrates for IaSGT, the kinetic parameters of the glucosyl transferase activity of the enzyme toward Sop 2−5 were determined (Table 1). Though IaSGT showed modest k cat values, the K m values were remarkably large, especially for Sop 2 and Sop 3 (120 mM and 300 mM, respectively), as a GH enzyme. Consequently, the k cat /K m values were quite small (less than 0.1 s −1 mM −1 for Sop 2 , Sop 3, and Sop 5 , and less than 0.5 s −1 mM −1 for Sop 4 ). Since the substrates in the transferase reaction are both donors and acceptors, the quite large K m values suggest that Sop n s are inappropriate as at least either donors or acceptors.

Determination of acceptor substrates
To determine optimal acceptors of IaSGT, the effects of various monosaccharides and disaccharides (1 mM D-mannose, D-glucose, D-galactose, D-xylose, D-talose, L-arabinose, D-fructose, L-rhamnose, D-gluconate, Lam 2 , Cel 2 , gentiobiose, sucrose,  maltose, α,α-trehalose, or lactose) as acceptors on activity toward 2 mM Sop 2 were investigated. However, a remarkable increase in specific activity was not found (less than 15% increase in specific activity, data not shown). Then, the glycosynthase activity of the E343G mutant in the presence of α-D-glucosyl fluoride (α-GlcF) as a donor was investigated by TLC analysis. The mutant showed glycosynthase activity only in the presence of glucose as an acceptor among the examined monosaccharides and disaccharides, though α-GlcF itself acted as an acceptor as well (Fig. S8). When pNP-α-Glc was used as an acceptor, a synthetic product was observed. Therefore, various aryl-and alkylglucosides, as acceptors, were investigated using the WT IaSGT in the presence of Sop 2 as a donor. Reaction products were detected regardless of the anomer of acceptors except methyl-β-Glc (Fig. 2). Considering the kinetic analysis described later, spots of reaction products derived from methyl-β-Glc seemed to overlap those of Sop n s. ESI-MS analysis using phenyl-α-Glc and Sop 2 detected peaks assigned as the compounds of phenyl-α-Glc linked with one or two Glc units clearly, though the peak corresponding to Glc was small (Fig. S7B). This result is consistent with detection of two spots indicated by arrows below the spot of phenyl-α-Glc in the TLC plate (Fig. 2).

Kinetic analysis of glucosides
We determined the kinetic parameters of the glycosyltransfer activity of IaSGT using Sop 2 as a donor and various glucosides as acceptors. The enzyme showed remarkably higher activity toward most of the investigated acceptors than that in the absence of the acceptors ( Table 2). The K m values for the α-glucosides were at the range of 0.044−0.38 mM (approximately 300−2600 times smaller than that of Sop 2 without the acceptors), and the k cat /K m values for the acceptors were approximately 70−570 times higher than that of Sop 2 without an acceptor. In the case of β-glucosides, the K m values were in the range of 0.021−0.15 mM (approximately 770−5500 times smaller than that of Sop 2 without the acceptors), and k cat /K m values were approximately 30−800 times higher than that of Sop 2 without an acceptor. Overall, β-glucosides showed smaller K m and k cat values than those of α-glucosides. However, the k cat /K m values for acceptors with both types of anomers were within a similar range and were sufficiently large as those of GH enzymes. Therefore, the enzyme can act on a wide range of glucosides with various aryl-and alkyl-groups and both types of anomers as acceptors. In addition, IaSGT comparably acted on amygdalin, a gentiobioside, as an acceptor, though the K m and k cat /K m values were rather larger and smaller than those of the β-glucosides, respectively.
Next, the kinetic parameters for Sop 2 and Sop 3 as donors were determined in the presence of pNP-α-Glc as an acceptor in order to investigate donor specificity ( Table 2). The K m and k cat values for Sop 2 were approximately two times smaller and six times higher, respectively, than those for Sop 3 , resulting in an approximately 14 times higher k cat /K m value for Sop 2 than that for Sop 3 .

Overall structure of IaSGT
First, we determined the ligand-free structure of wild-type IaSGT at 1.75 Å resolution after solving the initial phase by the SAD method (Table S2). Complex structures with various ligands were also solved, as shown in Table S2. The overall conformational change on binding of ligands is small (rmsd, approximately 0.2 Å). Thus, the wild-type (WT)-Sop 2 complex was used as a representative structure in Figure 3. There are two molecules in an asymmetric unit with almost the same configuration (rsmd between subunits A and B, 0.5 Å) Figure 2. TLC analysis of glycosyltransfer activity toward various acceptor glucosides. M, markers containing 0.5% each of Glc and Sop 2−5 ; + and − represent with and without WT IaSGT in the reaction mixture, respectively. Acceptors are shown below the TLC plates. Arrows indicate the products derived from phenyl-α-Glc. The TLC plates were developed once with 80% (v/v) acetonitrile in water.

TIM-barrel domain.
A structural homology search was performed using the Dali server (http://ekhidna2.biocenter.helsinki.fi/dali/) (42). TkGlmA was found as the enzyme that was most similar to IaSGT (Z-score, 42.6; rmsd, 2.5 Å). IaSGT has a similar quaternary structure and domain configuration to those of TkGlmA, though the C-terminal domain is smaller than that of TkGlmA (Figs. 3C and S3). IaSGT has much smaller numbers of hydrogen bonds and salt bridges at the subunit interface (25 and 8, respectively) than those in TkGlmA (58 and 32, respectively), as based on interface analysis with PISA (https://www.ebi.ac.uk/msd-srv/ prot_int/pistart.html) (43).

Binding mode of Sop 2
In order to understand the binding modes of ligands, the complex structures of WT with glucose (Glc) and Sop 2 and the complex of the E343G mutant with 1-deoxynojirimycin (DNJ) were determined (Fig. 4). The electron densities of Sop 2 were clearly observed in both subunits of the WT-Sop 2 complex (Fig. 4A). The fitted Sop 2 molecules in subunits A and B are βand α-anomers of Sop 2 , respectively. When the GlcN recognition residues in the TkGlmA-GlcN complex are aligned with the corresponding residues in the WT-Sop 2 complex, the Glc moiety at the nonreducing end of Sop 2 is well superimposed with the GlcN molecule ( Fig. 5A), suggesting that Sop 2 is bound at subsites −1 and +1. Most recognition residues at subsite −1 are conserved in TkGlmA spatially. Cys101 in TkGlmA is replaced by Leu100 in IaSGT, though Leu100 is disordered in subunit A and flips in subunit B. The O6 atom of the Glc moiety is recognized by Arg349, which has no corresponding residue in TkGlmA. One of the 2-hydroxy group recognition residues (Asn175) is replaced by Asp178 in TkGlmA, as described above. This observation implies that these residues are important to distinguish between Glc and GlcN.
One of the most distinctive features of the IaSGT structure is the firm recognition of a Glc moiety at subsite +1. Arg349 hydrogen-bonds with the 6-hydroxy group and O5 atom of the pyranose ring in the Glc moiety (Fig. 5B). The β-anomeric hydroxy group is also recognized by the same residue in subunit A, while the corresponding hydroxy (α-anomeric) group in subunit B is not. The 3-, 4-, and 6-hydroxy groups of the Glc moiety form hydrogen bonds with the main chain   atoms of Gly278, Trp279, and Asp309, including those mediating a water molecule. Glu176, a candidate for an acid/base catalyst, forms hydrogen bonds with the 2-and 3-hydroxy groups. Therefore, all (or all but anomeric) oxygen atoms in the Glc moiety are recognized by the enzyme, implying narrow specificity to the Glc moiety at subsite +1. Glu343 and Glu176 are also well superimposed with the nucleophile and acid/base residues of TkGlmA, respectively (Figs. 1 and 5A). The activities of E176Q and E343Q were investigated in the presence of 1 mM Sop 2 as a donor and 0.4 mM pNP-α-Glc as an acceptor. E176Q and E343Q showed less than 0.4% and 0.1% relative activity toward the WT enzyme, respectively, suggesting that Glu176 and Glu343 are an acid/base and a nucleophile, respectively. At subsite −1 in the WT-Sop 2 complex, the distance between the carboxy group oxygen atom in the Glu343 side chain and the anomeric carbon atom (3.0 Å for both subunits) and the angle formed by the two atoms and the glycosidic bond oxygen atom in the Sop 2 molecule (159 for subunit A and 165 for subunit B) are suitable for nucleophilic (in-line) attack on the anomeric center of the Glc moiety by Glu343 (Fig. 5C). The carboxy group of the Glu176 side chain interacts with the scissile bond oxygen atom in the Sop 2 molecule (2.6 Å between the two atoms) and is located where anti-protonation can occur (44). These results suggest that Glu343 and Glu176 are the nucleophile and acid/base, respectively.

Binding modes of DNJ and Glc
The electron densities of DNJ molecules were observed only in subunit A in the E343G mutant (Fig. 4B). The DNJ molecules are located at subsites −1 and +1 like Sop 2 in the WT-Sop 2 complex (Fig. S9). The DNJ molecule at subsite +1 is well superimposed with the corresponding moiety in the WT-Sop 2 complex. In contrast, the position of the other DNJ molecule shifts to a little below from the corresponding distorted Glc moiety ( 3 H 2 conformation, Φ = 322.034 , θ = 125.390 , Q = 0.532, according to the Cremer-Pople parameter calculator) (45) of the WT-Sop 2 complex at subsite −1 (Fig. 5C). This is probably due to pushing up of the anomeric area by the Glu343 side chain, because the potential distance between the side chain oxygen atom in the carboxy group of Glu343 and the carbon atom corresponding to an anomeric one in the DNJ molecule is too close (1.3 Å). The absence of the side chain of the residue in the E343G mutant enables accommodation of the DNJ molecule without distortion ( 4 C 1 conformation, Φ = 145.774 , θ = 4.556 , Q = 0.579).
When a crystal was soaked in a solution containing Glc, the electron densities of the Glc molecules were clearly observed in both subunits as the β-anomer at subsite +1 at almost the same positions as the DNJ molecule and the Glc moiety of Sop 2 (Figs. 4C and 5C). In contrast, Glc is absent at subsite −1 in the complex unlike in the DNJ and Sop 2 complexes. The potentially too close distance between the DNJ molecule at subsite −1 and the Glu343 side chain suggests that an α-anomeric configuration of a Glc molecule is not allowed at subsite −1. If two Glc molecules bound at both subsites −1 and +1, the 1-hydroxy group (β-anomer) of a Glc molecule at subsite −1 and the 2-hydroxy group of a Glc molecule at subsite +1 would collide. These observations suggest that binding at subsite +1 is obviously stronger than that at subsite −1.
The positions of Glu102 in the complexes with ligands should be noted, since the positions are related to substrate preference, as described later. In the ligand-free structure, the side chain of Arg349 is disordered beyond the C δ atom. The side chain of Glu102 is also disordered (subunit A) or flips out from subsite −1 (subunit B) (Fig. 6A). In the WT-Glc complex, Arg349 participates in substrate recognition to give a stable conformation, whereas Glu102 is still disordered or flips out as the ligand-free structure (Fig. 6B). Contrarily, in the WT-Sop 2 complex, the Glu102 residue in each subunit clearly faces subsite −1 to form hydrogen bonds with the Glc moiety, suggesting that the side chain of Glu102 must interact with a Glc moiety at subsite −1 to face in the direction of subsite −1 (Fig. 6C).

Binding mode of Sop 2 in the E343Q-Sop 2 complex
In order to understand the binding mode at subsite +2, crystals of WT IaSGT were soaked in a solution containing Sop 3 . However, only the same structure as the Sop 2 complex was obtained (data not shown), probably due to Sop 2 production through enzymatic reaction in the crystals. In addition, only poor Sop 2 complex was obtained when using E343G mutant. Thus, crystals of the E343Q mutant (the mutant of the nucleophile) were soaked in a solution containing Sop 4 (Fig. 7A). Clear electron density fitting a Sop 2 molecule was observed at subsites +1 and +2 only in subunit A. An electron density beyond subsite +2 is almost absent in the solvent, suggesting that there is no subsite +3 in IaSGT. Importantly, as shown later, electron density fitting ligands was not observed at subsite −1 at all.
The Glc moiety at subsite +2 is hydrogen-bonded with the side chain of Gln106 and the main chain of Glu102. Tyr392 may also participate in binding to the O5 atom in the Glc moiety. The aromatic ring of Phe391 undergoes hydrophobic interaction with the C6 atom in the moiety. The position of the Glu102 side chain should be noted. In the WT-Sop 2 complex, both the side chain carboxy group oxygen atoms of Glu102 participate in the substrate recognition at subsite −1 (Fig. 7B). However, the distance between one of the oxygen atoms in Glu102 and a potential hydrogen atom generated from the C3 atom of the Glc moiety at subsite +2 by PyMOL (2.2 Å) is outer limit for van der Waals distance between nonbonded hydrogen and oxygen atoms. The distance is smaller than the normally allowed one (2.4 Å), as shown as overlapped spheres  (46) (Fig. 7B). This observation suggests that binding at subsites −1 and +2 competes mildly through Glu102. This competition further suggests that Sop 2 can be a preferable donor but Sop n s (n ≥ 3) that have to bind at subsite +2 cannot, and that Sop n s are unfavorable as acceptors, though not completely excluded.

Binding modes of glucoside acceptors
We determined the complex structures of WT IaSGT with various glucoside acceptors listed in Table S2 to elucidate the binding modes of the acceptors. Aryl-and alkyl-glucosides (both anomers) with well-observed electron densities at their aglycones are shown in Figure 8. The Glc moieties of all these acceptors tightly bind to subsite +1 at almost the same position and orientation as in the WT-Glc complex (Fig. 9A). Though the positions of the aglycones deviate from each other, all aglycones shown in the figure bind within the same specific area.
In subunit B of the pNP-α-Glc complex, Val179, Phe180, and Leu183 in helix α6 and Leu100 form a hydrophobic dent to interact with the aromatic ring in pNP-α-Glc (Fig. 9B). Such hydrophobic interactions are also found in complexes with various aryl-and alkyl-glucosides regardless of the anomer. This hydrophobic hollow is clearly different from subsite +2 in position and is likely to be too small for monosaccharide moieties to be accommodated, judging from the van der Waals spheres of aromatic ring atoms fitting the hollow (Fig. 9B).
One of the noticeable features of these complex structures is the side chain of Leu100 and helix α6. There are two rotamers for the side chain of Leu100; one faces subsite +1 (in), and the other faces in the opposite direction (out). There are also two possible positions for helix α6; one is close to the substrate pocket (near), and the other is a little far from the pocket (far). Three types of combinations are found in the complex structures, as summarized in Table 3. Subunit A of the ligand-free structure adopts a conformation in which both the Leu100 side chain and helix α6 are close to the substrate pocket (type 1), while subunit B adopts a conformation in which both are away from the substrate pocket (type 3) (Fig. 10A). Subunit A of methyl-β-Glc, Sop 2 and DNJ complexes, and both subunits of the Glc complex adopt the type 1 conformation, which is likely to be observed if the aglycone moiety of ligands is absent or small ( Fig. 10B and Table 3). Large aglycone moieties are likely to push the Leu100 side chain away from subsite +1. The side chains of Leu100 (out) and Phe180 in helix α6 (near) are potentially located within the range of steric hindrance (around 1.5 Å). To avoid such hindrance, both side chains are disordered (e.g., subunit A of the pNP-α-Glc complex) or the side chain of Phe180 flips out (e.g., subunit A of the esculin complex) for type 2 (Fig. 10C). Otherwise, helix α6 is pushed out by Leu100 for type 3 (e.g., both subunits of phenyl-α-Glc) (Fig. 10D). The dent near helix α6 does not accommodate carbohydrate moieties due to its small size, but its conformational variety may allow various aglycones.

Classification of IaSGT
In this study, we found that IaSGT was a glycosyltransferase acting on the β-1,2-glucosidic bonds. The enzymatic reaction and structure-function relationship are shown schematically in Figure 11. Interestingly, suitable acceptors were not Sop n s but various aryl-and alkyl-glucosides, though gentiobiosides are likely to be allowed as acceptors as well based on the results of assaying for amygdalin. Many of the investigated glucosides are natural compounds, for example, ethyl-α-Glc is one of the umami compounds found in fermented foods such as liquor and some seasonings; β-arbutin, gastrodin, salicin, and esculin are found in plants (47)(48)(49)(50). Such wide acceptor specificity among glucosides is attributed to the strong recognition of the Glc moiety at subsite +1 (Figs. 5 and 8) and the mobile hydrophobic region accommodating aglycones ( Fig. 9B). The side chain of Glu102 binding to the Glc moiety at subsites −1 makes Sop n s unfavorable as acceptors due to steric hindrance as to subsite +2, though the hindrance is not too strict to inhibit production of Sop n s completely (Figs. 7B  and 11B). This hindrance also makes IaSGT preferable for Sop 2 as a donor than the other Sop n s ( Fig. 11B; Tables 1  and 2). However, Sop 3 cannot be excluded sufficiently as a donor from an enzymological point of view unlike in the case of Sop n s as acceptors (Tables 1 and 2). In the case of donors, binding at subsite −1 compensates for the disadvantage at subsite +2 for binding. This perspective suggests that it is more appropriate to regard Sop n s rather than only Sop 2 as donors for classification of enzymatic reactions.
The reaction mode of IaSGT comprises transfer of a β-1,2linked glucose unit without any hydrolytic activity, unlike some GH35 hydrolases possessing glycosyltransfer activity such as β-galactosidase from Aspergillus niger (51). Though IaSGT is similar to SOGP in the transfer of a β-1,2-linked glucose unit, there is no space necessary for accommodation of an inorganic phosphate below the anomeric center attacked by a nucleophile. While IaSGT prefers aryl-and alkylglucoside regardless of the anomer as acceptors, SOGP requires sophorose as a minimum acceptor in the synthetic reaction. In addition, IaSGT is presumed to exhibit an anomerretaining mechanism based on the conservation of catalytic residues in the GH35 family, while SOGP exhibits an anomerinverting mechanism. Overall, IaSGT is a novel enzyme that should be given a new EC number. We propose β-1,2glucooligosaccharide:D-glucoside β-D-glucosyltransferase as a systematic name and β-1,2-glucosyltransferase as an accepted name (Fig. 11A).

Comparison of substrate recognition residues with those of GH35 enzymes
Characterization and structural analyses of IaSGT revealed residues important for substrate recognition and catalysis. The   Figure 8 (a white cartoon for IaSGT). B, surface representation of the WT-pNP-α-Glc complex. pNP-α-Glc is shown as a yellow green stick as in Figure 8A. Hydrophobicity is shown as a gradient of red to white (high to low hydrophobicity) in the surface representation (https://web. expasy.org/protscale/pscale/Hphob.Eisenberg.html) (62). Aglycone recognition residues with hydrophobic residues are colored as in the surface representation and are shown as sticks. The polypeptide chain around these residues is shown as a cartoon with the same color usage as for the surface representation. The Sop 2 molecules in the WT-Sop 2 and the E343Q-Sop 2 complexes are superimposed and are shown as thin white and light blue sticks, respectively, with the same color usage as in Figure 7. The van der Waals radii of aromatic carbon atoms are shown as cyan dots. recognition residues at subsite −1 and catalytic residues (Y52, E102, N175, E176, N275, E343, and Y378) in IaSGT are well conserved among IaSGT homologs, implying that IaSGT homologs share the same specificity at subsite −1 and a fundamental reaction mechanism.
One of the interesting issues regarding GH35 is how the enzymes distinguish Glc/GlcN and Gal (O4 epimers). Rotamers of 6-hydroxy groups should also be taken into account, since the rotamer preference of the 6-hydroxy groups is affected by the orientations of 4-hydroxy groups to avoid proximity of the O4 and O6 atoms. Preferable conformations of 6-hydroxy groups of Glc and Gal are gg and tg, respectively. The O4 and O6 atoms are recognized by Glu102 and Tyr378 in IaSGT. The enzyme has Tyr397, which can bind to the O6 atom in the tg rotamer potentially. Though Tyr397 is not conserved among IaSGT homologs (Fig. S10), this residue is conserved in the βgalactosidase from Streptococcus pneumoniae TIGR4 (SpBgaC) (Fig. S11 right and left), implying the need for other factors to differentiate the specificity at subsite −1. Leu100, a residue corresponding to Cys96 in the β-galactosidase spatially, may avoid the O4 axial orientation due to its hydrophobicity. Arg349 recognizing O6 with a gg rotamer cannot form a hydrogen bond with O6 with the tg rotamer potentially, which may also affect the specificity for O4 epimers. In the case of TkGlmA, Cys101 is located at the position corresponding to Leu100 in IaSGT as well as SpBgaC (Fig. S11 middle), implying that Cys101 is unlikely to be involved in the distinction of O4 epimers (39). Instead, Trp308 at the bottom of the pyranose ring is likely to make TkGlmA unpreferable for the tg rotamer of a 6-hydroxy group. In addition, TkGlmA does not possess a residue corresponding to Tyr305 in SpBgaC recognizing an O6 atom with the tg conformation.

Diversity of IaSGT homologs
It should be noted that there is interesting diversity at subsite +1 and the aglycone-binding region among IaSGT homologs. The subsite firmly recognizes a Glc moiety by many residues and shows the strongest affinity for the Glc moiety in IaSGT based on structural observation (Figs. 4C and 5C). Among the recognition residues, Arg349 is the key for substrate specificity. Arg349 provides its side chain for substrate binding, whereas the other bindings are provided by main chain atoms (Gly278, Trp279, and Asp309) or by a catalyst that cannot be replaced by the other residues (Glu176) (Fig. 5). Nevertheless, Arg349 is not conserved among many IaSGT homologs (Fig. S10).
IaSGT also has a hydrophobic region for the binding of aglycones. The residues (Leu100, Val179, Phe180, and Leu183) recognizing alkyl and aryl groups in the acceptors are conserved as hydrophobic ones among the homologs. Among them, Leu100 plays a unique role in the conformational change of the aglycone recognition region (Fig. 9). However, Leu100 is replaced by a Met residue in many of the homologs (Fig. S10). Such differences among IaSGT homologs make us expect diversity in substrate specificity or preference.

Speculated physiological role of the IaSGT gene cluster
The IaSGT gene forms a gene cluster with the genes encoding carbohydrate transporters and GH family members (Fig. S1). The speculated physiological roles of these proteins are shown in Fig. S12. Extracellular β-1,2-glucans may be hydrolyzed by GH144 IALB_1179 protein to Sop n s or shorter β-1,2-glucans, since this protein was predicted to be located at the extracellular or cell inner membrane. The products might be transported into the periplasm by a TonB-dependent transporter, though there is no biochemical evidence of close homologs. Sop n s can be further degraded by GH144 homologs and be transported into the cytoplasm by an ABC transporter. IALB_1187 is a homolog of the Sop n s-binding protein from L. innocua (25). IALB_1177 is a homolog of a C-terminal SOGP domain in a cyclic β-1,2-glucan synthase, the protein being expected to release Sop 2 as a final product. However, there is no β-glucosidase homolog in the gene cluster, implying that Sop 2 is supplied for IaSGT as a donor.

Conclusion remarks
We found that IaSGT is an enzyme exhibiting a novel glycosyltransfer reaction. It is intriguing that various aryl-and alkyl-glucosides with both anomers were acceptors for the enzyme. Furthermore, X-ray crystallographic analysis revealed structural features for substrate specificity. This report is also the first β-glucoside-acting enzyme and the first glycosyltransferase in the GH35 family. Glycosyltransfer reactions are of use for oligosaccharide synthesis, but glycosyltransferases in In the case of both subunits of α-arbutin, benzyl-α-Glc, 2-naphthyl-α-Glc, gastrodin, pNP-β-Glc, salicin, and amygdalin, and for subunits that are not shown in the table, the electron densities of the aglycone moieties of the ligands were partially obscure, the ligands were not observed, or binding modes of the ligands were apparently artificial due to potential steric collision with the Glu102 side chain participating in binding at subsite −1 (data not shown). b The electron density states of the Phe180 side chain are shown in parentheses. c The E343G mutant was used for the complex with DNJ. The WT enzyme was used for the other structures.
GH families share their reaction mechanisms with glycoside hydrolases basically. Complete comprehension of their reaction mechanism is required to control conversion between transferases and hydrolases freely, though this is still an open issue. Our findings are important biochemical data for understanding the diversity of CAZymes and a fundamental structural basis for further investigation of the profound enigma of the reaction mechanisms of CAZymes.

Phylogenetic analysis
The amino acid sequences of the characterized GH35 enzymes were retrieved from the CAZy database. The homologous sequences of IaSGT and GlmAs were obtained by means of Protein BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi). The sequences were aligned using MUSCLE, and the phylogenetic tree was constructed and visualized by the maximum likelihood method using MEGA X version 10.2.5 (52).

Cloning, expression, and purification
The gene encoding IALB_1185 protein was amplified by the PCR method using KOD -plus-(TOYOBO) and the genomic DNA of I. album (DSM19864) purchased from DSMZ as a template with the primers listed in Table S3. The amplified gene was inserted into the NdeI and XhoI sites of the pET30a(+) vector (Novagen). Mutants of IaSGT were generated using a PrimeSTAR Mutagenesis Basal Kit (TakaraBio) with the primers listed in Table S3. The constructed plasmid was transformed into E. coli BL21(DE3). The transformant was cultivated in LB medium containing 30 μg/ ml kanamycin at 37 C until the cell culture reached the log phase (A 600 0.7), followed by induction using 100 μM IPTG at 20 C with shaking at 200 rpm overnight. For expression of selenomethionine-labeled IaSGT, the plasmid was introduced into E. coli B834(DE3), and the transformant was cultivated in LeMaster medium containing 30 μ g/ml kanamycin. The recombinant protein expressed as a his 6 -tag fusion protein was extracted from E. coli cells by sonication and purified by affinity chromatography using a HisTrap FF crude column (GE Healthcare, US) (linear gradient of 0-500 mM imidazole), followed by a HiTrap Butyl HP column (GE Healthcare) (linear gradient of 1.5-0 M ammonium sulfate). The buffer used for purification was 50 mM MOPS (pH 7.5) containing 100 mM NaCl. The recombinant protein was purified to homogeneity by SDS-PAGE. The IALB_1185 solution was dia-   Figure 8 (green, cyan, and magenta for B-D, respectively). Leu100, Phe180 and aglycone moieties are shown as thick sticks. Val179, Leu183, and Glc moieties are shown as thin sticks. B, type 1 conformation. Subunit B of the ligand-free structure is superimposed semitransparently. Leu100 and Phe180, and helix α6 in the subunit are shown as thin sticks and a white cartoon, respectively. C and D, type 2 (C) and type 3 (D) conformation. Subunits A is shown translucently as in the same way as subunit B in (B).

Analysis of the effects of pH and temperature
The effect of pH on the enzymatic activity of IALB_1185 was determined by measuring glycosyltransfer activity toward Sop 3 in various pH buffers (Briton-Robinson buffer, pH 3-12, and sodium acetate buffer, pH 4.5-5.5). Each reaction was performed in a reaction mixture (50 μl) containing 20 mM Sop 3 and 5.6 μg of IaSGT at 37 C for 60 min. Similarly, the effect of temperature was determined by measuring the glycosyltransfer activity toward Sop 3 at various temperatures (20-80 C). Each reaction was performed in the reaction mixture containing 20 mM Sop 3 and 5.6 μg IaSGT in 20 mM sodium acetate buffer (pH 5.0) for 60 min. The enzymatic activity was determined by measuring the concentration of Sop 2 released from Sop 3 by means of glycosyltransfer activity. Sop 2 in the sample solution was hydrolyzed to glucose with 0.1 mg/ml HjCel3A (54), which can hydrolyze only Sop 2 among Sop n s, in 100 mM sodium acetate buffer (pH 5.5) at 40 C for 30 min. The amount of glucose in the mixture was determined by the GOPOD method (24), and the concentration of Sop 2 derived through glycosyltransfer activity was calculated. All experiments were carried out in triplicate.

Substrate specificity of IaSGT
We examined the activity of recombinant IaSGT toward various sugars (Cel 2−5 , Lam 2−5 , Sop 2−5 ). The reaction mixtures, comprising 10 mM carbohydrate, 70 μg/ml IaSGT (0.2 mg/ml for maltose, gentiobiose, melibiose, sucrose, and lactose) in 50 mM sodium acetate (pH 5.0), were incubated at 30 C overnight. Each sample solution or marker containing 0.5% each carbohydrate (0.5 μl) was spotted onto a TLC plate. The TLC plates were developed with 75% (v/v) acetonitrile in water. After soaking in 5% (v/v) sulfuric acid in methanol, the TLC plates were heated until bands were visualized sufficiently. The reaction for Figure 1E was performed using 10 mM Sop 5 and 0.1 mg/ ml IaSGT. A marker was prepared by the reaction using SOGP from Enterococcus italicus (approximately 0.3 mg/ml) (55) in the presence of 1% Sop n s mixture (hydrolysates of β-1,2-glucan by Figure 11. Schematic representation of the reaction and the mechanism of substrate preference of IaSGT. A, IaSGT transfers a glucose moiety from the nonreducing end of Sop n to a glucoside and links it through a β-1,2-glucosidic bond. B, Glc molecules and moieties are shown as gray circles. The colors of ones bound at subsite +2 and covalently bound with the enzyme are black and white, respectively. Subsites in the enzyme are shown as semicircles and the numbers are assigned for the subsites. Unfavorable factors for binding are indicated by slash lines in subsite +2. The hydrophobic pocket is shown as a bold partial circle. Letter R represents an aryl or alkyl group. SGL from C. pinensis) and approximately 10 mM sodium phosphate (pH 7.0) at 37 C overnight. Each sample solution (1 μl) or the marker (0.5 μl) was spotted on the TLC plate, and the plate was developed twice.

TLC analysis for investigation of acceptors
For exploration of acceptor substrates of IaSGT, we used the IaSGT E343G mutant as a glycosynthase and the WT enzyme. The reaction mixture comprising 90 μg/ml IaSGT E343G, 10 mM α-GlcF, and 10 mM various glucosides in 20 mM sodium acetate buffer (pH 5.0) was incubated at 30 C overnight. Similarly, each reaction mixture comprising 90 μg/ml IaSGT (WT), 10 mM Sop 2 , and 10 mM various glucosides in 20 mM sodium acetate buffer (pH 5.0) was incubated at 30 C overnight. Reaction products were visualized by TLC analysis as described above.

NMR analysis
The enzymatic reaction was performed in a reaction mixture comprising 350 μg/ml IaSGT, 50 mM Sop 3 , and 5 mM sodium acetate buffer (pH 6.0) overnight. The reaction product was purified by size-exclusion chromatography using a Toyopearl HW-40F column (approximately 2 L gel), as described previously (23). Briefly, after the injection of the reaction mixture (approximately 10 ml), the sample was eluted with distilled water. The eluates were fractionated into 10 mlportions, and the fraction containing only Sop 4 was lyophilized. The resultant powder was dissolved in D 2 O, and acetone was added as a standard for calibration of chemical shifts. The chemical shifts were recorded relative to the signal of the methyl group of the internal standard acetone (2.22 ppm). As a reference, Sop 4 was also dissolved in the same solvent. 1 H-NMR spectra were recorded using a Bruker Advance 400 spectrometer (Bruker BioSpin).

ESI-MS analysis
The enzymatic reactions (100 μl) were performed overnight based on the description on the reaction for TLC analysis except that 5 mM sodium acetate (pH 5.0) was used as a buffer. Amberlite MB4 (Organo) was added to each sample to remove ionic compounds. After the solution was collected, 60 μl of water was added to the beads, and the wash solution was also pooled. The samples were diluted 100 times with the solution (methanol/water = 1/1, v/v) containing 5 mM ammonium acetate. After filtration, the samples were loaded on the Sciex X500 R QTOF (Sciex) in positive mode at the flow rate of 20 μl/min.

Assay of glucosyltransferase activity
Reaction mixtures comprising 82.6 μg/ml IaSGT and various concentrations of Sop n s (0.5-40 mM Sop 2 , 0.5-40 mM Sop 3 , 0.5-40 mM Sop 4 , or 0.5-40 mM Sop 5 ) in 20 mM acetate-Na buffer (pH5.0) were incubated at 37 C for 1 h, and then the reaction was stopped by heat treatment at 100 C for 5 min. To determine the kinetic parameters of IaSGT for Sop 2-5 , Glc concentrations in the samples were determined by the GOPOD method based on the manufacturer's instructions (Megazyme) after the treatments described below. Glc was used as a standard. To determine activity toward Sop 2 , the concentration of glucose released from Sop 2 was measured. For activity toward Sop 3 , Sop 2 released from Sop 3 was hydrolyzed to Glc with HjCel3A. To determine the kinetics for Sop 4 , Sop 5 generated by IaSGT was hydrolyzed to Sop 2 and Sop 3 with SGL from C. pinensis (0.12 mg/ml). Then the released Sop 2 was treated as in the assay for Sop 3 . To determine activity toward Sop 5 , the reaction products were reduced using a one-fifth volume of 1 M NaBH 4 . The same volume of 1 M acetate as that of the NaBH 4 solution was added to each sample to neutralize NaBH 4 . Then, the samples were treated with 0.12 mg/ml of SGL from C. pinensis at 40 C for 20 min to release Sop 2 from the reduced Sop 6 . The released Sop 2 was quantified in the same way as for the assay for Sop 3 . When the effect of an acceptor on activity toward Sop 2 was examined, the assay was performed using 35 μg/ml IaSGT in the presence of 2 mM Sop 2 and 1 mM acceptor. The accepters used were mannose, D-gulose, Gal, D-xylose, Cel 2 , Lam 2 , gentiobiose, sucrose, maltose, α,α-trehalose, lactose, L-arabinose, L-rhamnose, fructose, and gluconate.
To determine the kinetic parameters for glycosides as acceptors, colorimetric determination and enzymatic reactions were carried out as described below. The reaction mixture comprised appropriate concentrations of IaSGT, various concentrations of glycosides, 100 U/ml hexokinase, 100 U/ml G6PDH, 1 mM ATP, 1 mM thio-NAD + , and 10 mM MgCl 2 in 20 mM sodium acetate buffer, pH 5.0. Each reaction mixture was incubated at 37 C, and the increase in absorbance at 398 nm derived from thio-NADH was monitored for 10 min. In the reaction with Sop 3 as an acceptor, HjCel3A (0.1 mg/ml) was added to the reaction mixture to hydrolyze Sop 2 to glucose. The extinction coefficient of the assay was determined to be 11,900 M −1 cm −1 according to the manufacturer's instructions (https://www.oyc.co.jp/bio/IVD_research/ coenzyme/ThioNAD.html). Assaying of the E176Q and E343Q mutants was carried out in the same way as for the coupling method described above in the presence of 1 mM Sop 2 and 0.4 mM pNP-α-Glc. All kinetic parameters in the study were determined by regressing the data to the Michaelis-Menten equation using GraFit Version 7.0.3.

Assaying of hydrolytic activity
For assaying of β-galactosidase activity, reaction mixtures comprising 82.6 μg/ml IaSGT and 1 mM pNP-β-D-galactopyranoside in 20 mM sodium acetate buffer (pH 5.0) were incubated at 37 C for 1 h. An equal volume of a 0.5 M Na 2 CO 3 solution was added to each reaction mixture, and then the absorbance at 405 nm of each sample was measured. The extinction coefficient of pNP used was 18,500 M −1 cm −1 . In order to evaluate the hydrolytic activity toward Sop 3 , a reaction mixture comprising 82.6 μg/ml IaSGT and 20 mM Sop 3 in 20 mM sodium acetate buffer (pH 5.0) was incubated at 30 C for 1 h. The concentrations of Glc released from Sop 3 were determined by the GOPOD method described above. All experiments were carried out in triplicate.