The structure of a family 110 glycoside hydrolase provides insight into the hydrolysis of α-1,3-galactosidic linkages in λ-carrageenan and blood group antigens

α-Linked galactose is a common carbohydrate motif in nature that is processed by a variety of glycoside hydrolases from different families. Terminal Galα1–3Gal motifs are found as a defining feature of different blood group and tissue antigens, as well as the building block of the marine algal galactan λ-carrageenan. The blood group B antigen and linear α-Gal epitope can be processed by glycoside hydrolases in family GH110, whereas the presence of genes encoding GH110 enzymes in polysaccharide utilization loci from marine bacteria suggests a role in processing λ-carrageenan. However, the structure–function relationships underpinning the α-1,3-galactosidase activity within family GH110 remain unknown. Here we focus on a GH110 enzyme (PdGH110B) from the carrageenolytic marine bacterium Pseudoalteromonas distincta U2A. We showed that the enzyme was active on Galα1–3Gal but not the blood group B antigen. X-ray crystal structures in complex with galactose and unhydrolyzed Galα1–3Gal revealed the parallel β-helix fold of the enzyme and the structural basis of its inverting catalytic mechanism. Moreover, an examination of the active site reveals likely adaptations that allow accommodation of fucose in blood group B active GH110 enzymes or, in the case of PdGH110, accommodation of the sulfate groups found on λ-carrageenan. Overall, this work provides insight into the first member of a predominantly marine clade of GH110 enzymes while also illuminating the structural basis of α-1,3-galactoside processing by the family as a whole.

a-Linked galactose is a common carbohydrate motif in nature that is processed by a variety of glycoside hydrolases from different families. Terminal Gala1-3Gal motifs are found as a defining feature of different blood group and tissue antigens, as well as the building block of the marine algal galactan l-carrageenan. The blood group B antigen and linear a-Gal epitope can be processed by glycoside hydrolases in family GH110, whereas the presence of genes encoding GH110 enzymes in polysaccharide utilization loci from marine bacteria suggests a role in processing l-carrageenan. However, the structure-function relationships underpinning the a-1,3-galactosidase activity within family GH110 remain unknown. Here we focus on a GH110 enzyme (PdGH110B) from the carrageenolytic marine bacterium Pseudoalteromonas distincta U2A. We showed that the enzyme was active on Gala1-3Gal but not the blood group B antigen. X-ray crystal structures in complex with galactose and unhydrolyzed Gala1-3Gal revealed the parallel b-helix fold of the enzyme and the structural basis of its inverting catalytic mechanism. Moreover, an examination of the active site reveals likely adaptations that allow accommodation of fucose in blood group B active GH110 enzymes or, in the case of PdGH110, accommodation of the sulfate groups found on l-carrageenan. Overall, this work provides insight into the first member of a predominantly marine clade of GH110 enzymes while also illuminating the structural basis of a-1,3-galactoside processing by the family as a whole.
The ABH glycan antigens define the ABO blood types, the appropriate matching of which is a key consideration in blood transfusions and organ transplantations. The O-blood group, defined by the smaller H antigen, is considered a universal donor. Accordingly, enzymatic conversion of the more elaborate and immunogenic A/B antigens to the H antigen provides an attractive route to avoid antigen mismatching and create a ready supply of universal donor blood. A campaign to identify enzymes that could hydrolyze terminal a-1,3-linked N-acetyl D-galactosamine and/or D-galactose from the A and B antigens, respectively, and thereby provide a set of biocatalytic tools to perform this antigen switching revealed a set of enzymes with specific B antigen exo-a-1,3-galactosidase activity (1,2). These enzymes were the founding members of GH110 (glycoside hydrolase 110) family, of which all currently characterized members are exo-a-1,3-galactosidases that are able to hydrolyze the B antigen glycan (Gala1-3(Fuca1-2)Gal-R) (1)(2)(3). The GH110 family presently has over 400 classified members in the Carbohydrate-Active Enzyme Database (4). These putative enzymes are encoded by genes found in the genomes of a variety of environmental and host-adapted bacteria. Most recently, genes encoding proteins that are classified by amino acid sequence identity into GH110 have been identified in polysaccharide utilization loci (PULs) from human gut microbiome bacteria and marine bacteria; these PULs are postulated to target l-carrageenan, which is a polysaccharide found in marine algae (5)(6)(7).
Most of the photosynthetically fixed carbon present on Earth is in land plants with the amount of photosynthetically fixed carbon in the oceans (i.e. present in microalgae and macroalgae (seaweed)) estimated to be only ;1/200 that of terrestrial plant biomass. However, the annual turnover rate (total mass/time) of photosynthetically fixed carbon in the oceans is roughly equal to that on land, indicating a highly dynamic process with a normalized rate of recycling that is ;2 orders of magnitude greater than on land (8). This is a remarkable biotransformation that occurs in the marine environment, and it is largely made possible by the metabolic capabilities of marine microbes, who return the carbon locked in algal storage and structural polysaccharides to the global carbon cycle (9). The dedicated metabolic pathways, which include numerous glycoside hydrolase, deployed by marine microbes to perform this biotransformation are uniquely adapted to the distinctive chemical compositions of marine algal polysaccharides. However, the identification and molecular details of several pathways that target major classes of marine polysaccharides, such as l-carrageenan, remain to be uncovered. Understanding these pathways is key to, for example, the development of complete biogeochemical models of the global carbon cycle (10), comprehending the rise and fall of algal blooms (11), unlocking farmable seaweed biomass feedstocks for the generation of biofuels or other high-value products (12), and even identifying and engineering the metabolic capabilities of the human gut microbiome (7,13,14).
Carrageenans are a family of abundant marine polysaccharides comprising unbranched, high-molecular-mass sulfated galactans. They are typically found in the cell walls of marine red macroalgae, where they can make up to 50% of the dry mass. The backbone comprises D-galactose with alternating a-1,3and b-1,4-linkages. The occurrence and specific patterns of sulfate esters on the free hydroxyl groups of the galactose backbone, along with the presence or absence of 3,6-anhydro-D-galactose, gives rise to numerous different carrageenan families (15)(16)(17). l-Carrageenan is made of neocarrabiose motifs in which D-galactose-2,6-sulfate is a-1,3-linked to D-galactose-2sulfate, and this disaccharide is joined by b-1,4-glycosidic linkages forming a linear l-carrageenan polymer that is distinct from other carrageenans by its lack of 3,6-anhydro-D-galactose. Presently, how microbes process l-carrageenan is poorly understood with only endo-acting b-1,4-l-carrageenases from Pseudoalteromonas carrageenovora 9 T (18,19) having been identified. To date, no other enzymes having activity consistent with l-carrageenan processing have been experimentally identified, including the distinct absence of identified enzymes that are active on the a-1,3-linkages. We postulate that this is an activity performed by the GH110 enzymes found in l-carrageenan PULs.
Toward testing this hypothesis, we characterized the structure and function of PdGH110B. This enzyme is encoded by a gene we identified in the recently reported genome of Pseudoalteromonas distincta U2A (referred to as U2A for brevity) (6). PdGH110B has ;25% amino acid sequence identity with the characterized GH110 enzymes from Bacteroides sp. (1,2) and ;94% amino acid sequence identity with a putative GH110 enzyme present in the P. carrageenovora 9 T PUL that is proposed to target l-carrageenan (5). Here we demonstrate that PdGH110B is an a-galactosidase that can hydrolyze the Gala1-3Gal disaccharide, but not the blood group B-trisaccharide [Gala1-3(Fuca1-2)Gal], thus distinguishing it from previously characterized GH110 enzymes. The structural determination of PdGH110B in complex with enzyme substrate and products revealed the parallel b-helix fold of GH110 enzymes and a 21 subsite (20) that accommodates an unmodified galactose residue. An analysis of the structure points to a potential 119 subsite that is key to accommodating sulfate modifications, as present in l-carrageenan, or fucose, a defining constituent of the blood group B antigen glycan. Overall, the results are consistent with PdGH110B representing the founding member of a GH110 subfamily that exo-hydrolytically processes terminal a-linked galactose residues in l-carrageenan while providing general insight into the specificity of the family as a whole.

Results
Identification of a GH110 in Pseudoalteromonas distincta sp. U2A U2A was isolated from the marine environment for its capacity to grow on macroalgal polysaccharides, including carrageenan, as previously described (6). We identified two adjacent genes (locus tags EU511_08545 and EU511_08540) encod-ing proteins with 30% sequence identity to one another and ;25% sequence identity to previously characterized GH110 enzymes. This pair of proteins displayed 98 and 94% amino acid sequence identity to two putative orthologous GH110 enzymes in P. carrageenovora 9 T that are encoded by adjacent genes in a presumed l-carrageenan PUL (5). A gene truncation of EU511_08540, encoding a protein we refer to as PdGH110B, lacking the predicted signal peptide was overproduced and purified, followed by qualitative assessment of activity on the synthetic substrates pNP-a-D-galactopyranoside and pNP-b-Dgalactopyranoside. Recombinant PdGH110B showed activity only on pNP-a-D-galactopyranoside and a pH optimum of ;5.6 ( Fig. S1A).
We further tested the activity of PdGH110B on more natural substrates using Gala1-3Gal (aG2) and Galb1-4Gal (bG2), which represent the basic unmodified motifs present in l-carrageenan, by quantifying galactose release. PdGH110B released D-galactose when incubated with aG2, whereas there was no activity on bG2 (Fig. 1A). The K m and k cat for aG2 were 5.9 6 1.1 mM and 18.3 6 0.002 s 21 , respectively (Fig. 1B). Given the activity of other GH110 enzymes on the blood group B glycan, we tested PdGH110B using TLC but could not detect any activity (Fig. S1B).
The overall structure of PdGH110B a-1,3-galactosidase in complex with D-galactose An initial preliminary structure of PdGH110B was determined by single-wavelength anomalous dispersion using a cadmium derivative. This initial model was used to solve the structure of PdGH110B in complex with a D-galactose monosaccharide to 2.35 Å resolution. The crystal structure of PdGH110B in complex with D-galactose revealed two chains in the asymmetric unit. Residues 25-238/241-616 for one molecule and residues 21-182/ 186-238/241-616 for the second molecule were modeled, with the missing residues residing in loop regions. The noncrystallographic dimer shows no evidence of being stable; however, each monomer in the asymmetric unit forms a crystallographic dimer (Fig. S2A). The crystallographic dimers ( Fig. 2A) have a total molecular interface, determined by PISA (Proteins, Interfaces, Structures and Assemblies) (21) analysis, of 2300 Å 2 , predicting a stable dimeric state. Dynamic light scattering analysis of PdGH110B in solution at protein concentrations of 0.18, 0.37, and 0.73 mg/ml yielded a molecular mass of 136.3 6 9.4 kDa. The expected molecular mass of the PdGH110B monomer is 67.9 kDa, resulting in an expected molecular mass of 135 kDa for a dimer. This indicates that PdGH110B adopts a dimeric quaternary structure, which is most likely the biologically relevant assembly.
The overall fold of PdGH110B is that of a right-handed parallel b-helix of 11 complete turns. A structural homology search using the DALI server (22) identified a fold most similar to those of a GH87 a-1,3-glucanase from Bacillus circulans (23), as well as two epimerases, AlgE4 and AlgE6, from Azotobacter vinelandii (16) (PDB code 5LW3). This core b-helix is surrounded by two small b-barrel domains (domains I and II) that contribute to the residues involved in dimerization of PdGH110B (Fig. 2B). The a-helix of domain II from the adjacent monomer folds over and along the wall of the active site, which was identified by a bound D-galactose, with several amino acid side chains protruding into the cleft that contains the active site pocket (Fig. 2C).
The bound D-galactose monosaccharide was identified by clear electron density (Fig. S2B) found in the central region of the b-helix domain of both active sites of the dimer. The modeled monosaccharide occupied a pocket that sequesters the monosaccharide in a fashion that is typical for glycoside hydrolases that are exo-acting on the nonreducing end of glycans (Fig. 2D). Specifically, Asp-344 is located within hydrogen bonding distance of the C1-OH, where the scissile bond of an intact substrate would be ( Fig. 2E), indicating that this residue is a likely candidate to play the catalytically essential role of general acid. To trap an intact substrate complex of the enzyme, we targeted this residue to generate an inactive D344N mutant, which indeed lacked activity on pNP-a-galactopyranoside.

Structure of PdGH110B in complex with a-1,3-galactobiose (aG2)
Crystals of the PdGH110B D344N mutant were soaked with an excess of aG2, and the structure was determined to 2.20 Å  , and domain II (dark gray). C, the association of the domain II a-helix extending into the neighboring chain's active site. The coloring is the same as in A. D, the active site pocket, shown as a solvent-accessible surface in gray, sequesters the galactose residue. The O1, which would be engaged in a glycosidic linkage, is indicated. E, the interaction of the galactose residue with the likely catalytic acid residue. Interacting side chains are shown in purple, and the water molecule is shown as a red sphere. In all A-E, the D-galactose monosaccharide is represented as yellow sticks.
resolution. The refined structure revealed four monomers of PdGH110B D344N in the asymmetric unit. The monomers were organized as two noncrystallographic dimers with identical arrangements to the crystallographic dimer observed in the PdGH110B D-galactose complex, supporting the concept that the dimer is a stable quaternary structure. Clear electron density for the aG2 disaccharide was found in each monomer active site ( Fig. S3) with the intact glycosidic linkage spanning the 21 subsite and 11 subsites and the catalytic machinery (Fig. 3A).
There is an extended network of hydrogen bonds, as well as a single interacting aromatic residue, made between the aG2 molecule and the enzyme active site (Fig. 3A). The C2-C6 portion of the a-face of the D-galactose unit in the 21 subsite sits on a hydrophobic platform created by Trp-486, interacting through CH-p interactions often employed by CAZymes (24). The remainder of the 21 subsite is created by an extensive hydrogen bond network comprising interactions between Asn-85, Glu-488, Asn-348, and Glu-480 and the C3-OH, C4-OH, and C6-OH hydroxyl groups. Arg-265 coordinates both C2-OH and C3-OH, and Arg-453 interacts with C6-OH and the endocyclic oxygen (Fig. 3A). The 11 subsite is formed exclusively by the positively charged side chains of Arg-451, Arg-208, and through a water coordinated by Lys-207. The electrostatic potential of the active site surface indicates a generally acidic 21 subsite but a 11 subsite and neighboring surfaces that are basic (Fig. 3B).
GH110 enzymes were previously shown to operate through use of a single displacement, or inverting, catalytic mechanism (1). We confirmed this for PdGH110B by using 1 H NMR to monitor the initial release of the b-anomer of D-galactose from pNP-a-D-galactopyranoside, which indicates inversion of the anomeric configuration of C1 involved in the glycosidic bond (Fig. 4A). The architecture of the PdGH110B catalytic center is also consistent with an inverting catalytic mechanism (25, 26) (Fig. 4B). In the aG2 complex with the mutant enzyme, Asn-344, which would be Asp-344 in the WT enzyme, is 3.1 Å from the glycosidic oxygen and appropriately positioned to act as a general acid. A water molecule that sits 3.5 Å beneath C1 of the D-galactose residue in the 21 subsite is suitably positioned to be activated as a nucleophile by Asp-321 and/or Asp-345 (Fig. 4B).

Discussion
The GH110 family was initially identified by examination of members that specifically removed the immunodominant a-1,3-linked galactose residues of blood group B antigen. Subsequent characterization of additional GH110 enzymes revealed some to be less stringent by possessing the ability to process the linear a-Gal epitope, as well as the blood group B antigen. In contrast to the majority of other a-galactosidases, which employ a retaining catalytic mechanism (i.e. GH4, GH27, GH31, GH36, GH57, and GH97 (27-32)), family GH110 was shown to utilize an inverting mechanism for this hydrolytic activity. Here, through examination of PdGH110B, which originates from a marine bacterium and has relatively low amino acid sequence identity to previously characterized GH110 members, we also demonstrated a-Gal activity, although it differed from other GH110 enzymes by its inability to hydrolyze the blood group B antigen. The structure of PdGH110B revealed the molecular basis of the inverting mechanism utilized by the family. Notably, the core parallel b-helix fold and catalytic machinery of PdGH110 is conserved within GH families 28, 49, and 87 ( Fig. 4C) (33)(34)(35). GH28 and GH49 are classified into glycoside hydrolase clan GH-N. Given that the fold and catalytic machinery of these founding families of the clan are conserved with GH87 and GH110, we note that it is likely that the latter two GH families also belong to clan GH-N.
A phylogenetic tree constructed from 334 GH110 amino acid sequences, including PdGH110B and another from U2A (PdGH110A), displays several distinct clades ( Fig. 5A and Fig.  S4). Notably, the sequences originating primarily from marine microbes form their own clade that branches off directly from the origin, hinting at evolution toward potentially distinct functions. Indeed, the P. carrageenovora 9 T orthologue of PdGH110B (;94% amino acid sequence identity) resides in a locus proposed to target the highly sulfated marine galactan l-carrageenan (5). The genes neighboring that encoding PdGH110B in U2A show similarly high amino acid sequence identity to components of the putative P. carrageenovora 9 T l-carrageenan PUL. This led to the hypothesis that PdGH110B (and its orthologue) would target l-carrageenan. The activity of PdGH110B on aG2, and the molecular recognition thereof, is consistent with exo-a-1,3-galactosidase activity on the nonreducing ends of l-carrageenan. In contrast, because of its 1 C 4 chair conformation, the 3,6-anhydro-D-galactose residue found at the nonreducing termini of k-and i-neocarrageenoligosaccharides would not be accommodated in the active site of PdGH110B. Notably, the architecture of the 21 subsite leaves no room for a sulfate modification on either C2 or C6 of the galactose residue, indicating that PdGH110B requires an unmodified D-galactose at the nonreducing end of its substrate. This is consistent with our inability to observe activity for this enzyme on sulfated l-neocarrageenoligosaccharides (not shown) and suggests that preprocessing of l-carrageenan by asyet-unidentified sulfatases would be necessary prior to the action of PdGH110B.
An examination of amino acid conservation within the 21 subsite of the representative GH110 sequences reveals this subsite to be remarkably conserved across the family (Fig. 5B), including among the characterized enzymes (Fig. S5), suggesting that exo-a-galactosidase activity is a general feature of the family. This observation points to features of the plus (1) subsites as being critical determinants of substrate selectivity. In the case of PdGH110B, a series of specific interactions, particularly those mediated by Arg-451, select for a galacto-configured sugar in the 11 subsite. In l-carrageenan, the galactose residue in this subsite would likely bear a 2-sulfate group. The structure of PdGH110B in complex with aG2 shows a small pocket, which we will refer to as a 119 subsite, with the appropriate size and charge to accommodate a sulfate modification (Fig.  3B). Although PdGH110B was active on aG2 (i.e. lacking sulfates) the high K m (;5 mM) suggests nonoptimal recognition of this substrate; 2-sulfation of the galactose in the 11 subsite may be expected to improve binding of the substrate. Thus, we suggest that the features of the PdGH110B active site are entirely consistent with l-carrageenan recognition.
The 11 subsite has one conserved residue, Arg-451, which makes specific interactions with the galactose residue in the 11 subsite that are likely maintained throughout the family (Figs. 3A and 5B). The other residue in this subsite that makes a direct interaction, Arg-208, is not well-conserved (Fig. 5B). Nevertheless, the noteworthy conservation of the 21 subsite and conservation of Arg-452 in the 11 subsite, which makes directional interactions specific to a galacto-configured monosaccharide, reveal a mechanism that is shared by the GH110 family to recognize a Gala1-3Gal motif. Lys-207 makes potential watermediated hydrogen bond to the galactose in the 11 subsite; however, it appears more relevant to the 119 subsite of PdGH110B, which is not well-conserved among the GH110 family (Fig. 5, B and C). In particular, Lys-195 (contributed by Figure 4. PdGH110B using an inverting catalytic mechanism. A, 1 H NMR spectra of pNP-a-D-galactopyranoside treated with PdGH110B measured at various time points. Signals with NMR chemical shifts corresponding with those distinctive of pNP-a-D-galactopyranoside, a-galactopyranoside, and b-galactopyranoside are labeled with integrated peak areas shown where applicable. B, aG2 (yellow sticks) bound in the active site of PdGH110B D344N. The three catalytic residues are represented as purple sticks, hydrogen bonds are dashed lines, and the water molecule is a red sphere. C, an overlay of the PdGH110B aG2 complex (purple sticks) with the GH49 isopullulanase from Aspergillus niger (green; PDB code 2z8g) (34), the GH28 exo-polygalacturonase from Yersinia enterocolitica (yellow; PDB code 2uvf) (36), and the GH87 a-1,3-glucanase from Paenibacillus glycanilyticus (gray; PDB code 6k0n) (35). The active site subsites are labeled in green. The putative acid is indicated with A, and the pair of putative bases is indicated with B. the other monomer of the dimer) and Lys-207 on either side of the 119 subsite close it off this subsite in PdGH110B making it of insufficient size to accommodate the 2-fucosyl residue of the blood group B antigen. These residues are not well-conserved among the family as a whole (Fig. 5C) nor in the characterized enzymes (Fig. S5). In the characterized enzymes, Lys-195 resides in a variable region, which in PdGH110B comprises the "finger" that extends between the two monomers of the dimer (Fig. 2C and Fig. S5), although Lys-207 is typically an amino acid with a smaller sidechain (Fig. S5). These alterations seen other GH110 enzymes likely open up the 119 subsite, allowing it to accommodate the 2-fucosyl residue of the blood group B antigen. Furthermore, these structural changes, along with substitutions of the poorly conserved Arg-208, would likely reduce the basic nature of the 119 subsite, making it less suitable for accommodating an anionic sulfate residue and more appropriate for a neutral fucosyl residue.
To date, all characterized examples of GH110 enzymes display a-1,3-galactosidase activity. These enzymes partition into three classes of activity: enzymes that are specific for the blood group B antigen, those that are active on both the blood group B antigen and linear a-1,3-galactose epitopes, and now a class represented by PdGH110B that is selective for linear a-1,3linked galactose. PdGH110B notably sorts into a clade that comprises proteins from primarily marine microbes. This observation likely reflects that enzymes in this clade have adapted to enable biological processing of l-carrageenan. Indeed, the other entries in this clade that are not from marine microbes are typically from human gut bacteria, which themselves may also be able to target l-carrageenan present in the diet. Further examination of this marine clade of GH110 enzymes may ultimately reveal additional adaptations that confer the ability to accommodate, or even specifically recognize, the sulfate modifications on l-carrageenan.

Materials
All reagents, chemicals and other carbohydrates were purchased from Sigma unless otherwise specified.

Cloning and mutagenesis
The gene fragment encoding GH110B without predicted signal peptides (amino acids 24-620) was amplified from Pseudoalteromonas distincta U2A genomic DNA using the oligonucleotide primers 59-CTA GCT AGC AAT GAT AAA GTG ATA GAT G-39 (PdGH110B_fwd) and 59-CCG CTC GAG TTA GTT TTT AGC TCT TTT AT-39 (PdGH110B_rev). The product was ligated into pET28a between the NheI and XhoI restriction sites (underlined in sequences).  Table S1 for complete annotation of the tree. B, the conservation of active site residues of GH110 mapped onto the structure of PdGH110B by ConSurf analysis (46). C, conservation of the active site shown as accessible surface representation. The color scheme representing degree of residue conservation in B and C is shown in B. Active site subsites and sugar residues are labeled in green.

Structural analysis of GH110
All DNA amplifications were done using CloneAmp HiFi PCR premix. Mutations were created by site-directed mutagenesis (QuikChange site-directed mutagenesis kit) using the primers 59-CTC CAT GGA TGT TAG TAG CAT CAT TCT  TTT GAC TTT CGA ATA AGT TAT C-39 and 59-GAT AAC  TTA TTC GAA AGT CAA AAG AAT GAT GCT ACT AAC ATC CAT GGA G-39. All constructs were sequence confirmed by bidirectional sequencing.

Protein expression and purification
The expression plasmid of the PdGH110B was transformed into Escherichia coli BL21 (DE3) Star and grown in 2-liter cultures of LB broth containing 50 mg ml 21 kanamycin sulfate at 37°C with agitation at 180 rpm until cell density reached an A 600 of ;0.5, at which time the temperature was dropped to 16°C, and recombinant protein production was induced with a final concentration of 0.5 mM isopropyl b-D-thiogalactopyranoside and allowed to incubate for an additional 16 h.
PdGH110B protein was purified by applying the clarified lysate supernatant to a nickel-affinity chromatography column and eluted with 20 mM Tris-HCl (pH 8.0), 500 mM NaCl, with a stepwise increase in imidazole concentration of 5, 10, 15, 20, 40, 50, 100, and 500 mM. All samples containing the protein of interest as judged by SDS-PAGE (SDS-PAGE) were concentrated with an Amicon ultrafiltration cell (EMD Millipore) with a 10-kDa molecular-mass cutoff. Proteins of interest were further purified using a HiPrep 16/60 Sephacryl S-300 HR sizeexclusion chromatography column in 20 mM Tris (pH 8.0) and 500 mM NaCl.

pH studies
The reactions with synthetic substrates were set up in triplicate using a 96-well plate and incubated at 25°C in the dark for 1 h. The reactions (100 ml) contained 50 mM McIlvaine buffer (pH 2.9-9.6), 1 mM 1,4-dithio-D-threitol in binding buffer with 7.5% glycerol, 1 mM pNP-a-D-galactopyranoside or pNP-b-Dgalactopyranoside in dH 2 O, and 1 mM recombinant PdGH110B. The reactions were stopped with an equal volume of 100 mM sodium hydroxide. The plate was read at 25°C and 405 nm with five technical replicates, using a Molecular Devices Spectramax Plus plate reader.

Enzyme activity measuring galactose release
The activity of recombinant PdGH110B against aG2 and bG2 was tested using the Megazyme L-arabinose/D-galactose (Rapid) assay kit (Megazyme) to detect galactose release via the oxidation of b-D-galactose and the reduction of NAD 1 to NADH by a galactose dehydrogenase. The assays were performed in 83 mM HEPES (pH 7.5), 10 ml of kit solution 2 (NAD 1 ), 2 ml of kit suspension 3 (dehydrogenase and mutarotase), 4 mg of aG2, bG2, or D-galactose in binding buffer, and 0.81 mM PdGH110B.
The reactions were set up in triplicate, blanked before the addition of the kit enzymes (suspension 3), and read after addition of the enzymes every 30 s for 30 min at 25°C and 340 nm, using a Molecular Devices Spectramax Plus plate reader.

H NMR
Recombinant PdGH110B (1.3 ml at 1 mg/ml) was bufferexchanged overnight at 4°C in 1 liter of 50 mM sodium phosphate buffer (pH 5.8). PdGH110B was then buffer-exchanged into NMR buffer (50 mM sodium phosphate (pH 5.9) in 99.9% D 2 O) through dilution and reconcentration in an Amicon centrifugal filtration unit with a 10-kDa cutoff. The reaction contained 8.3 mM pNP-a-D-galactopyranoside and 2.9 mM recombinant PdGH110B in NMR buffer. A 10 mM D-galactose control reaction in NMR buffer was also monitored as a standard. The reactions were measured before the addition of PdGH110B and after subsequent incubations at the 5-min, 15min, 30-min, 1-h, 2-h, and 24-h time points using a Bruker Avance II 500 MHz NMR spectrometer and a 5-mm TXI inverse probe. The data were processed using the MestReNova 10 software package.

Phylogenetic analysis
The sequences from the glycoside hydrolase family 110 were retrieved. Of the 417 sequences, only 332 had entries in the NCBI database. A multiple sequence alignment was performed using Clustal Omega (36). The evolutionary relationships of PdGH110A, PdGH110B, and the other GH110 sequences was inferred by maximum likelihood method based on the Jones-Tay-lor_thornton matrix-based model using FastTree (37). The phylogenetic tree was visualized and annotated using the iTol web tool.
Crystallization, diffraction data collection, and processing All crystals were grown at 18°C by hanging-drop vapor diffusion with 1:1 ratios of crystallization solution and protein. A native PdGH110B crystal grown in 0.5 M NaI, 2% Tacsimate, 0.1 M HEPES (pH 7.5), and 17% PEG 3350 was soaked for 10 min in this mother liquor containing 1 mM CdCl 2 prior to data collection. To obtain the product complex, a PdGH110B crystal grown 0.1 M HEPES (pH 7.5), 9% PEG 6000, and 5.5% 2-methyl-2,4-pentanediol was soaked in mother liquor containing excess D-galactose prior to data collection. To obtain a substrate complex, PdGH110B_D344N was crystallized in 1 M NaI, 2% Tacsimate, 0.1 M HEPES (pH 7.5), and 17% PEG 3350 and then soaked in this mother liquor containing excess a-1,3-galactobiose prior to data collection. 25% (v/v) ethylene glycol was used as the cryo-protectant for all crystals, with the exception of the D-galactose-soaked crystal. The 2-methyl-2,4-pentanediol content of the D-galactose-soaked crystal was increased from 5.5% (mother liquor concentration) to 20% for cryoprotection.
Diffraction data were collected on an instrument comprising a Pilatus 200K 2D detector coupled to a MicroMax-007HF Xray generator with a VariMaxTM-HF ArcSec confocal optical system and an Oxford Cryostream 800. The data were integrated, scaled, and merged using HKL2000. The data processing statistics are shown in Table 1.

Structure solution and refinement
The structure of PdGH110B was determined by the singlewavelength anomalous dispersion method using the cadmium derivative. Initial phases were determined using the SHARP/ autoSHARP pipeline (38). Phases were improved using PAR-ROT (39) to perform density modification and noncrystallographic averaging. An initial model comprising ;95% completeness was constructed by autobuilding using ARP/wARP (40). The most complete monomer from this initial model was used as a molecular replacement model to determine the structure of the PdGH110B galactose complex, which was finished by manual building with COOT (41) and refinement with REFMAC (42). A monomer from this model was then used to solve the structure of the PdGH110B_D234N structure in complex with aG2 by molecular replacement using PHASER (43) and the same building and refinement procedures.
For all structures, the addition of water molecules was performed in COOT with FINDWATERS and manually checked after refinement. In all data sets, refinement procedures were monitored by flagging 5% of all observations as "free" (44). Model validation was performed with MOLPROBITY (45). The model refinement statistics are shown in Table 1.

Data availability
The atomic coordinates for the two crystal structures reported here have been deposited in the Research Collaboratory for Structural Bioinformatics Protein Data Bank under accession codes 7JW4 and 7JWF. All other data are available in the article and in the supporting information.
Acknowledgments-We thank Edward Meier for technical assistance.  Conflict of interest-The authors declare that they have no conflicts of interest with the contents of this article.