Endo-fucoidan hydrolases from glycoside hydrolase family 107 (GH107) display structural and mechanistic similarities to α-l-fucosidases from GH29

Fucoidans are chemically complex and highly heterogeneous sulfated marine fucans from brown macro algae. Possessing a variety of physicochemical and biological activities, fucoidans are used as gelling and thickening agents in the food industry and have anticoagulant, antiviral, antitumor, antibacterial, and immune activities. Although fucoidan-depolymerizing enzymes have been identified, the molecular basis of their activity on these chemically complex polysaccharides remains largely uninvestigated. In this study, we focused on three glycoside hydrolase family 107 (GH107) enzymes: MfFcnA and two newly identified members, P5AFcnA and P19DFcnA, from a bacterial species of the genus Psychromonas. Using carbohydrate-PAGE, we show that P5AFcnA and P19DFcnA are active on fucoidans that differ from those depolymerized by MfFcnA, revealing differential substrate specificity within the GH107 family. Using a combination of X-ray crystallography and NMR analyses, we further show that GH107 family enzymes share features of their structures and catalytic mechanisms with GH29 α-l-fucosidases. However, we found that GH107 enzymes have the distinction of utilizing a histidine side chain as the proposed acid/base catalyst in its retaining mechanism. Further interpretation of the structural data indicated that the active-site architectures within this family are highly variable, likely reflecting the specificity of GH107 enzymes for different fucoidan substructures. Together, these findings begin to illuminate the molecular details underpinning the biological processing of fucoidans.

Fucoidans are chemically complex and highly heterogeneous sulfated marine fucans from brown macro algae. Possessing a variety of physicochemical and biological activities, fucoidans are used as gelling and thickening agents in the food industry and have anticoagulant, antiviral, antitumor, antibacterial, and immune activities. Although fucoidan-depolymerizing enzymes have been identified, the molecular basis of their activity on these chemically complex polysaccharides remains largely uninvestigated. In this study, we focused on three glycoside hydrolase family 107 (GH107) enzymes: MfFcnA and two newly identified members, P5AFcnA and P19DFcnA, from a bacterial species of the genus Psychromonas. Using carbohydrate-PAGE, we show that P5AFcnA and P19DFcnA are active on fucoidans that differ from those depolymerized by MfFcnA, revealing differential substrate specificity within the GH107 family. Using a combination of X-ray crystallography and NMR analyses, we further show that GH107 family enzymes share features of their structures and catalytic mechanisms with GH29 ␣-L-fucosidases. However, we found that GH107 enzymes have the distinction of utilizing a histidine side chain as the proposed acid/base catalyst in its retaining mechanism. Further interpretation of the structural data indicated that the active-site architectures within this family are highly variable, likely reflecting the specificity of GH107 enzymes for different fucoidan substructures. Together, these findings begin to illuminate the molecular details underpinning the biological processing of fucoidans.
Originating from various brown macro algae, fucoidans are chemically complex sulfated fucans that possess a variety of physicochemical and biological activities; thus, their study is of increasing interest. They are a known source of nutritional value and because of their rheological properties are used as additives in the food industry as gelling and thickening agents. In addition, fucoidans are recognized as potential novel therapeutic agents because they possess several health-related biological activities, including anticoagulant, antiviral, antitumor, antibacterial, and immune-inflammatory actions (1)(2)(3)(4)(5)(6).
Although fucoidan depolymerizing enzymes have been identified, the molecular basis of their action remains largely uninvestigated. In this study we focus on three GH107 family enzymes, FcnA, which we will specify as MfFcnA, and two newly identified members, P5AFcnA and P19DFcnA, from marine Psychromonas species SW5A and SW19D, respectively. Here we show that P5AFcnA and P19DFcnA show activity on fucoidans from Laminaria hyperborea and Macrocystis pyrifera, thus distinguishing them from MfFcnA, which lacks activity on these substrates. To probe the molecular basis of fucoidan activity and provide some potential insight into the differential specificity of this enzyme family, we determined the full structure of P5AFcnA and catalytically active fragments of MfFcnA by X-ray crystallography. This revealed structural similarities to GH29 ␣-L-fucosidases and the proposal of a catalytic mechanism that retains the stereochemistry of the substrate at C1 of the hydrolyzed bond, support for which was provided by NMR analyses. Although the catalytic machinery is conserved throughout the family, including the unusual use of a histidine residue as the proposed catalytic acid/base, the structural data point to nonconserved subsites outside of the primary Ϫ1 subsite as possible key contributors to determining the specificity of GH107 enzymes for different fucoidan substructures.

Members of the GH107 family depolymerize a number of sulfated fucans
MfFcnA is the founding member of GH107 and the canonical example of an endo-fucoidanase. The activity of this enzyme, however, was only reported on fucoidan from one source, and subsequent to this report, no characterization of additional members of GH107 have been reported. Therefore, we sought to provide additional assessment of substrate specificity within the GH107 family with MfFcnA serving as our standard. Because it is a relatively large, multimodular protein, we based our initial truncation comprising residues Gln 29 -Thr 794 on the construct of Colin et al. (19), which we refer to as MfFcnA2 (Fig. 1A), and produced this in Escherichia coli.
In the genome sequences of two newly isolated fucoidanolytic marine Psychromonas sp., we identified two genes encoding proteins that we refer to as P5AFcnA and P19DFcnA. These are 403-amino acid proteins, including predicted 15-amino acid secretion signal peptides, which display 81% amino acid sequence identity to one another. Initial annotation of these proteins indicated identity (E values of ϳ10 Ϫ8 ) with proteins classified as GH107 enzymes and no identity with other GH families. More detailed comparison of P5AFcnA and P19DFcnA with members of GH107, including MfFcnA, yielded amino acid identities between 16 and 21%. Although the identities of P5AFcnA and P19DFcnA with the canonical GH107 MfFcnA were comparatively low, they were suggestive of membership in GH107 and possible fucoidanase activity. P5AFcnA and P19DFcnA were produced as recombinant proteins in E. coli with only their predicted secretion signal peptides truncated (Fig. 1B).
To assess the activity of recombinant MfFcnA2, P5AFcnA, and P19DFcnA against fucoidans originating from a variety of algal sources, we used carbohydrate-PAGE (C-PAGE) (Fig. 2). Consistent with previous reports, MfFcnA2 was capable of degrading fucoidan, and displayed activity on fucoidan from Chorda filum and fucoidan from Ascophyllum nodosum (Fig.  2). It also showed some capacity to degrade fucoidan from Saccharina japonica (previously known as Laminaria japonica) (Fig. 2), which is a complex and highly variable fucoidan that contains both class 1 and 2 backbone structures. P5AFcnA and P19DFcnA both showed the capacity to degrade fucoidans, supporting assignment of fucoidanase activity and membership The proteins were recombinantly expressed without their predicted secretion signal peptides (shown in "yellow"). C, the modulare structures of the remaining currently classified GH107 enzymes. Fda1 and Fda2 are the entries from Alteromonas sp. SN-1009 and SVI_0379 is from Shewanella violacea DSS12. AXE80_07420, AXE80_07425, AXE80_07310, and AXE80_07305 are from Wenyingzhuangia fucanilytica CZ1127. In all panels, the domains are: yellow, signal peptide; blue, GH107 (␤/␣) 8 D1 domain; gray, immunoglobulin-like R domain (IPR013783); green, secretion system C-terminal sorting domain (IPR026444); purple, concanavalin A-like lectin/glucanase domain superfamily (IPR013320); light blue, archaeal/bacterial C-terminal peptidase (IPR007280); and red, conserved all ␤-strand domain. Module/domain assignments were based on a combination of manual assignment by BLAST and automated assignment by InterProScan (14).

GH107 structure and mechanism
in family GH107 (Fig. 2). They qualitatively displayed the same specificity for fucoidans from L. hyperborea and M. pyrifera, with apparently some limited ability to degrade fucoidans from S. japonica and C. filum (Fig. 2). The distinct laddering pattern observed upon treatment of fucoidan from L. hyperborea with either P5AFcnA and P19DFcnA implies endo-fucoidanase activity.

Structural analysis of GH107 members
In an effort to provide molecular insight into fucoidan degradation by GH107 enzymes, we made extensive attempts at crystallizing P5AFcnA and P19DFcnA. We were successful in getting X-ray diffraction quality crystals of P5AFcnA. This protein crystallized in the space group P3 1 21 with a single molecule in the asymmetric unit. Phases were initially determined to ϳ1.98 Å resolution by a single isomorphous replacement with anomalous signal (SIRAS) experiment using an iodide derivative. The resulting preliminary model was used with a higher resolution 1.55 Å resolution data set to determine the final refined structure. This revealed a single domain organization comprising a (␤/␣) 8 -barrel domain (Fig. 3). The (␤/␣) 8 -barrel fold contains defects in the secondary structure such that the polypeptide regions corresponding to one of the ␣-helices and one ␤-strand do not form regular secondary structures and are better described as random coil. Electron density for one metal ion, which was modeled as Ca 2ϩ , was found weakly bound to the loop leading into ␣-helix 7 and was coordinated by one nitrogen ligand from Lys 259 and one water molecule (Fig. 3).
Initial attempts at crystallizing MfFcnA2 were unsuccessful, and we reasoned that this was likely due to our inability to purify the protein as a fully intact and homogeneous protein. To explore potential regions of disorder in MfFcnA2 and possibly inform the generation of constructs that could be crystallized,

GH107 structure and mechanism
we performed a hydrogen-deuterium exchange (HDX) MS analysis of MfFcnA2. This highlighted a highly dynamic region at the C terminus comprising residues Thr 731 -Thr 769 (Fig. 4A). Subsequently, a new construct was designed, MfFcnA4, which comprised resides Gln 29 -Ser 734 (Fig. 1A). This new construct could be purified as a stable, homogeneous polypeptide and produced crystals of X-ray-diffraction quality.
MfFcnA4 crystallized in the space group P3 1 21 with a single molecule in the asymmetric unit. The phases were initially determined to ϳ2.8 Å resolution by a SIRAS experiment using an iodide derivative. The resulting preliminary model was used with a higher resolution 2.2 Å resolution data set to determine the final refined structure. This revealed a four-domain organization comprising a large N-terminal (␤/␣) 8 -barrel domain, which displayed a root mean square deviation (RMSD) of 2.21 Å over 287 aligned residues and 24% sequence identity when compared with the P5AFcnA structure (Fig. 4, B and C). This N-terminal domain, which we will refer to as D1, also contained the same secondary structure defects as P5AFcnA. In MfFcnA4 the D1 domain is linked to three consecutive Ig-like domains that wrap around the D1 domain (Fig. 4, B and C). Electron density for two well defined metal ions, modeled as Ca 2ϩ (based on coordination geometry, protein ligands, and refined B-factor), was found in the N-terminal domain. One Ca 2ϩ (Ca1) was found bound to the loop leading into ␣-helix 5 and was octahedrally coordinated by five oxygen ligands from the protein (side chains of Asp 330 , Asn 335 , and Asp 336 ; backbone carbonyls of Phe 327 and Arg 332 ) and one water molecule (Fig. 4B). The second and likely more weakly bound Ca 2ϩ (Ca2) was found near the C terminus of ␣-helix 7. This was also octahedrally coordinated but by four water molecules and two oxygen ligands: one from the carboxylate of Asp 79 and the other from the backbone carbonyl of Thr 77 in the protein (Fig. 4B).
As noted by Colin et al. (19) the three repeating domains, referred to as R1 to R3, following D1 had folds consistent with membership in the immunoglobulin-like superfamily with each domain binding a calcium atom at an apical end (Fig. 4, B and C). The pairwise amino acid sequence identities range from 27 to 48%, whereas the structural superimposition gives RMSD values of 1.6 Å (R1 cf. R2), 1.7 Å (R1 cf. R3), and 0.7 (R2 cf. R3) Å. The high frequency of carbohydrate-binding modules (CBM) in carbohydrate-active enzymes, combined with the observation that these often have Ig-like folds, lends to the idea that the repeated domains in MfFcnA4 may be CBMs (22,23). However, CBMs invariably employ carbohydrate-binding sites with solvent-exposed aromatic amino acid side chains (22,23), which the R domains lack and therefore do not display features consistent with carbohydrate recognition. Structure similarity searches using the DALI server (24) show that the domains most closely related to the R domains are the C-terminal tandem repetitive modules CDHL-1 and CDHL-2 from the ligandbinding region of the Staphylococcus aureus surface-exposed serine-rich repeat glycoprotein (SRRP) (25) (PDB codes 4M00 and 4M03). R1 shows structural similarity to the C-terminal module CDHL-2 (PDB code 4M03) (25% identity over 85 residues, with a RMSD of 2.0 Å). R2 and R3 show structural similarity to the CDHL-1 module (PDB code 4M00) (22 and 24% identity over 92 and 86 residues, with a RMSD of 2.8 Å and 2.1 Å, respectively). The CDHL modules, which resemble calciumdependent eukaryotic cadherins, function as a relatively rigid stem to extend the N-terminal lectin module of the glycoprotein outwards from the S. aureus cell surface. Like the R domains in MfFcnA, the CDHL modules bind one Ca 2ϩ ion per The deuterium incorporation levels are colored according to the legend where amino acids that have low deuterium incorporation are colored blue, and amino acids that have high deuterium incorporation are colored red (see also Fig. S1). B, the MfF-cnA4 structure colored by secondary structure and orientated to show the three consecutive C-terminal Ig-like domains (R1, R2, and R3) that wrap around the large N-terminal (␤/␣) 8 -barrel catalytic domain (D1) that coordinates an ethylene glycol (green sticks). C, the solvent-accessible surface of MfFcnA4. D, the MfFcnA9 structure (yellow) superimposed over the MfFcnA4 (gray) structure to show the similar structural arrangements of the composite domains in different crystallographic conditions. In all panels, calcium atoms are shown as green spheres, a sodium atom is shown as a purple sphere, and bound ethylene glycol and malonate ion are green sticks.

GH107 structure and mechanism
module and are suggested to facilitate structural rigidity to adopt the extended conformation in solution (25). On this basis we propose that the R1-R3 domains could be carrying out a similar function whereby the R modules constitute a structural spacer separating the D1 catalytic domain from the unknown C-terminal domain.
We made extensive attempts to generate a structure of MfF-cnA4 in complex with fucoidan oligosaccharides purified after treatment of C. filum fucoidan with this enzyme. These attempts were unsuccessful, and we reasoned this to be largely because of occlusion of the active site in the crystal packing of the existing crystal form and an inability to generate alternative crystal forms. To obtain a construct that might crystallize differently, we designed a truncation based on the MfFcnA4 structure comprising only the D1-R2 domains (amino acids Gln 29 -Asp 623 ), which we called MfFcnA9 (Fig. 1A). Recombinant purified MfFcnA9 crystallized and its structure was determined; however, we were again unable to generate the structure of the bound complex. Nevertheless, a comparison of the MfF-cnA9 coordinates with MfFcnA4 provided insight into the conformational preference of the composite domains in these polypeptides. An overlay of the two structures revealed a RMSD of 0.65 Å over all of the residues present in the MfFcnA9 model (Fig. 4D). Notably, the two different crystal forms of MfFcnA4 and MfFcnA9 had quite different crystallographic environments, suggesting that the very similar arrangements of the first three domains (D1-R2) in these different truncations of MfF-cnA represent a configuration that is consistently stabilized by the interdomain interfaces. The R3 domain in MfFcnA4, however, is fully solvent-exposed, suggesting that the terminal domain in this construct may be somewhat flexible (Fig. 4D). Supporting this is the observation that the average B-factor for R3 was ϳ10 Å 2 (or ϳ20%) higher than for the R1 and R2 domains, which were otherwise similar.
Despite extensive attempts to generate structures of substrate and/or product complexes of GH107s, we were unable to do so. Therefore, to provide greater insight into the location and architecture of the active site, we performed a structural comparison of both MfFcnA4 and P5AFcnA with BiAfcB from Bifidobacterium longum subsp. Infantis (27) in complex with lacto-N-fucopentaose II (LNFPII), which terminates in the Le a antigen trisaccharide (Gal␣1-3(Fuc␣1-4)GlcNAc) (PDB code 3UET) (Fig. 5A). Consistent with the DALI comparison, this revealed similarities between only the (␤/␣) 8 domains and not the accessory domains appended to the main catalytic domain in MfFcnA4.
A more detailed comparison of residues in the BiAfcB active site showed the structural conservation of the catalytic nucleophile, which is Asp 172 in BiAfcB, with Asp 226 and Asp 201 in the proposed active sites of MfFcnA4 and P5AFcnA, respectively (Fig. 5B). When the ligand for BiAfcB from the superimposi- Because the BiAfcB-LNFP II complex was generated with a D172A mutant, we overlapped the apo-BiAfcB structure (PDB code 3MO4) with the BiAfcB-LNFP II complex, and it is the unmutated active site of BiAfcB that is shown with the ligand from the complexed structure. In A and B, the substrate bound to BiAfcB is shown as yellow sticks. C, C-PAGE analysis of the activity of MfFcnA4_H294Q on fucoidan from C. filum and A. nodosum.

GH107 structure and mechanism
tions was examined in the context of the putative MfFcnA4 and P5AFcnA active sites, we found that Asp 226 and Asp 201 were in approximately the correct position to attack C1 of a fucose in the Ϫ1 subsite, consistent with a possible role as the catalytic nucleophile in these GH107s. The general acid/base in BiAfcB is Glu 217 , which is structurally replaced by His 294 in MfFcnA4 and His 276 in P5AFcnA (Fig. 5B). The N e2 nitrogens of the histidine side chains are roughly positioned to act as acid/bases, suggesting that His 294 and His 276 may act in this capacity. To provide support for this, the His 294 residue from MfFcnA was conservatively mutated to a glutamine, which abolished the activity of the mutant protein, MfFcnA4 H294Q, on the fucoidan substrates from C. filum and A. nodosum (Fig. 5C). The crystal structure of MfFcnA4H294Q was determined and revealed no structural defects, other than substitution of the histidine, that would lead to the inactivity of the enzyme (not shown). This supports the general assignment of this pocket as the catalytic site and points to a specific catalytic role for His 294 in MfFcnA and, by extension, His 276 in P5AFcnA.

GH107s depolymerize fucoidan via an endohydrolytic retaining mechanism
MfFcnA is reported to have endo-fucoidanase activity (19), whereas the pattern of products from P5AFcnA activity on L. hyperborea fucoidan is also suggestive of endo-acting hydrolytic activity (Fig. 2C). The structures of these enzymes reveal active sites that are contoured as grooves on the surfaces of the enzyme (Fig. 6), consistent with recognition and internal cleavage on fucoidan chains. Furthermore, the different surface contouring of MfFcnA and P5AFcnA are coherent with the different specificities of the enzymes for fucoidans from various sources.
The ability of MfFcnA to hydrolyze fucoidan from C. filum, which is reported to be a class one fucoidan comprising an ␣(1-3)-linked fucose backbone with ␣(1-2) branches, contra-dicts the described specificity of the enzyme for ␣(1-4)-linkages. To resolve this ambiguity, we purified two products resulting from the digestion of C. filum fucoidan with MfF-cnA2. One product was the major rapidly appearing product (within minutes) and the second product appeared only after extensive incubation (hours). NMR analysis of the major product revealed it to be a tetrasaccharide comprising an ␣(1-4)linked repeat of the disaccharide L-fucopyranose-2,3-disulfate-␣(1-3)-L-fucopyranose-2-sulfate, which is the same as the tetrasaccharide product reported by Colin et al. (19) as resulting from the MfFcnA catalyzed endohydrolytic digestion of fucoidan from the brown alga P. canaliculata. The disaccharide product appearing after extensive digestion was the L-fucopyranose-2,3-disulfate-␣(1-3)-L-fucopyranose-2-sulfate disaccharide. These results reveal that C. filum fucoidan does contain some fraction of ␣(1-4)-linkages. The production of the disaccharide likely results from slow conversion of the tetrasaccharide product, which may bind the enzyme with low affinity. Together, these results are consistent with those of Colin et al. (19) and supports the conclusion that MfFcnA endohydrolytically cleaves ␣(1-4)-linkages in fucoidan, which appear to be present in the preparation of C. filum fucoidan that we used.
On the basis of these results, which suggest that the Ϫ1 subsite in MfFcnA accommodates L-fucopyranose-2-sulfate, we modeled this monosaccharide into the active site using the overlap of the MfFcnA4 structure with the coordinates of the BiAfcB-LNFP II complex as a guide. This suggested the presence of a basic pocket in the Ϫ1 subsite that may accommodate and complement the charge of this acidic sulfate ester group, thus implying that the sulfation pattern of fucoidan may play an important role in its recognition by the enzyme (Fig. 6A). A similar approach with P5AFcnA also reveals basic patches in the active-site groove, also suggesting an ability to accommodate sulfate groups (Fig. 6B). However, the unknown structures of P5AFcnA substrates and the different surface contouring of the P5AFcnA active site relative to that of MfFcnA make the role of sulfate accommodation less clear in this enzyme.
Given the similarities in the positioning of the catalytic residues in GH29 enzymes and the proposed nuclophile and acid/ base residues in MfFcnA, we hypothesized that the GH107 enzyme should catalyze hydrolysis of the glycosidic bonds in fucoidan with a retaining catalytic mechanism, which is used by GH29 enzymes (18,28). Direct determination of the stereochemical outcome of hydrolysis by MfFcnA by 1 H NMR proved challenging because of the low reaction rates and the relatively fast mutarotation of the hemiacetal product. To overcome this, we used a strategy wherein transglycosylation to an alcohol is monitored because the first formed acetal product is stereochemically stable; mercaptoethanol proved to be a suitable alcohol for this purpose (29,30). Incubation of C. filum fucoidan with MfFcnA2 in the presence of 10% 2-mercaptoethanol led to cleavage to oligosaccharide products. The smallest product formed was purified, and then its structure was determined by NMR (Fig. 7A). This analysis revealed that in this product the -OCH 2 CH 2 SH is connected to C1 of the newly generated disaccharide with an ␣ configuration. Formation of an ␣-glycoside product from an ␣-glycoside substrate confirms the retention of stereochemistry and thus reaction through a

GH107 structure and mechanism
double-displacement mechanism. By extension to the hydrolysis reaction, this supports a retaining hydrolysis mechanism catalyzed by MfFcnA (Fig. 7B).

Discussion
Glycoside hydrolase family 107 presently has eight formally classified members with MfFcnA the only example whose characterization has been reported in the literature. The sizes of these putative proteins range from 574 amino acids to the largest, MfFcnA, at 1007 amino acids (Fig. 1). The bioinformatics analysis of MfFcnA by Colin et al. (19) suggested that this protein comprises multiple modules. This was supported by our structural analysis of MfFcnA4, which clearly revealed the presence of four distinct modules within residues 29 -732 of the protein; the structure and function of the C-terminal region of full-length MfFcnA remains unknown. The comparison of the MfFcnA4 and MfFcnA9 structures suggest that the first three modules, D1 to R2, comprise a structurally stable unit. This is consistent with the observations of Colin et al. (19) and our observations (not shown) that variants truncated smaller than a construct terminating after R1 could not be produced as soluble protein. However, only the (␤/␣) 8 D1 module of MfFcnA houses the catalytic machinery with no apparent contribution to the active site from the R modules, which suggests that the

GH107 structure and mechanism
GH107 family is largely defined by the D1 structural unit. Indeed, supporting this proposition, only the ϳ400-amino acid D1 module is conserved among the classified members of GH107 (Fig. 7). Although all of the presently classified members of GH107 are considerably larger than the conserved D1 domain, suggesting that a common feature of these enzymes is multimodularity (Fig. 1), the (␤/␣) 8 D1 module defining the GH107 family appears to be the minimal catalytically functional unit of these enzymes, as it clearly is in P5AFcnA and P19DFcnA.
The (␤/␣) 8 -barrel adopted by the D1 domain of MfFcnA and P5AFcnA revealed highest structural similarity to GH29 exo-␣-fucosidases. Structural conservation of functionally similar amino acid side chains with the catalytic residues of GH29 enzymes allowed us to postulate a retaining catalytic mechanism for this enzyme, which was supported by the results of NMR analyses. The apparent utilization of a histidine side chain in both enzymes as a catalytic residue, as suggested by the structures and reinforced by the MfFcnA4H294Q mutant lacking detectable activity, is unusual for a glycoside hydrolase, but it has been associated with phosphorolytic enzymes (31). However, an analysis of fucoidan hydrolysis in a phosphate-containing buffer by NMR did not provide any evidence for the formation of sugar phosphates. Thus, we propose that the current evidence is most consistent with MfFcnA utilizing a retaining hydrolytic mechanism whereby the side chain of Asp 226 acts as the nucleophile and the side chain of His 294 acts as the acid/base (Fig. 7B). Supporting this, both of these residues are conserved among currently classified GH107 enzymes, as well as P5AFcnA and P19DFcnA (Fig. 8A). Precedent for nonphosphorolytic glycoside hydrolases using a putative catalytic histidine exists in the glycoside hydrolase family 117 exo-␣-agarases, where a histidine residue is proposed to act as an acid in an inverting catalytic mechanism (32,33), and in GH145 L-Rh␣-␣-1,4-GlcA ␣-L-rhamnohydrolases where a histidine also functions as a catalytic residue in a unique mechanism suggested to involve an epoxide intermediate (34). Other examples in which a (potentially) anionic carboxyl side chain has been replaced by a neutral residue such as histidine are found in the GH1 myrosinases, where the acid/base Glu is replaced by Gln and an exogenous base (ascorbate) binds in its place (35); in the sialidases, where a neutral tyrosine residue takes the place of a carboxylate nucleophile (36); and in many sialyl transferases, where His serves as the general base catalyst (37). The majority of these enzymes bind and work upon an anionic substrate; thus it is supposed that the replacement of the anionic carboxylate in the active site by a neutral residue such as histidine allows the anionic substrate to bind effectively. Such is presumably also the case for GH107 enzymes so that they may accommodate the sulfated substrate.
The similarity of the MfFcnA4 and P5AFcnA structures with those of the GH29 enzymes provided initial evidence of the likely catalytic residues in these enzymes, but the remainder of the conservation between the GH107 active site and GH29 active sites is relatively limited. Indeed, this appears to be largely the case within the complete GH107 family. Conservation of amino acid residues in the D1 domain of this family, including P5AFcnA and P19DFcnA, by ConSurf analysis (38) and mapped onto the D1 domain of the MfFcnA structure shows that the overall sequence conservation on the protein surface is quite low (Fig. 8B). There are 13 residues of the total ϳ400 residues in the catalytic D1 domain that show conservation in all current members of the GH107 family, and notably, this includes the proposed catalytic Asp and His residues and four additional residues that are present in the Ϫ1 subsite. Thus, this family displays the anticipated conservation of the putative catalytic machinery and the features presumably allowing accommodation of fucose in the Ϫ1 subsite.
The lack of conservation in the surface regions neighboring the Ϫ1 subsite in GH107 enzymes likely reflects the different subsite architectures that allow the enzymes to recognize different fucoidan structures. This is exemplified by the differing groove-like topologies of the MfFcnA and P5AFcnA substratebinding sites, which appears to reflect the need for the catalytic sites to accommodate different polymers of fucose. For example, MfFcnA binds a substrate with alternating ␣(1-3)and  green arrow), the conserved putative histidine acid/base (indicated by yellow arrow), and the conserved tryptophan residue in the Ϫ1 subsite. The numbering above the alignment represents that of the MfFcnA4 structure. B, the conservation of amino acids of the full D1 domain from A mapped onto the solvent-accessible surface of the D1 domain structure from MfFcnA4 calculated by ConSurf (38). Amino acids colored turquoise are considered variable through to amino acids colored maroon, which are considered highly conserved. The inset focuses on the putative Ϫ1 subsite with the highly conserved residues shown as sticks and colored as in A. The bound ethylene glycol molecule is shown as green sticks.

GH107 structure and mechanism
␣(1-4)-linkages, such as is found in A. nodosum fucoidan (7,10,39) and, based on our results, in C. filum fucoidan as well. However, MfFcnA lacked activity on Fucus vesiculosis fucoidan, which is thought to have the same class two backbone structure as A. nodosum fucoidan (7,12,40), suggesting additional subtleties, such as sulfation patterns, in the recognition of fucoidan by individual enzymes. Indeed, we hypothesize a pocket in the Ϫ1 subsite of MfFcnA that accommodates a specific sulfate substituent (Fig. 6A). In contrast, the substrates utilized by the new fucoidanases P5AFcnA and P19DFcnA were the same for both enzymes and different from the MfFcnA substrates. Because these two enzymes were active on fucoidans whose structures are poorly defined (L. hyperborea and M. pyrifera) or are highly variable and chemically complex (S. japonica), we presently cannot comment on their precise substrate requirements (2,12,3). However, it is clear that their preference is distinct from that of MfFcnA, and this property must be imparted by different active-site topologies that allow them to accommodate unique fucoidan structures.
Together, these findings begin to illuminate the molecular details underpinning the biological processing of fucoidan. Although fucoidan is broadly classified into two types based on the stereochemistry of the fucose backbone, it is clear that this is a structurally diverse family of polysaccharides resulting from the enormous range of potential modifications to the backbone, including, for example, sulfations, branching, and decoration with other sugars. This high structural diversity appears to be reflected in the differential specificity of even a very small glycoside hydrolase family like GH107, which shows evidence of a conserved catalytic mechanism but likely variable active-site architectures to provide differential substrate specificity. These molecular-level observations provide a foundation that may ultimately aid in the informed implementation of such enzymes as biocatalytic tools to characterize fucoidans and/or produce potential biologically active molecules from fucoidan.

Materials
All reagents, chemicals, and other carbohydrates were purchased from Sigma unless otherwise specified. Purified fucoidan from M. pyrifera and F. vesiculosis were from Sigma; fucoidan from C. filum, A. nodosum, and Durvillaea antarctica was from Elicityl; L. hyperborea fucoidan was from FMC; S. japonica fucoidan was from Cactus Botanicals; Undaria pinnitiffida fucoidan was from Marinova/NZP; and Cladosiphon okamuranus fucoidan was from Kanehide Bio Co. Ltd.

Fucoidanase cloning, recombinant expression, and purification
The E. coli codon optimized MfFcnA2 gene encoding amino acid residues Gln 29 -Thr 794 of the full-length protein fused to an N-terminal six histidine tag by a thrombin cleavage site was generated synthetically and inserted into pET28a (Genscript) to make the construct pET28a_MfFcnA2. The pET28a_ MfFcnA4 construct was obtained by amplifying the truncated region of the MfFcnA2 gene, including the N-terminal sixhistidine tag from the template pET28a_MfFcnA2 using the primers MfFcnA2_F (5Ј-AGCCATATGGCTAGCCAAGTA-CCAGATCCAAACCAAG) and MfFcnA4_R (5Ј-GTGGT-GGTGCTCGAGTTAACTAATAATAGTTGCAATTACGT-TGA), inserting it into pET28a using the In-Fusion HD cloning kit (Takara Bio USA, Inc.). The pET28a_MfFcnA9 construct was also obtained by the same procedure but using the primers MfFcnA2_F (5Ј-AGCCATATGGCTAGCCAAGTACCAGA-TCCAAACCAAG) and MfFcnA9_R (5Ј-GTGCTCGAGTTA-ATCCACGGTAATCTTGATTTCGG). The MfFcnA_H294Q mutant was constructed using the QuikChange site-directed mutagenesis method (Agilent Technologies) with the plasmid pET28a_MfFcnA4 as the template. The following primers were used: MfFcnA_H294Q_F (5Ј-TATACCTTTGGCCAGCCGT-TTGGTGGCGCGGGTAA and MfFcnA_H294Q_R (5Ј-GCC-ACCAAACGGCTGGCCAAAGGTATAATCGTCAAAC-AGG). The P5AFcnA and P19DFcnA gene sequences, obtained from genomes of marine isolates SW5A and SW19D (see below), were synthetic E. coli codon optimized genes, excluding signal peptides, and including an N-terminal six-histidine tag. Plasmid with the genes were inserted into pET28a to make the constructs pET28a_p5AFcnA and pET28a_P19DFcnA, respectively, were ordered from Genscript. The sequence fidelity of all constructs was confirmed by bidirectional DNA sequencing.
Plasmids were used to transform E. coli BL21 (DE3), and subsequent transformants were grown in Luria-Bertani medium with 50 g/ml kanamycin at 37°C at 180 rpm to an A 600 of ϳ0.5. The growth temperature of the cultures was then reduced to 16°C for 45 min before adding a final concentration of 0.05 mM isopropyl ␤-D-thiogalactopyranoside to induce protein expression. Following further incubation of 15-20 h, the harvested cells were chemically lysed by resuspension in 35% (w/v) sucrose, 1% (w/v) deoxycholate, 1% (v/v) Triton X-100, 10 mg of lysozyme, and 0.2 g/ml DNase in 20 mM Tris-HCl (pH 8.0). Cell lysate was centrifuged at 16,000 ϫ g for 30 min to separate cellular supernatant and pellet. The supernatant was applied to a nickel-affinity chromatography column and eluted with 20 mM Tris-HCl, 0.5 M NaCl (pH 8.0), with increasing imidazole concentrations (20 -500 mM). Eluted protein was concentrated then further purified by size-exclusion chromatography using a HiPrep 16/60 Sephacryl S-200 HR column in 20 mM Tris-HCl, 0.5 M NaCl (pH 8.0).

Isolation of fucoidan-degrading bacteria
The fucoidanolytic marine Psychromonas sp. isolates SW5A and SW19D were isolated from brown macro algae collected from the shores of Cadboro Bay and Willows Beach (Victoria, Canada). Fragments of the macro algae were used to inoculate 10 ml of minimal seawater medium supplemented with 0.1% (w/v) various purified fucoidan preparations and incubated at 25°C at 200 rpm for 20 h. The cultures were used to inoculate another 10 ml of minimal seawater medium supplemented with 0.1% (w/v) fucoidan and incubated at 25°C at 200 rpm for 20 h. The cultures were then used to inoculate Zobell Marine plates 2216 (HiMedia Laboratories) and incubated at 25°C for up to 3 days. Single colonies were used to inoculate 10 ml of minimal seawater medium supplemented with 0.1% (w/v) fucoidan and incubated at 25°C at 200 rpm for 20 h. The cultures were then used to inoculate another 10 ml of minimal seawater medium supplemented with 0.1% (w/v) fucoidan and incubated at 25°C GH107 structure and mechanism at 200 rpm for 20 h. Two isolated bacterial strains designated SW5A and SW19D were found to grow on purified commercial fucoidan.

Sequencing, assembly, and annotation of fucoidan-degrading bacteria genomes
Genomic DNA was extracted from SW5A and SW19D grown in 5 ml of Zobell Marine broth, incubated for 20 h at 25°C at 200 rpm, using the DNeasy blood and tissue kit (Qiagen). Paired-end DNA libraries were prepared using NEBNext dsDNA Fragmentase (New England BioLabs), SPRIselect reagent kit (Beckman Coulter, Inc.), and NEBNext Ultra TM II DNA library prep kit for Illumina (New England BioLabs). The library was quantified using the QuBIT TM dsDNA HS assay kit (Invitrogen), the NEBNext Library Quant kit (New England BioLabs), and an Agilent 2100 Bioanalyzer high sensitivity DNA chip (Agilent Technologies). Next generation sequencing of libraries was performed on the Illumina MiSeq platform with the MiSeq V2 reagent kit (500 cycles) (Illumina). Sequencing analysis of paired-end reads and genome assembly was performed using the A5-miseq pipeline (41). Analysis of the 16S RNA revealed both bacterial strains to belong in the Psychromonas genus and are consequently referred to as Psychromonas sp. SW5A and Psychromonas sp. SW19D. The dbCAN webserver and database (42) was used to annotate all carbohydrate-active enzymes present in the genomes, including the identification of P5AFcnA from Psychromonas sp. SW5A and P19DFcnA from Psychromonas sp. SW19D.

Assays for fucoidanase activity
Fucoidanase activity assays were conducted using C-PAGE analysis of the release of anionic oligosaccharides (43). Briefly, 0.25% (w/v) of fucoidan and 1 M fucoidanase in 20 mM Tris-HCl (pH 8.0) buffer with 0.5 M NaCl were incubated for 40 h at room temperature. Sulfated carrageenan oligosaccharides were used at 0.05% (w/v) as standards. The samples and standards had loading dye added (5% (w/v) glycerol, 0.005% (w/v) bromphenol blue) and were run on an acrylamide/bis-acrylamide electrophoresis gel with a 5% (w/v) stacking layer and 24% (w/v) resolving layer in 75 mM Tris-HCl (pH 8.8) for 15 min at 100 V, 20 min at 150 V, and 30 min at 200 V. Gels were stained with 0.1% O-toluidine blue (w/v) in 1% (v/v) acetic acid and destained with 1% acetic acid.

HDX-MS
HDX reactions were conducted in triplicate with 31 pmol of MfFcnA2 and were initiated by the addition of 45 l of D 2 O buffer solution (10 mM HEPES, pH 7.5, 50 mM NaCl, 97% D 2 O), to give a final concentration of 87% D 2 O. Exchange was carried out for 3 s on ice and was terminated by the addition of ice-cold quench buffer (final concentration, 0.6 M guanidine HCl, 0.8% formic acid). A fully deuterated sample was prepared by incubating MfFcnA2 in 3 M guanidine HCl at room temperature for 30 min, prior to a 1-h exposure to D 2 O buffer solution on ice. The samples were flash frozen in liquid nitrogen and stored at Ϫ80°C until mass analysis. The peptides were generated by running the sample over two immobilized pepsin columns (Applied Biosystems; porosyme, 2-3131-00) at 10 and 2°C at 200 ml/min for 3 min. Mass analysis was completed as described previously (44 -46). Complete peptide data are given in Fig. S1.

Crystallization, data collection, structure solution, and refinement
Protein crystals were grown at 18°C using the hanging-drop vapor-diffusion method. Hanging drops were made by adding 1 l of protein solution (protein between 10 and 20 mg/ml in 20 mM Tris-HCl, pH 8.0, 500 mM NaCl) to 1 l of crystallization condition. Crystals used to solve the MfFcnA4 and MfFcnA4_ H294Q structures grew in the crystallization condition 0.1 M NaCl, 1.4 M (NH 4 ) 2 SO 4 , sodium acetate:acetic acid, (pH 4.5). Crystals used to solve the MfFcnA9 structure grew in the crystallization condition 0.2 M MgCl 2 , 0.1 M Bis-Tris (pH 5.6), 19% PEG 3350. Crystals used to solve the P5AFcnA structures grew in the crystallization condition 2 M sodium malonate (pH 6 -7). P5AFcnA crystals required no additional cryo-protection prior to flash cooling in a nitrogen stream at 100 K. However, crystals of the MfFcnA constructs required cryo-protection in crystallization solution supplemented with 30% ethylene glycol before being flash-cooled in a nitrogen stream at 100 K.
Diffraction data were collected on an "in-house" instrument comprising a PILATUS3 R 200K hybrid pixel array detector coupled to a MicroMax-007HF X-ray generator with a Vari-MaxTM-HF Arc) Sec confocal optical system and an Oxford Cryostream 800 or on Beamline 7-1 at the Stanford Linear Accelerator Center (Stanford Synchrotron Radiation Lightsource (SSRL)). In house diffraction data were indexed, integrated, and scaled using the HKL2000 suite (47) and converted to an mtz file using SCALEPACK2MTZ in the CCP4 platform. The data collected at SSRL was processed using iMosflm (48) for indexing and integration and Aimless for truncation and scaling (49). Initial structures of MfFcnA4 and P5AFcnA were solved by SIRAS using the SHARP/autoSHARP package (50) with a 2.8 Å (MfFcnA4) and 1.98 Å (P5AFcnA) native data sets and a 2.9 Å (MfFcnA4) and 2.4 Å (P5AFcnA) data set collected on derivatives produced by a 10 -15-min soak of crystals in cryo-protecting solution supplemented with 1 M NaI.
For the MfFcnA4 structure, the resulting phases were of sufficient quality for BUCANNEER to automatically build a ϳ60 -70% complete model comprising mainly backbone of the protein. This model was improved using a higher resolution 2.2 Å MfFcnA4 data set collected using synchrotron radiation, the initial model as a template for molecular replacement using PHASER (51), and BUCANNEER to automatically build a new nearly complete model. Iterative model building and refinement was used to complete the model. This model was used as the starting point to determine the structure of the MfF-cnA4 H294Q mutant using data collected to 2.85 Å in-house. Diffraction data for MfFcnA9 were collected in-house, and the structure of MfFcnA9 was solved by molecular replacement using the final MfFcnA4 model as a search model and the program PHASER (51) in the CCP4 platform. The MfFcnA9 model was completed by iterative building and refinement as above.
For the P5AFcnA structure, the resulting phases were of sufficient quality for BUCANNEER to automatically build an initial model. The initial model was used as a template for molec-GH107 structure and mechanism ular replacement using PHASER (51) and a 1.55 Å P5AFcnA data set. The resulting phases were used as a starting point for iterative model building and refinement to complete the P5AFcnA model.
All model building and refinement was performed using COOT (52) and REFMAC (53). The addition of water molecules was performed in COOT with FINDWATERS and manually checked after refinement. In all data sets, refinement procedures were monitored by flagging 5% of all observation as "free" (54). Model validation was performed using MolProbity (55). All data collection, processing, and model refinement statistics are shown in Table 1.

Preparation of fucoidan oligosaccharides
Fucoidan from C. filum (500 mg) was resuspended in 20 mM Tris (pH 8.0) with excess MfFcnA4 (10 M) and incubated for 3 h at room temperature on a Enduro TM MiniMix rotating mixer (Labnet International). The assay was frozen, lyophilized, resuspended in 0.5 ml of distilled water, and centrifuged at 12,000 rpm for 10 min in a 3-kDa MWCO Amicon Ultra centrifugal filter (Merck Millipore). This process was repeated two times, and all flow-through, containing small molecular weight oligosaccharides, was collected, pooled, and purified on a size-exclusion-chromatography column containing Bio-Gel P-2 gel in 0.1 M ammonium bicarbonate. Elution fractions were analyzed by TLC using the solvent system 8:4:1 formic acid:butanol:water. The resulting pure samples were pooled, lyophilized, resuspended in 0.5 ml of distilled water, and vacuum-concentrated to remove excess ammonium carbonate. 1 H, 13

The stereochemistry of hydrolysis by NMR
To a 1% (w/v) solution of C. filum in Na-HEPES buffer (10 mM, pH 7) containing 10% (w/w) 2-mercaptoethanol was added 5 M of MfFcnA2, and the reaction was incubated for 3 days at 25°C. The formation of hydrolysis products was monitored by TLC (1:1:1 water:isopropanol:ethyl acetate), as well as 1 H NMR. The reaction mixture was lyophilized to dryness and sonicated in MeOH. The fraction solubilized in MeOH was shown to contain the free disaccharide (R f ϭ 0.67) and a new disaccharide spot (R f ϭ 0.71), whereas the remaining solid contained the free disaccharide (R f ϭ 0.67) and tetrasaccharide (R f ϭ 0.31). The new disaccharide product (R f ϭ 0.71) was isolated by preparative TLC (1:1:1 water:isopropanol:ethyl acetate) and analyzed by NMR (Fig. S2). 1 H HMBC and NOESY experiments revealed that -OCH 2 CH 2 SH is connected to the C-1 of the newly generated disaccharide with ␣ configuration (J 1,2 ϭ 3.8 Hz). In a control reaction, no detectable disaccharide acetal was formed when incubating the free disaccharide with 2-mercaptoethanol under the same conditions.