Analysis of Keystone Enzyme in Agar Hydrolysis Provides Insight into the Degradation (of a Polysaccharide from) Red Seaweeds*

Background: The catalytic mechanism and substrate recognition required for cleavage of the α-linkage in agarose are unclear. Results: Structural analysis of a family 117 glycoside hydrolase details substrate recognition and supports an inverting mechanism. Conclusion: GH117 enzymes use substrate distortion and an unusual general acid for catalysis. Significance: Microbes may utilize alternate strategies to catalyze the degradation of polysaccharides with unique structural characteristics. Agars are abundant polysaccharides from marine red algae, and their chemical structure consists of alternating d-galactose and 3,6-anhydro-l-galactose residues, the latter of which are presumed to make the polymer recalcitrant to degradation by most terrestrial bacteria. Here we study a family 117 glycoside hydrolase (BpGH117) encoded within a recently discovered locus from the human gut bacterium Bacteroides plebeius. Consistent with this locus being involved in agarocolloid degradation, we show that BpGH117 is an exo-acting 3,6-anhydro-α-(1,3)-l-galactosidase that removes the 3,6-anhydrogalactose from the non-reducing end of neoagaro-oligosaccharides. A Michaelis complex of BpGH117 with neoagarobiose reveals the distortion of the constrained 3,6-anhydro-l-galactose into a conformation that favors catalysis. Furthermore, this complex, supported by analysis of site-directed mutants, provides evidence for an organization of the active site and positioning of the catalytic residues that are consistent with an inverting mechanism of catalysis and suggests that a histidine residue acts as the general acid. This latter feature differs from the vast majority of glycoside hydrolases, which use a carboxylic acid, highlighting the alternative strategies that enzymes may utilize in catalyzing the cleavage of glycosidic bonds.

Algae are the plants of the ocean; however, they produce oligo-and polysaccharides that are structurally and chemically distinct from those found on land. Although microbial degradation of these algal carbohydrates is of ecological importance and of increasing interest for biofuel applications (1), how bacteria depolymerize and metabolize algal carbohydrates remains unclear in many cases. Like the cell walls of terrestrial plants, the cell wall and storage polysaccharides of marine macroalgae comprise a massive sink of biomass and photosynthetically fixed carbon (2). Unlike the cell walls of terrestrial plants, however, the cell walls of macroalgae either lack lignin or are much less lignified. Furthermore, their matrix polysaccharides form flexible gels instead of crystalline polysaccharides like those of terrestrial plants, and this confers important properties to macroalgae (3). First, it permits dehydration and rehydration of the seaweed cell wall and thereby protects against desiccation occurring in the intertidal zone. Second, it enables the macroalgae to remain flexible enough to withstand the physical challenge imposed by this particular environment (e.g. waves and tides). Agars are the dominant gel-forming cell wall matrix polysaccharide (70%) in red seaweeds (agarophytes) and comprise a class of heterogeneous polymers having various degrees of sulfation and other substitutions including O-methylation. One of the major constituents of the cell wall of agarophytes, however, is the neutral and linear polysaccharide termed agarose (4,5).
Agarose is made of neoagarobiose motifs in which 3,6-anhydro-L-galactose is ␣-1,3-linked to D-galactose; the resulting disaccharide is linked by ␤-1,4-glycosidic bonds to form the linear polymer (see Fig. 1A). Agarose forms a helical structure stabilized by intra-and intermolecular hydrogen bonds to generate a high gel strength polymer. The 3,6-anhydro-L-galactose, which is a bridged bicyclic system containing fused five-and six-membered rings, is critical for helix formation. Polysaccharides that lack this modification, like porphyran, which is a sulfated agarocolloid related to agarose, form viscous solutions instead of flexible gels (6). The high gel strength of agarose and its resistance to degradation by many microbes have supported its wide utilization in microbiological, molecular biological, and food applications (7). However, marine bacteria have evolved systems of metabolic enzymes to enable utilization of this abundant marine polymer as both a carbon and energy source (4). These metabolic pathways are initiated by specific carbohydrate-active enzymes (CAZymes) 4 (8) that are responsible for the initial depolymerization steps.
The largest class of carbohydrate-degrading CAZymes are the glycoside hydrolases, which hydrolyze glycosidic bonds found within high molecular weight carbohydrates to produce mono-, di-, and short oligosaccharides. These enzymes are ubiquitous throughout all domains of life. Currently, 130 families of GHs are classified in the CAZy database (Carbohydrate-Active Enzyme database) (8). Enzymes within one family are related by amino acid sequence and thus share common protein folds and catalytic mechanisms (8). In general, glycoside hydrolases catalyze hydrolysis of the glycosidic bond using either a two-step double displacement mechanism resulting in retention of stereochemistry at the anomeric carbon or a one-step, single displacement mechanism that leads to a product with inverted stereochemistry at the anomeric center (9).
The highly polymerized nature of agarose, the presence of both ␣and ␤-bonds in its backbone, and its composition of two different monosaccharide units create a challenge for its degradation by CAZymes. As such, both endo-acting (cleaving internally in the backbone) and exo-acting (acting only on the chain ends) CAZymes that cleave the ␤-1,4-glycosidic bond of agarose ( Fig. 1A) have been described; these belong to glycoside hydrolase families 16, 50, 86, and 118. CAZymes that cleave the ␣-1,3-glycosidic bond presently belong to families 96 and 117. ␤-Agarases are relatively well studied, and much mechanistic and structural insight into the cleavage of the ␤-bond in agarose has been gained (4, 10, 11); however, comparatively little is known about the ␣-agarases (12).
Family GH96 comprises a unique but very small family of endo-␣-agarases. In contrast, family GH117 is substantially larger and more widely distributed in nature, suggesting a more general role of this enzyme family in agarose metabolism. Members of this family are known to cleave the ␣-1,3-glycosidic bond in neoagarose oligosaccharides to release 3,6-anhydro-L-galactose. This family was created based on the identification of the neoagarobiose hydrolase from Saccharophagus degradans (13), whereas the first structural and bioinformatic description for family GH117 was provided for Zg4663 (ZgAhgA) from the marine flavobacterium Zobellia galactanivorans (14). Proteins belonging to GH117 are the only known enzymes that release monosaccharides from agaro-oligosaccharides; thus, they are of critical importance for polysaccharide utilization in agarolytic bacteria. Most GH117 enzymes are found in bacteria associated with seaweed, coral, and the intestines of fish, which is consistent with the abundance of agarose in the marine environment, but surprisingly these enzymes are also found, albeit with lower frequency, in bacteria from soil, the oral cavity, and the human intestine (14,15). The x-ray crystal structures of two GH117 enzymes, ZgAhgA from Z. galactanivorans (14) and SdGH117 from S. degradans (13), revealed the structural similarity between GH117 and GH43 enzymes (16). In particular, conservation of active site residues has led to the postulation that the GH117 enzymes share an inverting catalytic mechanism with the GH43 enzymes (14). Significant alterations in the active sites of the two enzyme families, however, have kept predictions about the general mechanism of catalysis and the identity of the catalytic residues in GH117 speculative. Here we present a structural and mutational analysis of the GH117 enzyme from Bacteroides plebeius called BpGH117. The x-ray crystallographic analysis of an inactive mutant of the enzyme in complex with unhydrolyzed substrate reveals the complex suite of residues involved in recognizing substrate and hydrolyzing the glycosidic bond and provides strong support for the proposed inverting hydrolytic mechanism. This analysis reveals distortion of the constrained 3,6-anhydro-L-galactose residue to a conformation conducive to hydrolysis of the glycosidic bond. In the proposed inverting catalytic mechanism, a histidine residue is suggested to act as the general acid, which would be the first observation of its kind for a glycoside hydrolase and would add to the recent example of a histidine acting as the acid/base catalytic residue for the retaining catalytic mechanism used by the N-acetylglucosaminidase from Bacillus subtilis (17).

EXPERIMENTAL PROCEDURES
Gene Cloning, Protein Production, and Protein Purification-The portion of the gene encoding BpGH117 (locus tag Bacple_1671) without the signal peptide, which was identified by the SignalP server (18), was amplified by polymerase chain reaction from B. plebeius genomic DNA and comprised amino acid residues 22-402 (see Table 1 for primer sequences). The DNA that served as template for the PCR was extracted from B. plebeius obtained from DMSZ GmbH. The PCR product was cloned into pET-28 using standard directional cloning procedures using the restriction enzymes NheI and XhoI. The sequence was verified by bidirectional sequencing. The encoded protein, referred to as BpGH117, contained an N-terminal 6-histidine tag separated from the domain of interest by a thrombin cleavage site. This cloned gene construct was used as a template for site-directed mutagenesis using the QuikChange site-directed mutagenesis kit (Stratagene catalogue number 200518). The standard protocol was used with the primers listed in Table 1. All mutations were verified by bidirectional DNA sequencing. The protein production and purification protocol was identical for BpGH117 and mutants. Plasmids were transformed into chemically competent Escherichia coli BL21(DE3) Star cells (Invitrogen). For protein expression, 5 ml of LB medium supplemented with 50 mg/ml kanamycin was inoculated with a single bacterial colony and incubated at 30°C for 8 h. One milliliter was used to inoculate 1 liter of autoinduction medium (19), and the cultures were incubated with shaking for 5 days at 20°C.
Cells were harvested by centrifugation and resuspended in 25 ml of sucrose solution (25% w/v), 50 mM Tris-HCl, pH 8.0. Ten milligrams of lysozyme was added to the suspension, and stirring was continued for 10 min. 50 ml of a deoxycholate solution containing 1% deoxycholate (w/v), 1% Triton X-100 (v/v), 20 mM Tris-HCl, pH 7.5, 100 mM NaCl was added with continued stirring for 10 min. The solution was adjusted to 5 mM MgCl 2 , and 2 mg of DNase was added. Cellular debris was separated by centrifugation at 15,000 rpm for 45 min at 4°C.
The supernatant was applied onto a 2-ml Ni 2ϩ -charged nitrilotriacetic acid-Sepharose resin (GE Healthcare), which was equilibrated with binding buffer (20 mM Tris-HCl, pH 8.0, 500 mM NaCl). Proteins were eluted with a linear gradient of imidazole from 0 to 500 mM in binding buffer. Fractions were analyzed by SDS-PAGE, and fractions containing protein were pooled and dialyzed overnight into 20 mM Tris-HCl, pH 8.0. After dialysis, the proteins were further purified by anion exchange chromatography on a Bio-Rad UNO Q6 column. Proteins were eluted in 20 mM Tris-HCl, pH 8.0 with a linear gradient to 1 M NaCl; protein-containing fractions were concentrated in a stirred cell ultrafiltration unit using a 5-kDa molecular mass cutoff membrane. Following concentration, a final polishing step of size exclusion chromatography was performed on a Sephacryl 16/60 Hiprep S-200 column using 20 mM Tris-HCl, pH 8.0. The proteins were concentrated in a stirred ultrafiltration unit prior to crystallization and activity measurements. Selenomethionine-containing protein was produced in E. coli B834(DE3) cells in minimal autoinduction medium that lacked yeast extract and tryptone and was supplemented with selenomethionine (40 mg/liter) and the SelenoMet Nutrient Mix (5.1 g/liter), both from Athena Enzyme Systems.
Crystallization and Data Collection-Crystallization experiments were performed at 18°C by the vapor diffusion method using both hanging drops and sitting drops. Crystals of BpGH117 and mutants were obtained in conditions comprising polyethylene glycol 3350 at 15-25% (w/v) with 100 mM ammonium acetate (Sigma), 100 mM Bis-Tris, pH 5.5 (Sigma). Prior to crystallization, mother liquor and protein solution were mixed and centrifuged for 30 min at 12,000 ϫ g and 4°C to remove precipitated protein; crystallization experiments were set up with the supernatant. The selenomethionine derivative of BpGH117 was crystallized in 15% PEG 3350 with 100 mM ammonium acetate, 100 mM Bis-Tris, pH 5.5 by streak seeding with microseeds of native crystals. A complex with neoagarobiose was obtained by co-crystallizing BpGH117E303Q (at 10 mg/ml) with 2 mM neoagarobiose in 0.1 M HEPES, pH 7.5, 0.15 M ammonium acetate, 17-21% PEG 3350.
The crystals were cryoprotected by soaking in crystallization solution supplemented with 20% ethylene glycol or glycerol. The crystals were flash frozen in liquid nitrogen, and data collection was carried out on either BL9-2 at the Stanford Synchrotron Radiation Laboratories (Stanford, CA) or the CMCF1 beam line at the Canadian Light Source (Saskatoon, Canada). Data were processed using MOSFLM (20) and SCALA (21) as part of the CCP4 suite (22).
Structure Solution and Model Refinement-The structure of BpGH117 was solved by the single wavelength anomalous dispersion method using an x-ray wavelength optimized for the selenium peak determined by a fluorescence scan ( ϭ 0.97911). SHELXC/D was used to determine the selenium substructure, which comprised 20 of the possible 22 total seleniums present in the two BpGH117 monomers in the asymmetric unit (23). Phasing was performed with SHARP, which resulted in figures of merit of 0.43 and 0.17 for acentric and centric reflections, respectively, and a phasing power of 1.5 over the full resolution range of the data set (24). The 2-fold non-crystallographic symmetry operators were determined using PROFESS followed by phase improvement by solvent flattening (with an estimated solvent content of 44%) and non-crystallographic symmetry averaging using DM (22). These phases were sufficient for automatic model building of a nearly complete model using ARP/wARP (25). This model was corrected and completed by refinement with REFMAC (26) against the high resolution native data set followed by model building using Coot (27). The completed model was used as a starting point to solve the structures of the neoagarobiose complex and mutant proteins. In all cases, water molecules were added using Coot Find Waters and inspected visually prior to deposition. Five percent of the observations were flagged as "free" and used to monitor refinement procedures (28). Model validation was performed with MolProbity (29). All data collection and model statistics are given in Table 2 Activity Assays-Detection of enzyme activity by TLC was performed as described previously (30). The activity of the BpGH117 mutants was compared with that of the native protein by a time course digestion assay with an enzyme concentration of 0.5 M and neoagarobiose at a concentration of 0.5% (1.54 mM) in 50 mM NaPO 4 , pH 7.0. Aliquots were taken from each reaction at 5-min intervals and added to an equal volume of 0.1 M NaOH to stop the reaction. The samples were subsequently diluted into sterile distilled water to a final sugar concentration of 20 mM. The reactant and product sugars were then separated by high performance anion exchange chromatography with pulsed amperometric detection using a Dionex ICS 3000 HPLC equipped with an ASI 100 automated sample injector and an ED50 electrochemical detector (Dionex) with a gold working electrode and an Ag/AgCl reference electrode. Products were analyzed using a CarboPac PA-20 column set (analytical plus guard column). The running buffer comprised 0.1 M NaOH, and elution of the sugars was achieved using a linear gradient of 0 -300 mM sodium acetate in 0.1 M NaOH.
Bioinformatics-Homologues of BpGH117 were extracted from GenBank TM through Phylogeny.fr (31). The calculation of conservation scores was carried out with ConSurf (32) using 33 amino acid sequences, and the conservation was shown on the BpGH117 structure. The phylogenetic tree was constructed with MEGA5 (33) using the same data set that was used for the ConSurf calculation. The phylogenetic tree was calculated with the maximum likelihood method. A model for a member of Clade C was created using the Phyre server (34) with BpGH117 and the sequence of a homologue from Clade C, Zg3615 from Z. galactanivorans, which shares 40% sequence identity with BpGH117.

RESULTS
BpGH117 Is ␣-1,3-3,6-Anhydro-L-galactosidase-BpGH117 was overproduced by heterologous expression in E. coli and purified to homogeneity with three chromatographic steps. The activity of this recombinant protein on neoagaro-oligosaccharides was examined using thin layer chromatography to visualize the reaction products. Two products, one of which migrated faster than the D-galactose standard (G residue), appeared from degradation of neoagarobiose (LA-Gϳ), neoagarotetraose (LA-G-LA-Gϳ), and neoagarohexaose (LA-G-LA-G-LA-Gϳ) (Fig. 1B). This new high mobility product showed the typical color (not shown) and ratio-to-front (R f ) of 3,6-anhydro-L-galactose (LA residue) as observed with the res-orcinol staining procedure (30). The lower running bands could therefore be interpreted as Gϳ, G-LA-Gϳ, and G-LA-G-LA-Gϳ products. This product pattern is identical to the reaction profiles observed for ZgAhgA (14) and the S. degradans enzyme described by Suzuki et al. (35), indicating that BpGH117 possesses the same activity as these other two GH117 enzymes and therefore cleaves the ␣-1,3-glycosidic bond to release 3,6-anhydro-L-galactose from the non-reducing end of neoagaro-oligosaccharides.
General Structural Features of BpGH117-The x-ray crystal structure of BpGH117, determined to 1.7-Å resolution, revealed a dimer in the asymmetric unit (Fig. 2, A and B). Residues 33-401 for one molecule (Chain A) and residues 30 -402 for the second molecule in the dimer (Chain B) could be continuously modeled into the electron density. Each monomer folds into a globular 5-fold ␤-propeller comprising five ␤-sheets arranged radially around a central axis (Fig. 2, A and B). The ␤-sheets contain three to four antiparallel ␤-strands. BpGH117 forms a dimer with a total molecular interface of 8700 Å 2 of buried surface area, ϳ7500 Å 2 of which is contributed through interactions of N-and C-terminal extensions (Fig. 2, A and B). The structures of ZgAhgA and BpGH117, superposed as their dimers, display a root mean square deviation of 1.0 Å (36) over 708 aligned residues. This is consistent with their high amino acid sequence identity (Ͼ60%) and also reveals both the high structural identity of the monomers and the organization of the dimer. Similar to ZgAhgA, the N-terminal 28 residues of BpGH117 (residues 33-59) comprise a helix-turn-helix domain that reaches over to interact with the other monomer in the dimer. Thus, the main function of the helix-turn-helix domain in both enzymes is suggested to be in stabilizing the dimer by formation of extensive interactions between the two monomers. However, a notable difference between BpGH117 and ZgAhgA is that a 12-residue segment at the C terminus of BpGH117 (residues 390 -401) was found to be ordered and formed a loop structure that, like the N-terminal helix-turnhelix, interacts with the surface of the adjacent monomer in the dimer, thus increasing the contact surface between the monomers and likely stabilizing its quaternary structure. A similar set of interactions was observed for the dimer of the GH117 enzyme SdGH117 from S. degradans (37). The five blades of the BpGH117 ␤-propeller fold around a deep central cavity. In structurally related glycoside hydrolases from Clans F (GH43 and GH62) and J (GH32 and GH68), a similarly positioned pocket harbors the active site (16,38,39). In BpGH117 and ZgAhgA, this central cavity houses a water/ metal cluster with the metal being completely coordinated by water molecules in its inner coordination sphere (Fig. 2C)  1, 2, and 3 show the untreated standards, whereas lanes 4, 5, and 6 show the products resulting from treatment of the di-, tetra-, and hexasaccharide standards with BpGH117.  APRIL 20, 2012 • VOLUME 287 • NUMBER 17 JOURNAL OF BIOLOGICAL CHEMISTRY 13989 (supplemental Fig. 1). On the basis of the bond lengths (2.05-2.19 Å), B-factors, and coordination geometry, the metal atom bound to the BpGH117 monomers was modeled as a Mg 2ϩ ion. This choice of metal ion was further supported by the absence of any detectable anomalous difference peaks for the atom in data sets collected with x-ray wavelengths of 0.95-0.98 or 1.5419 Å (unreported data sets were ultimately superseded by higher resolution synchrotron data). The metal binding site is absolutely conserved. In the case of ZgAhgA, however, the ion was modeled as Zn 2ϩ ; a fluorescence scan of BpGH117 crystals at the zinc edge produced no signal, supporting the identity of the BpGH117 bound ion as something other than Zn 2ϩ , namely Mg 2ϩ . Likewise, this metal site is relatively well conserved with the structurally related GH43 enzymes like the ␣-L-arabinase Arb43A from Cellvibrio japonicus of GH43 (Protein Data Bank code 1gyd) in which it was modeled as a chloride ion (16). This and the appearance of Zn 2ϩ in ZgAhgA imply a degree of plasticity in the use of ions by these structurally similar and functionally related enzymes.

Analysis of ␣-Agarase from B. plebeius
The coordination sphere of water molecules surrounding the Mg 2ϩ adopts a near-perfect octahedral bipyramidal geometry, all of which is accommodated in a cavity of relatively large volume within the protein. This cavity is in part formed by the unique structural features of proline residues. Three central ␤-strands forming the wall of the metal cavity are interrupted by prolines (Pro 91 , Pro 166 , and Pro 246 ), one located in each of these strands (supplemental Fig. 1), and these residues are highly conserved in GH117. These prolines lead to a bulge in the three ␤-strand structures, creating a larger space in the center of the ␤-propeller where the metal/water cluster is located. The cluster is locked in this cavity by the carboxyl groups of Glu 167 from one side and by Asp 245 and Asp 90 on the active site face and by a number of main chain carbonyl oxygen atoms. In addition to hosting the bound metal, the central cavity also connects the putative active site with the distal side of the protein through a solvent channel running along the rotational axis of the ␤-propeller leading to a connected water network through the entire protein (supplemental Fig. 1), resembling ZgAhgA in this regard. Although it is clear that this metal ion is important to the catalytic activity of the enzymes in this family, its role in catalysis remains unknown.
The active site of BpGH117 is a pronounced pocket, which is typical for exo-acting glycoside hydrolases. To determine the catalytic residues and infer the catalytic mechanism, we structurally aligned BpGH117 with the catalytic domain of the ␤-xylanase XynB3 from Geobacillus stearothermophilus (39) and with the ␣-L-arabinase Arb43A from C. japonicus; the catalytic residues for both of these enzymes have been described previously (16) (Fig. 2C). In both of these inverting enzymes, the general base has been predicted as Asp 38 in Arb43A and Asp 15 in Xyn3B. In BpGH117, the residue Asp 90 aligns well with these two residues. The general acid has been assigned to Glu 221 in Arb43A and to Glu 187 in Xyn3B, whereas a histidine, His 302 , is located at this structural position in BpGH117. We found that His 302 has a strained geometry and is an outlier in the Ramachandran plot, which suggested a possible catalytic function. However, being an unusual candidate for a catalytic residue in glycoside hydrolases, it was the adjacent residue Glu 303 , which is not conserved in GH43, that has been proposed as a catalytic residue in GH117 (14). Thus, there remains ambiguity as to the identity of the catalytic residues in GH117 and the catalytic mechanism that this family uses. To provide insight into this, we systematically mutated residues in the active site.
Mutation of Active Site Residues-The predicted catalytic residues Glu 303 , Asp 90 , and Asp 245 were mutated to Gln, Asn, and Asn, respectively. Additionally, Glu 167 , which is located behind the metal/water cluster, was mutated to Gln, and His 302 , which is located in place of the general acid in GH43, was mutated to both Glu and Gln. All of these mutants crystallized under the same conditions as the wild-type enzyme and eluted without apparent signs of aggregation during the gel filtration purification (not shown), indicating the fidelity of their folds, but they all lacked activity under conditions where the wildtype enzyme hydrolyzed Ͼ95% of the neoagarobiose substrate (Fig. 2D). Although this did not resolve the ambiguity in the identity of the catalytic residues, it did indicate that all of the targeted residues are important to catalysis. This analysis also provided a tool to probe the molecular basis of the enzymesubstrate interaction and the more specific roles of these enzymes by obtaining the inactive E303Q mutant, which could be crystallized with neoagarobiose.
Structure of BpGH117 in Complex with Neoagarobiose-GH117E303Q was co-crystallized with an excess of neoagarobiose under the same conditions and in the same space group as the native protein. Clear electron density for fully occupied neoagarobiose molecules was found in both active sites of the dimer, allowing unambiguous modeling of the carbohydrate (Fig. 3A). The disaccharide was found with an intact glycosidic linkage between the LA residue in the Ϫ1 subsite and the G residue in the ϩ1 subsite (40). An extended network of hydrogen bonds and hydrophobic interactions is made between the neoagarobiose molecule and the enzyme active site with particular interactions in each of the subsites providing specificity for the sugar subunits (Fig. 3B) (41).
The ␤-face of the G residue in the ϩ1 subsite sits on a hydrophobic platform formed by Trp 127 and Phe 125 . The side chains of Arg 273 and Glu 262 approach the ␣-face of the G residue and make hydrogen bonds with the C4-OH. Remarkably, the C terminus of the adjacent monomer in the dimer, in addition to contributing to the dimer interface, plays a role in contouring the ϩ1 subsite (Fig. 3B). The C terminus from the adjacent monomer folds over the active site, narrowing the opening and introducing His 392 into the cleft. His 392 forms hydrogen bonds with both the endocyclic oxygen and the C6-OH of the G residue in the ϩ1 subsite. Although the C-terminal region was disordered in the ZgAhgA, His 392 is conserved in GH117 enzymes from marine Bacteroidetes, suggesting that His 392 may have a conserved functional role in substrate recognition. C1-OH of the G residue occupying the ϩ1 subsite is solventexposed, consistent with the ability of BpGH117 to remove LA from the non-reducing end of longer agarose oligosaccharides. Notably, in this complex, His 302 was found to be suitably located so that its N⑀2 is positioned at a distance of 2.6 Å from the glycosidic oxygen and appropriately oriented to make a hydrogen bond (Fig. 3A). Furthermore, Asp 320 interacts closely (2.6 Å) with the N⑀1 of His 302 , suggesting these two residues Analysis of ␣-Agarase from B. plebeius may act in tandem to effect general acid catalysis and relay a proton from Asp 320 to the leaving group (Fig. 3B) (17).
The LA residue is fully enclosed in the Ϫ1 subsite. Hydrogen bonds are made between C4-OH and Gln 180 , the 3,6-anhydro bridge oxygen to Thr 165 , and C2-OH to both Lys 260 and Asp 245 (Fig. 3B). In addition, the C4-OH makes a water-mediated interaction with a Cl Ϫ atom, which is structurally conserved in ZgAhgA (14). Trp 127 contributes to forming one wall of this subsite; however, this residue does not interact in the classical way of protein sugar recognition where the aromatic residue forms a platform for the face of a sugar in a sandwich-like manner. Here Trp 127 is oriented toward the side of the LA residue with the endocyclic ring oxygen (Fig. 3B).

GH117 Is Keystone Enzyme in Agarocolloid Degradation-
Agarolytic bacteria with sequenced genomes contain genes encoding varied numbers and combinations of GH86, GH16, and GH50 enzymes, all of which act on the ␤-1,4-glycosidic bond of agarose. For example, Z. galactanivorans utilizes mainly GH16 ␤-agarases (42) to degrade agar, whereas the estuarine bacterium S. degradans utilizes a mixture of GH86, GH50, and GH16 (43). All of these organisms also contain at least one gene encoding a conserved GH117 (supplemental Table 1), raising the question whether GH117 enzymes are the predominant enzymes to break the ␣-glycosidic bond in agarose or whether there are alternative endo-acting ␣-agarases that may play a significant role in agar degradation. The only other known examples with such an activity are ␣-agarases of family GH96; however, none of the currently available microbial genomes that are annotated in CAZy contain such an enzyme (supplemental Table 1). Compared with families GH16, GH50, GH86, GH117, and GH118, enzymes in family GH96 are extremely rare with only four sequences in CAZy (8). Both endo-acting ␣and ␤-glycoside hydrolases are identified by agar lysis around agarolytic bacteria, which has been the classical method for functional screening and agarase discovery (44,45), identifying both ␣and ␤-activities. Thus, screening bias does not account for the discrepancy in total numbers of highly abundant endo-␤-agarases versus extremely rare endo-␣-agarases, strongly suggesting that the latter are actually rare in nature. Because all agarolytic bacteria appear to contain at least one conserved GH117 ␣-agarase, it seems at present that this family is the primary evolutionary solution to breaking the ␣-glycosidic bond in agarose. Thus, this suggests that family GH117 proteins are keystone enzymes in agarose metabolism.
Relating Sequence to Specificity in GH117-Using Blast Explorer (Phylogeny.fr), we extracted 35 sequences with over 35% identity and over 75% query coverage to BpGH117. Examination of these family GH117 sequences, in light of the complex structure, reveals near invariance for catalytic and sugar binding residues His 302 , Glu 303 , Asp 90 , Trp 127 (except Cyclobacter marinum), Asp 245 , Gln 180 (except C. marinum), Thr 165 , and His 244 (except Lentisphaera araneosa and C. marinum). When calculating the conservation of residues in this data set with ConSurf, a small degree of variation has been found for residues around the G residue in subsite ϩ1 (Fig. 4A). The residues in subsite Ϫ1 are invariant, showing that recognition of the LA residue is conserved within these agarases like GH117. Thus, with the exception of C. marinum and L. araneosa, we can confidently predict these proteins to be active on agarose oligosaccharides for cleavage of the ␣-1,3-glycosidic bond. The phylogenetic tree initially shown by Rebuffet et al. (14) and an updated version with additional sequences presented here show that three separate Clades (A, B, and C) are formed within GH117. Apart from the highly conserved subsites ϩ1 and Ϫ1, the highest sequence variation in the active site of GH117 is found in Clade C. We found that the structural variation in Clade C is concentrated in proximity to the C4-OH of the LA residue in the Ϫ1 subsite (Fig. 4A). These residues are bulky aromatic amino acids in Clade A (i.e. Phe 115 , Phe 125 , Phe 162 , and Tyr 186 in BpGH117) but are substituted with smaller amino acids, often serines and alanines, in Clade C. This leaves considerably more space at the Ϫ1 subsite of Clade C enzymes, and we therefore hypothesize that sequences belonging to Clade C may have additional subsites, possibly allowing accommodation of longer substrates. We created a model (34) based on the BpGH117 structure using a homologue from Clade C, Zg3615 from Z. galactanivorans, which shares 40% sequence identity with BpGH117. The model showed structurally aligned residues for neoagarobiose recognition and an extended space, possibly for an additional Ϫ2 subsite. This may allow binding of a G residue in addition to the LA residue bound in subsite Ϫ1. A system containing additional GH117 enzymes with an extra minus (Ϫ) subsite would be advantageous for agarolytic bacteria because characterized enzymes of Clade A are inactive on their own reaction products, i.e. G-LA-Gϳ, due to the new non-reducing end (G residue), which blocks re-entry into the active site. These products, however, may be a substrate for and further degraded by a GH117 enzyme of Clade C, allowing synergistic degradation of agarose oligosaccharides. This may explain the occurrence of multiple GH117 enzymes in Z. galac-tanivorans and other agarolytic microbes. In conclusion, the unique features encountered in this class of enzymes reflect the unusual glycochemical character of its substrate. Once microbes evolved an efficient solution, the high constraints for the distortion of the 3,6-anhydro-L-galactose may have acted as a selection pressure, explaining the conservation observed in the active site of GH117 enzymes.
Substrate Distortion and Catalytic Mechanism in GH117-Quantum mechanical calculations predict that the 3,6-anhydro bond constrains the 3,6-anhydropyranose ring in either a 1 C 4 chair or B 1,4 boat conformation with an energy barrier of 10 kcal mol Ϫ1 to other conformations (46). Consistent with this latter point, only the 1 C 4 chair has been experimentally reported in the x-ray crystal structures of enzymes and carbohydrate binding modules in complex with neoagarose-oligosaccharides (42,47). Because the LA residue preferentially adopts a 1 C 4 chair rather than a 4 C 1 chair, the ␣-1,3-glycosidic bond in agarose mimics the conformation of a ␤-glycosidic bond. The enzymecatalyzed hydrolysis of ␤-glycosidic bonds is facilitated by distortion of the substrate to position the leaving group in a pseudoaxial orientation, thereby placing the lone pair on the endocyclic oxygen antiperiplanar to the scissile bond (48). This distortion relieves unfavorable 1,3-diaxial interactions involving the incoming nucleophile and permits in-line attack of the nucleophile at the anomeric center. The conformational FIGURE 4. Residues for agarose recognition are highly conserved throughout sequences belonging to glycoside hydrolase family 117. A, the conservation of active site residues (shown as sticks) in GH117 as calculated with ConSurf (see "Experimental Procedures"). The residues that interact with neoagarobiose in BpGH117 are largely conserved in GH117. A cluster of bulky hydrophobic residues that is located at the non-reducing end of the LA residue is less conserved in GH117; these are labeled. B, the phylogenetic tree of GH117 as proposed by Rebuffet et al. (14) and updated here with additional sequences. BpGH117 is indicated by a black circle, and GH117 from Z. galactanivorans is indicated by white circles. The clustering into clades is indicated. (A more detailed phylogenetic tree showing the identity of GH117 members is given in supplemental Fig. 2.) C, the BpGH117 active site is shown with the neoagarobiose complex and the aromatic residues that close the minus (Ϫ) subsite at the non-reducing end of the LA residue. BpGH117 belongs to Clade A of the phylogenetic tree, the sequences of which all share the cluster of bulky aromatic side chains at the non-reducing end of the LA residue and thus contain one minus (Ϫ) subsite. D, surface representation of a homology model of Zg3615 with neoagarobiose modeled into the active site.
change in the substrate also brings the glycosidic oxygen in position for efficient general acid catalysis by His 302 . Thus, one would predict that distortion of the terminal LA residue occurs during hydrolysis of the ␣-1,3-glycosidic bond by GH117 enzymes.
Indeed, the Michaelis complex formed from BpGH117 and neoagarobiose reveals that the LA residue in the Ϫ1 subsite adopts a B 1,4 conformation (Fig. 5A), suggesting a reaction coordinate involving distortion of the free substrate from a 1 C 4 conformation to a less favorable B 1,4 . The resulting substrate conformation enables efficient engagement of the catalytic machinery.
Based on partial conservation of active site residues with GH43 glycoside hydrolases, the GH117 enzymes were postulated to use an inverting mechanism. Glycoside hydrolases that catalyze with an inverting catalytic mechanism use general acid and base catalysis from two residues, typically either aspartate or glutamate. The general base enhances the nucleophilicity of a suitably positioned water molecule, enabling it to attack the anomeric carbon. In BpGH117, the pseudoaxial orientation of the leaving group arising from substrate distortion enables this in-line attack of a molecule of water, one positioned only 3.14 Å distant from C1 of the LA residue in the Ϫ1 subsite, on the * orbital at the anomeric carbon (9, 49, 50) (Fig. 3A). This water interacts with the side chain carboxylates of both Asp 90 and Glu 303 . Although Asp 90 is well positioned for an in-line activation of the catalytic water molecule, the Glu 303 residue may modulate the electrostatic environment and tailor the pK a of the general base Asp 90 . This is supported by mutation of either residue to the corresponding amide side chains, both of which inactivated the enzyme. Of these two residues, only Asp 90 is structurally conserved with the general base in the GH43 enzymes Arb43A and XynB3, generating further support for this residue as the more likely candidate for the general base catalytic residue.
The distorted conformation of the substrate (Figs. 3A and 5A) positions the glycosidic bond such that it could accept a proton from His 302 , which based on its distance from and orientation to the glycosidic bond oxygen is the only candidate for a general acid. His 302 is structurally conserved with the position of Glu 187 in the GH43 XynB3 (Protein Data Bank code 2exh) and with Glu 221 in Arb43A (Protein Data Bank code 1gyd) (Fig.  2B). In support of the catalytic role of His 302 , substitutions of this amino acid with glutamate or glutamine abolished the activity of the enzyme. Although glutamine is unable to act as a general acid, glutamate, however, is a common general acid found within glycoside hydrolases. Given the structural conservation with GH43 and the superposition of His 302 with the general acid of GH43, we expect that glutamate could have conferred activity to this H302Q variant. To analyze why H302E lacked activity, we crystallized the protein together with the substrate neoagarobiose. The structure of H302E lacks a bound neoagarobiose molecule, but nevertheless, its comparison with the Michaelis complex reveals that the side chain of Glu 302 is too distant from the glycosidic oxygen to serve as a general base (3.48 Å) (Fig. 5B). Furthermore, whereas His 302 interacts with Asp 320 (2.6 Å) in the native enzyme (Fig. 3B), mostly likely acting in tandem as a proton relay system, the Glu 302 of the H302E mutant makes no interactions with Asp 320 , resulting in the greater mobility of the Glu 302 side chain as judged by its elevated B-factors.
The structural features of the active site revealed by the complex of BpGH117 with neoagarobiose are therefore consistent with the assignment of an inverting catalytic mechanism. This mechanism displays the common feature of a carboxylate side chain, in this case Asp 90 , acting as a general base to activate the catalytic water (Fig. 6). An unusual feature of this mechanism, however, is the utilization of a histidine, His 302 , as the general acid (Fig. 6). Indeed, the Asp 320 -His 302 dyad of BpGH117 is similar to the Asp 232 -His 324 catalytic dyad observed for family 3 glycoside hydrolase BsNagZ from B. subtilis (17). A notable difference between the mechanisms of these two enzymes, however, is that BsNagZ uses a retaining mechanism, and therefore the histidine residue acts as both acid and base, whereas the histidine in BpGH117 acts only as an acid in the proposed inverting mechanism of catalysis for this enzyme.
GH117 enzymes are the only known enzymes that release monosaccharides from agar, the major cell wall polysaccharide A, the neoagarobiose molecule shows distortion of the 3,6-anhydro-L-galactose into the B 1,4 conformation. B, the catalytic residues in the mutant H302E are shown in green and are superposed with the E303Q mutant in complex with neoagarobiose (faint gray). The H302E mutant is inactive likely due to the longer distance between its carboxyl group and the glycosidic bond (3.6 Å) compared with the native His 302 residue (2.7 Å). WAT, water.
of agarophytes. Thus, the structural and mechanistic data presented add to the increasing understanding of how bacteria degrade seaweed polysaccharides, insight that is much needed for a refined understanding of the ocean carbon cycle. Furthermore, the unusual use of a histidine as a catalytic residue in this family of enzyme highlights the alternative mechanistic strategies that glycoside hydrolases can use to catalyze the hydrolysis of glycosidic bonds.