Role of Glycoside Phosphorylases in Mannose Foraging by Human Gut Bacteria*

Background: The relations between the gut microbiota, food, and host play a crucial role in human health. Results: Prevalent bacterial glycoside phosphorylases are able to break down dietary carbohydrates and the N-glycans lining the intestinal epithelium. Conclusion: GH130 enzymes are new targets to study interactions between host and gut microbes. Significance: Glycoside phosphorylases are key enzymes of host glycan catabolism by gut bacteria. To metabolize both dietary fiber constituent carbohydrates and host glycans lining the intestinal epithelium, gut bacteria produce a wide range of carbohydrate-active enzymes, of which glycoside hydrolases are the main components. In this study, we describe the ability of phosphorylases to participate in the breakdown of human N-glycans, from an analysis of the substrate specificity of UhgbMP, a mannoside phosphorylase of the GH130 protein family discovered by functional metagenomics. UhgbMP is found to phosphorolyze β-d-Manp-1,4-β-d-GlcpNAc-1,4-d-GlcpNAc and is also a highly efficient enzyme to catalyze the synthesis of this precious N-glycan core oligosaccharide by reverse phosphorolysis. Analysis of sequence conservation within family GH130, mapped on a three-dimensional model of UhgbMP and supported by site-directed mutagenesis results, revealed two GH130 subfamilies and allowed the identification of key residues responsible for catalysis and substrate specificity. The analysis of the genomic context of 65 known GH130 sequences belonging to human gut bacteria indicates that the enzymes of the GH130_1 subfamily would be involved in mannan catabolism, whereas the enzymes belonging to the GH130_2 subfamily would rather work in synergy with glycoside hydrolases of the GH92 and GH18 families in the breakdown of N-glycans. The use of GH130 inhibitors as therapeutic agents or functional foods could thus be considered as an innovative strategy to inhibit N-glycan degradation, with the ultimate goal of protecting, or restoring, the epithelial barrier.

the structure and/or quantity of host glycans due to biosynthetic defects or microbial degradation alter their barrier function and are thought to be involved in the initiation and the maintenance of mucosal inflammation in IBDs and in the development of intestinal cancer (12).
To breakdown these complex carbohydrates of either plant or human origin, gut bacteria produce a full repertoire of carbohydrate-active enzymes (CAZymes, listed in the CAZy database (13)), of which glycoside hydrolases (GHs) are the main constituents, as revealed by Gill et al. (14). In particular, the degradation of dietary mannans requires both endo-␤-mannanases and ␤-mannosidases to release mannose (15). The hydrolysis of N-glycans by a broad consortium of endo-and exo-, ␣-, and ␤-glycosidases, in particular those produced by the prominent gut bacterium Bacteroides thetaiotaomicron (16 -20), has also been described. Until August, 2013, only glycoside hydrolases (GHs) have been implicated in N-glycan breakdown.
However, other types of CAZymes participate in the breakdown of complex carbohydrates, particularly those of plant origin, by working in synergy with GHs as follows: carbohydrate esterases (CE), polysaccharide lyases (PLs), and glycoside phosphorylases (GPs). The presence of CEs and PLs is readily detectable in gut bacterial genomes, metagenomes, and metatranscriptomes, because they are sufficiently divergent from GHs as to be classified in their own CAZyme families. This is not the case for GPs, which are found both in the glycosyltransferase (GT) and glycoside hydrolase families, depending on the sequence and catalytic mechanism similarities shared with GT and GH archetypes, respectively. The prevalence of GPs and their role in the metabolism of carbohydrates is therefore difficult to evaluate on the basis of sequence data alone, as a given family of GHs (or of GTs) may include both GHs (or GTs) as well as GPs. GPs catalyze the breakdown of a glycosidic linkage from oligosaccharide or polysaccharide substrates with concomitant phosphate glycosylation to yield a glycosyl-phosphate product and a sugar chain of reduced length. These enzymes are also able to perform reverse phosphorolysis (in the so-called "synthetic reaction") to form a glycosidic bond between the glycosyl unit originating from the glycosyl-phosphate, which acts as the sugar donor, and a carbohydrate acceptor (23). Retaining GPs, for which phosphorolysis occurs via overall retention of the substrate anomeric configuration, are found in CAZy families GT4, GT35, and GH13. Inverting GPs are classified in families GH65, GH94, GH112, and GH130. Inverting GH-related GPs and hydrolytic enzymes use a similar single displacement mechanism, differing in the requirement of GPs for a single catalytic residue (the proton donor), and inverting GHs for two catalytic residues. In GP-catalyzed reactions, the reaction begins with the direct nucleophilic attack by phosphate to the glycosidic bond with the aid of the catalytic residue, which donates a proton to the glycosidic oxygen atom and then proceeds through an oxocarbenium cation-like transition state. In GH reaction mechanisms, the nucleophilic attack of the C-1 of the glycoside is performed by a water molecule activated by the catalytic base. The natural structural and functional diversity of GPs thus appears to be highly restricted because of the following: (i) they are found in only 7 of the 226 GH and GT families listed in the CAZy database (March 2013); (ii) approximately only 15 EC entries are currently assigned to GPs (24); (iii) their specificity toward glycosyl phosphates is limited to ␣and ␤-D-glucopyranose 1-phosphate (25)(26)(27)(28)(29), which are the most prevalent substrates; ␣-D-galactopyranose 1-phosphate (30); N-acetyl-␣-D-glucosamine 1-phosphate (31), and ␣-Dmannopyranose 1-phosphate (32). In July 2013, ␣-D-mannopyranose 1-phosphate specificity was described for just one enzyme produced by a human gut bacterium, the Bacteroides fragilis NCTC 9343 mannosylglucose phosphorylase (BfMP), FIGURE 1. Schematic representation of N-glycan processing by glycoside hydrolases and phosphorylases. Green spheres, mannosyl residues; blue squares, N-acetyl-D-glucosamine residues. The genomic cluster containing the GH130_2 encoding gene and other ones involved in N-glycan processing are shown for B. stercoris ATCC 43183 genome, and the UhgbMP encoding metagenomic clone (accession number GU94293). CAZyme encoding genes are colored according to the linkage they break down in the N-glycan structure, based on known biochemical data of the families to which they belong.
which converts ␤-D-mannopyranosyl-1,4-D-glucopyranose and phosphate into ␣-D-mannopyranose 1-phosphate and D-glucose (32). This enzyme has been implicated in the catabolism of linear dietary mannans, assisted by a ␤-1,4-mannanase and a mannobiose 2-epimerase. During review of this study, Nihira et al. (33) reported the discovery of a metabolic pathway for N-glycans, which includes a ␤-D-mannosyl-N-acetyl-1,4-D-glucosamine phosphorylase named BT1033, produced by the human gut inhabitant B. thetaiotamicron VPI-5482. BfMP and BT1033 are two of the four enzymes to be characterized in the recently created GH130 family, which includes a total of 447 entries, from archaea, bacteria, and eukaryotes. The other characterized GH130 enzymes are RaMP1 and RaMP2 from the ruminal bacterium Ruminococcus albus NE1 (34). These enzymes have also been proposed to participate in mannan catabolism in the bovine rumen and are assisted by an endomannanase and an epimerase, via RaMP2-and RaMP1-catalyzed phosphorolysis of ␤-1,4-manno-oligosaccharides and 4-O-␤-D-mannopyranosyl-Dglucopyranose, respectively (34). X-ray crystallographic studies show that GH130 enzymes share a 5-fold ␤-propeller fold. Currently, atomic coordinates data sets for four protein structures are available in the RCSB Protein Data Bank (35) as follows: BACOVA_03624 protein from Bacteroides ovatus ATCC 8483 (3QC2); BDI_3141 protein from Parabacteroides distasonis ATCC 8503 (3TAW); BT_4094 protein from Bacteroides thetaiotamicron VPI-5482 (3R67), and TM1225 protein from Thermotoga maritima MSB8 (1VKD). However, no function has yet been attributed to these four proteins, thus limiting the understanding of structure specificity relations for GH130 enzymes and the investigation of their catalytic mechanism.
Recently, the sequence of another GH130 enzyme, which we refer to as UhgbMP (unknown human gut bacterium mannoside phosphorylase, GenBank TM accession number ADD61463.1), was discovered by functional metagenomics of the human gut microbiota (36). The 36.6-kbp metagenomic DNA fragment containing the UhgbMP-encoding gene was taxonomically assigned to an as yet unidentified bacterium belonging to the genus Bacteroides. However, UhgbMP itself presents 99% protein sequence identity with the hypothetical protein BACSTE_ 03540 from Bacteroides stercoris ATCC 43183 (accession number EDS13361.1).
Here, we present an integrative approach, based on analyses of UhgbMP substrate specificity and of metagenomic and genomic data at the level of the entire human gut ecosystem, to reveal the role of UhgbMP and of 64 other GH130 enzymes produced by known gut bacteria in the breakdown of host and dietary mannose-containing glycans. In addition, we establish the molecular basis of GH130 enzyme catalysis, supported by the experimental results of rational engineering of UhgbMP and the analysis of its three-dimensional molecular model. Finally, we discuss the potential of this enzyme for the development of therapeutic agents and functional foods to inhibit N-glycan degradation, with the ultimate goal of protecting the epithelial barrier in the IBD context.
To allow heterologous UhgbMP production in E. coli with His 6 tag at the N-terminal extremity, the PCR product was purified and subsequently cloned into the pCR8/GW/TOPO entry vector (Invitrogen), and then into the pDEST17 destination vector (Invitrogen), according to the manufacturer's recommendations. E. coli BL21-AI cells (Invitrogen) harboring the UhgbMP-encoding plasmid were cultured at 20°C for 24 h in ZYM-5052 autoinduction medium (37) supplemented with 100 g/ml ampicillin, inoculated at A 600 nm 0.1. Cells were harvested and resuspended in 20 mM Tris-HCl, pH 7.0, 300 mM NaCl, and lysed by sonication. Soluble lysate was applied to a TALON resin loaded with cobalt (GE Healthcare) equilibrated in 20 mM Tris-HCl, pH 7.0, 300 mM NaCl. After column washing with 8 volumes of the same buffer supplemented with 10 mM imidazole, the protein was eluted in 20 mM Tris-HCl, pH 7.0, 300 mM NaCl, 150 mM imidazole. Finally, the protein sample was desalted on a PD-10 column (GE Healthcare) and eluted in 20 mM Tris-HCl, pH 7.0, 0.1% Tween 80 (v/v). In these conditions, 84% of UhgbMP remained soluble after 8 days at 4°C, thus allowing further functional characterization. The purity of the purified wild-type UhgbMP and mutants was evaluated higher than 95% by SDS-PAGE using Any kD TM Mini-PROTEAN TGX TM Precast Gel (Bio-Rad) (supplemental Fig.  1). After migration, proteins were stained with the PageBlue Protein Staining Solution (Thermo Scientific) according to the manufacturer's recommendations. Protein concentrations were determined by spectrometry using a NanoDrop ND-1000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA) The NanoDrop measurement error was 5%. The calculated extinction coefficient of the purified UhgbMP fused to an N-terminal His 6 tag was 76 630 M Ϫ1 ⅐cm Ϫ1 .
The percentage of reverse phosphorolysis activity of the D104N, D304N, and E273Q variants, compared with that of the wild-type enzyme, was determined with 0.1 mg/ml of purified proteins by quantifying ␣-D-mannopyranose 1-phosphate consumption rate from 10 mM ␣-D-mannopyranose 1-phosphate as glycosyl donor and 10 mM D-mannose as acceptor. ␣-D-Mannopyranose 1-phosphate was quantified by using high performance anion exchange chromatography with pulsed amperometric detection (HPAEC-PAD). Carbohydrates and ␣-D-mannopyranose 1-phosphate were separated on a 4 ϫ 250 mm Dionex CarboPak PA100 column. A gradient of sodium acetate (from 0 to 150 mM in 15 min) and an isocratic step of 300 mM sodium acetate in 150 mM NaOH was applied at a 1 ml⅐min Ϫ1 flow rate. Detection was performed using a Dionex ED40 module with a gold working electrode and a Ag/AgCl pH reference. Finally, the hydrolytic or phosphorolytic behavior of the wild-type UhgbMP and its Y103E variant was assessed by using 1 mM pNP-␤-D-mannopyranose, in the absence or presence, respectively, of 10 mM inorganic phosphate. The pNP release was monitored at A 405 nm on a carry-100 UV-visible spectrophotometer (Agilent Technologies). Between three and five independent experiments were carried out to determine initial activity, kinetic constants, and percentage of inhibition by carbohydrates and polyols of wild-type UhgbMP and its mutants. For all reaction rate measurements, it was checked by HPAEC-PAD that less than 10% of substrate was consumed and that the amount of consumed or released ␣-D-mannopyranose 1-phosphate increased linearly with time.
Three-dimensional Molecular Modeling-The UhgbMP sequence was submitted to the I-TASSER server for automated protein structure and function prediction (40). The homologous T. maritima TM1225 structure (PDB accession code 1VKD) was used to provide spatial restraints. The three-dimensional model of UhgbMP predicted by I-TASSER was then further refined by energy minimization using the CFF91 force field implementation in the DISCOVER module of the InsightII software suite (Accelrys, San Diego). The CFF91 cross-terms, a harmonic bond potential, and a dielectric constant of 1.0 were specified in the energy function. An initial minimization was performed with positional restraints on the protein backbone using a steepest descent algorithm followed by conjugated gradient minimization until the maximum RMS energy gradient was less than 0.5 kcal mol Ϫ1 Å Ϫ1 . The system was then fully relaxed without positional restraints. ␣-D-Mannopyranose 1-phosphate, ␤-D-mannoheptaose, and ␤-D-mannopyranosyl-1,4-N,NЈ-diacetylchitobiose were manually docked into the active site of UhgbMP. The ligand complexes were then optimized according to the minimization protocol described above with the ligand molecules free to move. Molecular graphics images were produced using PyMOL software (Schrödinger, LLC).
GH130 Multiple Sequence Alignment Analyses-The 369 public sequences of GH130 enzymes listed in the CAZy database in January, 2013, were aligned with MUSCLE version 3.7 (41). A distance matrix was generated from the multiple sequence alignment using the BLOSUM62 amino acid residue substitution matrix. The output result file was subjected to hierarchical clustering using Ward's method (39), and the resulting tree was visualized using DENDROSCOPE 3 (42). Two sequence clusters were clearly apparent, members of which were accordingly assigned to the GH130_1 or GH130_2 subfamilies.
Position-dependent amino acid residue variation in multiple sequence alignment data were analyzed using the Shannon information entropy measure (H X ), calculated using SEQUESTER software. The Shannon entropy (H X ) at residue alignment position (X), corrected for the normalized frequency of residue type occurrence, is computed as shown in Equation 1, where p iʈX is the conditional probability of residue type i occurrence at alignment position X, and p i is the normalized probability of residue type i occurrence at any position. To minimize sampling bias, normalized residue type probability values were taken as those documented by Ranganathan and co-workers (43), 3 garnered from sequence data for all natural proteins. Values of H X lie in the range 0 -1; a zero value corresponds to a fully conserved residue position, and a value of unity represents a distribution in which each residue type has an equal chance of occurrence. SEQUESTER appends Shannon entropies at aligned residue positions in a chosen reference protein structure to an atomic coordinate data file for convenient threedimensional visual display.
GH130 Genomic Context Analysis-The analysis of the habitat of the organisms displaying these 369 GH130 sequences, as referenced in the GOLD database, allowed us to sort out 28 public genomes of human gut bacteria that display GH130 sequences. The genomic context of these 63 sequences, as well as of the UhgbMP and the GenBank TM ADD61810 sequences belonging to the metagenomic sequences GU942931 and GU942945, respectively, was analyzed to identify the CAZy encoding genes that are present in the same multigenic cluster as a GH130_1 or a GH130_2 encoding gene. CAZy encoding genes were searched on the same DNA strand as the GH130 encoding gene, with an increment of 10 kbp maximum upstream or downstream the GH130 sequence. For each glycoside hydrolase (GH), polysaccharide lyase (PL), or CE family, the frequency of co-occurrence in a multigenic cluster with a GH130_1 or a GH130_2 sequence was calculated, pondered by the number of GH130_1 or GH130_2 sequences, and used as edge attributes in a Cytoscape representation. In total, 74 and 62 co-occurrences of GH, PL, or CE sequences with GH130_1 or GH130_2 sequences, respectively, were counted. Prediction of transmembrane topology and signal peptides was performed using PHOBIUS.

RESULTS
Uhgb_MP Substrate and Product Specificity-From the metagenomic sequence contained in the recombinant E. coli clone (accession number GU942931), we subcloned the UhgbMP encoding gene to produce a soluble protein tagged with a His 6 tag at the N terminus, with a yield of 32 mg of purified protein per liter of culture. We first characterized the substrate specificity of UhgbMP for carbohydrate phosphorolysis. In contrast to RaMP1 and BfMP, but similar to RaMP2 and to BT1033, UhgbMP exhibits a relaxed specificity toward carbohydrate substrates (Table 1). UhgbMP is able to phosphorolyze ␤-Dmannopyranosyl-1,4-D-glucopyranose, ␤-1,4-linked D-mannooligosaccharides, and mannan (␤-D-Manp-1,4-(D-Manp) n , with n ϭ 1-15), characterized by a notable increase in specific activity with the degree of polymerization (DP) (Fig. 2). UhgbMP is thus to date the only characterized GH130 enzyme that is able to break down mannan, a constituent of hemicellulose in grains and nuts (45). Interestingly, UhgbMP is also able to phospho-  ucts obtained during phosphorolysis reactions, irrespective of the carbohydrate substrate. UhgbMP is thus an exo-acting enzyme, able to breakdown only the first ␤-mannosidic linkage at the nonreducing end of oligosaccharides.
␣-D-Mannopyranose 1-phosphate was therefore tested as a glycosyl donor for reverse phosphorolysis. First, even without any carbohydrate acceptor, UhgbMP produces mannose and, furthermore, manno-oligosaccharides of DP ranging from 1 to 12, indicating that water itself plays the role of first acceptor at the beginning of the reaction. The ␤-1,4 regio-specific synthesis of manno-oligosaccharides was characterized by 1 H and 13 C NMR (supplemental Fig. 2). Various carbohydrates or polyols were tested as acceptors (Table 1). D-GlcpNAc and ␤-D-GlcpNAc-1,4-D-GlcpNAc were the best recognized acceptors, given that the UhgbMP K m values for these compounds are 4-and 55-fold lower than for D-mannose, respectively. Starting from ␣-D-mannopyra-  (Fig. 3). For ␤-D-GlcpNAc-1,4-D-GlcpNAc and ␤-D-Manp-1,4-␤-D-GlcpNAc-1,4-D-GlcpNAc concentrations lower than 1 and 0.5 mM, respectively, the most probable mechanism is also a mixed-type sequential random Bi Bi mechanism. The K value calculated in these conditions is 0.32, a value similar in magnitude to the value 0.658 obtained for D-Manp-1,4-␤-D-Glcp phosphorolysis by RaMP2 (34). Further investigations of UhgbMP mechanism will be necessary to determine the order of substrate binding and product release and to better understand how this enzyme works on its natural substrates, including ␤-D-Manp-1,4-D-GlcpNAc.
Finally, with all other compounds tested as acceptors, UhgbMP synthesizes the same (Man) n series as when ␣-D-mannopyranose 1-phosphate was the sole substrate. This shows that none of the other tested molecules were good acceptors, because water and additional mannose and manno-oligosaccharide units were used preferentially by the enzyme. However, the presence of cellobiose, D-fucose, L-rhamnose, L-and D-xylose, N-acetyl-D-galactosamine (D-GalpNAc), D-altrose, D-allose, xylitol, D-lyxose, or D-mannitol decreased enzyme-specific activity, in some cases quite markedly (Table 3). But concentrations of these compounds did not decrease during reaction, and no significant additional product was produced compared with reaction in the presence of ␣-D-mannopyranose 1-phosphate as sole substrate. These carbohydrates thus act as UhgbMP inhibitors. Various sugar-phosphates were also tested as glycosyl donors for reverse phosphorolysis as follows: ␣-D-fructose-1-and -6-phosphate, D-ribose 1-phosphate, ␣-D-galactosamine 1-phosphate, ␣-D-glucosamine 1-phosphate, D-mannose 6-phosphate, and ␣-D-glucopyranosyl 1-and -6-phosphates. None of them was consumed by UhgbMP, and no reaction product appeared on HPAEC-PAD chromatograms after 24 h at 37°C.
Key Residues Involved in Mannoside Phosphorolysis Mechanism-To investigate the UhgbMP catalytic mechanism, we first built a three-dimensional model of the enzyme using the atomic coordinates of the TM1225 protein from T. maritima MSB8 (PDB accession code 1VKD) as a structural template. The enzyme, which adopts a five-bladed ␤-propeller fold (Fig.  4A), was identified from structural genomics initiatives. Of the  NOVEMBER 8, 2013 • VOLUME 288 • NUMBER 45

JOURNAL OF BIOLOGICAL CHEMISTRY 32375
four proteins of known structure that share the same fold, TM1225 aligns with the highest sequence identity to UhgbMP (59%), but it has yet to be functionally characterized.
To identify putative catalytic amino acid residues, we analyzed the multiple alignment of the 369 protein sequences referenced in CAZy family GH130 (January 2013) (supplemental Fig. 3). The most conserved amino acid residue positions were then mapped onto the three-dimensional model of UhgbMP. Of these, Asp-104 and Asp-304 were found to be part of a narrow groove putatively considered to be the active site. Another conserved amino acid residue, Glu-273, present in a short ␣-helix turn section, was also identified in the groove.
These three amino acid residues were individually mutated to investigate their potential role in the UhgbMP catalytic

N-Glycan Degradation by Bacterial Glycoside Phosphorylases
mechanism. Results showed that mutation of Asp-104 into an asparagine completely abolished UhgbMP activity, although D304N and E273Q mutations retained only 3.9 Ϯ 0.5 and 0.2 Ϯ 0.1% of the native catalytic activity, respectively. These three amino acids were thus considered as candidate catalytic residues. To further elucidate the functional roles of these residues, molecular modeling techniques were used to dock ␣-D-mannopyranose 1-phosphate into the putative active site groove. Of the identified docking modes, only one appeared compatible with spatial constraints provided by the UhgbMP reaction mechanism (Fig. 4A) as follows: (i) provision for the specific recognition of mannose at the Ϫ1 subsite; (ii) substrate interaction with at least one of the (Asp-104, Asp-304, or Glu-273) putative catalytic acidic residue side chains; (iii) presence of a favorable phosphate-binding site allowing reverse phosphorolysis to take place; (iv) presence of binding subsites able to accommodate ␤-D-Manp-1,4-␤-D-GlcpNAc-1,4-D-GlcpNAc and ␤-1,4-D-mannan chains in catalytically productive binding modes with respect to active site residue(s) implicated in the chemical reaction mechanism (Fig. 4, B and C).
In this binding mode, where the ␣-D-mannopyranosyl 1-phosphate ring structure is stabilized in the Ϫ1 subsite through stacking interactions with Tyr-103, the Asp-104 residue located on the ␤-face of the catalytic chiral center can act as the unique proton donor during catalysis, whereas Arg-150, Arg-168, and Asn-151 amino acid residues can favorably assist phosphate group positioning consistent with inversion of configuration at C-1 of mannose (Fig. 4A). Given the high conser-vation of these amino acid residues within the GH130 family alignment, we suggest that enzymes contained in this family should share the same single displacement mechanism described for GH-like inverting phosphorylases. In such a mechanism, the aspartic acid corresponding to Asp-104 in UhgbMP would be the sole catalytic residue, having the role of proton donor (Fig. 5). The phosphate group, stabilized through ionic interactions with highly conserved Arg-150, Arg-168, and Asn-151 residues in UhgbMP, would then act as the nucleophile (Fig. 4A).

Molecular Basis of Substrate Specificity of GH130 Enzymes-
The biochemical data from the kinetic characterization of BfMP, RaMP1, RaMP2, BT1033, and UhgbMP revealed marked specificity differences among the five enzymes. The BfMP and RaMP1 ϩ 1 subsite is highly specific for glucose, whereas RaMP2, BT1033, and UhgbMP display looser specificity both toward carbohydrate substrates for phosphorolysis and acceptors for synthetic reactions. We therefore investigated whether the five characterized proteins could represent different GH130 subfamilies. Indeed, in most of the cases, enzymes classified in a CAZy subfamily share the same substrate and/or product specificity, reflecting a high degree of conservation in their active site (46,47). A phylogenetic tree was constructed, based on the multiple alignments of the GH130 enzyme sequences. Three clusters of sequences clearly appeared. BfMP and RaMP1 are contained in subfamily GH130_1 (79 sequences, Fig. 6 and supplemental Table 1). UhgbMP, RaMP2, and BT1033 are contained in subfamily GH130_2 (42 sequences), together with TM1225 (PDB accession code 1VKD). The GH130_NC cluster contains 248 sequences of as yet uncharacterized proteins that are too heterogeneous to permit the creation of univocal subfamilies. This cluster contains the other three proteins of known structure, namely the BDI_3141 protein from P. distasonis ATCC 8503, the BT_4094 protein from B. thetaiotamicron VPI-5482, and the BACOVA_03624 protein from B. ovatus ATCC 8483. However, a key residue position occupied by Tyr-103 in UhgbMP allows the discrimination of the GH130_NC sequences from those of the two other subfamilies. The tyrosine residue at position 103 in UhgbMP, which lies close to the phosphate group binding site in the three-dimensional model of the docked complex of the enzyme with ␣-Dmannopyranose 1-phosphate, is strictly conserved in subfamilies GH130_1 and _2 but is replaced by a glutamic acid in 234 of the 248 sequences of the GH130_NC group (supplemental Fig.  3). Moreover, residues Arg-150, Arg-168, and Asn-151 that could favorably assist phosphate group positioning in UhgbMP active site are not conserved in the GH_130 NC cluster, although they are perfectly conserved in the GH130_1 and GH130_2 sequences (supplemental Fig. 3). We therefore suspect that the majority of the enzymes classified as GH130_NC  are hydrolases and not phosphorylases. The Glu residue corresponding to Tyr-103 in UhgbMP is expected to act as the second catalytic residue, taking on the role of base. To validate this hypothesis, we attempted to transform UhgbMP, which possesses low intrinsic hydrolase activity when assayed in the absence of inorganic phosphate, into a hydrolase, through the replacement of Tyr-Y103 by a glutamic acid. In the presence of 10 mM P i , the pNP-␤-D-mannopyranoside breakdown activity of the wild-type enzyme was 6.3 ϫ 10 Ϫ3 Ϯ 3 ϫ 10 Ϫ4 mol⅐min Ϫ1 ⅐mg Ϫ1 , a value similar to that obtained for mutant Y103E (5.7 ϫ 10 Ϫ3 Ϯ 1 ϫ 10 Ϫ4 mol⅐min Ϫ1 ⅐mg Ϫ1 ). In contrast, without any phosphate, the hydrolytic activity of the wild-type was 1.4 ϫ 10 Ϫ3 Ϯ 1 ϫ 10 Ϫ4 mol⅐min Ϫ1 ⅐mg Ϫ1 , although it was 3.8 ϫ 10 Ϫ3 Ϯ 3 ϫ 10 Ϫ4 mol⅐min Ϫ1 ⅐mg Ϫ1 for Y103E. Nevertheless, the Y103E mutant pNP release curve versus time reached a plateau after only 10 min reaction at 37°C with and without phosphate, although less than 5% of substrate was consumed (supplemental Fig. 4). This phenomenon was also observed for the wild-type enzyme without phosphate. This indicates that inorganic phosphate is probably involved in maintaining the UhgbMP active site conformation and that the Y103E mutation may alter this conformation. Because the product yields are so weak without phosphate, it remains difficult to conclude that the Y103E mutant is really a hydrolase. Functional and structural investigations will thus be needed to confirm that the GH130_NC cluster contains mannoside hydrolases.
Our attempts to check correct folding of the Y103E variant, as well as that of the wild-type enzyme and of variants D104N, D304N, and E273Q by circular dichroïsm using a J-815 UV Spectrum spectropolarimeter (Jasco) failed, because of the high absorbance of Tween 80 at wavelengths between 200 and 290 nm (48).
To study the active site conservation in family GH130, we projected the results of the multiple alignments of the GH130 protein sequences onto the UhgbMP three-dimensional model, in which ␤-D-Manp-1,4-␤-D-GlcpNAc-1,4-D-GlcpNAc or ␤-1,4-linked D-mannoheptaose (Fig. 7) was docked to enable the mapping of putative binding carbohydrate subsites. Clearly, residues lining the UhgbMP catalytic furrow are highly conserved only among the GH130_2 sequences. The ϩ1 putative subsite appears delimited by Tyr-103, Asp-304, His-174, Tyr-240, and Phe-283 residues that are specifically conserved in the GH130_2 family ( Fig. 4B and supplemental Fig. 5). In the presence of ␤-D-Manp-1,4-␤-D-GlcpNAc-1,4-D-GlcpNAc, Tyr-103 and Asp-304 residues can form H-bond interactions with the N-acetyl-D-glucosamine positioned in the ϩ1 subsite. In the presence of ␤-1,4-linked D-mannoheptaose, the Tyr-103 can interact with the mannosyl residue located in the ϩ1 subsite, whereas Asp-304 establishes a hydrogen bond with the mannosyl moiety at ϩ2 subsite. Asp-304 thus probably plays a key role in substrate binding at ϩ1 and ϩ2 subsites, explaining why its mutation dramatically alters UhgbMP activity.
The Tyr-242 and Phe-283 residues provide additional staking platforms at ϩ1 subsite to stabilize the bound glycosyl unit. At the putative ϩ2 subsite, the N-acetyl-D-glucosamine residue at the reducing end of the ␤-D-Manp-1,4-␤-D-GlcpNAc-1,4-D-GlcpNAc is found stabilized through hydrogen bonding interactions with Tyr-242, Pro-279, Asn-280, and Asp-304, while establishing van der Waals interactions with Val-278. Interestingly, in our three-dimensional model of the active site, the N-acetyl groups of the D-GlcpNAc moieties docked at ϩ1 and ϩ2 subsites are found to fit nicely into a pocket formed by the loop Pro-271-Pro-284, which is locked in a "closed conformation" via a salt bridge ionic interaction between Asp-277 and Arg-65 residues ( Fig. 4B and supplemental Fig. 5). The conformation of this loop may be modified by the E273Q mutation, thus altering oligosaccharide accommodation in the active site. Sequence analysis indicates that such a loop is very well conserved within the subfamily GH130_2 but less within GH130_1 (supplemental Fig. 3). This could thus suggest an involvement of this loop in the specificity determination of GH130_2 enzymes toward manno-oligosaccharides longer than DP2.
The ϩ3 to ϩ6 putative subsites have been mapped using the docked mannoheptosaccharide (Figs. 4C and 7 and supplemental Fig. 5). Surprisingly, very few aromatic residues able to provide stacking interactions with the manno-oligosaccharide chain have been found. However, a dense network of likely hydrogen bonding interactions and van der Waals contacts could explain the increase of UhgbMP catalytic efficiency with the polymerization degree of manno-oligosaccharides. In the ϩ3 putative subsite, the mannosyl residue can be stabilized by  NOVEMBER 8, 2013 • VOLUME 288 • NUMBER 45 interactions with Tyr-240, Asn-238, and Val-241 residues, although at the putative ϩ4 subsite, the sugar moiety is rather engaged in interactions with Asn-280 and Asn-238 residues. Residues Tyr-264, Trp-208, and Arg-269 are seen to establish interactions with the mannosyl moiety located at subsite ϩ5. The length of the funnel binding site is found to accommodate up to ϩ5 subsites. Outside this funnel, the ϩ6 subsite has been putatively defined as sitting on the aromatic Trp-208 residue, which is almost perfectly conserved among the GH130_2 subfamily and not in GH130_1.

N-Glycan Degradation by Bacterial Glycoside Phosphorylases
Finally, two important differences between GH130_1 and GH130_2 subfamilies, identified by sequence alignment analysis, could explain why GH130_1 enzymes seem to exhibit a very narrow specificity toward ␤-D-Manp-1,4-D-Glcp, as observed for RaMP1 and BfMP, although GH130_2 enzymes would be able to act on longer manno-oligosaccharides. First, the UhgbMP loop Gly-121-Gly-125, which defines the extremity of the Ϫ1 subsite, is very well conserved within the GH130_2 subfamily, whereas an insertion of 12 residues is observed in all GH130_1 sequences (supplemental Fig. 3). This longer loop could thus prevent the accommodation of long oligosaccharides in the negative subsite of GH130_1 enzymes. Second, we observed that the His-174 residue, which is not conserved in the GH130_1 subfamily and is contained in the UhgbMP Pro-169 -Asp-179 loop, interacts with the sugar moiety bound at the ϩ1 subsite. Interestingly, a motif containing five glycine residues was rather found within the GH130_1 subfamily in place of His-174 observed in GH130_2 enzymes that could play a major role in substrate accommodation within the active site (supplemental Fig. 3).
Genomic Context of GH130 Encoding Genes, Focus on Human Gut Bacteria-Differences in substrate specificity of enzymes belonging to the GH130_1 and GH130_2 subfamilies may illustrate different roles in the microbial ecosystem in which they are produced. 23 of the 79 GH130_1, 17 of the 42 GH130_2, and 25 of the 248 GH130_NC enzymes belong to human gut bacteria. To assess how globally prevalent these GH130 encoding genes are in gut microbiomes, we compared the 369 GH130 sequences with the human fecal metagenome sequences currently available, sampled from 162 individuals of the MetaHit cohort (3,49) and 139 individuals of the NIH Human Microbiome Project (50). No less than 15 GH130_1, 10 GH130_2, and 14 GH130_NC sequences were detected in the fecal metagenome of at least 50 of the 301 individuals (Fig. 6). With the exception of four (one each from pig and chicken gut bacteria and two from cow rumen bacteria), these sequences all belong to human gut bacteria. The UhgbMP sequence was detected in the metagenomes of 93 of the 301 considered individuals. This indicates that it is not a rare gene and that it probably plays a critical role in mannose foraging in the gut. The highest occurrence values (sequences found in 173 to 197 individuals) were found for the GH130_1 sequence of Bacteroides uniformis ATCC 8492 and the GH130_2 sequences of B. ovatus ATCC 8483, Bacteroides caccae ATCC 43185, and B. thetaiotamicron VPI-5482 (BT1033 sequence) (GenBank TM accession numbers ZP_02068954.1, ZP_02067106.1, ZP_01958898.1, and AAO76140.1, respectively), which can be considered as common genes according to the Qin et al. definition (3), as they were found in more than 50% of the individuals. Interestingly, the mean number of GH130 BLAST hits per Mbp of sequence, obtained against the metagenomes of the 27 IBD patients (suffering either with Crohn disease or ulcerative colitis) of the MetaHit cohort, was 12, 28, and 58% higher than that obtained from the data sampled from the 135 other individuals (healthy or obese individuals), for GH130_1, GH130_2, and GH130_NC respectively (supplemental Table 2). This indicates a higher prevalence of GH130 encoding genes, and particularly of GH130_2 and GH130_NC, in the gut microbiome of IBD patients. This prevalence difference is thin, but it could be significant, considering that the gut microbiome of IBD patients harbors, on average, 25% fewer genes than that of healthy individuals (3).
The multigenic cluster containing UhgbMP encoding gene and that from B. stercoris ATCC 43183 containing its homolog BACSTE_03540 (accession number EDS13361.1) are clear examples of this gene organization with several GH92, GH18, and GH97 sequences surrounding a GH130_2 encoding gene ( Fig. 1 and supplemental Fig. 6). In these cases, the results of in silico detection of signal peptide or transmembrane topology indicate that UhgbMP and BACSTE_03540 are probably intracellular, although the glycoside-hydrolases encoded by the same multigenic clusters would rather be secreted.

DISCUSSION
The biochemical characterization of UhgbMP, a representative of the GH130 enzyme family, revealed its flexibility toward carbohydrate substrates. UhgbMP is capable of catalyzing the phosphorolysis of ␤-D-mannopyranosyl-1,4-D-glucopyranose, ␤-1,4-D-manno-oligosaccharides of DP Ͼ5 and mannan, as well as the N-glycan core oligosaccharide ␤-D-Manp-1,4-␤-D-GlcpNAc-1,4-D-GlcpNAc. Based on a sequence analysis of the GH130 family and on a structural model of UhgbMP, supported by the experimental findings presented here and else-where by Senoura et al. (32), Nihira et al. (33), and Kawahara et al. (34), we propose the creation of two GH130 subfamilies, of which the members probably share the same single displacement mechanism involving a single catalytic acidic residue (corresponding to UhgbMP D104) acting as the proton donor. Subfamily GH130_1 gathers together enzymes (including BfMP and RaMP1) exhibiting narrow specificity toward ␤-D-Manp-1,4-D-Glc. Conversely, enzymes of the GH130_2 subfamily show high promiscuity toward their substrates and products. Their active sites would be sufficiently extensible to accommodate complex carbohydrate structures such as ␤-Dmannan and ␤-D-manno-oligosaccharides, with or without ␤-D-GlcpNAc-1,4-D-GlcNAc, D-GlcpNAc, or D-glucose at their reducing end. The resolution of crystallographic structures of representatives of each of GH130 clusters in complex with substrates and products will, however, be necessary to confirm the role of the key residues identified here and to deepen our understanding of the reaction mechanisms operating in these enzymes.
Among the 369 enzymes archived in the GH130 family (January, 2013), 65 belong to human gut bacteria. The strong prevalence of the corresponding genes, in particular that coding for UhgbMP, in the human gut metagenome suggests that these enzymes have a major role for foraging mannose in this ecosystem. Here, we have demonstrated that UhgbMP, and probably also the 16 other enzymes in the proposed GH130_2 subfamily produced by gut bacteria (supplemental Table 1), including BT1033, are able to participate in breaking down the host mannosylated glycoproteins lining the intestinal epithelium.
In vivo, incorporation of GH reaction products into metabolic pathways is energy-consuming, although direct metabolism of the glycoside phosphates synthesized by GPs does not require ATP. Phosphorolysis reactions mediated by GPs would thus be more advantageous than hydrolysis under certain physiological conditions, for example during intensive use of carbohydrate resources or in anoxic environments such as the gastrointestinal tract, where ATP cannot be efficiently produced by the respiratory chain-linked phosphorylation process (51).
In common with other GPs, especially those previously identified in microorganisms having a facultatively anaerobic lifestyle like Bifidobacterium sp., Lactobacillus sp., or Clostridium sp (50), enzymes in the proposed GH130_1 and GH130_2 families can therefore be considered as catabolic enzymes, working in tandem with catabolic GHs in dietary and host mannoside breakdown.
Based on the analysis of the genomic context of GH130_2 encoding genes in gut bacteria, and accordingly with Nihira et al. (33), we propose that mannoside phosphorolysis catalyzed by GH130_2 enzymes acts in concert with GH activities of enzymes in the GH18 and GH92 families, to completely break down N-glycans, as proposed in Fig. 1. The role of the GH97s whose genes belong to the same multigenic systems as those coding for the proposed GH130_2 enzymes (for example, in B. stercoris ATCC 43183) is less clear. Family GH97 contains only three characterized enzymes, of which two are ␣-glucosidases with ␣-1,4-link specificities (52,53). However, the functional diversity of GH97 enzymes has not yet been thoroughly explored, and if these enzymes were previously thought to contribute to dietary carbohydrates in the human gut, they may well also be active in the breakdown of N-glycans, for example by liberating, like the GH99 enzymes (18), the ␣-1,3-linked glucosyl residue at the nonreductive end of the immature N-glycans. In the particular case of the unknown gut bacterium producing UhgbMP and of B. stercoris ATCC 43183 producing its homolog BACSTE_03540, and based on the results of in silico detection of signal peptides, we assume that the UhgbMP and BACSTE_03540 physiological role would be the intracellular phosphorolysis of short oligosaccharides that can be internalized in the cell (di-or trisaccharides (54)), like the ␤-D-Manp-1,4-D-GlcpNAc disaccharide or the ␤-D-Manp-1,4-␤-D-GlcpNAc-1,4-D-GlcpNAc trisaccharide resulting from extracellular hydrolysis of N-glycans by glycoside hydrolases belonging to the GH18, GH92, and maybe also GH97 families. This deglycosylation arsenal thus allows one to deprotect the glycoproteins of the intestinal epithelium. Their protein part could also thus be degraded by the endopeptidases, similarly to those putatively encoded by the genes ADD61465.1 and ADD61469.1, which belong to the same metagenomic DNA fragment than the UhgbMP encoding gene, so as to perforate the intestinal epithelium (Fig. 1). However, further studies will be needed to confirm the physiological role of these enzymes, like metabolomic and transcriptomic analyses of gut bacteria producing GH130_2 enzymes (like B. stercoris ATCC 43183) in the presence of N-glycans as carbon source.
The results presented here show that in vitro, UhgbMP is also able to effectively degrade manno-oligosaccharides and mannan itself. This is also probably the case for other gut bacterial enzymes of the proposed GH130_2 subfamily, which show a similar active site topology, allowing the accommodation of long oligosaccharides. The intestinal bacteria that produce GH130_2 enzymes would thus be able to use dietary hemicelluloses and their hydrolysis products as carbon sources if needed, which would give them a competitive advantage with respect to the other bacteria, to colonize and maintain themselves in the intestinal tract. The use of mannose-rich carbohydrates (plant or yeast mannans, linear ␤-1,4-linked manno-oligosaccharides, or ␤-D-Manp-1,4-␤-D-GlcpNAc-1,4-D-GlcpNAc) as functional foods could thus allow the metabolism of the N-glycan-degrading bacteria to be bypassed and, eventually, may reduce the degradation of the epithelial barrier for therapeutic applications, especially for patients suffering from IBD. Indeed, in their gut metagenome, the known GH130_2 and GH130_NC encoding genes seem highly prevalent, even if enzyme assays on fecal samples, as well as metatranscriptomic and metaproteomic studies, should be performed as evidence that GH130 could be biomarkers of IBDs. However, this approach presents a risk of overfeeding this specific type of bacteria and increasing their prevalence in the gut, and thus the potential for N-glycan breakdown, in the case of reduced intake of exogenous mannose-rich carbohydrates. The utilization of GH130 enzyme inhibitors, like D-altrose, D-xylose, and D-allose for UhgbMP or suicide inhibitors that could be specifically designed thanks to the available or future structural data, would thus be the preferred strategy in therapeutic contexts.
Finally, we have shown here that the relaxed specificity displayed by UhgbMP toward acceptor substrates can be used for the stereo-and region-selective synthesis of original glyco-con-jugates and oligosaccharides. In particular, this enzyme is able to synthesize ␤-1,4-linked D-manno-oligosaccharides, which have been reported to present prebiotic properties (55), even if their use as functional food may be risky, from our point of view. UhgbMP is also highly effective for production of N-glycan core oligosaccharides, such as ␤-D-Manp-1,4-D-GlcNAc and ␤-D-Manp-1,4-␤-D-GlcpNAc-1,4-D-GlcpNAc, whose commercial price today exceeds $10,000 per mg. UhgbMP-based ␤-mannoside synthesis processes appear as highly attractive compared with those based on mannosyltransferases, which use expensive activated sugar nucleotides as donors (56,57), or on transmannosylation catalyzed by native or engineered mannosidases or mannanases (44, 58 -62). Indeed, a two-step UhgbMP-based process would allow one to use phosphate and ␤-mannan as substrates to first catalyze phosphorolysis and second to reverse phosphorolysis in the presence of hydroxylated acceptors to synthesize mannosylated products. UhgbMP is thus an enzymatic tool with high potential for synthesizing molecules like N-glycan core oligosaccharides, of major importance for studying, and potentially also for controlling interactions between host and gut microbes.