Characterization of Glycosaminoglycan (GAG) Sulfatases from the Human Gut Symbiont Bacteroides thetaiotaomicron Reveals the First GAG-specific Bacterial Endosulfatase*

Background: Sulfatases are emerging as key adaptive tools of commensal bacteria to their host. Results: The first bacterial endo-O-sulfatase and three exo-O-sulfatases from the human commensal Bacteroides thetaiotaomicron, specific for glycosaminoglycans, have been discovered and characterized. Conclusion: Commensal bacteria possess a unique array of highly specific sulfatases to metabolize host glycans. Significance: Bacterial sulfatases are much more diverse than anticipated. Despite the importance of the microbiota in human physiology, the molecular bases that govern the interactions between these commensal bacteria and their host remain poorly understood. We recently reported that sulfatases play a key role in the adaptation of a major human commensal bacterium, Bacteroides thetaiotaomicron, to its host (Benjdia, A., Martens, E. C., Gordon, J. I., and Berteau, O. (2011) J. Biol. Chem. 286, 25973–25982). We hypothesized that sulfatases are instrumental for this bacterium, and related Bacteroides species, to metabolize highly sulfated glycans (i.e. mucins and glycosaminoglycans (GAGs)) and to colonize the intestinal mucosal layer. Based on our previous study, we investigated 10 sulfatase genes induced in the presence of host glycans. Biochemical characterization of these potential sulfatases allowed the identification of GAG-specific sulfatases selective for the type of saccharide residue and the attachment position of the sulfate group. Although some GAG-specific bacterial sulfatase activities have been described in the literature, we report here for the first time the identity and the biochemical characterization of four GAG-specific sulfatases. Furthermore, contrary to the current paradigm, we discovered that B. thetaiotaomicron possesses an authentic GAG endosulfatase that is active at the polymer level. This type of sulfatase is the first one to be identified in a bacterium. Our study thus demonstrates that bacteria have evolved more sophisticated and diverse GAG sulfatases than anticipated and establishes how B. thetaiotaomicron, and other major human commensal bacteria, can metabolize and potentially tailor complex host glycans.

Bacteria are by far the most abundant and diverse microorganisms inside the human gastrointestinal tract with Firmicutes and Bacteroidetes being the major phyla (1)(2)(3). How these bacteria efficiently compete in this fierce ecosystem is still poorly understood despite their major role in human physiology (4). Although the human gut provides a constant source of nutrients and tightly controlled conditions, only a few phyla are abundant, suggesting a selective adaptation of these bacteria to their host (1). Key studies have highlighted that the prominent gut bacterium Bacteroides thetaiotaomicron has evolved complex relationships with its host (5,6). Of particular interest, it has been established that B. thetaiotaomicron relies on host glycan foraging to enhance the fitness to its host (7). Recently, we have shown that in this process sulfatases play a pivotal role to allow B. thetaiotaomicron persistence in the gastrointestinal tract (8).
Sulfatases catalyze hydrolysis of sulfate groups from a broad range of substrate molecules, including small organic compounds to large macromolecules such as mucins or glycosaminoglycans (GAGs). 5 In humans, sulfatases have been studied in great detail and are shown to be involved in genetic disorders or cancer (9). Most of them are lysosomal, and at least six are exo-enzymes responsible for GAG degradation (10). In addition, one GAG endosulfatase has been identified and shown to be critical for heparan sulfate (HS) remodeling (11). Opposite to the mammalian sulfatases, few bacterial GAG sulfatases have been characterized with three heparin sulfatases identified in the bacterium Flavobacterium heparinum (now called Pedobacter heparinus) (12)(13)(14). In addition, several GAG sulfatase activities, acting on chondroitin sulfate (CS), have been reported in B. thetaiotaomicron (15) and Proteus vulgaris (16).
Among hydrolases from bacteria to human, sulfatases are unique in requiring a critical amino acid post-translational modification on a key active site residue. This modification leads to the formation of the so-called "formylglycine" (3-oxoalanine), which is the only naturally occurring amino acid having an aldehyde function (17). Although the role of formylglycine in catalysis is still not completely understood, we have shown this modification to be dependent on three distinct enzymatic systems in bacteria (17)(18)(19)(20). In B. thetaiotaomicron only one of these systems, the anaerobic sulfatase-maturating enzyme (anSME), which is part of the large and diverse family of radical S-adenosyl-L-methionine enzymes (21,22), is involved in sulfatase activation (20). By inactivating anSME, we have demonstrated that the ability of B. thetaiotaomicron to survive in the gastrointestinal tract is strongly impaired (8). Transcriptomic analyses further revealed that, in the absence of a functional anSME, the expression profile of genes related to host glycan metabolism, especially encoding putative sulfatase genes, was markedly altered suggesting their involvement in host glycan metabolism.
To decipher the function of these potential sulfatases, we cloned 10 B. thetaiotaomicron genes, nine of which are induced by host glycans and five belong to polysaccharide utilization loci (Table 1) (7,8). These genes were expressed in Escherichia coli in the presence of the B. thetaiotaomicron anSME (anSMEbt) that we have previously characterized (20,23). The activity of the soluble purified proteins was then assayed with synthetic and natural GAG substrates. We especially focused on heparin/HS and chondroitin sulfate/dermatan sulfate (CS/ DS), two major classes of sulfated GAGs, to determine the activity and specificity of these enzymes. Both classes of GAGs are characterized by alternating units of a hexuronate and an amino sugar: N-acetylglucosamine (GlcNAc) or N-acetylgalactosamine (GalNAc), respectively, with heparan polymer containing GlcUA␤1-4GlcNAc␣1-4 and CS GlcUA␤1-3GalNAc␤1-4 as basic repeat units.
During biosynthesis, these structures are further modified by extensive sulfation and epimerization. In HS, part of the GlcNAc are N-deacetylated and N-sulfated during biosynthesis resulting in an N-sulfated GlcN unit (GlcNS). The hexuronate, either glucuronate (GlcUA) or its C5-epimer iduronate (IdoUA), may be 2-O-sulfated, whereas the GlcNAc and GlcNS units can further be 6-O-and, rarely, 3-O-sulfated. CS/DS can be sulfated at position 2 on IdoUA and, more rarely on GlcUA, at positions 4 and 6 of the GalNAc unit. These sulfate groups are not only critical for the functional properties of the GAGs but also confer resistance to the hydrolytic activity of bacterial glycosidases (24).
In B. thetaiotaomicron, we identified two different 6-O-sulfatases with a strict specificity for either gluco-or galactosaminoglycans. We also report a 2-O-sulfatase that removes sulfates from a hexuronate unit independent of the parent GAG and whose structure is reported. Whereas these three enzymes are exolytic hydrolases, like all bacterial sulfatases reported to date, we also identified, in a totally unanticipated manner, a unique GAG endosulfatase. This novel enzyme very efficiently removes sulfate groups in the 4-O-position from CS/DS disaccharides to large polymeric chains and is thus the first bacterial GAG endosulfatase reported to date active at the polymer level.
We not only discovered novel bacterial sulfatases, we also demonstrate the complex interplay between these enzymes for host glycan degradation. Finally, we show that these enzymes are widespread in gut-associated bacteria underlining the importance of such metabolic pathways for a dominant phylum of the human microbiota, the Bacteroides. These novel sulfatases constitute attractive targets to manipulate the human microbiota and could potentially influence gut epithelial integrity by their unique capacity to selectively modify human GAGs.
Cloning of Sulfatase Enzymes-Sulfatases were cloned into the pRSF-Duet1 expression vector (Novagen) at the MCS1 cloning site with an N-terminal His 6 tag that allowed purification on the Ni-NTA column. The sulfatase-maturating enzyme from B. thetaiotaomicron (anSMEbt (20)) was cloned into the same vector at the MCS2-cloning site that allowed sulfatase to be activated during expression in E. coli. The expression vectors were transformed into One Shot BL21 Star TM (DE3) E. coli (Invitrogen). The following primers were used for the corresponding sulfatase genes: BT_0756, 5Ј-GGA TCC AAT GGC  CTC TGC TGT GCA-3Ј and 5Ј-CTG CAG TTA TTT TTT  TAA TGA TGA TTT CTT CAC TT-3Ј; BT_1596, 5Ј-GGA  TCC AAT GGG ATT AGC CCT TTG TGG-3Ј and 5Ј-CTG  CAG TTA TTT TCT TTT GAG GAT CTC CCG-3Ј; BT_1624,  5Ј-GGA TCC AAT GAG AAA AGA ATT TTA TGG TAT  ATT ACC-3Ј and 5Ј-CTG CAG TTA TAG TGG CAG ACC  GTA GCG-3Ј; BT_1628, 5Ј-GGA TCC AAT GCC GGA AGG  CCA TC-3Ј and 5Ј-CTG CAG TTA TTC CTT GTC CCT TTC  CG-3Ј; BT_1918, 5Ј-GGA TCC AAT GAT TAA CCT GAA  ATG TAC ATT TGC-3Ј and 5Ј-CTG CAG TTA TCG CTT  TTC TTT CGG ATA GTT-3Ј; BT_3095, 5Ј-GCA TGA CTG  AAT TCA ATG CAG CGT TTT GTA TTA CGG-3Ј and  5Ј-GAG TCT ACG TCG ACT TAT TGT TCG GGT TGG  AGA TAA TT-3Ј; BT_3101, 5Ј-GGA TCC AAT GAA TAG  ACT ATT TTT GAG TGT TTC TGT-3Ј  Expression and Purification of Sulfatase Enzymes-E. coli was used for cloning and expression of sulfatase genes using previously reported procedures (20). Briefly, plasmid coexpressing sulfatase genes with anSMEbt were selected on LB-agar plates containing 50 g/ml kanamycin and grown in LB medium supplemented with the same antibiotic at 37°C under agitation at 200 rpm. When the A 600 reached 0.7, incubation temperature was lowered at 21°C, and sulfatase expression was induced by adding isopropyl ␤-D-1-galactopyranoside (500 M final concentration). The culture was continued overnight, and bacteria were harvested by centrifugation at 5000 ϫ g for 20 min at 4°C. After resuspension in buffer A (50 mM Tris/HCl, pH 7.0, 100 mM KCl, 10 mM MgCl 2 ), the cells were disrupted by sonication and centrifuged at 100,000 ϫ g for 1 h at 4°C. The cell extract was then loaded onto a Ni-NTA-Sepharose column equilibrated with buffer A. The column was washed extensively with the same buffer. Weakly adsorbed proteins were washed off by applying 3 column volumes of 25 mM imidazole in buffer A, followed by 2 column volumes of 100 mM imidazole in the same buffer. Sulfatases were then eluted by applying 1 column volume of 500 mM imidazole in buffer A. Imidazole was removed using PD-10 desalting columns (GE Healthcare) equilibrated in buffer A. The sulfatase-containing fractions were immediately concentrated in Ultrafree cells (Millipore) with a molecular cutoff of 10 kDa. Sulfatase purity was assessed by SDS-PAGE, and their identity was confirmed by mass spectrometry analyses.
To measure sulfatase activity on natural GAGs, sulfatases were incubated at a final concentration of 10 M in 20 l of 50 mM Tris/HCl, pH 7.5, 100 mM KCl, 10 mM MgCl 2 , containing 1 g of GAG at 30°C for 8 h. Sulfatase activity was stopped by boiling the sample at 96°C for 10 min. For specificity analyses, variable enzyme concentrations were used as indicated in the figure legends. Sulfated saturated mono-and unsaturated disaccharides were used at a 20 M final concentration and enzymatically digested as indicated, followed by analysis by reversed-phase ion pairing-high performance liquid chromatography (RPIP-HPLC).
Analysis of Sulfatase Products by Capillary Electrophoresis-Analyses by capillary electrophoresis (CE) were carried out on an Agilent CE apparatus, using a bare fused-silica capillary (64 cm ϫ 50 m). New capillaries were conditioned by successive flushes with NaOH (1 and 0.1 M) and water for 10, 5, and 10 min, respectively. Borate buffer (Agilent Technologies) (23 mM, pH 9.0, filtered through 0.2-m filtered before use) was used as CE background electrolyte. The capillary was thermostatically controlled at 25°C, and CE experiments were performed by applying a positive voltage of 30 kV at the capillary inlet. Prior to each sample injection, the capillary was flushed with the separation electrolyte for 10 min. Samples were loaded hydrodynamically by applying 50 millibars at the capillary inlet for 4 s. Detection was performed at 192 nm.
Analysis of Sulfatase Products by Reversed-phase Ion Pairing Chromatography-Sulfatase-treated GAGs (1 g) were prepared for RPIP-HPLC analysis by exhaustive digestion to disaccharides with either 20 milliunits chondroitinase ABC (for CS/DS) or a mixture of 0.4 milliunit each of heparin lyase I-III (for heparin or HS) at 37°C overnight as described (28) before heat inactivation of the enzymes. Products were separated by RPIP-HPLC and monitored by post-column fluorescence detection as described earlier (28).
Structure of the BT_1596 Sulfatase-The BT_1596 structure was solved by the Joint Center for Structural Genomics (29) and deposited in the PDB under code 3B5Q. Manual docking of the enzyme substrate was made using the disaccharide ⌬HexA2S-GalNAc4,6S derived from a structurally characterized heparin tetrasaccharide (PDB code 1BFB). The human cerebroside-3sulfate 3-sulfohydrolyase (ArsA) in interaction with a synthetic substrate (PDB code 1E2S) was superimposed with the BT_ 1596 structure using Coot and the SSM program. The heparin disaccharide was manually docked using ArsA substrate as reference.
NMR Analysis-NMR experiments were performed on a Bruker 600 MHz Avance III spectrometer using a 2.5-mm 1 H/ 13 C inverse detection probe, a 5-mm 1 H/ 13 C/ 15 N/ 31 P inverse detection QCI probe, or a 5-mm 1 H/ 13 C/ 15 N/ 31 P cryoprobe all equipped with z-gradient and controlled by Topspin 2.1 and 3.0 software. The incubated samples were repeatedly exchanged with D 2 O with intermediate lyophilization. The residual HOD signal was suppressed by saturation of the water peak during the recycle delay or using the WATERGATE pulse sequence. For the one-dimensional NMR experiments with the substrates incubated with the enzymes, the diffusion-edited NMR experiment (ledbpgp2s1d) was used for effective suppression of the residual water and buffer signals while retaining the signals from the substrates. The diffusion delay ⌬ and the gradient pulse length ␦ were 100 and 1.9 ms, respectively. The gradient strength was set to 95%. Assignments of 1 H and 13 C resonances were achieved using COSY, TOCSY, NOESY, HSQC, HMBC, and HSQC-TOCSY experiments from the Bruker pulse sequence library.

Expression and Functional Assay of B. thetaiotaomicron
Sulfatases-Based on our and other transcriptomic studies (7,8), we selected 10 potential sulfatase genes in the B. thetaiotaomicron genome, induced by host glycans (Table 1). These potential sulfatases were cloned and expressed in E. coli in the presence of the sulfatase-activating enzyme (anSMEbt) from B. thetaiotaomicron (20). Indeed, we have shown earlier that anSMEbt is critical for the post-translational modification and thus the activation of sulfatases from B. thetaiotaomicron (20). Purified sulfatases were first assayed with the simple chromogenic substrate pNP-S ( Table 1). Half of these enzymes proved to be active on this synthetic substrate identifying them as authentic sulfatases.
We further synthesized a more relevant substrate in which the sulfate group was on a carbohydrate moiety rather than on an aromatic ring. This substrate (GlcNAc6S-O-pNP; see Scheme 1a and "Experimental Procedures") (25) also contained a chromogenic group allowing the spectroscopic monitoring of the reaction by a coupled enzyme assay (Table 1). All the purified sulfatases were tested on this substrate; however, only two sulfatases, BT_4656 and to a much lesser extent BT_1628, hydrolyzed this reagent. Interestingly, despite its strong activity on this substrate, the BT_4656 sulfatase was inactive on pNP-S (Table 1).
Sulfatase Activity on Synthetic GAG-like Substrates-The B. thetaiotaomicron sulfatases were then tested against a library of synthetic saturated CS disaccharides with sulfate groups on the 2-O-position of the uronate unit and 4-O-and 6-O-positions of the amino sugar as described previously (27,30).
Only the sulfatase BT_3349 (Table 1) proved to be active on these disaccharides. This enzyme led to the hydrolysis of all the disaccharides containing a sulfate group in the 4-O-position, whether or not a sulfate group is present in 6-O-position. To confirm BT_3349's specificity, we incubated this enzyme with the synthetic di-O-sulfated disaccharide GlcUA-GalNAc4,6S (Scheme 1b). Comparison of the NMR spectra of the disaccharide in the absence or presence of BT_3349 (Fig. 1A) showed that the signals at ϳ4.8 and 4.9 ppm corresponding to the H4 of the GalNAc4,6S (␣, ␤) unit disappeared in agreement with desulfation at the 4-O-position of the GalNAc residue.
These results were consistent with transcriptomic analysis showing that two putative sulfatases genes, BT_3349 and BT_3333, are induced when B. thetaiotaomicron grows with CS as sole carbon source (7). B. thetaiotaomicron is known to possess two distinct CS sulfatase activities, one specific for the 4-O-
As exoenzymes would not be expected to attack the sulfate groups on the sugar in reducing-end position, we treated these disaccharides with a bacterial ⌬4,5-glycuronidase that catalyzes the hydrolysis of the unsaturated hexuronate ring (31). The resulting monosaccharide mixtures were assayed with BT_ 3333 and BT_4656 sulfatases. As shown, the monosaccharides GalNAc6S and GlcNAc6S were hydrolyzed by BT_3333 and BT_4656, respectively (Fig. 1B). Using a mixture of differentially sulfated CS disaccharides as substrates, we demonstrated that only the 6-O-sulfate group on the GalNAc unit was hydrolyzed by BT_3333 after removal of the ⌬HexA unit (Fig. 1C) (where ⌬ stands for the 4,5-unsaturated ring structure and HexA for the uronate unit).

+ BT_4656
GlcNAc3S6S, GlcNS3S6S 1 H NMR analysis of BT_4656 in the presence of GlcNAc6S-O-pNP showed that signals at ϳ4.4 and 4.2 ppm, corresponding to the two H6 protons of the substrate, disappeared in the NMR spectrum (Fig. 1D), consistent with its identity as an N-acetylglucosamine-6-O-sulfatase. On the contrary, BT_4656 was inactive on a synthetic disaccharide with the GlcNAc6S unit in reducing-end position (i.e. IdoUA2S-GlcNAc6S-O-Me, Scheme 1c and "Experimental Procedures"), in agreement with the experiments performed on the enzymatically produced disaccharides.
In HS not only the GlcNAc can be 6-O-sulfated but also the GlcN-sulfated unit. Furthermore, the rare 3-O-sulfation of the GlcN unit may hinder 6-O-sulfatase activity. We therefore tested the impact of additional sulfation on the GlcNAc unit for 6-O-desulfation. Although the substitution with either an N-acetyl-or an N-sulfate group did not affect sulfatase activity, 3-O-sulfation totally hampered enzyme activity, demonstrating tight substrate specificity (Fig. 1E). Taken together, these data established that the sulfatases BT_3333 and BT_4656 are strict 6-O-sulfatases with an exclusive exolytic mode of action.
Link between the CS and HS Degradation Pathways in B. thetaiotaomicron-Further investigation with B. thetaiotaomicron sulfatases led to the identification of BT_1596 as a sulfatase active on HS-derived disaccharides but also on CS-derived disaccharides (Fig. 1B). Analysis of the different disaccharides present in solution showed that all types of 2-O-sulfated disaccharides, independent of the type of hexosamine unit and the presence of additional sulfate groups, were hydrolyzed by this enzyme. Notably, ⌬HexA2S-GlcNAc, ⌬HexA2S-GlcNAc6S, ⌬HexA2S-GlcNS, and ⌬HexA2S-GlcNS6S were completely desulfated ( Fig. 2A), as well as ⌬HexA2S-GalNAc4S, ⌬HexA2S-GalNAc6S, and ⌬HexA2S-GalNAc4,6S (Fig. 2B). Comparison of BT_1596 on ⌬HexA2S-GalNAc and ⌬HexA2S-GlcNAc indicated similar hydrolytic activities on both CS-and HS-derived disaccharides (data not shown). However, BT_1596 had no effect on polymeric CS or HS indicating that this enzyme is an exolytic sulfatase (Fig. 2C).
To confirm its exolytic nature and specificity, BT_1596 was incubated with a heparin-derived hexasaccharide (Scheme 1d) (32), and the reaction was analyzed by NMR (Fig. 2D). Comparison of the 1 H and 13 C signals in the hexasaccharide before and after incubation with BT_1596 showed chemical shift differences for the signals H1/C1 to H4/C4 at the nonreducing end. The large upfield shift of H2 by 0.84 ppm from 4.64 to 3.80 ppm together with the upfield shift of C2 by 3 ppm proved desulfation at this position. The changes in chemical shifts for the other signals were also characteristic for a change from a ⌬4,5-HexA2S to the nonsulfated ⌬4,5-HexA (33,34). BT_1596 is thus a 2-O-sulfatase active on unsaturated nonreducing end hexuronate units.
The activity of the BT_1596 sulfatase was further investigated by incubating the enzyme with the semisynthetic ultralow molecular weight heparin AVE5026 (Scheme 1f) (36). AVE5026 consists of a mixture of different oligosaccharides with an average M r of ϳ2400 g/mol (i.e. ϳ8 units) and the internal IdoUA residues partially sulfated on position 2. NMR analysis showed AVE5026 to be an enzyme substrate (Fig. 2E). It also demonstrated that BT_1596 exclusively hydrolyzes the ester sulfate at the 2-O-position of the ⌬4,5HexA2S nonreducing end of all oligosaccharides present in AVE5026. Furthermore, no change for the C2 chemical shifts and the H1 signal intensity of the internal IdoUA2S units was observed, confirming that they are not targets of the enzyme (33). Our study thus establishes BT_1596 as a ⌬4,5-hexuronate-2-O-sulfatase connecting HS and CS metabolism in B. thetaiotaomicron, consistent with transcriptomic analysis showing its induction in presence of host glycans (7).
Structure of the ⌬4,5-Hexuronate-2-O-sulfatase BT_1596 -Investigation of public databases revealed that the structure of the BT_1596 enzyme has been solved by the Joint Center for Structural Genomics (29) (PDB code 3B5Q) and identified as a putative sulfatase. Its structure shows an ␣/␤ topology, similar to other solved arylsulfatase structures (9). It is composed of six ␤-sheets in the larger N-terminal domain and of the canonical four anti-parallel ␤-sheets in the C-terminal domain, both surrounded by ␣-helices (Fig. 2F). In the highly charged active site, the critical residue (i.e. Ser-64) is located, as expected, at the beginning of an ␣-helix (9). Only one structure of a sulfatase in interaction with a substrate analog is available. It was obtained using an alanine mutant of the human cerebroside 3-sulfate 3-sulfohydrolyase (ArsA) and the synthetic substrate p-nitrocatechol sulfate (37). Interestingly, the superposition of ArsA and BT_1596 sulfatase shows that both enzyme structures share high structural similarities (root mean square deviation of 2.1) and the conservation of several amino acids involved in catalysis and substrate binding. Inside the BT_1596-active site, in addition to the catalytic residue Ser-64 (formylglycine 64 in the post-translationally activated enzyme), His-119 and Arg-68 are potentially involved in the elimination of the sulfate ester intermediate, whereas His-180, Lys-117 and Lys-296 may assist the binding of the anionic sulfate of the substrate (Fig. 2G). Moreover, although no cation has been modeled in the BT_1596 crystal structure, a putative conserved metal-binding region is found, composed of Asp-24, His-25, Asp-283, and His-284, likely coordinating a metal cation in the active enzyme. Incubation of BT_1596 in the presence of EDTA abolished its activity, further supporting the requirement of a cation, in the enzyme-active site.
Based on this analysis and using the ArsA substrate as a reference, we manually docked a disaccharide containing a ⌬4,5-HexA2S (PDB code 1BFB) inside the BT_1596-active site (Fig. 2G). Interestingly, validations of our model are provided by the fact that the docked disaccharide superimposed with a sulfated piperazine crystallized within the enzymeactive site and that the four sulfate oxygen atoms coincide perfectly with four water molecules located inside the enzyme-active site. In this model, two charged amino acids (i.e. Lys-296 and Lys-117) are ideally positioned to interact with the sulfate group, although Glu-378 and Asn-95 seem to be responsible for the correct orientation of the unsaturated unit inside the enzyme-active site. bacterial sulfatases. On the contrary, BT_3349 was active on the reducing-end unit of saturated (Fig. 1A) or unsaturated CS disaccharides (Fig. 1B) but not on monosaccharide units (30) suggesting a possible endolytic mode of activity.

BT_3349 Is a Bacterial Endo-4-O-sulfatase Active from Di-to
To probe this hypothesis, we assayed BT_3349 on various polymeric CS with different degrees and locations of sulfate modifications ( Table 2). With all CS used, we monitored extensive polymer desulfation related to the chain content of 4-Osulfate groups (Fig. 3, A and B). Even with excess of enzyme, only 4-O-sulfate groups were removed from either substrate indicating a high degree of specificity of this enzyme.
To univocally demonstrate 4-O-desulfation, we incubated the BT_3349 sulfatase with CS from shark cartilage (CSD) and followed the reaction by NMR. Analysis of the two-dimensional COSY, TOCSY, and HSQC NMR spectra showed shifts of all GalNAc4S H4/C4 signals by Ϫ0.61/Ϫ8.6 ppm, H3/C3 signal by Ϫ0.21/ϩ5 ppm, and H5/C5 signal by Ϫ0.13/ϩ0.4 ppm upon enzyme addition (Fig. 3, C and D). These changes in 1 H and 13 C chemical shifts indicate desulfation on the 4-O-position of GalNAc4S (38 -40). Taken together these data demonstrated that the vast majority of the 4-O-sulfate groups present on the GalNAc units in the polymer chain were hydrolyzed. Quantitative analysis confirmed that, depending on the CS used, between 90 and 100% of 4-O-sulfate groups were removed (Fig. 3B).
Nevertheless, the substitution of CS units with sulfate groups in other positions and the presence of epimerized hexuronate unit affected the sulfatase activity (Fig. 4A). Mono-4-O-sulfated disaccharides were essentially completely hydrolyzed by Ͻ0.5 M enzyme at a constant substrate level, whereas the presence of iduronate or sulfation of the uronate residue (more often on IdoUA than GlcUA) in the chain reduced the efficiency of the enzyme (Fig. 4, cf. CSB versus CSA, -D, or -E). Substitution of the 4-O-sulfated GalNAc with an additional sulfate group in position C6 reduced the efficiency slightly with all 4-O-sulfate groups hydrolyzed by ϳ1 M enzyme (Fig. 4A, CSE).
As distribution of sulfate groups in different chain types is heterogeneous, analysis at chain level gives only a semi-quantitative indication of the impact of modification surrounding the target sulfate group. We therefore employed lyase-generated disaccharides to determine the influence of sulfate groups in position C2 of the hexuronate and C6 of the hexosamine unit on 4-O-sulfatase efficiency. Although 6-O-sulfation increased the enzyme concentration required to hydrolyze a disaccharide substrate from ϳ0.05 M (for mono-4-O-sulfated disaccharides) to ϳ0.5 M (for di-4,6-O-sulfated disaccharides), an additional 10-fold more enzyme (ϳ5 M) was necessary when a sulfate group was present in position C2 of the hexuronate (Fig. 4B). Sulfation in all three positions further hindered sulfatase activity.

DISCUSSION
Despite their broad distribution in living organisms, only scarce information is available on sulfatases. Notably, because no sulfatase has been crystallized with its physiological substrate, almost no information about their specificity and selectivity is available. Furthermore, even if GAG-active sulfatases have been reported in some bacterial species, the identity of many of them remains unknown limiting biochemical characterization and sequence/structure comparisons. Finally, because of the lack of knowledge on these enzymes, their role in bacteria is usually confined to sulfate scavenging, although we have recently demonstrated their critical involvement for bacteria/host relationships in the context of the human microbiota (8).  GlcNS6S. B, analogous experiment with lyase-produced CS disaccharides incubated with BT_1596 and analyzed. C, HS and heparin were incubated without or with BT_1596 for 8 h under conditions described under "Experimental Procedures" and thereafter exhaustively digested by a mixture of heparin lyases (I-III) before analysis by RPIP-HPLC and quantification as described in Fig. 1C. The average sulfation degree was calculated and plotted on the y axis. CS-derived disaccharides are abbreviated as follows: 0S, ⌬HexA-GalNAc; 4S, ⌬HexA-GalNAc4S; 6S, ⌬HexA-GalNAc6S; 2S, ⌬HexA2S-GalNAc; 4S6S, ⌬HexA-GalNAc4,6S; 2S4S, ⌬HexA2S-GalNAc4S; 2S4S6S, ⌬HexA2S-GalNAc4,6S. D, 1 H-1D NMR spectra of a size-defined, lyase-produced heparin-derived hexasaccharide before (lower trace) and after (upper trace) incubation with BT_1596. The large upfield shift of H2 and smaller upfield shifts of the other signals from the nonreducing end unsaturated uronate residue (⌬) are indicative of desulfation at the C2 position. E, superimposition of 1 H-13 C HSQC spectra of AVE5026 alone (red, CH and CH 3 groups, and blue, CH 2 groups) and incubated with BT_1596 (black, CH and CH 3 groups, and green, CH 2 groups) show the large shifts experienced by the carbon and proton signals of the nonreducing end residue upon desulfation at the C2 position, although signals for internal IdoUA2S remain unaltered. The upper spectrum displays the anomeric region and the lower spectrum the C2/H2 to C6/H6 region. F, overall structure of the BT_1596 sulfatase (PDB code 3B5Q) with the disaccharide ⌬HexA2S-GalNAc46S (PDB code 1BFB) manually docked into the active site. G, zoom of the BT_1596 sulfatase showing conserved critical residues. Residues involved in cation coordination are shown in salmon, and residues involved in substrate coordination are shown in pink. Serine 64 is likely a C␣-formylglycine in the active enzyme. The green sphere represents a modeled cation.

TABLE 2 Composition of polysaccharide chains used as sulfatase substrates
Disaccharides obtained by exhaustive enzymatic cleavage of chains were quantified, and the relative proportions of different disaccharide units were calculated for each type of chain. We thus undertook a comprehensive analysis of potential GAG-specific sulfatases from the major human gut commensal B. thetaiotaomicron. In our study, we identified four sulfatases (BT_1596, BT_3333, BT_3349, and BT_4656) specific for GAG degradation. Three of these sulfatases are exolytic enzymes, in line with what is known about bacterial sulfatases, although one is an authentic endolytic O-sulfatase.
The BT_4656 sulfatase proved to be an N-acetylglucosamine 6-O-sulfatase sharing 57.5% identity with the heparin/HS 6-Osulfatase from P. heparinus (14). Both enzymes exhibit similar properties, notably their specificity toward glucosamine over galactosamine residues. Using a high homology threshold (E-value 0, identity Ͼ50%), we identified more than 200 related potential sulfatases genes in bacterial genomes. This group contains two other sulfatase genes from B. thetaiotaomicron, BT_3177 and BT_1628 (52.5 and 51.7% identity, respectively). We have shown here that the BT_4656 sulfatase has a different specificity compared with the BT_1628 enzyme, suggesting that these 200 genes are unlikely to be GAG-specific glucosamine 6-O-sulfatases, although the BT_1628 exhibited weak 6-O-sulfatase activity. Phylogenetic analysis revealed that the glucosamine 6-O-sulfatase from B. thetaiotaomicron and P. heparinus defined a gene cluster containing 61 sulfatase genes. Interestingly, this cluster is composed mostly of genes originating from 38 major gut Bacteroides species, including Bacteroides cellulosilyticus, Bacteroides eggerthii, Bacteroides finegoldi, Bacteroides intestinalis, Bacteroides ovatus, Bacteroides pyogenes, Bacteroides stercoris, B. thetaiotaomicron, Bacteroides uniformis, Bacteroides xylanisolvens and several uncharacterized Bacteroides species.
BT_3333 is the other 6-O-sulfatase we have identified. It represents the first bacterial N-acetylgalactosamine-6-O-sulfatase gene identified so far, although this type of activity has been reported in P. vulgaris (16). This enzyme does not present significant sequence homologies with the human N-acetylgalactosamine-6-sulfatase (UniProt P34059, identity Ͻ24%) and is thus currently only annotated as an arylsulfatase in bacterial genomes. We were able to identify 64 homologs (E-value 0, identity Ͼ50%) with a distribution mirroring essentially the one of BT_4656 among gut bacteria.    Table 2 for composition) were treated by BT_3349 before analysis as described in A. Degree of sulfation is plotted for each of the three different positions, C4 and C6 in GalNAc and C2 in ⌬HexA. C, 1 H-1 H COSY of CS before (red) and after (black) addition of BT_3349. D, superimposition of the 1 H-13 C HSQC of CS before (red) and after (black) addition of BT_3349. The arrows show the change in chemical shifts of H4/C4, H3/C3, and H5/C5 of the GalNAc residues upon desulfation at the C4 position.
Although both of these 6-O-sulfatases are highly specific for either CS or HS units and cannot cross-react, we identified the BT_1596 sulfatase as a ⌬4,5-hexuronate-2-O-sulfatase able to efficiently hydrolyze both types of substrates with slight preference for HS-derived structures. BT_1596 exhibits only 29% sequence identity with the analogous enzyme identified in P. heparinus (12). The latter enzyme, contrary to BT_1596, has been described as a 2-O-sulfatase kinetically preferentially acting on HS-derived unsaturated disaccharides, yet capable to cleave 2-O-sulfated unsaturated CS-derived units at saturating conditions. Whereas both enzymes hydrolyze sulfate groups on 4,5-unsaturated units derived from HS and CS disaccharides, we demonstrate here that the BT_1596 sulfatase is efficient on longer oligosaccharide and strictly active on the unsaturated units. Indeed, even after an extended period of time, BT_1596 cannot hydrolyze saturated units.
The solved structure of this enzyme shows the typical ␣/␤topology of sulfatases and a narrow active site consistent with its strict exolytic activity. Our structural analysis allowed us to identify characteristic residues likely involved in catalysis and cation coordination, although no metal ion was found in the electron density map. A model of P. heparinus sulfatase was previously built, and several residues were predicted to be essential for substrate interaction (41). None of them were conserved in the structure of the BT_1596 sulfatase. Structural analysis revealed the presence of two charged amino acids, likely involved in the interaction with the sulfate group. Interestingly, both residues (i.e. Lys-296 and Lys-117) are conserved in the P. heparinus sulfatase. Based on our model, we also identified Glu-378 and Asn-95 as likely candidates for the selectivity of BT_1596 toward ⌬4,5HexA2S over IdoUA2S units.
Interestingly, the BT_1596 sulfatase has a more narrow distribution with only 32 homologs (E-value 0, identity Ͼ85%) found exclusively in a few Bacteroides species including B. xylanisolvens, B. pyogenes, B. ovatus, B. finegoldi, and B. uniformis. Because of its lack of homology with previously known sulfatases, all these genes are annotated as putative sulfatase (similar to the yidJ gene from E. coli (18)) or as hypothetical proteins. The P. heparinus 2-O-sulfatase (31) has no homolog at such a threshold indicating the existence of at least two distinct groups of ⌬4,5-hexuronate-2-O-sulfatases among bacteria.
Finally, our study allowed us to identify a fourth GAG-specific sulfatase (BT_3349) in B. thetaiotaomicron, which we have demonstrated to be a CS/DS-specific 4-O-sulfatase. We have established that this enzyme efficiently removes sulfate groups from a broad substrate range of saturated disaccharides to high molecular weight polymers. As expected, this enzyme shares no significant homologies with the human N-acetylgalactosamine-4-O-sulfatase, ArsB (UniProt ID: P15848), which is a strict exo-enzyme active only on the sulfate group present on the GAG nonreducing end (42).
The BT_3349 enzyme exhibits similar properties to an enzyme from P. vulgaris (16). Although the P. vulgaris sulfatases were never further characterized, two sulfatase activities (a CS 4-O-and 6-O-sulfatase) were found in this bacterium like in B. thetaiotaomicron. Few major differences are apparent between the activities identified in P. vulgaris and the enzymes we characterized from B. thetaiotaomicron. First, we have unequivocally demonstrated that the CS 6-O-sulfatase (Nacetylgalactosamine-6-O-sulfatase) from B. thetaiotaomicron (i.e. BT_3333) is an exo-enzyme active only on the nonreducing end of CS oligosaccharides, whereas the P. vulgaris CS 6-Osulfatase was reported to be specific for the sulfate groups present at the reducing end of hexasaccharides (16). Second, the P. vulgaris CS 4-O-sulfatase was shown to be active at the reducing end of oligosaccharides up to hexasaccharides and, under some conditions, able to act on monosulfated internal units. Here, we demonstrate that the BT_3349 efficiently hydrolyzes sulfate groups from a broad range of substrate size, including disaccharide to high molecular weight CS and DS polymers. We have shown that this enzyme is active from mono-to tri-Osubstituted units with a strict specificity for the 4-O-sulfate groups of galactosamine. Furthermore, a noteworthy feature of this enzyme is its ability to extensively and efficiently remove sulfate groups on the intact polymer and, as such, is the first reported bacterial GAG endosulfatase active at polymer level.
To date, only one other GAG endosulfatase, the HS 6-O-endosulfatase, present in eukaryotes from Drosophila to humans, is able to hydrolyze 6-O-sulfate groups on the chain level without requiring prior polymer degradation. This latter enzyme has attracted considerable interest during the last 10 years because of its major roles in metazoan development and homeostasis (11). Although no structural data are available, this enzyme has been reported to be able to act as an endosulfatase because of the unique presence of an additional positively charged domain. This domain has been speculated to be critical for the interaction with the highly negatively charged GAG chains. The BT_3349 sulfatase does not possess such an additional domain and has a predicted topology similar to other bacterial sulfatases. This enzyme thus represents a novel group of GAGspecific endosulfatases and is the first identified CS-endosulfatase and the first bacterial GAG-specific endosulfatase ever reported.
Interestingly, whereas genes, encoding for homologs of BT_3349, are also present in several Bacteroides species (e.g. B.  xylanisolvens, B. pyogenes, B. ovatus, B. finegoldi, and B. uniformis), no homologs were found in the P. heparinus genome. Conversely, the sequenced Proteus species are apparently deprived of HS-related sulfatases. The gut Bacteroides are thus uniquely equipped to extensively modify and degrade both types of host GAGs.
Based on our biochemical studies, we can tentatively propose a metabolic pathway for GAG degradation in B. thetaiotaomicron (Fig. 5). The CS and HS degradation pathways might differ in the first step because CS can be extensively desulfated at the 4-O-position in the extracellular environment by the BT_3349 sulfatase. Then CS and HS are cleaved by lyases producing unsaturated di-to oligosaccharides. Because most of these enzymes belong to polysaccharide utilization loci, these enzymes are likely to exert their activity in the bacterial environment and the resulting oligosaccharides to be imported into the bacterium as described for other types of glycans (43).
The 2-O-sulfate groups present on the unsaturated hexuronate units of CS and HS oligosaccharides are cleaved off by the BT_1596 sulfatase. These products are then further hydrolyzed by glycosidases releasing oligosaccharides containing 6-O-sulfatated galactosamine or glucosamine at their nonreducing end. The BT_4656 and BT_3333 sulfatases are involved in the final steps reducing sulfation of oligo-or monosaccharides. Despite our attempts, we failed to identify in B. thetaiotaomicron an N-and a 3-O-sulfatase whose activities are required for completion of GAG desulfation even though we expressed and assayed BT_3095 and BT_3101, the closest homologs of P. heparinus N-sulfoglucosamine sulfohydrolase. Although we cannot exclude that these activities would require different assay conditions, it is likely that genes coding for these enzymes are present among the 14 other putative sulfatase genes of this organism. Interestingly, although many bacteria and Bacteroides species have either the CS or HS degradation pathway, only a few Bacteroides species possess both complete degradation pathways. These species are microbiota dominant species such as B. thetaiotaomicron, B. ovatus, or B. uniformis.
Although the role of the enzymes we have discovered is likely to metabolize host or food-derived GAGs, the unexpected discovery of a bacterial GAG endosulfatase raises the possibility that, as we previously hypothesized (8), Bacteroides use sulfatases to shape their glycosidic landscape. Indeed, these enzymes provide B. thetaiotaomicron (and other major human commensal Bacteroides) with the unique ability not only to metabolize host glycans but possibly also to modify host macromolecules FIGURE 5. Proposed functions of B. thetaiotaomicron sulfatases in the bacterial degradation pathways of glycosaminoglycans. Chondroitin and dermatan sulfate can be first desulfated by the unique endo-4-O-sulfatase (BT_3349) before lyases act, whereas heparin/HS are degraded directly into oligosaccharides and disaccharides. Independently of their origin, GAG oligosaccharides and disaccharides are further processed by the ⌬4,5-hexuronate-2-O-sulfatase (BT_1596) and hydrolyzed into shorter oligosaccharides or monosaccharides. Oligosaccharides with a nonreducing end hexosamine and monosaccharides then become substrates for the two specific 6-O-sulfatases (BT_3333 and BT_4656). Further action of an N-sulfamidase and 3-O-sulfatase is required to obtain totally desulfated units from HS and heparin. AUGUST 29, 2014 • VOLUME 289 • NUMBER 35 with a potential impact on host physiology and pathology. This hypothesis will nevertheless require further investigations. Finally, these enzymes represent unique tools for the selective modification and characterization of GAGs, notably the novel endosulfatase identified here could be employed to produce novel CS-derived polymers.