Three structurally and functionally distinct β-glucuronidases from the human gut microbe Bacteroides uniformis

The glycoside hydrolases encoded by the human gut microbiome play an integral role in processing a variety of exogenous and endogenous glycoconjugates. Here we present three structurally and functionally distinct β-glucuronidase (GUS) glycoside hydrolases from a single human gut commensal microbe, Bacteroides uniformis. We show using nine crystal structures, biochemical, and biophysical data that whereas these three proteins share similar overall folds, they exhibit different structural features that create three structurally and functionally unique enzyme active sites. Notably, quaternary structure plays an important role in creating distinct active site features that are hard to predict via structural modeling methods. The enzymes display differential processing capabilities toward glucuronic acid–containing polysaccharides and SN-38-glucuronide, a metabolite of the cancer drug irinotecan. We also demonstrate that GUS-specific and nonselective inhibitors exhibit varying potencies toward each enzyme. Together, these data highlight the diversity of GUS enzymes within a single Bacteroides gut commensal and advance our understanding of how structural details impact the specific roles microbial enzymes play in processing drug-glucuronide and glycan substrates.

The carbohydrates and glycoconjugates that reach the human gastrointestinal (GI) 3 tract are remarkably complex and sample a wide range of structural diversity. Despite the numerous and diverse carbohydrates humans consume, most of the enzymes required to process these molecules are not encoded by the human genome (1). Fortunately, a mutually beneficial relationship exists between the microbial inhabitants of the GI tract and the human host, in which the human gut microbiota (HGM) expand the host's metabolic capabilities via carbohydrate-active enzymes (CAZymes) (2). These CAZymes include glycoside hydrolases (GHs) and polysaccharide lyases (PLs) that mediate the fermentation of nondigestible carbohydrates and glycosides (3,4). The major products of these processes, short chain fatty acids, account for up to 10% of the dietary energy in humans (5) and have been associated with a myriad of health benefits (6 -8). In return, the HGM gain a stable energy source, which is crucial for microbial survival and maintaining balance within the HGM.
The Gram-negative phylum Bacteroidetes, one of two dominant bacterial phyla in the human gut microbiome, is a key metabolizer of diverse glycans in the GI tract. Members of the Bacteroidetes, the majority belonging to the genus Bacteroides, degrade both dietary and host-derived carbohydrates, and many species ferment multiple different polysaccharides (1,9). Bacteroides thetaiotaomicron, for example, forages both host mucus glycans and plant polysaccharides, depending on their availability (10). Accordingly, Bacteroides encode genes for large numbers of CAZymes, particularly GHs (3), that are organized in polysaccharide utilization loci (PULs), a distinctive feature of their genomes (11).
Given the thousands of CAZymes that occur in Bacteroides, the functional and structural diversity of GHs within individual Bacteroides species still remain largely unexplored. Although the CAZyme classification system, which groups GHs into families based on their amino acid sequences, is reliable for the prediction of catalytic mechanisms and overall structural folds, substrate specificity, and unique structural features are difficult to predict. For example, the GH2 family comprises ␤-glucuronidases, ␤-glucosidases, ␤-galactosidases, and ␤-mannosidases, all of which possess an (␣/␤) 8 TIM barrel-fold (3). Thus, it is important to experimentally characterize GHs to understand their structure and to assign function.
Recently, we presented a structure-guided approach to differentiate ␤-glucuronidase (GUS) proteins from their GH2 family members (12). As reported, 279 unique GUS enzymes were identified from the 4.8 million unique genes present in the stool sample database of the Human Microbiome Project (HMP) (12). This provided the first atlas of GUS enzymes in the human gut microbiome. Within that effort, we identified and characterized a GUS from the human gut bacterium Bacteroides uniformis, which has been reported to be highly abundant in the human GI tract (13). We demonstrated that it acts as a ␤-glucuronidase and is able to process both a small-molecule glucuronide and a polysaccharide with a terminal glucuronic acid moiety (12). As outlined below, in an attempt to gain further insight into its role in polysaccharide degradation, we searched the genomic region surrounding this GUS (Fig. 1A). We found two additional GH2 enzymes in the same PUL that retain sequence features previously identified as unique to GUS enzymes (Fig. 1A). To our knowledge, no previously characterized PUL contains three potential GUS enzymes. For this reason, we were interested in their differential structural properties and their abilities to cleave diverse glucuronic acid (GlcA)-containing substrates.
Here we demonstrate that three putative GUS proteins from a single B. uniformis microbe share the TIM barrel structural fold but exhibit distinct tertiary and quaternary structures, not obvious from sequence analysis, and harbor unique structural features within their actives sites that likely afford them specific substrate processing capabilities. Indeed, these GUS enzymes displayed differential activities toward a variety of glucuronide substrates, including GlcA-containing polysaccharides and SN-38-G, a metabolite of the cancer drug irinotecan. Additionally, we tested the ability of both selective bacterial GUS inhibitors and a pan GUS inhibitor to inhibit the three GUSs from this microbe, which reveal distinct propensities for inhibition. We further examined the potential for these glucuronidases to act on other sugar acid-containing substrates, such as those that contain galacturonic acid, iduronic acid, or mannuronic acid. These results highlight the broad structural and functional diversity among GUS enzymes within a single human gut microbe. Furthermore, the data presented here provide a foundation for understanding the specialized roles of GUS enzymes in the deconstruction of a sugar acid-containing carbohydrate and the ability of the HGM to reactivate drug-glucuronide conjugates.

Discovery and sequence analysis of GUS enzymes from a B. uniformis PUL
A GUS from the human gut bacterium B. uniformis strain 3978 T3i (BuGUS) was previously discovered in the HMP database (12). Further inspection of the genomic region flanking this GUS gene revealed a hallmark of PULs, a nearby susC/ susD-like gene pair. These two genes are involved in the binding of polysaccharides on the outer membrane (SusD) and transport into the periplasm (SusC) (Fig. 1B) (11). The presence of the susC/susD homologs indicates that BuGUS is located in a PUL, which means BuGUS likely contributes to the orchestrated degradation of a GlcA-containing polysaccharide. Two additional enzymes predicted to belong to the GH2 family were also identified adjacent to BuGUS and the susC/susD-like pair (Fig. 1B). Each of these proteins possess key sequence features that are characteristic of GUS enzymes, including the asparagine-X-lysine (NXK) motif and catalytic glutamates that recognize and cleave glucuronides, respectively (12, 14) (Fig. 1A). Only BuGUS-1 and BuGUS-2, however, possess the GUS-specific tyrosine residue (Tyr-480 and Tyr-495, respectively) that hydrogen bonds with the nucleophilic glutamate. In BuGUS-3, a tryptophan (Trp-483) replaces the tyrosine ( Fig. 1A and Fig. S1).
Sequence alignments with the previously characterized BuGUS (now termed BuGUS-2) and these two new GUS enzymes, termed BuGUS-1 and BuGUS-3, revealed sequence identity of 27 and 29%, respectively, whereas BuGUS-1 and BuGUS-3 share 18% sequence identity (Fig. S1). Sequence analysis also revealed that BuGUS-1, BuGUS-2, and BuGUS-3 fall into the previously defined No Loop (NL), Loop 2 (L2), and Mini Loop 2 (mL2) classes, respectively; these classifications are related to the size and location of loops at the active site of gut microbial GUS enzymes, and have been shown to play key roles in substrate specificity (12) (Fig. 1C). Utilizing the signal peptide prediction tool, SignalP 4.1 Server (15), we found that BuGUS-1, BuGUS-2, and BuGUS-3 have a signal peptide and are thus expected to be periplasmic. Together, this sequence analysis indicates that a PUL from B. uniformis contains three putative GUS enzymes with distinct sequence features.

BuGUS-1 and BuGUS-2 exhibit ␤-glucuronidase activity with 4-MUG
To begin to elucidate the substrate specificities of these three putative GUS enzymes, we performed in vitro activity assays with their purified protein products. BuGUS-2 has been previously shown to exhibit GUS activity (12). To confirm that BuGUS-1 and BuGUS-3 are also GUS enzymes, we synthesized, cloned, expressed, and purified their protein products. We then utilized the standard substrates 4-methylumbelliferyl-␤-D-glucuronide (4-MUG) and p-nitrophenyl-␤-D-glucuronide to assess the pH profile and kinetic parameters of GUS activity, respectively. BuGUS-1 (k cat /K m ϭ 3.4 ϫ 10 5 s Ϫ1 M Ϫ1 ) and BuGUS-2 (k cat /K m ϭ 3.8 ϫ 10 5 s Ϫ1 M Ϫ1 ) both efficiently processed these standard substrates, indicating that they are GUS enzymes (Table 1 and Fig. S2). However, BuGUS-3 was unable to catalyze the hydrolysis of these substrates (Table 1). Thus, whereas BuGUS-1 and BuGUS-2 can hydrolyze glucuronides, BuGUS-3 may have a distinct but related activity despite its GUS-like sequence features.
Although BuGUS-1 maintains a similar tertiary structure to other GUS enzymes, it possesses unique active site residues, particularly Tyr-382 and Trp-383 ( Fig. 2A, highlighted in yellow). These positions are generally occupied by smaller residues in previously characterized GUS enzymes, such as BuGUS-2 (Fig. 2B). In addition to these unique active site residues, BuGUS-1 is only the second tetrameric bacterial GUS characterized that does not contain a Loop 1 by sequence analysis. Instead, remarkably, and unpredictably by sequence analysis alone, an N-terminal loop (NTL) (Fig. S1) is donated from an adjacent protomer and resembles the loop-based active sites of previously characterized Loop 1 GUS structures, as outlined below (Fig. 8A).
Compared with previously characterized GUS enzymes, BuGUS-3 deviates the most in its active site composition, containing five unique residues (yellow) (Fig. 2C). Most notably, three arginines in BuGUS-3 replace the small, polar residues, such as asparagine, which are conserved in previously characterized GUS enzymes, and Arg-391 and Arg-466 are positioned to form ionic interactions with the catalytic glutamates ( Fig.  2C). Furthermore, the BuGUS-3 active site contains Trp-483, which replaces the conserved tyrosine in all other GUS enzymes characterized. Finally, Trp-431 and Arg-391, localized across the active site from the NXK motif (green), have not been observed in any other GUS enzymes (Fig. 2C). These distinct active site features may explain why BuGUS-3 does not process the standard glucuronide substrates despite the presence of the NXK motif and catalytic glutamates necessary for the recognition and cleavage of glucuronides.
BuGUS-1 displays a unique quaternary structure despite having similar tertiary structure to EcGUS ( Fig. 3A and Fig. S3). BuGUS-1 forms a unique inverted tetramer compared with the previously determined structures of E. coli GUS (EcGUS), Streptococcus agalactiae GUS, Clostridium perfringens GUS, and the human GUS ortholog (Fig. 3, A and B) (14,16,17). In BuGUS-1, individual protomers interact via their N termini compared with the previously examined GUS enzymes outlined above, in which the interface of protomers is formed by  Three distinct ␤-glucuronidases from B. uniformis their C termini (Fig. 3, A and B). The consequence of this oligomeric organization is a solvent-exposed active site. Unlike the tetrameric BuGUS-1, BuGUS-2 and BuGUS-3 both form dimers (Fig. 4, A and B, and Fig. S4) and contain extra domains at their C termini (Fig. 2, B and C, and Fig. S5). Excluding these additional domains, the core tertiary structures of BuGUS-2 and BuGUS-3 are TIM barrel folds with two ␤-sandwich-like domains, similar to previously characterized GUS enzymes (Fig. 2, B and C, and Sequence and structural analysis of the C-terminal domains of BuGUS-2 revealed that the most C-terminal (yellow) is a member of the CBM 57 family, based on malectin that binds to developing glycans in the endoplasmic reticulum ( Fig. 4C) (18). The remaining domains in BuGUS-2 and BuGUS-3 are "domains of unknown function" (DUF) and are not formally defined as, and simply may not be, CBMs (Fig. 2, B and C, and Fig. S5). Both sequence (NCBI BLAST) and structure-based (PDBeFold) searches of these additional C-terminal DUFs revealed hits for the C-domains of antibodies. These domains may only serve a role in the oligomeric organization of these proteins. The DUFs from the BuGUS enzymes are similar in structure and have been observed once previously in the structure of Bacteroides fragilis GUS (BfGUS), which was also previously designated as a DUF (Fig. S5). Collectively, the unique C-terminal domains of BuGUS-2 and BuGUS-3 may play roles in carbohydrate-binding and quaternary structure.
We also find that BuGUS-2 contains a well-organized predicted calcium-binding site (Fig. 5, A and B) that is unique to this GUS both in B. uniformis and in GUS enzymes of known structure to date. Approximately 24 Å from the active site of BuGUS-2 are three aspartic acids and three ordered water molecules that coordinate a predicted calcium ion (Fig. 5, A and B). Site-directed mutagenesis of Asp-341 and Asp-367 to alanines led to a complete loss of GUS activity and the crystal structure of this mutant (space group: P2 1 2 1 2 1 , molecules in asymmetric unit: 2, Table S1) revealed significant structural changes at the enzyme active site (Fig. 5, C and D). CD analysis also revealed a small loss in structural order for the predicted calcium-binding mutant compared with the wildtype (WT), but an equivalent melting temperature indicated no significant change in overall A, tertiary structure of BuGUS-1 with the sugar acid-recognizing NXK motif highlighted as green spheres and catalytic glutamates as deep salmon spheres and zoom-in of active site with unique active site residues highlighted in yellow. B, tertiary structure of BuGUS-2 with core fold highlighted in magenta and additional C-terminal domains in green (DUF1) and yellow (CBM57) and zoom-in of the active site. C, tertiary structure of BuGUS-3 with core fold in blue, the sugar acid-recognizing NXK motif highlighted as green spheres, and catalytic glutamates as deep salmon spheres and additional C-terminal domains in green (DUF1) and yellow (DUF2) and zoom-in of the active site with unique active side residues highlighted in yellow.  Three distinct ␤-glucuronidases from B. uniformis protein stability (Fig. S6). Sequence analysis of the 279 previously discovered GUS enzymes (12) revealed 17 additional GUS proteins with a predicted calcium-binding site (Table S2 and Fig. S7). Thus, it appears that the predicted calcium-binding site plays a key role in the structure and function of BuGUS-2 and is conserved among other GUS proteins in the human gut microbiome.

BuGUS enzymes differentially process GlcA-containing polysaccharides
Given the distinct active site architectures of the three GUS enzymes examined here, as well as their differential processing of standard glucuronide substrates, we examined a set of pure synthetic polysaccharide substrates (Fig. 6, A and B). We chose heparin-like nonamers (9-mers) that contain GlcA and are either acetylated or sulfated. We also examined shorter polysaccharides (5-mers) and a substrate with GlcA at the penultimate rather than the terminal (nonreducing end) position (NAc 4-mer) (Fig. 6B). Both BuGUS-1 and BuGUS-2 were able to process the acetylated heparin-like nonamer substrate (NAc 9-mer), but BuGUS-3 showed no activity (Fig. 6A). However, all three GUS enzymes, including BuGUS-3, were able to process the terminal ends of a sulfated heparin-like substrate (NS 9-mer; Fig. 6A). We next examined a 9-mer with a doubly sulfated glucosamine moiety at the penultimate position (NS6S 9-mer). We found that this change eliminated activity with all three enzymes (Fig. 6A).
We tested the effect of polysaccharide length on activity by examining shorter 5-mer substrates. Our results were similar to the 9-mer data outlined above, BuGUS-1 and BuGUS-2 processed the NAc 5-mer, whereas BuGUS-3 did not, and all three GUS enzymes processed the NS 5-mer (Fig. 6A). Interestingly, although, BuGUS-3 displayed much weaker activity with the NS 5-mer than it did with the NS 9-mer (Fig. 6A). Finally, to confirm that these proteins act as exolytic enzymes toward substrates with terminal GlcA moieties, we examined a 4-mer polysaccharide with GlcA at the penultimate position (Fig. 6B). As expected, the three enzymes examined failed to process this compound, indicating that they do not act as endolytic enzymes toward this particular substrate (Fig. 6A). Taken together, these data using six distinct polysaccharide substrates related to compounds found in humans reveal that all three BuGUS enzymes are able to process sulfated 9-mers and sulfated 5-mers, Three distinct ␤-glucuronidases from B. uniformis whereas only BuGUS-1 and BuGUS-2 cleaved the acetylated heparin-like 9-and 5-mers. Moreover, the enzyme activity appears limited to removing terminal GlcA groups. Such data provide an initial molecular framework to understand the potential for microbial GUS enzymes to utilize polysaccharide substrates within the human GI tract.

BuGUS enzymes may process additional uronic acid-containing substrates
Given the diversity of uronate-containing polysaccharides, we considered the possibility that these GUS enzymes would process uronic acid conjugates beyond glucuronides. Thus, we docked into the three BuGUS enzymes the following four uronic acids: glucuronic acid (GlcA), galacturonic acid (GalA), mannuronic acid (ManA), and iduronic acid (IdoA). These sugar monosaccharides were identified from the PDB and docked manually in PyMOL based on the glucuronate-bound structure of BuGUS-1 (PDB 6D6W). Despite the differences in stereochemistry between these sugar acids, docking suggests that each may be accommodated within all three GUS active sites (Fig. S8). Galacturonate appeared to be the most sterically strained sugar, which has an axial hydroxyl at the 4-position that could clash with the aspartic acid side chain conserved in all three BuGUS enzymes (Fig. S8). To test the hypothesis that substrates with terminal sugar acids beyond GlcA could be utilized as substrates, we examined the ability of p-nitrophenyl-␤-D-galacturonide (pNP-GalA) to act as a substrate for BuGUS-1, BuGUS-2, and BuGUS-3 (Fig. S9a). We found that only BuGUS-1 was able to process this galacturonide (Fig. S9b). Kinetic analysis of BuGUS-1 with both pNP-GlcA and pNP-GalA revealed catalytic efficiencies (k cat /K m ) of 2.2 ϫ 10 5 and 3.1 ϫ 10 4 , respectively, suggesting that whereas BuGUS-1 can hydrolyze galacturonides, it does so less efficiently than the analogous glucuronide (Fig. S9c). A model of galacturonic acid docked in the active site of BuGUS-1 shows that the aspartic acid (green) that could clash with the hydroxyl at the 4 position may cause this weaker efficiency (Fig. S9d). Taken together, docking studies and kinetics suggest that the GUS enzymes considered here may act on polysaccharide substrates containing terminal sugar acids beyond glucuronate, including mannuronate, iduronate, and galacturonate.

Three distinct ␤-glucuronidases from B. uniformis BuGUS structures in complex with substrate analogs
To gain a better understanding of substrate recognition by these novel GUS enzymes, we incubated them with the nonhydrolyzable substrate analog phenyl-thio-␤-D-glucuronide (PTG) and attempted co-crystallization. Co-crystallization of a PTG-BuGUS-1 complex was successful (space group: P12 1 1, molecules in asymmetric unit: 6, Table S1), and the crystal structure revealed a conformational shift in the active site in which the catalytic acid/base Glu-421 shifts away from the active site (Fig. 7, A and B). This conformational change is accommodated by additional shifts adjacent to the active site, in which Glu-453 and Lys-454 undergo 7.8-and 5.9-Å changes in position, respectively, relative to the unliganded structure (Fig.  7B). In line with previous studies, the carboxylate of PTG is recognized by Asn-573 and Lys-575 (NXK motif), as well as Tyr-484 (Fig. 7A). In addition to a PTG complex, we also determined the structure of BuGUS-1 in complex with GlcA (space group: C2, molecules in asymmetric unit: 4, Table S1). GlcA was bound to BuGUS-1 as the ␣-anomer (Fig. 7, C and D), and much like PTG, the carboxylate of GlcA is recognized by the NXK motif and other residues that contact its hydroxyl groups (Fig.  7C). Additionally, Trp-533 participates in C-H interactions with the nonpolar face of GlcA (Fig. 7C). The anomeric hydroxyl group forms a hydrogen bond with Glu-508, the catalytic nucleophile (Fig. 7D). Together, these structural data highlight how GUS specifically recognizes its glucuronide substrate.

Differential SN-38-G processing by BuGUS enzymes
GUS enzymes are promiscuous and can hydrolyze a variety of glucuronides related to mammalian gut toxicity (12, 14, 16, 19 -25). Thus, we sought to determine whether these GUS enzymes are capable of reactivating the inactive metabolite SN-38-G of the cancer drug irinotecan. Despite their localization in a PUL, BuGUS-1 and BuGUS-2 hydrolyzed the smallmolecule glucuronide SN-38-G (Fig. 8B). Strikingly, BuGUS-1 hydrolyzed SN-38-G with an efficiency that rivals previously characterized Loop 1 GUS enzymes that are not located in PULs and have been shown to prefer only small molecule glucuronides over polysaccharides (Fig. 8B) (12). We hypothesized that the NTL identified in the structure of BuGUS-1 may play a key role in recognizing the aglycone moiety of SN-38-G (Fig.  8A). The NTL is defined as residues Tyr-54 through Ala-67, and forms a loop that sits by the active site of an adjacent protomer (Fig. 8C). Indeed, the NTL loop deletion (⌬loop BuGUS-1) displayed much slower processing with both 4-MUG and SN-38-G compared with the WT BuGUS-1 (Table 1 and Fig.  8B). We solved the structure of ⌬loop BuGUS-1 (space group: P12 1 1, molecules in asymmetric unit: 4, Table S1), which shows the absence of this key loop structure (Fig. S10).
As an additional control to test the importance of the NTL for SN-38-G processing by BuGUS-1, we cloned, expressed, and purified a Bacteroides multispecies (BMSP) GUS that is similar to BuGUS-1 but, importantly, lacks the NTL sequence necessary for efficient processing of small molecule glucuronides (Fig. 8C). The 2.65-Å structure of BMSP GUS (space group: I4 1 , molecules in asymmetric unit: 4, Table S1) reveals the same tetrameric organization as BuGUS-1 but lacks the N-terminal loop that forms the aglycone-binding site of BuGUS-1 (Fig. 8C and Fig. S11a). Importantly, BMSP displayed similar 4-MUG and SN-38-G processing efficiencies compared with the ⌬loop variant of BuGUS-1 (Fig. 8B and Table 1). These data suggest that an N-terminal sequence feature in the previ-

Discussion
The GHs encoded by Bacteroides species play key roles in the processing of carbohydrates and glycosides that reach the GI tract. Here we present three unique GUS enzymes from the human gut microbe B. uniformis that advance our understanding of the structural and functional diversity within this particular GH family. By analyzing the genes adjacent to a previously characterized B. uniformis GUS (12), we discovered two additional GUS enzymes from a B. uniformis PUL (Fig. 1, A and B). One of these GHs we termed BuGUS-1, as it retained the GUSspecific features previously used to identify GUS enzymes in the HMP (12). We demonstrated that it is a GUS capable of processing a variety of GlcA-containing substrates ( Table 1, Fig. 6, and Fig. S2). BuGUS-3 also possessed several GUS-specific features, including the core fold, catalytic residues, and NXK motif; however, a tryptophan replaced the tyrosine that hydrogen bonds to and structurally stabilizes the nucleophilic glutamate (Fig. 2C). Although BuGUS-3 was unable to process 4-MUG, it did exhibit GUS activity toward the heparin sulfate 9-mer (Fig. 6A). This discovery suggests that the initial GUS rubric defined previously could allow either a tyrosine or a tryptophan at this sequence position (12). Indeed, a tryptophan is present at this position in the GUS module of BT0996, one of the enzymes responsible for the degradation of rhamnogalacturonan-II in B. thetaiotaomicron (27). This information indicates that the 279 GUS proteins previously identified represent

Three distinct ␤-glucuronidases from B. uniformis
an initial GUS atlas and should be reexamined and updated as new structural and functional data are determined regarding this enzyme family. Indeed, a preliminary analysis of the HMP identified 10 additional GUS proteins with a tryptophan residue in this position (Table S3); these novel proteins will be the subject of future studies.
As previously discussed by Pollet et al. (12), GUS enzymes with longer loops adjacent to the active site (e.g. Loop 1 GUS enzymes) were shown to process small glucuronides, and those possessing open active sites were able to process larger GlcAcontaining polysaccharides. In previously determined GUS structures, the tetrameric interface between GUS protomers have been formed by their C termini, and active site adjacent loop structures (Loop 1) from these adjacent protomers formed the aglycone-binding site (Fig. 3A), limiting the access of larger substrates (12, 14,16). In contrast, BuGUS-2 was shown to form a dimer, leaving its active site open and solvent exposed to accommodate larger polysaccharides (12) (Figs. 2B and 4A). BuGUS-1, which exhibits an open active site via a unique N-terminal-mediated tetrameric arrangement (Fig. 3B), processed 4-MUG with a similar efficiency to BuGUS-2 (Table 1).
BuGUS-1 was also shown to process SN-38-G faster than both BuGUS-2 and BuGUS-3 (Fig. 8B). Surprisingly, the NL BuGUS-1 processed SN-38-G at an efficiency that rivaled that of L1 EcGUS, despite the lack of an active site loop at the canonical position in its amino acid sequence (Fig. 1C). Further examination of the crystal structure of BuGUS-1 revealed the presence of an N-terminal loop (NTL) donated from an adjacent protomer (Fig. 8A). This donated loop mimics the Loop 1 present in L1 GUS enzymes and appears to enable BuGUS-1 to process the small molecule glucuronide 4-MUG (Table 1) and SN-38-G (Fig. 8B) at efficiencies similar to those of characterized Loop 1 GUS enzymes (Fig. 8B) (14, 16). Indeed, kinetic analysis of BMSP GUS and the ⌬loop variant of BuGUS-1 suggest that SN-38-G and 4-MUG processing by BuGUS-1 is greatly facilitated by its N-terminal loop (Table 1 and Fig. 8B). Examination of the GUS proteins present in the GI tracts of

D-Glucaro-1,4-lactone
Three distinct ␤-glucuronidases from B. uniformis healthy humans (12) revealed that six additional enzymes beyond BuGUS-1 maintain an NTL (Table S4). Collectively, these data indicate that the determination of novel crystal structures of GUS enzymes will continue to enhance our understanding of the structural and functional variations present in this family of proteins.
To further investigate how the BuGUS-1 active site may interact with SN-38-G, we manually docked SN-38-G based on the PTG-bound structure of BuGUS-1. Our analysis shows that the planar, nonpolar aglycone of SN-38-G could interact favorably with the BuGUS-1 active site (Fig. 8A). Notably, Tyr-57 located in the donated loop participates in interactions with the aromatic scaffold of SN-38-G in the binding mode modeled (Fig. 8A). This may explain its ability to efficiently hydrolyze this substrate. Docking of SN-38-G into the active sites of BuGUS-2 and BuGUS-3 demonstrates that they do not harbor the same active site features of BuGUS-1 that would allow them to recognize SN-38-G (Fig. S9). Specifically, the tyrosine in BuGUS-1 is replaced by an arginine and a tryptophan in BuGUS-2 and BuGUS-3, respectively, which do not appear to favorably interact with the aromatic scaffold of SN-38-G (Fig. S9, c  and d).
In addition to small glucuronides, we demonstrated that all three GUS enzymes differentially processed GlcA-containing polysaccharides. Although bioinformatic analysis of the genes in this PUL do not reveal a clear polysaccharide substrate for these enzymes to act on, we showed that BuGUS-2 was capable of processing a sulfated heparin-like 9-mer and an acetylated heparin-like 9-mer, and BuGUS-3 processed the sulfated heparin-like 9-mer (Fig. 6A). Given the unique nature of the BuGUS-3 active site (Fig. 2C) compared with previously characterized GUS enzymes, this likely is a key feature that leads to its lack of activity with most of the glucuronide-containing polysaccharides. The differences in polysaccharide processing may also be explained by differences in quaternary structures. Although both BuGUS-2 and BuGUS-3 are dimers and contain extra C-terminal domains, the positioning of these domains are distinct and influence protomer organization (Fig. 4, A and B). Taken together, a combination of unique active site residues and quaternary structures likely dictate the specific substrates of these GHs.
Interestingly, BuGUS-1 was also shown to process both GlcA-containing polysaccharides tested (Fig. 6A). Compared with traditional L1 GUS enzymes, the active site of BuGUS-1 is more open due to its N-terminal-mediated tetrameric interface (Fig. 3B), which allows larger polysaccharides to access the active site. In addition to its unique tetrameric state, the flexible nature of its active site, as evidenced by the PTG-bound structure, may also explain the ability of BuGUS-1 to process bulkier polysaccharides. Upon PTG binding, several conformational shifts occur, including that exhibited by the catalytic base Glu-421, which appears to conflict with the large sulfur atom of PTG (Fig. 7, A and B). Although this conformation would preclude function, as Glu-421 is far from the position it would need to be to serve as an acid/base, the structure demonstrates that there is enough mobility in the active site to accommodate this shift and suggests that the active site is also capable of accommodating larger polysaccharide substrates.
We further found via docking that other sugar acids, like galacturonate, mannuronate, and iduronate, are likely to be accommodated in the active sites of these GUS enzymes (Fig.  S8), and we confirm that BuGUS-1 can utilize a small-molecule galacturonide as a substrate (Fig. S9). This finding expands our understanding of the substrate-utilization capacities of the gut microbial GUS enzymes, and suggests that these enzymes may coordinate the degradation of polysaccharides that contain uronic acids beyond glucuronate.
Given the importance of quaternary structure relative to GUS function, we were interested in whether computational approaches would provide this critical information. We used Rosetta modeling to predict the tertiary and quaternary structures of the three GUS enzymes reported here. Although the core fold was predicted with a high degree of accuracy for all GUS enzymes analyzed, the critical loop structures as well as the orientation of C-terminal domains were more difficult to position and were heavily influenced by extant structures (Fig.  S13). These results highlight the importance of using experimental structures to further refine modeling approaches to accurately predict protein quaternary structures.
Upon determining that BuGUS-1 and BuGUS-2 are targets to prevent GI side effects via SN-38-G processing, we tested whether they are susceptible to inhibition. Our GUS-specific inhibitors Inh1 and UNC10201652 did not inhibit BuGUS-1 and BuGUS-3 (Table 2). Previously, we have shown that the loop in L1 GUS enzymes stabilizes Inh1 (16). Although BuGUS-1 contains an N-terminal loop that replaces L1 in the active site, it is distinct from classic Loop 1 GUS enzymes that form deep hydrophobic pockets constructed from two loops from adjacent monomers (Fig. 3, A and B) (16). Thus, the active site in BuGUS-1 is more hydrophilic and solvent accessible, making it unfavorable for binding to the hydrophobic scaffold of Inh1 and UNC10201652.
Although Inh1 and UNC10201652 did not inhibit GUS activity, we showed that the nonspecific GUS inhibitor D-glucaro-1,4-lactone did inhibit BuGUS-1 and BuGUS-2 ( Table 2). The crystal structure of BuGUS-1 incubated with the inhibitor revealed D-glucaro-1,5-lactone bound instead of D-glucaro-1,4lactone (Fig. 7, A and B). We hypothesize that D-glucaro-1,5lactone is spontaneously generated in solution over the time scale of crystal formation, upon which it is stabilized by binding to the GUS active site. Previous studies indicate that hydrolases in general, and GUS specifically, binds more tightly to D-glucaro-1,5-lactone than D-glucaro-1,4-lactone (28) and may explain its presence in the active site. The same result was observed for BuGUS-2, with D-glucaro-1,5-lactone apparent in the active site instead of the administered D-glucaro-1,4lactone (PDB 6D50). Importantly, this pan-GUS inhibitor exhibited mid-micromolar potency against BuGUS-1 and BuGUS-2. These data suggest that other inhibitor chemotypes could be employed to prevent the actions of non-Loop 1 GUS enzyme-mediated reactivation of SN-38-G in the intestinal lumen.
The presence of three structurally and functionally unique GUS enzymes within a single B. uniformis PUL suggests that Three distinct ␤-glucuronidases from B. uniformis they have evolved to cleave distinct bonds in a uronic acid-rich polysaccharide. However, the action of GUS enzymes is not sufficient to carry out the complete catabolism of a complex polysaccharide. Thus, it is likely that these GUS enzymes act in concert with the glucuronyl hydrolase 88 enzyme, mannonate oxidase, mannonate dehydratase, and PL enzymes found in the same PUL to deconstruct a complex uronate-containing glycan found in the human gut. Additionally, the hallmark of SusC/ SusD proteins likely mediate the transportation of the polysaccharide into the periplasmic space of B. uniformis for subsequent catabolism. Further studies are needed to determine the true polysaccharide associated with this PUL, but the data presented provides a basis for understanding the roles these GUS enzymes play in polysaccharide processing as well as their more established roles in drug-glucuronide reactivation.

Enzyme cloning
The full-length BuGUS-1, BuGUS-3, and BMSP genes were purchased from BioBasic in the pUC57 vector. Protein sequences were analyzed for signal peptide cleavage sites using the online SignalP 4.1 server (15). The mature gene lacking the signal peptide was amplified and inserted into the pLIC-His vector using the primers in Table S5.

Site-directed mutagenesis
The BuGUS-1 NTL deletion, BuGUS-2 D341A/D367A (BuGUS-2 ⌬Ca 2ϩ ), and BuGUS-2 N591A/K593A mutants were created using site-directed mutagenesis. Primers were synthesized by Integrated DNA Technologies and are shown in Table S5. The mutant plasmids were sequenced to confirm the mutations. The mutants were produced and purified using Escherichia coli BL21 (DE3) Gold as described above. The NTL deletion for BuGUS-1 encompassed residues Tyr 54 -Ala 67 , which were replaced by a 6-residue linker (RGMKVY) based on the structure of BMSP GUS to maintain protein stability.

Protein expression and purification
Each ␤-glucuronidase expression plasmid was transformed into BL21 (DE3) Gold cells for enzyme expression. Cells were grown in the presence of ampicillin in LB medium with shaking at 225 rpm at 37°C to an A 600 of 0.5, at which point the temperature was reduced to 18°C. At A 600 of 0.8, protein expression was induced by the addition of 0.1 mM isopropyl 1-thio-Dgalactopyranoside (IPTG) and incubation continued overnight. Cells were collected by centrifugation at 4500 ϫ g for 20 min at 4°C in a Sorvall (model RC-3B) swinging bucket centrifuge. Cell pellets were resuspended in Buffer A (20 mM potassium phosphate, pH 7.4, 50 mM imidazole, 500 mM NaCl), DNase, lysozyme, and a Roche Complete EDTA-free protease inhibitor tablet. Resuspended cells were sonicated and clarified via centrifugation at 17,000 ϫ g for 60 min in a Sorvall (model RC-5B). The lysate was flowed over a nickel-nitrilotriacetic acid HP column (GE Healthcare) loaded onto the Aktaxpress FPLC system (Amersham Bioscience) and washed with Buffer A. Protein was eluted with Buffer B (20 mM potassium phosphate, pH 7.4, 500 mM imidazole, 500 mM NaCl). For BuGUS-1 used for crystal-lography, the His tag was removed by tobacco etch virus cleavage in the presence of 1 mM DTT and incubated overnight at 4°C. This sample was then applied to the nickel-nitrilotriacetic acid column again and the flow through was collected. Fractions containing the protein of interest were combined and passed over a HiLoad TM 16/60 Superdex TM 200 gel filtration column. Proteins were eluted in S200 buffer (20 mM HEPES, pH 7.4, 50 mM NaCl), except for BMSP and the BuGUS-1 ⌬N-terminal loop mutant, which were eluted in S200 buffer that contained 300 mM NaCl. Fractions were analyzed by SDS-PAGE and those with Ͼ95% purity were combined and concentrated for long-term storage at Ϫ80°C. Crystal specimens were cryo-protected in the crystallization conditions as described above containing 20% glycerol, and diffraction data were collected for all crystals at 100 K at APS Beamline 23-ID-D, except for BuGUS-2 ⌬Ca 2ϩ , which was collected at APS Beamline 23-ID-B. The data were processed with XDS and all structures were solved via molecular replacement in Phenix (29) using the E. coli GUS structure (PDB 5CZK) as a search model for BuGUS-1-apo, the B. uniformis GUS structure (PDB 5UJ6) as a search model for BuGUS-2-⌬Ca 2ϩ , the B. fragilis structure (PDB 3CMG) as a search model for BuGUS-3, and the BuGUS-1-apo structure (PDB 6D1N) for the remaining structures. The resulting starting model and maps from molecular replacement were then used in the AutoBuild function of Phenix. Structures were refined in Phenix and visu-Three distinct ␤-glucuronidases from B. uniformis ally inspected and manually built using COOT (30). Final PDB coordinates for all structures have been deposited to the RCSB Protein Data Bank with corresponding PDB codes in parentheses: BuGUS-1 (6D1N), BuGUS-3 (PDB 6D1P), BMSP (PDB 6D8K), BuGUS-1 ϩ G-1,5-L complex (PDB 6D41), BuGUS-2 ϩ G-1,5-L complex (PDB 6D50), BuGUS-1 ϩ GlcA complex (PDB 6D6W), BuGUS-1 ⌬loop (PDB 6D89), BuGUS-2 calcium-binding mutant (PDB 6D8G), and BuGUS-1 ϩ TPG complex (PDB 6D7F).

GUS activity assay of 4-MUG hydrolysis
Initial pH screening was performed with PNPG, as described previously. Because PNPG is not amenable for continuous kinetic studies at pH below 6.5, we utilized an analogous but fluorescent GUS substrate 4-MUG for subsequent kinetic investigations. In vitro assays of GUS activity with 4-MUG were carried out in Costar black 96-well clear flat bottom plates. Total reaction volume was 50 l with 5 l of GUS and 5 l of 10ϫ buffer (250 mM HEPES, 250 mM NaCl, pH 7.0) mixed and preincubated at 37°C before reaction initiation by addition of 40 l of 4-MUG. Concentration of enzyme was specific to each GUS: 5 nM EcGUS, 5 nM BuGUS-1, 20 nM BuGUS-2, 40 nM BuGUS-1 ⌬loop, 80 nM BMSP GUS, and 320 nM BuGUS-3. Reactions were monitored continuously in a BMG lab tech PHERAstar plate reader with an excitation wavelength of 350 nm and an emission wavelength of 450 nm. Resultant progress curves were fit by a custom linear regression analysis program in MATLAB. Initial velocities were then analyzed in the enzyme kinetics module of SigmaPlot 13.0 by Michaelis-Menten fit to determine the catalytic turnover (k cat ) and Michaelis constant (K m ).

GUS activity assay of SN-38-G hydrolysis
In vitro assays of GUS activity with the substrate SN-38-G were carried out in Costar black 96-well clear flat bottom plates. Total reaction volume was 50 l with 5 l of SN-38-G at a range of low substrate concentrations (15, 10, 7.5, 5, and 2.5 M final), 5 l of 10ϫ buffer (250 mM HEPES, 250 mM NaCl, pH 7.0), and 35 l of water mixed and preincubated at 37°C before reaction initiation by addition of 5 l of GUS. Concentration of enzyme was specific to each GUS: 5 nM EcGUS, 5 nM BuGUS-1, 20 nM BuGUS-2, 40 nM BuGUS-1 ⌬loop, 80 nM BMSP GUS, and 320 nM BuGUS-3. Reactions were monitored continuously by fluorescence with an emission wavelength of 420 nm and an excitation wavelength of 230 nm. Resultant progress curves were fit by a custom linear regression analysis program in MATLAB. Initial velocities were then plotted against substrate concentration and fit with linear regression in Microsoft Excel to determine catalytic efficiency (k cat /K m ).

GlcA-containing polysaccharide processing assay
The sulfated heparin-like nonasaccharide (GlcA-(GlcNS-GlcA) 4 -PNP (where GlcA is glucuronic acid, GlcNS is Nsulfated glucosamine)) and the acetylated heparin-like nonasaccharide (GlcA-(GlcNAc-GlcA) 4 -PNP (where GlcA is glucuronic acid, GlcNAc is GlcNAc)) substrates were from Glycantherapeutics. The additional polysaccharides employed were synthesized in-house (37). Putative polysaccharide sub-strates were digested with each GUS enzyme for 3 h. Digestion reactions were composed of 0.5 M GUS enzyme and 10 g of oligosaccharide. Reactions were terminated by heating for 5 min at 95°C. Aliquots of the resultant solutions were analyzed by polyamine-based anion exchange (PAMN)-HPLC. Sugars were eluted from the PAMN column (0.46 ϫ 25 cm from Waters) with a linear gradient of KH 2 PO 4 from 0 to 1 M in 40 min at a flow rate of 0.5 ml/min. The eluent was monitored by a UV detector at 310 nm. Aliquots of the digestion reactions were analyzed by electrospray ionization-MS (ESI-MS) by first purifying the reaction mixture by C18 column eluted with a linear gradient of methanol with 1% TFA from 0 to 1 M in 60 min at a flow rate of 0.5 ml/min. The purified oligosaccharides were then dried. Electrospray ionization-MS analysis was performed on a Thermo LCQ-Deca in negative ionization mode. A syringe pump (Harvard Apparatus) was used to introduce the sample by direct infusion (50 ml/min). The purified oligosaccharides were diluted in 200 ml of H 2 O with the electrospray source set to 3 KV and 150°C. The automatic gain control was set to 1 ϫ 10 7 for full scan MS. The MS data were acquired and processed using Xcalibur 1.3.

GUS inhibition assay
In vitro assays of GUS activity with the substrate 4-MUG were carried out in Costar black 96-well clear flat bottom plates. Total reaction volume was 50 l with 5 l of GUS, 10 l of 5ϫ buffer (125 mM HEPES, 125 mM NaCl, pH 7.0), and 5 l of inhibitor mixed and preincubated at 37°C before reaction initiation by addition of 30 l of 4-MUG. Reactions were monitored continuously in a PHERAstar plate reader at 410 nm. End point absorbance values after 1 h were converted to % inhibition values via the following equation, where A exp is the end point absorbance at a particular inhibitor concentration, A max is the absorbance of the uninhibited reaction, and A bg is the background absorbance. Percent inhibition values were subsequently plotted against the log of inhibitor concentration and fit with a four-parameter logistic function in SigmaPlot 13.0 to determine the concentration at which 50% inhibition (IC 50 ) is observed.

SEC-MALS analysis of BuGUS enzymes
BuGUS-1, BuGUS-2, and BuGUS-3 were analyzed on a Superdex 200 size exclusion column connected to an Agilent FPLC system, Wyatt DAWN HELEOS II multi-angle light scattering instrument, and a Trex refractometer. The injection volume was 50 l, and each protein was assessed at 10 mg/ml in 50 mM HEPES and 150 mM NaCl, pH 7.4, buffer. A flow rate of 0.5 ml/min was used. Light scattering and refractive index data were collected and analyzed using Wyatt ASTRA (version 6.1) software. A dn/dc value of 0.185 was used for calculations. Approximately 99% of BuGUS-1 eluted in a single peak with a molar mass of 275 kDa, indicating that it forms a tetramer in solution. In contrast, 99% of BuGUS-2 and 95% BuGUS-3 Three distinct ␤-glucuronidases from B. uniformis eluted in single peaks with molar masses of 189 and 175 kDa, respectively, indicating that they form dimers in solution.

CD analysis of BuGUS-2 calcium-binding mutant
The protein stabilities of WT BuGUS-2 and the calciumbinding mutant were determined using the circular dichroism method (31). Enzyme (1 M) in CD buffer containing 10 mM potassium phosphate, pH 7.4, and 100 mM potassium fluoride was loaded into a 1-mm cuvette. Using a Chirascan-plus instrument (Applied Photophysis Limited), spectra from 185 to 280 nM were recorded at 20 Ϯ 1.0°C. Measurements were corrected for background signal using a CD buffer sample. The melting profile of the sample (5 M) was monitored at 218 nm from 25 to 94°C.

Manual docking of monosaccharide in PyMOL
Galacturonate, mannonate, and iduronate monosaccharides were accessed from the PDB in previously solved crystal structures (PDB 1KCC for galacturonate, PDB 3VLW for mannuronate, and PDB 4OBR for iduronate). These were then imported into PyMOL and manually aligned to the GlcA-bound BuGUS-1 structure (PDB 6DW6) with the 3-button editing tool. After manual alignment of the sugar monosaccharides, structures of BuGUS-1 and BuGUS-3 were aligned to the GlcAbound BuGUS-2 structure. Visual inspection and final figures after alignment were generated in PyMOL.

Rosetta modeling
The full-length amino acid sequences BuGUS-1, BuGUS-3, and BMSP GUS were submitted to the Robetta modeling server (32)(33)(34) to produce 3D homology models of these proteins, including their oligomeric complexes, based on template protein structures available in the Protein Data Bank from December 2017 to January 2018. The BuGUS-1 and BMSP GUS Robetta homology models were based on the E. coli ␤-glucuronidase structure (PDB 3LPF). For the BuGUS-3 homology model, the Robetta selected template was a ␤galactosidase from Bacillus circulans ATCC 31382 (PDB 4YPJ). Backbone C-␣ coordinates of the homology model protein structures were then superimposed onto X-ray crystal structures using TM-align algorithms (35).

Identification of predicted calcium-binding sites
To identify calcium-binding sites in GUS enzymes from the HMP dataset, the 279 GUS protein sequences previously identified (12) were aligned pairwise to BuGUS-2 using NCBI BLASTp (36). These alignments were then probed for the three aspartic acid residues in BuGUS-2 (Asp-176, Asp-341, and Asp-367) deemed necessary for calcium binding.

Identification of tryptophan substitutions
To identify additional GUS enzymes in the HMP Clustered genes (HMGC) dataset that possess a tryptophan rather than a tyrosine at position Trp-483 in BuGUS-3, the ϳ267,000 sequences previously determined to share 25% identify with EcGUS, SaGUS, CpGUS, and BfGUS (12) were aligned pairwise to the these GUS enzymes and BuGUS-3 using NCBI BLASTp (36). The sequences were then probed for the presence of the NXKG motif, catalytic E residues, and N and W motifs.

Identification of N-terminal loops
To identify N-terminal loops in GUS enzymes from the HMP dataset, the 279 GUS protein sequences previously identified (12) were aligned pairwise to BuGUS-1 using NCBI BLASTp (36). These alignments were then probed for the N-terminal loop in BuGUS-1.