Structural Insights into the Broad Substrate Specificity of a Novel Endoglycoceramidase I Belonging to a New Subfamily of GH5 Glycosidases*

Endoglycoceramidases (EGCases) specifically hydrolyze the glycosidic linkage between the oligosaccharide and the ceramide moieties of various glycosphingolipids, and they have received substantial attention in the emerging field of glycosphingolipidology. However, the mechanism regulating the strict substrate specificity of these GH5 glycosidases has not been identified. In this study, we report a novel EGCase I from Rhodococcus equi 103S (103S_EGCase I) with remarkably broad substrate specificity. Based on phylogenetic analyses, the enzyme may represent a new subfamily of GH5 glycosidases. The X-ray crystal structures of 103S_EGCase I alone and in complex with its substrates monosialodihexosylganglioside (GM3) and monosialotetrahexosylganglioside (GM1) enabled us to identify several structural features that may account for its broad specificity. Compared with EGCase II from Rhodococcus sp. M-777 (M777_EGCase II), which possesses strict substrate specificity, 103S_EGCase I possesses a longer α7-helix and a shorter loop 4, which forms a larger substrate-binding pocket that could accommodate more extended oligosaccharides. In addition, loop 2 and loop 8 of the enzyme adopt a more open conformation, which also enlarges the oligosaccharide-binding cavity. Based on this knowledge, a rationally designed experiment was performed to examine the substrate specificity of EGCase II. The truncation of loop 4 in M777_EGCase II increased its activity toward GM1 (163%). Remarkably, the S63G mutant of M777_EGCase II showed a broader substrate spectra and significantly increased activity toward bulky substrates (up to >1370-fold for fucosyl-GM1). Collectively, the results presented here reveal the exquisite substrate recognition mechanism of EGCases and provide an opportunity for further engineering of these enzymes.

nan), EGCases display an unusual substrate specificity that accepts amphiphilic GSLs consisting of a hydrophilic oligosaccharide headgroup and a hydrophobic ceramide tail. The crystal structure of EGCase II from Rhodococcus sp. strain M-777 (M777_EGCase II) suggests that the substrate-binding site of EGCase is split into two noticeably different parts: a wide, polar cavity that binds the polyhydroxylated oligosaccharide moiety and a narrow, hydrophobic tunnel that binds the ceramide moiety of the substrates (16). However, the distinct substrate specificities of different EGCases imply different substrate-binding modes, particularly for the oligosaccharide moiety. Unfortunately, because only one crystal structure of EGCase is available so far, detailed investigations of the substrate recognition mechanism have been hampered.
In this study, we report the molecular cloning and enzymatic characterization of a novel EGCase I from Rhodococcus equi 103S (103S_EGCase I). The recombinant protein showed high catalytic activity, broad substrate specificity, and a remarkably high expression level in Escherichia coli. Based on phylogenetic analyses, EGCase I may represent a new subfamily of GH5 glycosidases. The X-ray crystal structures of 103S_EGCase I alone and in complex with its substrates monosialodihexosylganglioside (GM3) and monosialotetrahexosylganglioside (GM1) were obtained and compared with the structures of M777_EGCase II. A detailed analysis of the substrate-binding mode offers valuable information that enables us to better understand its substrate recognition mechanism, which may facilitate subsequent enzyme engineering studies for the design of better EGCases.

Results
Overexpression and Characterization of EGCase I from R. equi 103S-A putative EGCase from R. equi 103S (103S_EG-Case I) (GenBank TM accession number CBH49814) shares 90% sequence identity with EGCase I from Rhodococcus sp. M-750 (M750_EGCase I) (supplemental Fig. S1) (15). The gene was codon-optimized for E. coli, chemically synthesized, and subcloned into a pET28a vector. 103S_EGCase I was functionally overexpressed in E. coli at a very high level (Fig. 1). In a typical experiment, 80 mg/liter purified protein was obtained from a 1-liter E. coli shaking flask culture after 12 h of induction, which is much higher than the previously reported expression of the M750_EGCase I in the Rhodococcus system (ϳ1 mg/liter, 24 h of induction).
The optimal substrate for 103S_EGCase I was GM1, whereas M777_EGCase II preferred GM3 over the GSLs with larger and branched sugar moieties, such as GM1, fucosyl-GM1, and Gb4Cer. Thus, these two enzymes probably possess different substrate-binding sites.
Based on the determination of the steady-state kinetic parameters, 103S_EGCase I and M777_EGCase II had similar K m values for the ganglioside GM1 (the raw data are shown in supplemental Fig. S3). However, the k cat and the k cat /K m of 103S_EGCase I were 119-and 130-fold higher, respectively, than the values for M777_EGCase II (Table 2).
Phylogenetic Analysis of EGCases-All of the known EGCases belong to the GH5 glycosidase family. EGCrP1 and EGCrP2 are assigned to subfamily GH5_12, eukaryotic EGCases belong to subfamily GH5_27, EGCase II belongs to subfamily GH5_28, and EGALC belongs to subfamily GH5_29 (17). However, EGCase I has not been classified into any GH5 subfamily. A phylogenetic analysis was performed to better understand the evolutionary background of EGCase I. The amino acid sequence of M750_EGCase I was used as a query for a BLAST search of the NCBI non-redundant protein sequence database, and five homologous sequences were collected with Ͼ50% sequence identity, including 103S_EGCase I (90%), WP 031939561 from Rhodococcus defluvii (86%), WP 042260331 from Nocardia brasiliensis (65%), WP 043573168 from Actinopolyspora erythraea (60%), and WP 051198817 from Gordonia shandongensis (57%). These sequences and 21 other EGCase-related proteins extracted from the CAZy database were used to derive a phylogenetic tree using the maximum likelihood method based on the JTT matrix-based model. As shown in Fig. 2, the tree clearly assigned these sequences to their corresponding subfamilies (17), confirming the validity of the phylogenetic analysis.
103S_EGCase I and M750_EGCase I were obviously clustered with other putative EGCase I sequences in a branch distinct from the other EGCase-related subfamilies, suggesting that these EGCase I-related enzymes may belong to a new subfamily of the GH5 family. The assignment of EGCase I to a new subfamily was consistent with the observation that it possessed distinct substrate specificity compared with other EGCases (supplemental Table S1).
Architecture of 103S_EGCase I-The crystal structure of 103S_EGCase I was determined at 2.11 Å resolution with the space group C121. R work and R free were 19.4 and 22.3%, respectively. Two 103S_EGCase I molecules were present in one asymmetric unit (Fig. 3A). The two monomers of 103S_EGCase I were nearly identical, with an r.m.s.d. of 0.52 Å over 416 residues. Residues 68, 300 -307, 309, and 342-343 were not modeled in chain B because of a poor electron density. The monomer-monomer interface buried an extensive, predominantly hydrophobic area of ϳ962 Å 2 , which corresponded to 12.7% of the total surface area of one monomer, as calculated by PISA ( Fig. 3B) (18). The interface had three hydrogen bonds and 43 non-bonded contacts, suggesting that the observed dimer was not the biological form because the crystal contacts of homodimers or protein complexes tend to have 10 -20 hydrogen bonds (19). Moreover, the interface between these two molecules had a 3.36% probability of being the biological interface according to the NOXclass analysis (20), providing further support for the hypothesis that the dimer only results from crystallographic packing.
Each monomer of 103S_EGCase I contained two distinct domains. The N-terminal domain exhibited the characteristic (␣/␤) 8 TIM-barrel fold of all GH5 family members; it contained an internal core of eight ␤-strands connected by loops of various sizes to an external layer of eight ␣-helices. The C-terminal domain formed a ␤-sandwich fold composed of two sheets of four antiparallel ␤-strands (Fig. 3, A and C). Two disulfide bonds were present in the 103S_EGCase I structure: Cys 224 -Cys 229 and Cys 294 -Cys 313 . A structure search using the DALI server (21) suggested that the crystal structure of 103S_EGCase I closely matched the structure of M777_EGCase II (PDB code 2OYM, DALI Z ϭ 41.7, and r.m.s.d. ϭ 2.6 Å for 400 equivalent C ␣ positions), despite their low sequence similarity (30% identity). Other similar structures included a cellulase (PDB code 4HTY, DALI Z ϭ 26.6, and r.m.s.d. ϭ 2.6 Å for 269 residues), an endo-␤-mannanase (PDB code 4QP0, DALI Z ϭ 25.8, and    (23). The C-terminal domain of 103S_EGCase I displayed a typical ␤-sandwich fold that resembled the folds of many carbohydrate-binding modules of glycoside hydrolases (24). The ␤-sandwich domain may not be involved in binding the carbohydrate portion of the substrate, because it located on the opposite face of the (␣/␤) 8 domain. Similar domains have been observed in other GH5 family members, including M777_ EGCase II (16), endo-xyloglucanase (25), and ␤-glucanase (26). Indeed, many of these carbohydrate-binding modules do not bind the substrate independently (27). This domain may simply stabilize the catalytic (␣/␤) 8

domain.
Structure of the Enzyme-Substrate Complex-103S_EGCase I was co-crystallized with each of its substrates, GM1 and GM3, to further understand the structural basis of the broad substrate specificity of EGCase I. Co-crystallization experiments were performed with the nucleophile mutant 103S_EGCase I/E339S to prevent substrate hydrolysis. The 103S_EGCase I-GM1 structure was determined at 2.15 Å resolution, and the 103S_EGCase I-GM3 structure was determined at 1.915 Å res-olution. Both structures belonged to space group C121 and contained two molecules per asymmetric unit. Clear electron density was evident for all the pyranoside rings, with the exception of the sialic acid unit of GM3, which was only partially clear ( Fig. 4, A-C). In both complexes, the ceramide moieties were partially distinguished. The superposition of the 103S_EGCase I-GM1 or 103S_EGCase I-GM3 structure on the 103S_EGCase I structure showed that the overall protein structure was almost unchanged, reflected in the r.m.s.d. of 0.21 Å/0.23 Å over 449 common C␣ atoms between the 103S_EGCase I structure and the ligand-bound forms, respectively.
Similar to the crystal structure of M777_EGCase II, the substrate-binding site of 103S_EGCase I contained two distinct regions. On one side of the catalytic residues, the active site channel was broad (ϳ27.9 Å) and mainly lined with polar residues that formed the binding cavity for the oligosaccharide moiety. On the opposite side, the active site narrowed to an ϳ5.8-Å channel that was predominantly lined with hydrophobic residues, forming the ceramide-binding tunnel. This tunnel subsequently opened onto a distinctly flat surface of the enzyme, which also appeared largely composed of hydrophobic residues (Fig. 3, B and D).  (15), whereas the other sequences included in the analysis were obtained from the NCBI database. All bootstrap values are displayed. Scale bar, 0.2 amino acid substitutions/site. The three-dimensional structure of 103S_EG-Case I (GenBank TM accession number CBH49814) from R. equi 103S solved in this study is marked with a red star. M777_EGCase II (GenBank TM accession number AAB67050) was from Rhodococcus sp. strain M-777, EGALC (GenBank TM accession number BAF56440) was from R. equi, EGCrP1 (GenBank TM accession number BAL46040) was from C. neoformans, and EGCrP2 (GenBank TM accession number AFR99035) was from C. neoformans. The GH5 subfamily number for each branch is shown.
The coordination of GM1 and GM3 in the enzyme is described in detail in Fig. 4, D and E. The glucose unit was held in a fixed position by a hydrogen bond network consisting of Lys 61 , His 131 , Asp 133 , Asn 213 , Glu 214 , and Gln 298 residues. These residues are highly conserved among EGCases. Mutation of any of these residues to alanine completely abolished the enzymatic activity toward GM1 (Table 3; for the raw data, see supplemental Fig. S4). The inner galactose unit formed hydrogen bonds with Lys 61 , Tyr 302 , and Trp 365 . Both Lys 61 and Trp 365 are conserved among EGCases, and mutating them to alanine resulted in a dramatic loss of enzymatic activity (Table 3). Notably, the sialic acid unit showed a remarkable difference in conformations between the GM1 and GM3 complex (Fig. 4C). In the GM3 complex, the sialic acid unit directly interacted with the inner galactose, whereas in the GM1 complex, it interacted with the N-acetylglucosamine residue. In the 103S_EGCase I-GM1 complex, Asp 62 formed a hydrogen bond with the N-acetylglucosamine unit of GM1. However, the D62A mutant still retained ϳ70% of the normal activity, suggesting that Asp 62 contributes little to catalysis. The terminal galactose unit of GM1 did not directly interact with the protein; instead, its interaction with Asp 342 was mediated by a water molecule.
Interestingly, although Asp 342 is not a conserved residue in EGCase family, mutation of this residue caused a dramatic loss of activity, suggesting that it has an important role in catalysis. This residue was mutated to several other amino acids to better understand its potential function. Mutation of Asp 342 with Asn or Gln caused the enzyme to retain very low activity, whereas the other mutations completely abolished the enzymatic activity (supplemental Table S2). Because the mutations caused a loss of the enzymatic activities toward GM1 and LacCer, the interaction between Asp 342 and the sialic acid unit did not contribute to the loss of activity. We inferred that Asp 342 might stabilize the conformation of the "cap," which might be important for catalysis, because it formed a hydrogen bond with the cap-forming amino acid, Tyr 302 , located in the ␣7-helix (Fig.  6A).
The ceramide moieties of GM1 and GM3 were only partially defined in the electron density map. Asn 265 and Gln 298 directly interacted with ceramide. Both residues are conserved among EGCases and important for catalysis ( Table 3). The ceramidebinding channel was lined by the hydrophobic side chains of Phe 162 , Pro 163 , Leu 167 , Trp 216 , Phe 225 , Val 262 , Ile 295 , and Leu 299 .
The Substrate Recognition Mechanism and Molecular Engineering Guided by Structural Comparisons-The main difference between EGCase I and EGCase II was their substrate specificity toward oligosaccharide moieties. EGCase I efficiently hydrolyzes fucosyl-GM1 and globo-series GSLs that are resistant to EGCase II. The resolution of the 103S_EGCase I-GM1 structure (PDB code 5J7Z) enabled us to perform a detailed comparison of its structure with the structure of M777_ EGGase II-GM3 (PDB code 2OSX). The superposition of 5J7Z and 2OSX monomers using Chimera gave an r.m.s.d. of 1.09 Å between 239 atom pairs. As shown in Fig. 5A, although the overall structure of 103S_EGCase I was similar to that of M777_EGCase II, it showed several major structural differences in the oligosaccharide-binding cavity.
First, the ␣7-helix of 103S_EGCase I was longer than the equivalent ␣8-helix in M777_EGCase II (Fig. 5B). The Tyr 302 and Leu 303 residues in the ␣7-helix along with Phe 162 , Pro 163 , and Leu 164 in loop 5 and the ␣6-helix formed a broad cap over the ceramide-binding channel in 103S_EGCase I, whereas the cap in M777_EGCase II formed by Arg 177 and Asp 311 was much smaller (Fig. 6, A and B). Therefore, the opening of the active site of 103S_EGCase I was obviously smaller than the opening of M777_EGCase II (Fig. 6, A and B), which is mainly attributed to the presence of Tyr 302 .
Second, loop 4 (Gly 140 -Pro 145 ) in 103S_EGCase I was shorter than the corresponding loop 4 (Thr 144 -Pro 161 ) in M777_EGCase II (Fig. 5C). The shorter loop increased the volume of the sugar cavity of 103S_EGCase I, which may account for its broad substrate specificity. The truncation of loop 4 in M777_EGCase II from Asn 148 to Gly 154 may increase the space of the crowded sugar-binding cavity (Fig. 6D), resulting in enhanced activity toward GM1 (163%) and decreased activity (46.8%) toward LacCer (Table 4).
Third, the conformation of loop 2 was different in the two enzymes. Loop 2 in 103S_EGCase I (Val 59 -Thr 73 ) was more open than loop 2 in M777_EGCase II (Ala 62 -Thr 76 ) (Fig. 5D). Consequently, 103S_EGCase I possessed a larger sugar-binding pocket, which could accommodate the fucosyl unit of fucosyl-GM1 (Fig. 6C). By contrast, loop 2 in M777_EGCase II was disrupted by a short 1-helix ( 63 SSAK 66 ), and thus it adopted a conformation that was closer to that of the substrate (Fig. 5D). The superimposition of the crystal structures of 103S_EGCase I-GM1 and M777_EGCase II clearly showed that the inclusion of Ser 63 in M777_EGCase II resulted in a narrowed sugar-binding cavity, which may also cause its strict specificity. Indeed, the S63G mutant of M777_EGCase II efficiently hydrolyzed fucosyl-GM1, with its catalytic activity increasing more than 1370fold. Moreover, its activity toward GM1 was also enhanced by ϳ10-fold (Table 4).

TABLE 3 Specific activities of wild-type 103S_EGCase I and its mutants
Representative HPLC chromatograms of the specific activity assay are presented in supplemental Fig. S4.

Structural Insight into EGCase Substrate Recognition
Finally, loop 8 also showed a large difference in conformations between 103S_EGCase I and M777_EGCase II (Fig. 5E). Compared with M777_EGCase II, loop 8 of 103S_EGCase I moved outward ϳ13 Å and was flatter, which also contributed to the enlarged sugar-binding cavity.

Discussion
EGCases are a group of glycoside hydrolases that are important in cellular glycosphingolipid-glycome analyses. In this study, we identified a new 103S_EGCase I from R. equi 103S that hydrolyzes ganglio-, lacto-, and globo-series GSLs, as well as fucosyl-GM1. Remarkably, 103S_EGCase I can be readily overexpressed in E. coli at a very high level (80 mg/liter purified protein, 12 h of induction), which is much higher than the previously reported expression of M750_EGCase I in the Rhodococcus system (ϳ1 mg/liter, 24 h of induction). The broad substrate specificity, high catalytic activity, and ease of expression make 103S_EGCase I a good biocatalyst for cellular glycomics analysis of GSLs.
The GH5 family is one of the largest GH families, containing Ͼ6900 protein sequences with ϳ20 different enzyme activities (17). Previously, the EGCases were divided into the GH5_12, GH5_27, GH5_28, and GH5_29 subfamilies. A phylogenetic analysis was conducted in this study, and the results suggested that EGCase I genes belong to a new subfamily within the GH5 family. We obtained the first crystal structure of a member of this new subfamily, which may provide new insights into the mechanism of the substrate selectivity of EGCases.
Based on the detailed structural comparison, 103S_EGCase I and M777_EGCase II exhibit several major structural differences in their sugar-binding cavities, which explains their different substrate specificities. First, the ␣7-helix of 103S_EG-Case I is longer than the equivalent ␣8-helix in M777_EGCase II, forming a larger cap over the glycosidic bond. Second, the flexible loop 4 between 1 and 2 is shorter than the corresponding loop in M777_EGCase II. The loop 4 may play a role in the substrate selectivity of EGCases. For larger substrates, such as GM1, the activity may be inhibited by the space limitation arising from the long loop 4 in M777_EGCase II, but for smaller substrates, such as LacCer, the loop may stabilize the substrates in the active site and efficiently facilitate catalysis. Third, loop 2 of 103S_EGCase I adopts an open conformation compared with the closed conformation in M777_EGCase II, producing a smaller sugar-binding pocket that cannot accommodate more extended oligosaccharides. Presumably, the size of this pocket is the reason why M777_EGCase II cannot hydrolyze fucosyl-GM1. Finally, loop 8 of 103S_EGCase I moved outward and flattened, which could also enlarge the oligosaccharide-binding cavity.
This structural information enabled us to identify a series of conserved amino acids that are important for substrate binding in 103S_EGCase I. The residues that interact with the first two sugar residues (Glu and Gal) are highly conserved in 103S_EG-Case I and M777_EGCase II and include Lys 61 , His 131 , Asp 133 , Asn 213 , Glu 214 , Gln 298 , and Trp 365 . By contrast, the outer sugar subunits have few interactions with the enzyme; the sialic acid unit of GM1 only interacts with Asp 342 through a water molecule. A structure-based sequence alignment of EGCases revealed several regions with low sequence identity. In particular, the main structural differences mentioned above are located in unconserved regions A, B, and C, which may play key roles in determining substrate specificity (Fig. 7).
These analyses provided valuable information for engineering the EGCase protein. As shown in the study by Ishibashi et al. (15), the deletion of loop 4 (residues Asn 148 -Gly 154 ) from M777_EGCase II increased its catalytic activity toward GM1. In this study, the resolution of the 103S_EGCase I crystal structure revealed that loop 4 in 103S_EGCase I is obviously shorter than the corresponding loop in M777_EGCase II. The superposition of the 103S_EGCase I-GM1 complex with the M777_EGCase II-GM3 structure suggested that the long loop 4 in M777_ EGCase II might hamper the binding of GM1, which provides a structural explanation for the enhanced activity of its loop 4-deleted mutant. Loop 2 is another important region that differs in the structures of the two enzymes. Ser 63 in M777_EGCase II appeared to be too close to the GalNAc residue in GM1. Indeed, the S63G mutation resulted in an activity enhancement of about 10-fold toward GM1 and at least 1370-fold toward the fucosyl-GM1, implying that this mutation eliminates the steric hindrance and enlarges the sugar-binding pocket. More detailed analyses of this structure information may be helpful for further protein engineering of EGCases.
In conclusion, the biochemical and structural analyses in this study illustrate the structural basis of the substrate selectivity of EGCases. The broad specificity, high reaction efficiency, and ease of expression of 103S_EGCase I make it the best enzyme reported to date for use in the cellular glycomics analysis of GSLs. The structural knowledge obtained in this study revealed several regions that may be important for the substrate recognition of this enzyme class, providing possibilities for the rational design of these enzymes.
Phylogenetic Analysis-Twenty-one members of the GH5_ 12, GH5_27, GH5_28, and GH5_29 subfamilies, with at least five members in each subfamily, were selected from the CAZy database. The protein sequences sharing Ͼ50% sequence identity with EGCase I from Rhodococcus sp. M-750 were collected by a BLAST search of the NCBI non-redundant protein sequence database. The sequences of the EGCases and EGCase-related proteins were aligned using ClustalX. Evolutionary analyses were conducted and visualized in MEGA6 (28). The phylogenetic tree of the EGCases was constructed using the maximum likelihood method based on the JTT matrixbased model with 100 bootstrap replications.
Protein Expression and Purification-The gene encoding the mature EGCase I from R. equi 103S, which lacks its 26-residue N-terminal secretion signal sequence, was codon-optimized for E. coli and chemically synthesized (Genscript Corp., Nanjing, China). The gene sequence was subcloned into a pET28a vector (Novagen, Madison, WI) using the BamHI/HindIII restriction sites and was transformed into E. coli BL21 (DE3) pLysS cells. Transformants were grown at 37°C in Luria-Bertani medium containing 100 g/ml kanamycin until the optical density at 600 nm reached ϳ0.8. Then protein expression was induced by the addition of isopropyl ␤-D-1-thiogalactopyranoside to a final concentration of 0.1 mM at 16°C. After 12 h, the cells were harvested and disrupted by sonication, and the enzyme was purified by Ni 2ϩ -chelating affinity chromatography to Ͼ95% purity, as determined by SDS-PAGE analysis. The protein concentration was determined using a bicinchoninic acid protein assay with BSA as the standard. The activity of EGCase I was confirmed using a thin layer chromatography (TLC) assay with the substrate GM1, as described previously (15).
Enzymatic Assay and Kinetics-The activities of 103S_EG-Case I and M777_EGCase II were measured in a standard enzymatic assay using GM1 as substrate. The reaction mixture contained 10 nmol of GM1 and an appropriate amount of enzyme in 20 l of 50 mM sodium acetate buffer (pH 6.0) with 0.1% (w/v) Triton X-100. Following an incubation at 37°C for 10 min (103S_EGCase I) or 30 min (M777_EGCase II), the reaction was stopped by heating the mixture in a boiling water bath for 5 min to ensure that the initial velocity was measured. The generation of the GM1 oligosaccharide was measured using the HPLCbased protocol described by Neville et al. (29), with slight modifications. Briefly, a 20-l sample of the reaction mixture was mixed with 100 l of the 2-AA solution in 1.6-ml polypropylene screw cap freeze vials. The vials were capped tightly and heated at 80°C for 45 min for derivatization. After centrifugation at 13,000 rpm, the supernatant was transferred into a glass vial, and an aliquot (10 l) was injected into a 4.6 ϫ 250-mm TSK gel-Amide 80 column (4.6-mm inner diameter ϫ 250 mm, 5-m particle size). Solvent system A consisted of 5% acetic acid and 3% triethylamine in water, and solvent system B con-sisted of 2% acetic acid in acetonitrile. The following gradient conditions were used: 30% A isocratic for 5 min followed by a linear increase to 70% A over 1 min, holding at 70% A for an additional 4 min, and then a linear decrease to 30% A over 3 min. The 2-AA-derivatized product was detected using a fluorescence detector (Agilent 1260 FLD, E ex ϭ 360 nm, E em ϭ 425 nm) and quantified using a standard curve.
The substrate specificity of EGCase was presented as the specific activity toward different substrates using 10 ng of 103S_EGCase I or 100 ng of M777_EGCase II in a standard enzymatic assay. Substrate specificity was also presented as the reaction yield (percentage) for different substrates after a 24-h reaction in the presence of a sufficient concentration of enzyme. The HPLC method used to detect the oligosaccharides after 2-AA derivatization was similar to the method used for the GM1 oligosaccharides, except that the mobile phase ratio of A/Bwasadjustedaccordingtothepolarityofthereleasedoligosaccharides (supplemental Fig. S2). For the kinetic analysis, 103S_EGCase I (10 ng) was incubated at 37°C for 10 min in 20 l of reaction buffer. M777_EGCase II (100 ng) was assayed at 37°C for 30 min in 20 l of reaction buffer. The concentrations of the substrates ranged from 10 to 2000 M. The parameters K m and k cat were obtained by fitting the experimental data to the Michaelis-Menten kinetics model using GraphPad Prism version 5 software.
Crystallization-Crystallization experiments were conducted in 48-well plates using the hanging drop vapor diffusion  the E339S mutant with GM1 or GM3. The GM1 or GM3 substrate was dissolved in the protein solution to a final concentration of 10 mM for 2 h at 4°C before proceeding with the hanging drop crystallization experiments described above. Data Collection and Structure Determination-For X-ray diffraction experiments, crystals were removed from the crystallization drop with a nylon loop, soaked briefly in a cryoprotectant solution of the crystallization solution supplemented with 30% (v/v) ethylene glycol, and flash-cooled in liquid nitrogen. X-ray diffraction data sets were collected on the BL17U and BL19U beamlines at the Shanghai Synchrotron Research Facility. All diffraction data were indexed, integrated, and scaled using HKL-2000 (31).
Initial phases for each structure were determined by molecular replacement. The structure of 103S_EGCase I was solved using the program BALBES (32) with the Auto-RICSHAW pipeline (33). The structure was completed with alternating rounds of manual model building with Coot (34) and refinement with REFMAC5 (35) in the CCP4 suite (36). The structure of the 103S_EGCase I-substrate complex was determined by molecular replacement with the program MOLREP (37) using the 103S_EGCase I structure as a search model. The structures of GM1 and GM3 were built with Coot Ligand Builder, and restraints were created using PRODRG (38). Iterative model building was performed in Coot, and refinement was conducted with REFMAC5 in the CCP4 suite. The final models were validated using MolProbity (39). Data collection and refinement statistics are provided in Table 5.
Structural Analysis-Searches for similar structures were performed using the DALI server (21). Structure-based sequence alignments were generated with PROMALS3D (40). The alignments were shaded in ESPript version 3.0 (41). Fucosyl-GM1 was modeled in the binding site by superimposing the structure on GM1, and the fucosyl unit was adjusted to a reasonable conformation in Coot. Figures were prepared using Chimera (42).