The Crystal Structure of Galacto-N-biose/Lacto-N-biose I Phosphorylase

Galacto-N-biose/lacto-N-biose I phosphorylase (GLNBP) from Bifidobacterium longum, a key enzyme for intestinal growth, phosphorolyses galacto-N-biose and lacto-N-biose I with anomeric inversion. GLNBP homologues are often found in human pathogenic and commensal bacteria, and their substrate specificities potentially define the nutritional acquisition ability of these microbes in their habitat. We report the crystal structures of GLNBP in five different ligand-binding forms. This is the first three-dimensional structure of glycoside hydrolase (GH) family 112. The GlcNAc- and GalNAc-bound forms provide structural insights into distinct substrate preferences of GLNBP and its homologues from pathogens. The catalytic domain consists of a partially broken TIM barrel fold that is structurally similar to a thermophilic β-galactosidase, strongly supporting the current classification of GLNBP homologues as one of the GH families. Anion binding induces a large conformational change by rotating a half-unit of the barrel. This is an unusual example of molecular adaptation of a TIM barrel scaffold to substrates.

Recently, two GH112 GalHexNAcPs from human pathogenic bacteria were reported to exhibit substrate specificities distinct from those of GLNBP. GalHexNAcPs from Clostridium perfringens ATCC13124 and Vibrio vulnificus CMCP6 are specific for GNB and LNB, respectively (4,6). Therefore, they are named galacto-N-biose phosphorylase (GNBP) and lacto-N-biose I phosphorylase (LNBP), respectively. C. perfringens is found in the gastrointestinal tract of humans, and some strains cause gas gangrene, septicemia, and food poisoning (7,8). V. vulnificus is often detected in the marine environment (9), and infection by this bacteria causes severe symptoms and high mortality in humans (10). The distinct substrate specificity of GalHexNAcPs in these commensal or pathogenic bacteria can be related to nutritional acquisition in their habitat (4).
Interestingly, most of the members of GH112 (GLNBP gene homologues) are found in human commensal or pathogenic bacteria as hypothetical proteins through genome projects. GNB and LNB structures are often found in biologically functional glycoconjugates of animals. GNB exists in O-glycans of mucin glycoproteins as core 1 or T-antigen disaccharide and also in glycosphingolipids (11)(12)(13). LNB exists in glycolipids as blood type antigens and human milk oligosaccharides (14 -16). These two similar disaccharides are important oligosaccharides in cell surface glycoconjugates, and cancer-associated antigens have structures related to them (17)(18)(19). The presence of the genes in commensal and pathogenic bacteria suggests that the enzymes play an important role in their growth in human tissues by utilizing GNB and/or LNB as nutrients.
Bifidobacteria naturally colonize the human intestinal tract and have health-promoting effects, such as prevention of diarrhea (20 -22), especially in infants (23)(24)(25)(26). The GNB/LNB pathway of Bifidobacterium longum JCM1217 involves a GNBand LNB-specific ABC transporter (27). In addition, the pathway includes a set of intracellular enzymes to send the entire molecules of the reaction product of GLNBP (Gal1P, GalNAc, and GlcNAc) into the glycolytic and amino sugar metabolic pathways (28). Before the cellular uptake, GNB and LNB are liberated from their natural substrates by the bifidobacterial extracellular enzymes, endo-␣-N-acetylgalactosaminidase (29), 1,2-␣-L-fucosidase (30), and lacto-N-biosidase (31). We have hypothesized that a key role of this pathway is to metabolize LNB in human milk oligosaccharides (1,28). In contrast to the gut contents of formula-fed infants, the guts of breast-fed infants generally contain microflora dominated by bifidobacteria (32)(33)(34)(35). Therefore, the selective growth of bifidobacteria observed in breast-fed infants has been attributed to human milk oligosaccharides (32, 36 -38), and our hypothesis explains its molecular basis.
The molecular scaffold and mechanisms of sugar phosphorylases are interesting, because they have been converged from various evolutionary origins. For example, pyridoxal phosphate-dependent glycogen phosphorylase shares structural and mechanistic similarities with typical NDP-dependent glycosyltransferases (GTs) (39) and is classified in the GT35 family. In contrast, some phosphorylases are very similar to standard GHs in their structures and reaction mechanisms (40). Therefore, sugar phosphorylases are classified in both the GH and GT classes in the CAZy Database. Another important aspect of phosphorylases is their ability to produce oligosaccharides due to the reversible nature of their reactions (41). GLNBP is actually used to prepare kilogram quantities of LNB (42). As a candidate for the bona fide bifidus factor, LNB attracts considerable interest in applications to improve infant health. In this study, we determined the crystal structures of GLNBP from B. longum JCM1217 in several ligand-binding forms and elucidated the molecular mechanism of the reaction and structural basis for substrate specificity of GalHexNAcPs. This is the first three-dimensional structure of a GH112 enzyme.

EXPERIMENTAL PROCEDURES
Enzyme Preparation-The nonlabeled GLNBP protein was expressed in Escherichia coli strain BL21 (DE3) and purified as described previously (2). The selenomethionine-labeled enzyme was expressed in the methionine auxotroph E. coli strain B834 (DE3) (Novagen). The purification procedures for the selenomethionine-labeled enzyme were the same as those for the nonlabeled enzyme. The point mutations of R32E, N166A, R210E, R358E, Y362F, Y362N, and F364N were made using the QuikChange site-directed mutagenesis kit (Stratagene). The oligonucleotide primers are listed in Table S1. Preparation, purification, and activity measurements of the mutant enzymes were done as described previously (2).
Crystallization and Data Collection-Ligand-free and selenomethionine-labeled GLNBP crystals were obtained at 4°C using the sitting drop vapor diffusion method by mixing 2 l of a protein solution with 2 l of a reservoir solution composed of 0.1 M sodium cacodylate (pH 6.5), 0.2 M Mg(NO 3 ) 2 , and 15% (v/v) polyethylene glycol 4000. Rodlike crystals (0.1 ϫ 0.1 ϫ 0.2 mm) grew within 3-7 days. The GalNAc complex form crystals were obtained by cocrystallization using a reservoir solution containing 100 mM Gal-NAc. The GlcNAc complex and GlcNAc-NO 3 -ethylene glycol (EG)

Structure and Movement of GNB/LNB Phosphorylase
complex form crystals were obtained by co-crystallization using a reservoir solution containing 100 mM GlcNAc. Crystals of the GlcNAc-SO 4 complex were obtained by co-crystallization using a reservoir solution containing 100 mM GlcNAc and 50 mM MgSO 4 . The crystals were transferred to a reservoir solution containing 15% (v/v) glycerol (selenomethionine, ligand-free, GalNAc or GlcNAc complex, and GlcNAc-SO 4 ) or 15% (v/v) EG (GlcNAc-NO 3 -EG) and then flash-cooled in a stream of cold nitrogen gas at 95 K. The data set for the selenomethionine-labeled crystal was collected at wavelengths of 0.9645 Å (remote), 0.9793 Å (peak), and 0.9797 Å (edge). X-ray diffraction data sets were collected using synchrotron radiation (beamline BL-5A, BL-17A, and NW12A; Photon Factory, Tsukuba, Japan). The data sets were processed and scaled using HKL2000 (43). The statistics for data collection and processing are given in Table 1.
Phase Calculation and Refinement-The autoSHARP (44,45) and SOLVE/RESOLVE (46) programs were used for site detection of selenium, phase calculation, and initial model building of the multiple-wavelength anomalous dispersion data set. The resulting initial model was used as a search model for molecular replacement of the nonlabeled GLNBP data set with the MOLREP program (47). Visual inspection of the models, introduction of water molecules, and crystallographic refinement were achieved using Coot (48) and Refmac5 (49). The refinement statistics and contents in the asymmetric unit (ASU) are given in Table 2. The figures were prepared using PyMol (50), and the structural alignment was performed by LSQMAN (51). Pairwise structural comparisons were carried out using the whole structure, whereas the closed state and other structures were superimposed by excluding the half-barrel unit and C-terminal domain.
Docking Studies-The AUTO-DOCK program version 4.0 (52) was used for the automated docking of ligands to the GLNBP active site. The LNB, GNB, and Gal1P ligand models were prepared with the PCModel program (Serena Software, Bloomington, IN) and optimized using the MMX force field. Rotatable ligand bonds (13 in LNB and GNB and five in Gal1P) were defined using the AutoDockTools interface. The closed state subunit of the GlcNAc-NO 3 -EG form was prepared for docking by removing the water molecules except for those included in the GlcNAc/Gal-NAc interactions. Retaining these water molecules (shown in Fig. 3) was essential for GlcNAc docking into the subsite (ϩ1). EG and Glc-NAc were removed for LNB/GNB docking, and EG and NO 3 Ϫ were removed for Gal1P docking, respectively. One of the two alternative conformations of the Arg-32 side chain in the closed state subunit of the GlcNAc-NO 3 -EG form was selected based on its conformation in the GlcNAc-SO 4 form. After adding polar hydrogens, Gasteiger charges were calculated for the ligand and protein. Grid maps were prepared with 40 ϫ 40 ϫ 40 points for LNB and GNB covering subsites Ϫ1 and ϩ1 and ranges at 40 ϫ 30 ϫ 30 points for Gal1P covering subsite Ϫ1 and the anion binding site with a point spacing of 0.375 Å. The Lamarckian genetic algorithm of the AutoDock 4.0 package performed 200 iterations with a maximum of 27,000 generations per iteration, a population size of 50 individuals, and a maximum of 2.5 ϫ 10 6 energy evaluations per generation. The results of the iterations were clustered so that no cluster member deviated Ͼ1.0 Å from other cluster members. After docking, all structures generated for a single compound were assigned to clusters based on a tolerance of 1.0 Å for all atom root mean square (r.m.s.) deviations from the lowest-energy structure. The results of cluster analyses are shown in Table S2. The best docking results with the lowest E inter values are shown in Fig. 3C and Fig. S2C.

RESULTS
Crystallography-Five refined crystal structures are presented here, and they are hereafter designated according to the ligands found in the active site: ligand-free form; GalNAc complex; GlcNAc complex; quaternary complex with GlcNAc, nitrate, and ethylene glycol (GlcNAc-NO 3 -EG); and ternary complex with GlcNAc and SO 4 2Ϫ (GlcNAc-SO 4 ) ( Table 2). These structures also contain crystallization reagents (Mg 2ϩ and NO 3 Ϫ ) and cryoprotectants (glycerol or EG) at their molecular surfaces. Ligand-free, GalNAc complex, GlcNAc complex, and GlcNAc-SO 4 crystals belong to space group P1 and contain four subunits per ASU, whereas GlcNAc-NO 3 -EG belongs to space group C2 and contains two subunits per ASU. The ASU contents in these crystal structures are summarized in Table 2. In all crystal forms, each of the two molecules forms a dimer, as shown in Fig. 2A. This should correspond to the dimeric form of this enzyme in solution (1).
Overall Structure of Ligand-free Form- Fig. 2B shows a ribbon diagram of the monomer structure of ligand-free form. The GLNBP monomer consists of four domains. The first is the (␤/␣) 8 barrel (TIM barrel) domain (residues 3-70 and 181-437; blue). The second domain, the Ig-like fold domain (residues 71-180; green), is inserted between ␤-3 and ␣-3 of a TIM barrel (Fig. S1). The remaining two regions form the ␣/␤ fold (residues 438 -695; yellow) and the C-terminal ␤-sheet (residues 696 -750; red) domains. The dimer interface is mainly formed by Ig-like and ␣/␤ fold domains ( Fig. 2A), and the buried molecular surface area is about 2,060 Å 2 /subunit. Structures of the four subunits in the ASU are almost identical; the r.m.s. deviations for C␣ atoms between all pairs are within 0.5 Å, except for the relatively flexible half-barrel unit and the C-terminal domain (for discussion, see below). GalNAc and GlcNAc Complex-In the GalNAc complex form, each subunit in the ASU holds a GalNAc molecule ( Table  2). It is bound at the center of the C-terminal loop side of the TIM barrel domain. All of the four GalNAc molecules in the ASU are in an ␣-anomeric state and take a standard 4 C 1 conformation (Fig. 3A). In addition to the hydrophobic interactions that recognize the pyranose moiety of GalNAc, two direct and several water-mediated hydrogen bonds are present. One direct hydrogen bond is formed between the O3 atom of Gal-NAc and the side chain of Asp-313. This residue has been proposed to be the catalytic proton donor residue based on a mutational analysis (2). Another direct hydrogen bond is formed between the carbonyl oxygen atom of the N-acetyl group and the N⑀ atom of the Trp-233 side chain. Most of the watermediated hydrogen bonds are formed at one side of the GalNAc molecule (O1, O5, and O6). The water molecules are held by the side chains of Tyr-165, Glu-228, and Ser-612 (from the other subunit), and by the main chain carbonyl oxygen atom of His-460.
The overall structure of the GalNAc complex is very similar to that of the ligand-free form (Fig. S2B). In the complex form, however, an ␣-helix region in the Ig-like domain (residues 160 -170) shifts toward the ligand (Fig. 4C, green), and the C␣ displacement is 1.5 Å at Val-162 (Fig. 4A). This region participates in the active site formation and partially covers the substratebinding pocket. Hereafter, this region is referred to as "160 -170 helix." The structural change is accompanied by a side chain switch of Pro-161 and Gln-217 at the root of the 160 -170 helix (Fig. 4C). Because this conformation exhibits slight but clear active site closure compared with an "open" state of the ligandfree form, we designate it as the "semiclosed" state.
We have also determined the complex structure of GlcNAc, which is the O4 epimer of GalNAc. The binding interactions of the GlcNAc molecule as well as the overall structure are almost identical with those of the GalNAc complex (Fig. S3A).
GlcNAc-NO 3 -EG Complex-In the GlcNAc-NO 3 -EG complex form, the two subunits in the ASU form a dimer ( Fig. 2A). One of the two subunits takes a semiclosed state whose active site pocket contains a GlcNAc molecule, as in the GalNAc and GlcNAc complex forms. The other subunit contains EG and NO 3 Ϫ as well as a GlcNAc molecule in the active site pocket (Fig.  3B). In comparison with the ligand-free form, this subunit exhibits a large domain closure at the substrate binding site (Figs. 2C and 4B). The conformational change is prominent, because about 50% of the TIM barrel elements (half-barrel unit, ␤1-␣1-␤2-␣2-␤3, and ␤7-␣7-␤8-␣8 formed by residues 3-50 and 355-437) and the C-terminal domain (residues 696 -750) shift toward the ligand molecules (r.m.s. deviation ϭ 2.7 Å; Fig.  4A). The half-barrel unit rotates about 10°around a pivot point located at the bottom of the TIM barrel (Fig. 4B). The maximal difference is present at Gly-371 (6.8 Å), which is located at the C-terminal loop of the half-barrel unit. Hereafter, this confor-mation is referred to as the "closed" state. Note that the barrel structure of the closed state is highly deformed (Fig. 2C), whereas the ␤-strands in the "open" state form an almost perfect circle (Fig. 2B). In this crystal form, the dimer is clearly asymmetric, since the two subunits are in a distinct conformational state. Moreover, the two active sites are clearly separated, and there is no apparent interaction between them (Fig. S2A). These results are in agreement with the nonallosteric behavior of this enzyme.
Interactions with the GlcNAc molecule are virtually the same as in the semiclosed state (Fig. 3B). One of the hydroxyl groups in EG forms a hydrogen bond with the side chain of Asn-166, which is located in the 160 -170 helix. The hydrophobic part of EG interacts with the Tyr-362 and Phe-364 side chains at the C terminus of ␤7 in the TIM barrel. The movement of the halfbarrel unit allows these hydrophobic residues to approach the EG binding site. The binding site of NO 3 Ϫ is surrounded by three arginine residues (Arg-32, Arg-210, and Arg-358), and they form direct hydrogen bonds with NO 3 Ϫ . Arg-32 have two alternative conformations (Fig. 4D), one of which directly interacts with NO 3 Ϫ . 4 Complex-The crystal of the GlcNAc-SO 4 complex is obtained by co-crystallization with 50 mM MgSO 4 and 100 mM Mg(NO 3 ) 2 . Three of the four subunits in the ASU are in the semiclosed state with a GlcNAc molecule in the active site (Table 2), and one subunit is in the closed state. However, a part of the electron density corresponding to the half-barrel unit is disordered (Fig. S2C). A strong electron density peak is present near the GlcNAc molecule in the active site of the subunit in the closed state (Fig. S3B). Considering the shape and height of the peak, the electron density peak definitely comes from SO  forms hydrogen bonds with the side chain hydroxyl group of Tyr-362 as well as with the three arginine residues. In contrast to the GlcNAc-NO 3 -EG complex form, Arg-32 in this form is clearly in a single conformation. The tetrahedral structure of SO 4 2Ϫ seems to be preferable to the planar triangular structure of NO 3 Ϫ for this binding site. These results suggest that the SO 4 2Ϫ binding site is also suitable for phosphate binding. Docking Analysis-To investigate the possible interactions with the galactose moiety of natural substrates or products, we performed automated docking analysis of LNB and Gal1P in the closed state structure of the GlcNAc-NO 3 -EG complex form. Several water molecules in the substrate binding pocket are not removed to maintain the interactions around the ligands. The best results of the docking analysis with LNB and Gal1P are shown in Fig. 3C and Fig. S3C. The GlcNAc moiety of the docked LNB almost entirely overlaps with the crystallographic GlcNAc molecule (r.m.s. deviation ϭ 0.43 Å). The galactose moiety of LNB also overlaps with the EG molecule in the crystal structure, suggesting that this site is subsite Ϫ1. This moiety is recognized by the side chains of Asn-166, Asp-313, Tyr-362, and Phe-364. The former two residues form bifurcated hydrogen bonds with the equatorial O3 and axial O4 atoms of the galactose moiety, whereas the latter two create hydrophobic interactions. When Gal1P is docked, the galactose and phosphate moieties are positioned at the binding sites for EG and SO 4 2Ϫ , respectively. The side chain of Asn-166 again forms bifurcated hydrogen bonds, suggesting that this residue is important for recognizing the galactose moiety. The docking analysis with GNB produced a very similar result with LNB under the same conditions (data not shown). We could not obtain any reliable models when the open and semiclosed state structures were subjected to the docking analysis (data not shown).

GlcNAc-SO
Mutational Analysis-Kinetic parameters of the synthetic reaction of mutant enzymes were measured to confirm the importance of the residues in the ligand binding sites (Table 3). Mutants at the three arginine residues in the anion binding site (R32E, R210E, and R358E) showed no detectable activity, indicating that they are all critical for the reaction. Mutations at the putative galactose binding site (N166A, Y362F, Y362N, and F364N) also severely impaired the activity. In all of the mutant enzymes, the K m values against Gal1P increased, and the k cat values significantly decreased. Elimination of the side chain of Tyr-362 (Y362N) resulted in the complete loss of the activity, and the loss of the hydroxyl group (Y362F) resulted in about 1,000-fold reduction of the catalytic efficiency. These results coincide with the docking results, since Tyr-362 is suggested to hydrophobically interact with the galactose pyranose ring as well as a hydrogen bond with phosphate. The activity of a mutant at the putative proton donor (D313N) was undetectable (2). All of these residues (Arg-32, Asn-166, Arg-210, Asp-313, Arg-358, Tyr-362, and Phe-364) are fully conserved in GH112 GalHexNAcPs (Fig. S1).

DISCUSSION
Reaction Mechanism of GLNBP-GLNBP forms a ternary complex with two substrates (i.e. phosphate and GNB/LNB) or two products (i.e. Gal1P and GalNAc/GlcNAc) during the reaction, because its phosphorolytic reaction follows the sequential bi-bi mechanism (6). Our results strongly suggest that the EG and anion binding sites correspond to those of galactose and phosphate, respectively. The anion binding site is in close proximity to the anomeric carbon of the galactose moiety of the docked LNB. In addition, a hydrogen bond is formed between the glycosidic bond oxygen and the Asp-313 side chain (Fig. 3), which has been suggested to be the catalytic proton donor (2). These structural features suggest that the inverting phosphorolytic reaction begins with a direct nucleophilic attack by phosphate on the anomeric carbon, but the evidence for the mechanism has yet to be obtained. This type of mechanism has also been proposed for cellobiose phosphorylase, the other inverting sugar phosphorylase in GH94 (53)(54)(55). In the crystal structures of cellobiose phosphorylase, subsite ϩ1 is occupied by glucose, and the adjacent subsite Ϫ1 and phosphate binding site are occupied by glycerol and SO 4 2Ϫ or phosphate, respectively.
Efficient catalysis on the hydrolytic or phosphorolytic cleavage of ␤-glucosidic bonds is considered to require distortion of the glycon sugar ring (55)(56)(57)(58). In this study, we tried to dock LNB with several distorted conformers at its glycon (e.g. 1 S 3 , 1 S 5 , and 2 S o at the galactosyl sugar ring), but we could not obtain any reasonably docked models. The docking procedures used here may not be sensitive enough to determine the most probable pretransition state conformer, because the structural differences are small between these conformers, or some rearrangements of the GLNBP active site may be required to accept a distorted galactose moiety. Alternatively, the conformational itinerary of ␤-galactosidases (or phosphorylases acting on ␤-galactosyl bonds) may be different from those of ␤-glucosidases because of potential crashes between the axial C4 hydroxyl and ␤-anomeric substituents. Espinosa et al. (59) conducted an NMR study of a nonhydrolyzable C-glycoside analogue of lactose bound to E. coli GH2 ␤-galactosidase. They suggested that the galactopyranose ring of the bound analog is not in a distorted conformation (e.g. a half-chair or sofa), because the intensities of the H-1Ј/H-3Ј or H-1Ј/H-5Ј intraresidue nuclear Overhauser effects were strong. Conformational changes on the ␤-galactosidic bond cleavage reactions need to be further studied.
Substrate Specificity Determinants of GalHexNAcPs-Major determinants of the substrate specificities of GalHex-NAcPs are envisaged to be located around the GalNAc/Glc-NAc binding site (subsite ϩ1). In particular, the Val-162 residue of GLNBP, which is located very close to the axial O4 atom of GalNAc (Fig. 3A), is the primary candidate for the determinants. GNBP shows strong preference for GNB (K m ϭ 1.9 mM) compared with LNB (K m ϭ 26 mM) (6). It has a threonine residue at the corresponding position of Val-162 in GLNBP (Fig. S1). Therefore, there would be a favorable polar interaction between the side chain of the threonine and the GalNAc axial O4 atom in GNBP. The factors for LNB preferences in LNBP are more complicated, because both GLNBP and LNBP retain a valine residue at this position. One possibility is that large insertions and deletions in the ␣/␤ domain affect the substrate specificity of LNBP (Fig. S1). His-460 and Ser-612 of GLNBP, which are involved in subsite ϩ1 recognition (Figs. 3 and 4, C and D), are both located in the ␣/␤ domain. Perhaps the insertions and deletions alter the tertiary and/or quaternary structures around the sugar binding site and prevent GNB binding to LNBP. In the future, many more putative GalHexNAcP (GH112 GLNBP homologue) genes will be found, since a number of genomic and metagenomic projects on human-related microbes are now in progress worldwide (e.g. through the International Human Microbiome Project) (60). Our study will provide useful information on the possible substrate preferences for those putative GalHexNAcPs retained by microbes living around human bodies. Structural Neighbor of GLNBP-A structural homology search using the Secondary Structure Matching server (61) indicated that GLNBP is most similar to a GH42 ␤-galactosidase from Thermus thermophilus A4, A4-␤-galactosidase (Protein Data Bank code 1KWK, Q score ϭ 0.110, r.m.s. deviation ϭ 3.73 Å for 370 residues) (62). A4-␤-galactosidase consists of three domains: a TIM barrel, an ␣/␤ fold, and ␤-sheet domains. The structural similarity between GLNBP and A4-␤-galactosidase ranges from the TIM barrel domain to the subsequent ␣/␤ fold domain (Fig. S4). Superimposition of the TIM barrel domain alone is shown in Fig. 5A (Q score ϭ 0.150, r.m.s. deviation ϭ 2.90 Å for 242 C␣ atoms). Their TIM barrel domains have a similar "comma-like" shape with a long ␣-4 helix. A close-up view of the superimposition at the active site is shown in Fig. 5B. A4-␤-galactosidase is a retaining GH, and the two catalytic residues, acid/base (Glu-141) and nucleophile (Glu-312), are located on the opposite side of the bound galactose. Interestingly, the catalytic components of A4-␤-galactosidase overlap well with the general acid residue (Asp-313), the phosphate binding site, and the subsite Ϫ1 (EG binding site) of GLNBP. The side chain positions of Asp-313 in GLNBP and Glu-141 in A4-␤-galactosidase coincide, although they come from neighboring ␤-strands in the barrel scaffold; the former is from ␤5, and the latter is from ␤4. On the opposite side, the bound anion (NO 3 Ϫ ) of GLNBP overlaps with the side chain carboxyl group of the nucleophile residue of A4-␤-galactosidase (Glu-312). An arginine residue involved in anion binding (Arg-358) and Glu-312 of A4-␤-galactosidase are both located in the same position at the ␤7 strand of the barrel scaffold. Another interesting overlap was observed between Arg-32 of GLNBP (anion binding) and Arg-32 in A4-␤-galactosidase, both being located at the same position in the barrel (␤2 strand). Arg-32 in A4-␤-galactosidase is a highly conserved residue and is considered to control the pK a of the nucleophile residue (62,63).
In summary, the positions of the nucleophile (phosphate in GLNBP and Glu-312 in A4-␤-galactosidase), subsite Ϫ1 (galactoside moiety in both enzymes), and the proton donor (Asp-313 and Glu-141) are located at the same positions in GLNBP and A4-␤-galactosidase. Although there is no clear evidence, the structural overlaps of the catalytic components as well as overall structural similarity suggest a possible evolutionary relationship between the inverting phosphorylase (GLNBP) and the retaining GH (A4-␤-galactosidase). Moreover, the catalytic mechanism of GLNBP seems to be similar to that of the GHtype inverting phosphorylase, GH94. Our data strongly support the current classification of GLNBP homologues as one of the GH families, GH112.
Conformational Changes on Ligand Binding-The 18 subunit structures in five crystals that were determined can be classified into three distinctive conformational states (Table 2). Subunits in the "semiclosed" state always contain a GlcNAc or GalNAc molecule in the active site, whereas the "open" conformation has no ligand. The structural difference was not so large but was clear at important regions for sugar binding at subsite ϩ1 (Fig. 4C). The formation of hydrophobic interactions induces the movement of Pro-161, Val-162, Tyr-165, Gln-217, and Trp-233 on sugar binding. The movements cause C␣ atom shifts of about 1 Å around these residues, including the 160 -170 helix (Fig. 4A, blue line). Notably, the O␦2 atom of the catalytically important Asp-313 residue shifts about 1.6 Å to form a hydrogen bond with the O3 atom of GlcNAc/GalNAc. These structural changes can be interpreted as a small induced fit (see Movie S1).
Both of the two "closed" state subunits contain an anion (NO 3 Ϫ or SO 4 2Ϫ ) at their active sites ( Table 2, subunit A in Glc-NAc-NO 3 -EG and subunit D in GlcNAc-SO 4 ). The mobile domains in the closed state subunits (half-barrel unit and C-terminal domain) are not involved in crystal packing, indicating that this feature is not a crystal artifact. The closed state was characterized by a notable conformational change at the halfbarrel unit of the catalytic domain (Fig. 4B). This conformational change simultaneously creates a hydrogen bond network between the anion and the three arginine residues in the halfbarrel unit (Fig. 4D). The electrostatic interaction between these elements may drive the large conformational change.