Structural Basis for the Substrate Specificity of a Novel β-N-Acetylhexosaminidase StrH Protein from Streptococcus pneumoniae R6*

The β-N-acetylhexosaminidase (EC 3.2.1.52) from glycoside hydrolase family 20 (GH20) catalyzes the hydrolysis of the β-N-acetylglucosamine (NAG) group from the nonreducing end of various glycoconjugates. The putative surface-exposed N-acetylhexosaminidase StrH/Spr0057 from Streptococcus pneumoniae R6 was proved to contribute to the virulence by removal of β(1,2)-linked NAG on host defense molecules following the cleavage of sialic acid and galactose by neuraminidase and β-galactosidase, respectively. StrH is the only reported GH20 enzyme that contains a tandem repeat of two 53% sequence-identical catalytic domains (designated as GH20-1 and GH20-2, respectively). Here, we present the 2.1 Å crystal structure of the N-terminal domain of StrH (residues Glu-175 to Lys-642) complexed with NAG. It adopts an overall structure similar to other GH20 enzymes: a (β/α)8 TIM barrel with the active site residing at the center of the β-barrel convex side. The kinetic investigation using 4-nitrophenyl N-acetyl-β-d-glucosaminide as the substrate demonstrated that GH20-1 had an enzymatic activity (kcat/Km) of one-fourth compared with GH20-2. The lower activity of GH20-1 could be attributed to the substitution of active site Cys-469 of GH20-1 to the counterpart Tyr-903 of GH20-2. A complex model of NAGβ(1,2)Man at the active site of GH20-1 combined with activity assays of the corresponding site-directed mutants characterized two key residues Trp-443 and Tyr-482 at subsite +1 of GH20-1 (Trp-876 and Tyr-914 of GH20-2) that might determine the β(1,2) substrate specificity. Taken together, these findings shed light on the mechanism of catalytic specificity toward the β(1,2)-linked β-N-acetylglucosides.

Streptococcus pneumoniae is a commensal Gram-positive, encapsulated pathogen that is widely distributed in nature. It causes acute pneumonia, otitis media, meningitis, and several other serious diseases that lead to the death of millions of people worldwide annually. During colonization and infection to the human host, S. pneumoniae encounters a variety of glycoconjugates, including mucin, host defense molecules, and glycans associated with the epithelial surface. To deglycosylate these host glycoconjugates, a number of S. pneumoniae glycosidases have evolved (1). Genome sequencing studies suggested that a large and diverse array of glycosidases is necessary for full virulence of the pneumococci (2,3). Among them, three surface-exposed exoglycosidases, neuraminidase, ␤-galactosidase, and StrH, have been previously demonstrated to sequentially remove sialic acid, galactose, and NAG 4 to expose the mannose on glycoconjugates of the host defense molecules such as human secretory component, lactoferrin, and immunoglobulin A1 (4). This cleavage was proposed to alter the clearance function of these molecules, facilitating the persistence of Streptococci in the airway. Moreover, the monosaccharides liberated from these glycoconjugates could be utilized for the growth of bacteria (5).
The N-acetylhexosaminidase StrH from S. pneumoniae is a pneumococcal surface protein that was proved to be a virulence factor (4,6). It belongs to glycoside hydrolase family 20 (GH20) (7)(8)(9), the members of which catalyze the hydrolysis of the ␤1-linked NAG group from the nonreducing end of various glycoconjugates, such as glycans, glycoproteins, and glycolipids (10). Members of this family, such as N-acetylhexosaminidase (EC 3.2.1.52) and lacto-N-biosidase (EC 3.2.1.140), have activities toward different substrates. They have been postulated to have specialized physiological functions, including posttranslational modification of N-glycans, degradation of glycoconjugates, as well as egg-sperm recognition (10). StrH is a 1312residue protein that contains a tandem repeat of two GH20 domains. The recombinant protein had ␤-N-acetylhexosaminidase activity and cleaved the ␤(1,2)-linked NAG group from human defense molecules (11). The tandemly connected GH20 domains are indispensable for StrH to hydrolyze versa-* This work was supported by Ministry of Science and Technology of China Grant 2009CB918804 and National Natural Science Foundation of China 30870488. □ S The on-line version of this article (available at http://www.jbc.org) contains supplemental Figs. S1 and S2. 1 Both authors contributed equally to this work. 2 To whom correspondence may be addressed: Institut  tile physiological glycans or glycoconjugates. The primary sequences of both GH20 domains are homologous to the following GH20 enzymes of known structure: Serratia marcescens CHB (12), human HexA and HexB (13,14), Actinobacillus actinomycetemcomitans DspB (15), Streptomyces plicatus Hex (16), Streptococcus gordonii GcnA (17), Paenibacillus sp. TS12 Hex (18), and Ostrinia furnacalis Hex1 (19). Despite the catalytic domain having a common (␤/␣) 8 TIM barrel fold harboring the active site, these GH20 enzymes demonstrate versatile physiological functions and specificity toward diverse substrates. All GH20 enzymes of known structure have only one catalytic GH20 domain and cleave either the ␤(1,4)or ␤(1,6)linked NAG group from diverse glycoconjugates. In contrast, StrH possesses a tandem repeat of two GH20 domains (designated as GH20-1 and GH20-2, respectively) and cleaves the substrate of a ␤(1,2)-linkage. Here we report the 2.1 Å crystal structure of the first GH20 domain of StrH in complex with NAG. Despite it sharing a typical (␤/␣) 8 TIM barrel core structure with other GH20 members, two loops at the entrance of the substrate binding pocket are distinct. Structural comparison in combination with enzymatic analyses revealed that residues Trp-443 and Tyr-482, which reside on these two loops, respectively, are crucial for the stringent specificity toward the ␤(1,2) substrate. This is the first report of the structural basis for the cleavage of ␤(1,2)-linkage catalyzed by N-acetylhexosaminidase.

Cloning, Expression, and Purification of StrH and Its Mutants-
The coding region of the strH gene was amplified from the genomic DNA of S. pneumoniae R6. Genes encoding StrH and its mutants were individually cloned into a pET28a-derived expression vector with an N-terminal His 6 tag and overexpressed in Escherichia coli strain BL21-RIL (DE3) (Novagen) using LB culture medium (10 g of NaCl, 10 g of Bacto-Tryptone, and 5 g of yeast extract/liter). The transformed cells were grown at 37°C in LB medium containing 30 g/ml kanamycin and 34 g/ml chloramphenicol until the A 600 nm reached ϳ0.6. Expression of the recombinant proteins was then induced with 0.2 mM isopropyl ␤-D-1-thiogalactopyranoside for another 20 h at 16°C before harvesting. The cells were collected and resuspended in 40 ml of lysis buffer (50 mM HEPES, pH 7.5, 150 mM NaCl). After 2.5 min of sonication and centrifugation at 12,000 ϫ g for 25 min, the supernatant containing the target protein was collected and loaded onto a nickel-nitrilotriacetic acid column (GE Healthcare) equilibrated with the binding buffer (50 mM HEPES, pH 7.5, 150 mM NaCl). The target protein was eluted with 300 mM imidazole, and further loaded onto a Superdex 200 column (GE Healthcare; 50 mM HEPES, pH 7.5, 150 mM NaCl). Fractions containing the target protein were combined and concentrated to 10 mg/ml for crystallization. Samples for enzymatic activity assays were collected at the highest peak fractions without concentration. The purity of protein was assessed by electrophoresis, and the protein sample was stored at Ϫ80°C.
The selenium-Met (Se-Met)-labeled StrH protein was expressed in E. coli strain B834 (DE3) (Novagen). Transformed cells were grown at 37°C in Se-Met medium (M9 medium with 25 g/ml Se-Met and the other essential amino acids at 50 g/ml) containing 30 g/ml kanamycin until the A 600 nm reached ϳ0.6 and were then induced with 0.2 mM isopropyl ␤-D-1-thiogalactopyranoside for 20 h at 16°C. Se-Met substituted His 6 -StrH was purified in the same manner as native His 6 -StrH.
Site-directed mutagenesis was performed using the QuikChange site-directed mutagenesis kit (Stratagene, La Jolla, CA) with the plasmid encoding the wild-type StrH as the template. The mutant proteins were expressed, purified, and stored in the same manner as the wild-type protein.
Crystallization, Data Collection, and Processing-Both native and Se-Met substituted StrH (amino acids 175-642) were concentrated to 10 mg/ml by ultrafiltration (Millipore Amicon) for crystallization. Crystals were grown at 16°C using the hanging drop vapor diffusion method, with the initial condition of mixing 1 l of protein solution with an equal volume of the reservoir solution (30% PEG 4000, 0.2 M ammonium acetate, 0.1 M sodium citrate tribasic dehydrate, pH 5.6). The crystals were transferred to cryoprotectant (reservoir solution supplemented with 25% glycerol) and flash-cooled with liquid nitrogen. The Se-Met derivative data for a single crystal were collected at 100 K in a liquid nitrogen stream using beamline 17U with an MX225 CCD (MARresearch) at the Shanghai Synchrotron Radiation Facility. All of the diffraction data were integrated and scaled with the program HKL2000 (20).
Structure Determination and Refinement-The crystal structure of StrH was determined using the single-wavelength anomalous dispersion phasing (21) method from a single Se-Met-substituted protein crystal to a maximum resolution of 2.1 Å. The Autosol program from PHENIX (22) was used to locate the heavy atoms, and the phase was calculated and further improved with the program SOLVE/RESOLVE (23,24). Electron density maps showed clear features of secondary structural elements. Automatic model building was carried out using Autobuild in PHENIX. The initial model was refined using the maximum likelihood method implemented in REFMAC5 (25) as part of CCP4i (26) program suite and rebuild interactively using the program COOT (27). The final model was evaluated with the programs MOLPROBITY (28) and PROCHECK (29). Crystallographic parameters are listed in Table 1. All of the structure figures were prepared with PyMOL (30).
Enzymatic Activity Assay-The enzyme kinetic parameters of native StrH and its mutants were measured using 4-nitrophenyl N-acetyl-␤-D-glucosaminide (pNp-NAG; Sigma) as substrate to a standard curve of 4-nitrophenol, which was described by Prag et al. (31) with minor changes. All of the assays were performed at 37°C in the buffer containing 50 mM NaH 2 PO 4 , pH 7.5. The reactions were initiated by the addition of StrH. Using a DU800 spectrophotometer (Beckman Coulter, Fullerton, CA), the increase in absorption at 405 nm was monitored continuously. Michaelis-Menten parameters (V max and K m ) were extracted from these data by nonlinear fitting to the Michaelis-Menten equation using the program Origin 7.5.
Preparation of 1-Phenyl-3-methyl-5-pyrazolone (PMP) Derivatives of Saccharides-PMP derivatization of saccharides was carried out as described previously (32, 33) with minor changes. Briefly, the 10-l reaction mixture was terminated by mixing with an equal volume of 0.3 M aqueous NaOH, and a 10-l 0.5 M methanol solution of PMP was added to each. Each mixture was allowed to react for 30 min at 70°C, then cooled to room temperature, and neutralized with 10 l of 0.3 M HCl. The resulting solution was dissolved in 100 l of chloroform. After vigorous shaking and centrifuging, the organic phase was carefully discarded to remove the excess reagents. The extraction process was repeated three times; then the aqueous layer was diluted with 40 l of water before HPLC analysis.
HPLC Analysis-The assays were performed at 37°C in a 10-l system containing the buffer of 50 mM citric acid/sodium phosphate, pH 5.0, and the disaccharide NAG␤(1,2)Man (Dextra, UK) with a series of concentrations. The reactions were triggered by adding the purified protein solution and terminated by mixing with an equal volume of 0.3 M NaOH. After PMP derivatization as mentioned above, the mixture was centrifuged at 12,000 ϫ g for 10 min, and the supernatant was analyzed by HPLC system (Agilent 1200 Series). Mannose and NAG standards were quantified by HPLC analysis using a series concentrations ranging from 0.1 to 5 mM. Acetonitrile and 100 mM K 2 HPO 4 /KH 2 PO 4 , pH 7.0, were mixed to give a final mixture of 20% acetonitrile, which was used for equilibration of the column (Eclipse XDS-C18 column, 4.6 ϫ 150 mm; Agilent) and separation of the components at a flow rate of 1 ml/min. The samples were injected in volumes of 10 l. Retention times of monosaccharides were determined by separation of standard monosaccharide solutions in 10-l injections, individually as well as in mixture. Three independent kinetic determinations were made to calculate the means and standard deviations for the reported K m and k cat values.

RESULTS AND DISCUSSION
Domain Organization of StrH-The 1312-residue StrH is composed of six distinct domains: a putative N-terminal signal peptide of 34 residues, a tandem repeat of two GH20 domains, followed by two G5 domains and a C-terminal domain to anchor to the cell wall (Fig. 1A). The two 53% sequence-identical GH20 domains of ␤-N-acetylhexosaminidase are bridged by a linker of 67 residues. Each G5 domain is a module of ϳ80 residues that is found in a variety of enzymes such as Streptococcal IgA peptidases and various glycoside hydrolases in bacteria. These enzymes are usually involved in metabolism of bacterial cell walls and related to the adhesive function to the host (34,35). The C-terminal cell wall anchor domain contains a canonical LPXTG motif, which is covalently anchored to the cell wall upon cleavage of the LPXTG sequence by a transpeptidase (a sortase) (6).
Overall Structure of StrH-N-In an initial effort to obtain the full-length structure of StrH, we found that the protein was degraded during purification. Limited proteolysis combined with liquid chromatography-mass spectrometry enabled us to define a relatively stable fragment comprising residues Glu-175-Lys-642 (termed StrH-N). We obtained the Se-Met derivative crystals of StrH-N in the presence of NAG and determined the complex structure at 2.1 Å by the single-wavelength anomalous diffraction phasing method. The structure was refined to R/R free factor of 19.2%/22.1% and showed good geometry as determined by the programs MOLPROBITY (28) and PROCHECK (29). Each asymmetric unit contains two identical subunits (A and B), with an overall root mean square deviation of 0.7 Å over 422 C␣ atoms. Residues Asn-181-Ala-630 in subunit A and residues Asn-181-Val-559 and Thr-567-Asn-609 in subunit B are well fitted in the final model. In addition, a molecule of NAG could be well defined at the active sites of each subunit. An interface of ϳ500 Å 2 between the two subunits indicated that StrH-N exists as a monomer in solution, which was proved by size exclusion chromatography (data not shown).
Overall structure comparison of GH20-1 using the Dali server (http://ekhidna.biocenter.helsinki.fi/dali_server/) (36) gave 53 hits for eight unique proteins with a Z-score higher than 17.0. All of these proteins were members of the GH20 family. Although GH20-1 of StrH shares a sequence identity of less than 17% with these GH20 enzymes, they all share a highly conserved central (␤/␣) 8 -barrel. The major differences are where F o and F c are the observed and calculated structure-factor amplitudes, respectively. c R free was calculated with 5% of the data excluded from the refinement. d RMSD, root mean square deviation from ideal values. e The categories were defined by Molprobity. f The values in parentheses refer to statistics in the highest bin. from the segments beyond the core structure, including the loops, helices ␣2, ␣4, and ␣6 and ␤-strands (␤3 and ␤4), because of frequent sequence variation, deletion, or insertion in these regions of all glycosidases.
The Active Site-Previous studies suggested a substrate-assisted catalytic mechanism for GH20 enzymes (12,16,37). First, a catalytic glutamate residue provides a general acid to catalyze the hydrolysis of the glycosidic linkage. As the glycosidic linkage is breaking, the anomeric carbon migrates toward the oxygen atom of the 2-acetamido group of the substrate to form a cyclic NAG-oxazolinium ion intermediate. Then a water molecule activated by the deprotonated glutamate attacks the anomeric center to form the final product. Another catalytic residue asparate is responsible for stabilizing the positively charged nitrogen of the 2-acetamide group (12,16,31,38). At the center of the (␤/␣) 8 TIM barrel convex side of StrH-N, there is a molecule of NAG, the pyranose ring of which adopts a chair-like conformation ( Fig. 2A). Three residues, Phe-415, Trp-439, and Trp-517, form a hydrophobic pocket to accommodate the hydrophobic pyranose ring. In addition, several polar residues at the active site form a hydrogen-bond network with NAG. The N2 and N1 of Arg-196 form two hydrogen bonds with O3 and O4 of NAG, respectively, whereas the carboxyl group of Asp-519 also makes two hydrogen bonds with O4 and O6 of NAG, respectively. Asn-225 interacts with O3 of NAG through a water molecule Wat-1. The catalytic residue Asp-360 forms a hydrogen bond with NAG via Wat-1, whereas another catalytic residue Glu-361 is stabilized by two hydrogen bonds with N⑀ and N2 of Arg-324.
Compared with the structure of N-acetylhexosaminidase SpHex in complex with NAG (Protein Data Bank 1M01, 2.1 Å), most residues at the active site of GH20-1 are structurally conserved. The catalytic residue Asp-360 adopts a similar pose to Asp-313 of SpHex. Three aromatic residues, Phe-415, Trp-439, and Trp-517, of StrH-N could be well superimposed with Trp-344, Trp-361, and Trp-442 in SpHex, respectively. In addition, residues Arg-196, Glu-519, and Asn-225 in StrH-N adopt a similar conformation to their counterparts in SpHex (Fig. 2B). However, there are significant differences in the active site between these two enzymes. In SpHex, the hydroxyl group of Tyr-393 donates a hydrogen bond to the oxygen atom of the 2-acetamido group. This hydrogen bond is important for catalysis, because it can fix the carbonyl oxygen of the NAG 2-acetamido group in the position for nucleophilic attack on the anomeric carbon C1 (16). However, SpHex-Tyr-393 is substituted by StrH-Cys-469, which has no interaction with NAG. Moreover, Tyr-393 in SpHex is conserved in other GH20 enzymes of known structure, such as Tyr-669 of SmCHB and Tyr-278 of AaDspB. This substitution of Tyr by Cys-469 in GH20-1 results in NAG adopting a distinct conformation. In SpHex, the 2-acetamido group of NAG turns back to its pyranose ring, and consequently the carbonyl oxygen lies under the mean plane of the pyranose ring and shows a distance of 2.5 Å to the anomeric carbon C1 for nucleophilic attack. In contrast, the 2-acetamido group in StrH-N points outwards against the pyranose ring, which results in a larger distance between the carbonyl oxygen of the 2-acetamido group and the anomeric carbon C1 (4.1 Å) (Fig. 2B). This conformation of NAG in GH20-1 is much more similar to that in the structure of SpHex D313N mutant (Protein Data Bank 1M04) (38), which has been proved to prevent the 2-acetamido group from providing efficient anchimeric assistance, resulting in the large reduction in enzymatic activ- B, overall structure of GH20-1 (cyan) together with the linker (magenta) and the N-terminal ␣-helix of GH20-2 (cyan). The secondary structural elements were labeled sequentially. The NAG molecule at the active site of GH20-1 is shown as green sticks.
ity. Furthermore, another remarkable difference is that the catalytic residue Glu-361 adopts a distinct orientation compared with its counterpart, Glu-314 in SpHex (Fig. 2B). In SpHex, Glu-314 points to the substrate-binding site, and its carboxyl group makes a hydrogen bond (2.5 Å) with the anomeric hydroxyl of NAG. This conformation of Glu-314 favors its general acid-base catalysis to the glycosidic linkage. However, the carboxyl group of Glu-361 of StrH-N points 10.6 Å away from NAG and is stabilized by two hydrogen bonds formed with N⑀ and N2 of Arg-324 ( Fig. 2A). The E361A mutant of GH20-1 was found to have a comparable K m , whereas a k cat of one-fifth compared with the wild-type protein suggests that Glu-361 functions in catalysis (Table 2). Taken together, NAG at the active site of StrH-N adopts a conformation that is not suitable for catalysis. This is mainly due to the effect of Cys-469.
Activity Analyses of the Two GH20 Domains-The kinetic parameters of the wild-type, truncated, and mutant proteins were determined using pNp-NAG as the substrate (Table 2). GH20-1 had a comparable activity with the fragment covering GH20-1 and the linker. This is also the case for GH20-2, indicating that the linker has no effect on the activity of either individual GH20-1 or GH20-2 ( Table 2). In addition, GH20-1 had an enzymatic activity (k cat /K m ) one-fourth of GH20-2 and approximately one-fifteenth of GH20-1&2 (residues 175-977, covering both GH20-1 and GH20-2). Such a significantly lower activity of GH20-1 might be due to the substitution of Cys-469 of GH20-1 by Tyr-903 of GH20-2, because the counterpart residue Tyr in other GH20 enzymes was reported to favor the catalysis by critically stabilizing the intermediate state of the substrate (16). In fact, the C469Y mutant of GH20-1 had an activity comparable with that of GH20-2 ( Table 2).
Substrate Specificity of StrH-To date, all GH20 enzymes of known structure are reported to catalyze the hydrolysis of NAG with a ␤(1,4)or ␤(1,6)-glycosidic linkage. The substrate-binding site could be divided into two subsites, termed subsites Ϫ1 and ϩ1, respectively (39). Aromatic residues at subsite ϩ1 were thought to play a crucial role in determining substrate specificity. For instance, Trp-685 at subsite ϩ1 of SmCHB is responsible for stabilizing the ϩ1 sugar in a proper pose, which is twisted 90°relative to the Ϫ1 NAG to facilitate the cleavage of the ␤(1,4)-glycosidic linkage (12). Similar conformations in other ␤(1,4)-N-acetylhexosaminidases were also observed, such as Trp-408 in SpHex (16), Trp-410 in PsHex (18), and Trp-490 in OfHex1 (19). However, in the structure of ␤(1,6)-Nacetylhexosaminidase AaDspB, the open binding pocket contains no conserved aromatic residue for stacking the ϩ1 sugar, which was proposed to favor the conformation of the ␤(1,6)linked polymer (15).
Superposition of GH20-1 against SpHex yields an overall root mean square deviation of 2.9 Å over 293 C␣ atoms, and 1.9 Å over 72 C␣ atoms of the central (␤/␣) 8 -barrels. The main differences concern the loops connecting the central (␤/␣) 8barrel. It has been demonstrated that StrH specifically hydrolyzes the N-linked sugars with a glycosidic linkage of NAG␤(1,2)Man (11). For a better understanding of the struc-
tural basis of this substrate specificity, we attempted to obtain crystals in the presence of the disaccharide NAG␤(1,2)Man by either soaking or co-crystallization but were unsuccessful. Alternatively, we first calculated a putative substrate entrance path with the CAVER program (http://loschmidt.sci.muni.cz/ caver/index.php). It revealed a dumbbell-shaped tunnel that was gated by two unique loops, L A (Trp-439 -Ser-450) and L B (Cys-469 -Asn-483) (Fig. 3A). L A between ␤8 and ␣8 exhibits an extended conformation, whereas L B connecting ␤9 and ␣9 shows a twisted conformation. Two aromatic residues, Trp-443 on L A and Tyr-482 on L B , were found to guard the entrance of the tunnel (Fig. 3A). Furthermore, we manually constructed a model with NAG␤(1,2)Man in the active site of our present structure by fixing the NAG moiety to subsite Ϫ1. The mannose at subsite ϩ1 was fitted into the position with the best steric geometry (Fig. 3B). The results showed that the ϩ1 mannose moiety of NAG␤(1,2)Man was twisted ϳ90°relative to the Ϫ1 NAG, packing against Trp-443 and Tyr-482. The distance from the ϩ1 mannose moiety to Trp-443 and Tyr-482 is ϳ4.0 and 3.3 Å, respectively (Fig. 3B). Compared with SmCHB, subsite ϩ1 of GH20-1 is structurally distinct, which could be attributed to the different substrate specificity. Interestingly, sequence analysis revealed that these two aromatic residues are conserved in the GH20-1 and GH20-2 domains of StrH. The counterparts in GH20-2 are Trp-876 and Tyr-914, respectively.
To determine the special roles of Trp-443 and Tyr-482 in GH20-1 (Trp-876 and Tyr-914 in GH20-2), enzymatic activity assays were performed using the wild-type proteins and the corresponding mutants. All of the protein samples were quality-controlled by circular dichroism spectroscopy, the results of which showed that the mutations did not introduce significant changes to the protein structures (supplemental Fig. S1). The disaccharide NAG␤(1,2)Man and pNp-NAG were used as the substrates of StrH. Compared with the wild-type enzyme, the W443A mutant of GH20-1 had comparable K m but much lower k cat values, resulting in enzymatic activities (k cat /K m ) ϳ1% toward pNp-NAG and one-fourth toward NAG␤ (1,2)Man (Table 3). It suggested that Trp-443 at the subsite ϩ1 had no considerable effect on substrate binding but played important roles in enzymatic activity. The Y482A mutant of GH20-1 had 2-fold increase in K m and one-third of k cat value, leading to a much lower activity toward pNp-NAG. Noticeably, the mutation of Y482A completely abolished the enzymatic activity toward NAG␤(1,2)Man. It indicated that Tyr-482 at subsite ϩ1 is crucial for both substrate binding and catalysis, and more importantly, indispensable for the activity toward NAG␤(1,2)Man (Table 3). In addition, enzymatic activities toward pNp-NAG of W443F and Y482F mutants of GH20-1, and W876F and Y914F mutants of GH20-2 were measured. Compared with the wild-type enzymes, the k cat /K m values for W443F and W876F mutants were approximately one-sixteenth and one-eighth, whereas they were approximately one-third for Y482F and comparable for Y914F, respectively (Table 3). On the other hand, all mutants to Phe showed somewhat higher enzymatic activities toward pNp-NAG compared with the corresponding mutants to Ala. Moreover, double mutant W443A/Y482A of GH20-1 had no detectable activity toward either pNp-NAG or NAG␤(1,2)Man. Similar results of the enzymatic activity were also observed for the Trp-876 and/or Tyr-914 mutants of GH20-2, suggesting the same roles of these residues at the subsite ϩ1 of GH20-2 (Table 3). Furthermore, double mutant W443A/W876A of GH20-1&2 had significant lower activities toward pNp-NAG and NAG␤(1,2)Man, whereas Y482A/Y914A completely abolished the activities. These results further confirmed the essential roles of these residues in substrate binding and catalysis for the full-length enzyme. To further investigate the substrate specificity of StrH, we also performed the enzymatic assays toward chitobiose of a ␤(1,4)-linkage. The results showed that neither the wild-type StrH nor the Y482A or W443A mutants of GH20-1 (mutants Y914A or W876A of GH20-2) could hydrolyze the ␤(1,4)-linked chitobiose (data not shown). Multiple-sequence alignment revealed that the key residues at the subsite ϩ1 (Trp-443 and Tyr-482 in GH20-1 and Trp-876 and Tyr-914 in GH20-2) of StrH are highly conserved among Lactobacillales of Grampositive bacteria (Fig. 4). We suggest that these proteins could also cleave the substrates with a ␤(1,2)-linkage in a way similar to StrH.
During the second round revision of our manuscript, we found the recently released structures of GH20-1 from S. pneumoniae TIGR4 complexed with NAG␤(1,2)Man (Protein Data Bank code 2YL8) and GH20-2 complexed with pentasaccharide N␤2M␣3M(N␤4)␤4N (Protein Data Bank code 2YLA), deposited by the group of Boraston et al. These two structures in combination with our structure made it possible for us to elucidate the structural insights into the unique substrate specificity of StrH (11). The active site pockets of both GH20-1 and GH20-2, which are specific for the ␤(1,2)-linked NAG, could accommodate either of the branches of these oligosaccharides. In the structure of pen-tasaccharide-complexed GH20-2, the substrate N␤2M␣3M-(N␤4)␤4N is finely buried in the active site, with O4 of its ϩ1 mannose directed outwards from the active site pocket (supplemental Fig. S2). Thus O4 of the ϩ1 mannose could be further decorated by an NAG molecule. This explains why only the double branches of ␤(1,2)and ␤(1,4)-linked NAG to the ϩ1 mannose of the tetraantennary oligosaccharide could enter the active site for hydrolysis (11). In addition, only the full-length StrH of S. pneumoniae TIGR4, but not the GH20-1, could hydrolyze the bisected biantennary oligosaccharide. Structural comparison of S. pneumoniae TIGR4 GH20-1 (Protein Data Bank 2YL8) and GH20-2 (Protein  Data Bank 2YLA) revealed that the active site pockets are similar, both of which have two extended loops (L A and L B ) and adopt similar conformations. The most significant difference is the variable sequences of L B loops from GH20-1 and GH20-2. Especially Tyr-482 in GH20-1 is substituted by Gly-914 in GH20-2. This substitution alters the active site accessibility and thus broadens the substrate specificity. Tyr-482 of GH20-1 partially occupies the space for the bisected NAG molecule in GH20-2 (supplemental Fig. S2), making it impossible for GH20-1 to accommodate the bisected biantennary oligosaccharide. This explains why only the full-length StrH, but not the GH20-1 alone, could hydrolyze the bisected biantennary oligosaccharide. More notably, multiple-sequence alignment revealed that some S. pneumoniae strains have a Tyr instead of Gly at L B loop of StrH GH20-2, such as R6 in our study. Thus we propose that StrH from these strains including R6 could not hydrolyze the bisected biantennary oligosaccharide, possibly because of the steric hindrance of Tyr residue to the active site. Distinct from other GH20 enzymes, Streptococcal StrH possesses a tandem repeat of two GH20 domains. The enzymatic activity of GH20-1&2 is significantly higher than the sum of two individual GH20 domains (ϳ15-and 4-fold higher than GH20-1 and GH20-2, respectively). These results suggest that GH20-1 and GH20-2 may have a synergistic effect during hydrolysis of the glycoconjugates of the host defense molecules. To further verify this effect, we simply mixed GH20-1 and GH20-2 at a 1:1 molar ratio and compared its activity toward pNp-NAG with the tandem repeat (GH20-1&2) and individual domains. The mixed GH20 domains had a K m value of 840 Ϯ 7.3 M, similar to that of either individual GH20 domain, and a k cat value of 6.6 Ϯ 0.9 s Ϫ1 , which is comparable to the sum of that for two domains. However, the mixed GH20 domains had an enzymatic activity (k cat /K m ) of only approximately one-third that of GH20-1&2. The results indicated that the tandemly repeated two GH20 domains of StrH had a synergistic effect. This effect could probably be attributed to the flexible interdomain linker that enables the two GH20 domains to easily approach each other during hydrolysis. Because of the substitution of the critical Tyr to Cys at the active site, GH20-1 only showed an enzymatic activity of approximately one-fourth compared with GH20-2. A sequence homology search revealed that all orthologs in other Streptococci have a conserved Tyr at the corresponding site. The Tyr 3 Cys mutation was only found in GH20-1 from the avirulent strain S. pneumoniae R6. The proposed attenuated activity of StrH toward the glycoconjugates of human host is in agreement with the low virulence of the R6 strain.