Structural Elucidation of Dextran Degradation Mechanism by Streptococcus mutans Dextranase Belonging to Glycoside Hydrolase Family 66

suicide substrate of the enzyme, revealed that the epoxide ring reacted to form a covalent bond with the Asp-385 sidechain. These structures collectively indicated that Asp-385 was the catalytic nucleophile and Glu-453 the 5 acid/base of the double displacement mechanism, in which the enzyme showed a retaining catalytic character. This is the first structural report for a GH-66 enzyme elucidating the enzyme's catalytic machinery. 10 (215 words)

Dextranase is an enzyme that hydrolyzes dextran α-1,6 linkages. Streptococcus mutans dextranase (SmDex) belongs to glycoside hydrolase family 66, producing isomaltooligosaccharides of various sizes, and 20 consisting of at least five amino acid sequence regions. The crystal structure of the conserved fragment from Gln-100 to Ile-732 of SmDex, devoid of its N and C-terminal variable regions, was determined at 1.6 Å 25 resolution and found to contain three structural domains. Domain N possessed an immunoglobulin-like β-sandwich fold, domain A the enzyme's catalytic module, comprising a (β/α) 8  Streptococcus mutans is a Gram-positive bacterium that has been implicated as a major cariogeneic bacteria (1,2), which metabolizes sugars, including sucrose, glucose, fructose and lactose, to lactic acid. With S. mutans accumulation on tooth surfaces as dental plaque, the lactic acid concentration increases and lowers the oral cavity pH, which leads to demineralization of tooth enamel, the origin of dental caries. Extracellular glucans, synthesized from sucrose by glucosyltransferase enzymatic activities, are known to enhance S. mutans 25 biofilm formation. Glucosyltransferases synthesize both water-soluble glucan, such as dextrans with linear α-1,6-linked glucose units, and water-insoluble glucans, which are sticky polysaccharides composed of glucose units, predominantly in α-1,3-linkage, and with various degrees of branching and associated with bacterial cell adhesion to the tooth surface. Endodextranase (EC 3.2.1.11; 6-α-D-glucan-6-glucanohydrolase) from S. 35 mutans (SmDex) is an enzyme that hydrolyzes α-1,6-linkages of dextran (1,3,4) and produces isomaltooligosaccharides (IGs) of various sizes; it has been shown to modify glucan structure by controlling the amount and content of extracellular glucans (5,6). SmDex belongs to the glycoside hydrolase family (GH)-66 according to the CAZy database (http://www.cazy.org/) (7). GH encoding SmDex enzyme, was first cloned from S. mutans Ingbritt and its nucleotide and amino acid (aa) sequences were determined (11). The expressed protein is composed of 850 aa residues with a molecular mass of 94.5 kDa, but 60 the formation of protease-associated multiple isoforms has been reported; similar observations have also been also reported regarding many other GH-66 dextranases of native and recombinant forms (12). According to amino 65 acid sequence analysis of GH-66 enzymes, SmDex has been divided into five regions: a signal peptide sequence (N-terminal 24 aa), an N-terminal variable region (Ser-25-Asn-99), a conserved region (Gln-100-Ala-615), a glucan 70 binding site (Leu-616-Ile-732) and a C-terminal variable region (Asn-733-Asp-850) (13)(14)(15). CITase has an extra-long insertion of ~90 aa inside the dextranase conserved region (16).
Biochemical studies using site-directed 75 mutagenesis based on amino acid sequence comparison with other glucosyltransferases have revealed that Asp-385 is essential for the catalytic reaction (14

60
Å resolution (space group P2 1 ) were collected using an ADSC-Q270 CCD detector (Area Detector Systems Corp., Poway, CA, USA). Crystals were cryocooled in a nitrogen gas stream to 95 K. While complexed structural analyses of the enzyme with isomaltotriose (IG3; Seikagaku Corp., Tokyo, Japan) or 4′,5′-epoxypentyl-α-D-glucopyranoside (E5G) (24,25), SmDexTM crystals were soaked into a drop containing 5% (w/v, 99 mM) IG3 or 1 mM 70 E5G in the precipitant solution for 10 min or 1 h, respectively, before the diffraction experiment. Diffraction data of the IG3 complex to 1.90 Å resolution were collected at 100 K, using R-axis VII imaging plate area detectors and CuKα 75 radiation generated by a rotating-anode generator MicroMax007 (Rigaku Corp., Akishima, Japan). The E5G complex data were collected at the beamline BL-5A, PF. All data were integrated and scaled using the DENZO 80 and SCALEPACK programs in the HKL2000 program suite (26). Crystal structure was determined by the multiwavelength anomalous dispersion method using SeMet-labeled crystals (27), and 12 selenium atom positions were 85 determined and initial phases calculated using SOLVE/RESOLVE program (28, 29). The solution was subjected to the auto-modeling ARP/wARP program (30) settled in the CCP4 program suite (31). Manual model building and 90 molecular refinement were performed using the Coot (32) and REFMAC5 programs (33,34).
For the analyses of ligand-binding structures, structural determination was conducted using the ligand-free structure as the starting model and the bound ligand observed in the electron density difference map. Data collection and refinement statistics are shown in Table 1. Stereochemistry of the models was analyzed with the Rampage program (35) and structural drawings prepared using the PyMol program (DeLano Scientific LLC, Palo Alto, CA, USA).

RESULTS
Overall structure of SmDexTM-The crystal structure of SmDexTM was determined by the multiwavelength anomalous dispersion method using SeMet derivative data and successively, the native and two ligand complex structures with IG3 (SmDexTM/IG) or E5G (SmDexTM/E5G) determined. Structural refinement statistics are summarized in Table 1 20 and the quality and accuracy of the final structures were further demonstrated as more than 98% of their residues fell within the common regions of the Ramachandran stereochemistry plot.

30
The N-terminal 4 residues Met-98-Lys-101 and the C-terminal 5 residues His-736-His-740 were not identified due to lack of electron density. The final model consisted of one SmDexTM molecule accompanied with 1 phosphate ion.

55
Domain A (Asp-211-Lys-593) was mainly composed of a (β/α) 8 -barrel, which is a catalytic domain in many glycoside hydrolases. Two catalytic amino acid residues, which were detected by the SmDexTM/IG complex structure 60 described below, were located on the concave surface formed by the central β-barrel C-terminal side. A structural homology search, using the Dali server (42), revealed that the domain A structure was similar to the catalytic 65 domains of many GH-13 subfamilies, especially subfamily 20, such as in Thermoactinomyces vulgaris R-47 α-amylases 1 (TVAI, PDB code 2D0H, subfamily unknown) (43) and 2 (TVAII, PDB code 3A6O, subfamily 20) (44) and 70 maltogenic amylase (PDB code 1GVI, subfamily 20) (45) with Z-scores of 18.2, 18.7, and 18.6, and root-mean-square (rms) differences of 3.6, 3.7, and 3.5 Å, respectively. There were five relatively large looped regions 75 in this domain, ranging from 23 to 50 residues in length, and the 50-residue long loop 3 associated with loop 2 to form a small subdomain structure ( Fig. 1, dark green), which was similarly positioned in comparison to the domain B of 80 α-amylases, forming one sidewall of the catalytic cleft.
Domain C (Val-594-Ile-732) adopted an antiparallel β-sandwich structure, consisting of 10 β-strands but basically belonged to two The domain arrangement of SmDex resembled some GH-13 proteins, including TVAI (43), which consists of domain N with an immunoglobulin fold, catalytic domain A with a (β/α) 8 -barrel and domain B, which possesses a structure dissimilar from loop 3 of SmDex, and the domain C with the Greek key motifs. The relative arrangement of domains N and C was, however, different between these enzymes, when the catalytic domains were superimposed ( Fig. 2).
Crystal structure of SmDexTM complexed with isomaltotriose-IG3 was used for enzyme-substrate/product complex analysis as a means to elucidate the enzyme's catalytic mechanism. In the electron density map, bound sugars were observed at two positions, in the catalytic cleft and on domain N (Fig. 1). In the catalytic cleft, the modeled ligand was 20 composed of four glucose moieties and appeared as a tetraisomaltooligosaccharide (IG4, Figs. 3A and 3B), the product of a reversed or transglycosylation reaction by the enzyme. The other bound isomaltooligosaccharide showed 25 two glucose moieties that appeared to be an isomaltose in the gap between domain N of an enzyme molecule and domain C of an adjacent molecule in the crystal (Fig. 3C). The two domains contributed to the binding through a 30 few hydrogen bonds and hydrophobic interactions, but no aromatic sidechain was involved. Taking these facts into account, the latter ligand appeared to be a crystal packing artifact.

35
The overall structure of the SmDexTM/IG complex was almost identical to that of ligand-free SmDexTM with an rms difference of 0.73 Å, implying that ligand binding had little effect upon the overall structure. The main 40 differences in the backbone structures were confined to the solvent-exposed loops in domains A and C, and four residues (Ile-249-Lys-252) in loop 2 of domain A were invisible in the SmDexTM/IG complex's 45 electron density map.
The catalytic cleft was located at the center surface of the (β/α) 8 -barrel. The enzyme subsites were named according to Davies et al. (46) and the transglycosylated ligand occupied subsites 50 -4 to -1 from the non-reducing end to the reducing end, with the glucose moieties in subsites -4 to -1 designated as Glc-4 to Glc-1. All glucose moieties were in the relaxed chair conformation with the glucose moiety Glc-1 in 55 the β-anomeric conformation and the O1 atom in the proximity of the Asp-385 sidechain, which was the candidate of a catalytic nucleophile. However, dextran has α-1,6-linkages and hydrogen bonds cannot be formed with the 60 scissile bond of the natural substrate. The Asp-385 Oδ2 atom was also close to the Glc-1 C1 atom with a distance of 3.1 Å, implying its role as a nucleophile. The other acidic residue, Glu-453, formed a hydrogen bond with the 65 Glc-1 O2 atom via the Oε2 atom. When Glc-1 adopts an α-anomeric conformation, as in natural substrates, the Glu-453 Oε1 atom would be situated within hydrogen bonding distance of the Glc-1 O1 atom and could provide hydrogen 70 in the catalytic bond, suggesting that it would act as an acid/base in the catalytic reaction. The O3 and O4 atoms of Glc-1 hydrogen-bonded to the Ala-559 mainchain O atom and the Tyr-257 Oη atom, respectively, and Glc-1 was recognized 75 by 5 direct hydrogen bonds.
Glc-2 was also found to participate in 5 direct hydrogen bonds with three tyrosine residues, Tyr-257, Tyr-260 and Tyr-307, and aspartic acid Asp-258. Asp-258 recognized two 80 oxygen atoms O2 and O3 of Glc-2, playing an important role in substrate recognition at subsite -2. Besides the hydrogen bonds, subsite -2 was built by the hydrophobic residues of Trp-277, Trp-280, and Met-309, and Glc-2 was strictly 85 recognized by the protein. In contrast, Glc-3 showed no direct hydrogen bonding to the protein and Glc-4 exhibited one hydrogen bond between its O2 atom and the sidechain of Thr-563. These two glucose moieties were 90 embedded onto the two aromatic residues of Trp-280 and Tyr-560, appearing to be loosely recognized with their electron density, especially for Glc-4, being relatively weak. The average B-factors of 6 cyclic atoms of the glucose rings were 51.6, 45.7, 56.7, and 64.8 Å 2 for Glc-1 to Glc-4, respectively, which also implied that subsites -1 and -2 showed strong glucose binding.
Crystal structure of SmDexTM complexed with E5G-The SmDexTM/E5G complex structure was determined to label and visually identify the enzyme's nucleophile residue. The resulting overall structure superimposed well with the ligand-free SmDexTM and SmDexTM/IG complex structures, with rms differences of 0.20 and 0.65 Å, respectively. The electron density at the active site accommodated a single E5G moiety and the E5G density was connected to the Asp-385 Oδ2 atom, revealing that E5G was covalently bound to the enzyme.

25
consisting of the 1'-4'-C atoms, occupied the similar positions of C6, C5, O5 and C1 atoms of the Glc-1 of the IG3 complex, respectively, but was slightly shifted towards Asp-385 ( Fig 3E). This was probably due to the covalent linkage 30 between the E5G 4′-C atom and the Asp-385 Oδ2 atom after opening of the epoxide ring formed by 4′ and 5′-C′s and epoxy oxygen.

DISCUSSION
In this study, the crystal structure of SmDexTM was determined, yielding the first example of a fully described GH-66 protein. The length of GH-66 proteins (including precursor forms) varies from 536 to 1686 aa residues. The   unclear, but CBMs generally are involved in binding complicated substrates and in assisting catalysis in the catalytic domain, and they are assumed to be dextran-binding modules. In addition, some GH-66 enzymes possess regions 65 in their C-terminus for cell wall association, and include sorting signals with LPXTG cell wall anchor motifs in SmDex (13) and S-layer homology (SLH) domains, which associate noncovalently with cell walls (50).

70
SmDex, as well as the other GH-66 dextranases and CITases, are retaining enzymes, targeting substrates with α-anomeric conformations and releasing an α-anomeric product, and thus their catalytic mechanism is 75 supposed to be a double displacement mechanism (51). There have been reports suggesting that Asp-385 of SmDex (14) and the corresponding residues in other GH-66 enzymes are the nucleophiles (17,18). Here, the 80 SmDexTM/IG complex structure revealed that the Asp-385 Oδ2 atom was located close to the Glc-1 C1 atom, with a distance of 3.1 Å. In addition, the SmDexTM structure when complexed with E5G, a widely used suicide 85 substrate employed to label the nucleophile of the retaining glycosidases (52), showed that the epoxy ring opened up and the 4′-C atom of the alkyl chain moiety formed a covalent bond with the Asp-385 Oδ2 atom. Structural evidence 90 obtained here visually proved that Asp-385 is the nucleophile of SmDex. On the other hand, there has been no report identifying the catalytic acid/base entity in GH-66 proteins. The SmDexTM/IG complex structure here showed that the C-1 hydroxyl group of Glc-1 took β-anomeric conformation and that the distance between the Glu-453 Oε1 atom and the Glc-1 C1 atom was 3.9 Å. However, the distance between the Glc-1 O1 and Glu-453 Oε1 atoms would be within hydrogen-bonding distance assuming the Glc-1 O1 atom was in α-anomeric conformation. Furthermore, in the ligand-free SmDexTM structure, the Glu-453 Oε1 and Asp-385 Oδ2 atoms were apart by 6.0 Å, a reasonable distance between the two catalytic residues of a retaining type glycosidase (51).
Taking these observationss into account, Glu-453 was identified as this enzyme's catalytic acid/base.
In addition to structural analyses, a separate mutational approach has been utilized to clarify 20 the catalytic residues of GH-66 enzymes, using Paenibacillus sp. 598K CITase (PsCITase) (20) and Paenibacillus sp. dextranase (PsDex) (19). Three acidic residues Asp-144, Asp-269, and Glu-341 are predicted to be the catalytically 25 important residues for PsCITase; similarly, Asp-189, Asp-340, and Glu-412 are important for PsDex. When a chemical rescue reaction is applied to D340G or E412Q mutants of PsDex by using α-isomaltotetraosyl fluoride with sodium azide NaN 3 , the D340G and E412Q mutants formed β and α-isomaltotetraosyl azides, implying that Asp-340 and Glu-412 are a nucleophile and an acid/base-catalyst, respectively. PsCITase Asp-269 and PsDex Asp-340 correspond to Asp-385 of SmDex, and PsCITase Glu-341 and PsDex Glu-412 correspond to Glu-453 of SmDex. Biochemical analyses and structural observations agree that aspartic acid is the catalytic nucleophile and 40 glutamic acid the catalytic acid/base of the GH-66 enzymes.
Also from mutational analyses, another aspartic acid, Asp-144 of PsCITase and Asp-189 of PsDex, is essential for catalysis. They 45 correspond to Asp-258 of SmDex, which formed two hydrogen bonds to the two oxygen atoms of Glc-2. This indicated that accurate substrate binding at subsite -2 was essential for the enzymatic activity, in addition to the catalytic 50 residues directly involved in the hydrolysis.
The catalytic nucleophile of SmDex Asp-385 was located at the end of strand Aβ4 and the acid/base catalyst Glu-453 at the end of strand Aβ6. Many GHs have the (β/α) 8  are different from SmDex. The catalytic nucleophile is the aspartic acid located at the end of the (β/α) 8 -barrel 4 th β-strand and the proton donor the glutamic acid located at the 5 th β-strand end (Fig. 5B) (44,58).

90
Most enzymes in the clan GH-D are exo-type enzymes, working from oligosaccharide chain ends, and their active sites form a pocket-type structure but, in contract, an isomaltodextranase belonging to GH-27 is considered to be an endo-type exception, acting within oligosaccharide chains (59). SmDex is an endo-type enzyme, which possesses a catalytic cleft positioned across the surface of the (β/α) 8 -fold. The cleft was surrounded by the loops of the (β/α) 8 -barrel, such that the loops 2, 3 and 4 formed one sidewall and loops 6 and 7 the other (Fig. 6). In the SmDexTM/IG structure, four glucose moieties were observed in the glycone side of the cleft and the bound ligand appeared similar to IG4, which was assumed to be the product of the reverse or transglycosylation enzyme reaction, as the protein used was a wild-type protein and a relatively high IG3 concentration was employed in the soaking experiment. Furthermore, continuous electron density was observed 20 around the Glc-4 O6 atom, implying that the bound ligand might be isomaltooligosaccharide with a degree of polymerization of five or greater. As Glc-4 was located at the catalytic cleft end, subsite -5 did not exist and the 25 remaining glucoses might have been disordered. Among the four subsites, Glc-2 showed the strongest electron density, compared with the other glucose moieties, and had the lowest B-factor. In addition, the SmDexTM/E5G 30 complex structure demonstrated that the E5G glucose moiety occupied subsite -2. These observations suggested that it was necessary for the substrate to be captured by the enzyme at the subsite -2 for a catalytic reaction to be possible.

35
Amino acid sequence alignment with GH-66 proteins showed that five amino acids, which hydrogen bond with Glc-1 and Glc-2 of the bound IG molecule (Tyr-257, Asp-258, Tyr-307, Asp-385 and Glu-453), are strictly conserved in 40 GH-66 enzymes, while Tyr-260 is nevertheless conserved in streptococcal dextranases (Fig. 4). This implied that other GH-66 enzymes can be expected to recognize dextran in a manner almost identical to SmDex. In contrast, amino acid residues contributing hydrophobic interactions with Glc-3 and Glc-4 are less conserved and represent regions that might influence an enzyme's product specificity.
On the other hand, no ligand was 50 observed in the aglycone side of the catalytic cleft. The glycone side was covered by protruding loops 2 and 7, while the aglycone side of the cleft was open and wide (Fig. 6). Dextran shows high solubility and appears to 55 have flexible structure, such that the wide cleft might preferably take in such an unstructured substrate. The aglycone side of the cleft mainly consisted of loops 3, 4, and 6 with some aromatic residues arranged at the loops, and that 60 also appear to help with substrate uptake. Subsites of the aglycone side remained unclear, but judging from the position of Glc-1, subsite +1 appeared to be surrounded by the hydrophobic residues of Ile-387, Val-434, and 65 Trp-455, but Glc+1 did not seem to be specifically recognized.
Original SmDex contains a C-terminal variable region after domain C, and a N-terminal variable region before domain N. In a previous 70 paper, recombinant full-size SmDex protein (95.4 kDa) was expressed but was also proteolytically degraded to form a shorter, truncated isoform of 89.8 kDa (21). When series of truncation mutants, with deleted C-terminal 75 and/or N-terminal variable regions, were constructed and examined, SmDexTM was found to be devoid of its N and C-terminal variable regions, proteinase-resistant, and also displayed the enhanced substrate hydrolysis 80 compared to full-size SmDex protein. Here, SmDexTM structure revealed that Asn-102, the N-terminal residue, was located in the proximal region of the Aα8 head, rather close to the catalytic cleft's glycone side, and the C-terminus 85 His-735 was positioned in the opposite side. Although the structure of these regions remains unclear, it cannot be denied that, as a pro enzyme, the N-terminal variable region might approach the catalytic cleft's aglycone side and 90 the C-terminal variable region might approach the glycone side and thus both might hinder pro-SmDex from accessing the substrate. Another possibility is that the unstructured N and C-terminal variable regions might cause aggregation of the enzyme, leading to decreased hydrolytic activity.
Morisaki et al. have reported that SmDex domain C is necessary for dextran binding in deletion-mutant studies (15), but here, no apparent dextran-binding site was found in domain C of the SmDexTM/IG structure. This 10 domain widely contacts the catalytic domain and appears to contribute to its stabilization. In particular, an elongated loop between the β-strands Cβ3 and Cβ4 was observed here to be in contact with the catalytic domain's loops 6 15 and 7, which were involved in the formation of the catalytic cleft, and appeared to hold their relative positions. Therefore, amino acid deletion in domain C might result in conformational changes in the catalytic cleft,   bound IG, gray stick model; two catalytic residues, pale red; estimated hydrogen bonds, cyan break lines. B, 2F obs -F calc electron density map of the bound IG3 derivative in the catalytic cleft; contour level, 1 σ. C, a 2 nd IG3 derivative found between domains N and C of the different molecules in the SmDexTM/IG complex structure; electron density map around the ligand, 0.5 σ contour level. D, E5G molecule covalently bonded to Asp-385 in SmDexTM/E5G catalytic cleft; E5G, teal stick model; electron density map around the ligand, 1 σ contour level. E, superimposition of IG3 derivative and E5G molecules bound in the catalytic cleft of the SmDexTM/IG and SmDexTM/E5G complex structures.