Molecular Basis for the Recognition of Long-chain Substrates by Plant α-Glucosidases

Background: The origin of specificity of plant α-glucosidases for long malto-oligosaccharides remains uncertain. Results: The crystal structure and mutational analyses of sugar beet α-glucosidase revealed its substrate binding properties. Conclusion: The long-substrate specificity was described as two structural elements, the N-loop and subdomain b2. Significance: A slight structural difference leads to significant differences in specificity for varying chain lengths of substrate. Sugar beet α-glucosidase (SBG), a member of glycoside hydrolase family 31, shows exceptional long-chain specificity, exhibiting higher kcat/Km values for longer malto-oligosaccharides. However, its amino acid sequence is similar to those of other short chain-specific α-glucosidases. To gain structural insights into the long-chain substrate recognition of SBG, a crystal structure complex with the pseudotetrasaccharide acarbose was determined at 1.7 Å resolution. The active site pocket of SBG is formed by a (β/α)8 barrel domain and a long loop (N-loop) bulging from the N-terminal domain similar to other related enzymes. Two residues (Phe-236 and Asn-237) in the N-loop are important for the long-chain specificity. Kinetic analysis of an Asn-237 mutant enzyme and a previous study of a Phe-236 mutant enzyme demonstrated that these residues create subsites +2 and +3. The structure also indicates that Phe-236 and Asn-237 guide the reducing end of long substrates to subdomain b2, which is an additional element inserted into the (β/α)8 barrel domain. Subdomain b2 of SBG includes Ser-497, which was identified as the residue at subsite +4 by site-directed mutagenesis.

Glycoside hydrolase family 31 (GH31) 3 (1). The GH31 ␣-glucosidases (GH31AGs) are involved mainly in the metabolism of starch and its derivatives. For example, maltase-glucoamylase (MGAM) and sucrase-isomaltase in the mammalian small intestine are associated with the hydrolysis of malto-oligosaccharides first degraded by ␣-amylases (2). In contrast, there are GH31AGs with different roles, such as ␣-glucosidase II, which is localized in the endoplasmic reticulum and is involved in the quality control of nascent glycoproteins.
Most fungal ␣-glucosidases (e.g. those of Aspergillus niger, Schizosaccharomyces pombe, and Schwanniomyces occidentalis) (3)(4)(5) also participate in the use of malto-oligosaccharides and exhibit the highest k cat /K m for a malto-oligosaccharide with a degree of polymerization (DP) of 3 (G3), with lower values for substrates longer than G3. These are known as short chain-specific GH31AGs.
During the germination of plant seeds, starch degradation to produce glucose is one of the most important events for obtaining energy. Four types of enzymes are believed to be involved in the conversion of starch to glucose: ␣-amylase, ␤-amylase, debranching enzymes, and ␣-glucosidase. ␣-Glucosidases are thought to act on maltose and other short malto-oligosaccharides produced by amylases. This is indeed the case for barley ␣-glucosidase, which shows short-chain specificity (6). However, most plant ␣-glucosidases tend to prefer long malto-oligosaccharides. For example, buckwheat ␣-glucosidase (7) and sugar beet ␣-glucosidase (SBG) (8) show 8-and 50-fold higher k cat /K m values for maltoheptaose than for maltose, respectively. In particular, SBG has exceptional specificity for long substrates, exhibiting a 90-fold higher k cat /K m for soluble starch than for maltose. It is of interest that GH31AGs show such different chain length specificities despite the fact that the enzymes share significantly similar amino acid sequences. Understanding the basis of the substrate specificity diversity in GH31AGs is a challenging task, and few studies have focused on the molecular mechanism of the diverse chain length specificities.
The crystal structures of several GH31AGs have been determined (9 -13). The major domain of GH31AGs displays a (␤/␣) 8 barrel fold. The active site pocket is formed by the (␤/␣) 8  ing from the N-terminal ␤-sandwich domain. Among the GH31AGs with known structures, the C-terminal subunit of human MGAM (CtMGAM) is the only long chain-specific enzyme and has a 10 times lower K m for G5 than for G2 (13). The long-chain specificity of the C-terminal unit of the glucoamylase CtMGAM was a result of an insertion of 21 amino acids, which form subsites ϩ2 and ϩ3. However, SBG and other plant GH31AGs have no such insertion, and another element must be responsible for their long-chain specificity. We previously identified Phe-236 in the N-loop of SBG as one of the important elements involved in the recognition of long-chain substrates based on a comparison of amino acid sequences and the results of site-directed mutagenesis. Substitution of Phe-236 with Ala or Ser decreased the k cat /K m values for the hydrolysis of soluble starch and malto-oligosaccharides (Gn, where n represents DP) except G2. In contrast, replacement of Thr-228, which is equivalent to Phe-236 in short chainspecific A. niger ␣-glucosidase, with Phe led to a shift from G3 to G4, which has the highest k cat /K m (14).
In this study, we determined the crystal structure of SBG in an effort to obtain structural insights into the long-chain specificity. This is the first crystal structure of a plant GH31AG reported to date. A complex structure bound with acarbose, a pseudotetrasaccharide inhibitor, reveals that the N-loop forms subsites ϩ2 and ϩ3. Furthermore, we identified the residue forming subsite ϩ4 is Ser-497 by site-directed mutagenesis.

EXPERIMENTAL PROCEDURES
Purification of ␣-Glucosidase from Sugar Beet Seeds-Sugar beet (Beta vulgaris L. cv. Abend) seeds with pericarp (1 kg) were milled, suspended in 4 liters of 0.1 M sodium acetate buffer (pH 5.4) (buffer A), and stirred for 10 h at 4°C. The crude extract was obtained from the suspension by filtration using a nylon net, centrifugation at 11,300 ϫ g for 10 min, and Celite 535 (Wako Pure Chemical Industries, Osaka, Japan). Proteins were precipitated by treatment with 90% saturated ammonium sulfate for 30 h at 4°C, collected by centrifugation at 11,300 ϫ g for 20 min, and dissolved in 20 mM buffer A containing 12% ammonium sulfate. The samples were loaded onto a Toyopearl butyl-650M column (3 cm (inner diameter) ϫ 38 cm; Tosoh, Tokyo, Japan) equilibrated with 20 mM buffer A containing 12% ammonium sulfate. After washing the column with equilibration solution, the bound proteins were eluted with a linear gradient of 12 to 0% ammonium sulfate in 20 mM buffer A. The active fractions were collected, dialyzed against 20 mM buffer A, and loaded onto a CM-Sepharose Fast Flow column (3 cm (inner diameter) ϫ 38 cm; GE Healthcare) equilibrated with 20 mM buffer A. The column was washed, and the bound proteins were eluted with a linear gradient of 0 -1 M sodium chloride in 20 mM buffer A. The active fractions were concentrated using a Centriprep YM-50 unit (Millipore, Billerica, MA) and loaded onto a Toyopearl HW-55F column (2.7 cm (inner diameter) ϫ 80 cm; Tosoh) equilibrated with 20 mM buffer A containing 100 mM sodium chloride. The purified native SBG fractions were collected and dialyzed against 20 mM buffer A.
Cloning of the Gene Encoding SBG-Sugar beet (cv. Abend) seeds were germinated for 9 days at 30°C. Their shoots were collected, immediately frozen, and crushed in liquid nitro-gen. The whole DNA was extracted from the crushed shoots (100 mg) using ISOPLANT II (Nippon Gene, Toyama, Japan). The extracted DNA was used for PCR with a pair of synthesized primers: 5Ј-TCCTAAAAGCTCAACATTTATCGAGG-GTTT-3Ј and 5Ј-CACACACAAAATCAGAAAAACTCCA-AGG-3Ј. These primers were designed according to a reported ␣-glucosidase cDNA cloned from sugar beet seeds (strain NK185-BR2; GenBank TM accession number AB698976). PCR was performed using PrimeSTAR Max DNA polymerase (Takara Bio, Otsu, Japan). The PCR product was ligated using a Ver. 2 ligation kit (Takara Bio) into the EcoRV site of pBluescript II SK(ϩ) (Stratagene, La Jolla, CA) and propagated in Escherichia coli strain DH5␣. The amplified DNA strand was sequenced using an automated DNA sequencer (Applied Biosystems 310 Genetic Analyzer and BigDye Terminator v3.1). The nucleotide sequence was deposited in the GenBank TM with the accession number AB699590. The exons of the amplified DNA were predicted by Spidey (15) using the reported SBG cDNA (GenBank TM accession number AB698976) as a template.
In all cases, crystallization was performed by the hangingdrop vapor-diffusion method at 25°C for ϳ1 month. Several crystals of the ligand-free form were obtained in a drop consisting of 6 l of Endo-F3-treated native SBG (2.16 mg/ml) and 3 l of reservoir solution (50 mM sodium acetate buffer (pH 4.5), 100 mM ammonium sulfate, and 18% polyethylene glycol monomethyl ether 2000). Several co-crystals with acarbose were obtained in a drop consisting of 3 l of Endo-F3-treated native SBG (2.16 mg/ml), 3 l of reservoir solution (50 mM sodium acetate buffer (pH 4.0), 50 mM ammonium sulfate, and 16% polyethylene glycol monomethyl ether 2000), and 1 l of 100 mM acarbose.
Data sets were collected under a stream of nitrogen at 100 K from a single crystal at beamline BL41XU of SPring-8 (Hyogo, Japan) at a wavelength of 1.000 Å. Each crystal was flash-cooled after soaking in each reservoir solution containing 20% glycerol (and 14 mM acarbose for the co-crystal) for several minutes. Diffraction data sets were collected using an MX225HE CCD detector (Rayonix, Norderstedt, Germany). The diffraction data were indexed, integrated, scaled, and merged with XDS (16).
The structure of the acarbose complex was determined by the molecular replacement method with AutoMR in PHENIX (17) using the N-terminal subunit of MGAM (NtMGAM; Protein Data Bank code 2QLY) as a search model. The ligand-free structure was determined using the acarbose complex structure as a search model. After several cycles of manual model corrections with Coot (18) and refinement with REFMAC5 (19) in CCP4 and phenix.refine (17), the refinement converged. Ramachandran plot analysis was performed using RAMPAGE (20) in CCP4. Coordinates and structure factors have been depos-ited in the Protein Data Bank with codes 3W37 and 3W38. Graphical representations were prepared using PyMOL (21).
Production of Recombinant Enzymes-Site-directed mutagenesis was performed using a PrimeSTAR mutagenesis basal kit (Takara Bio). The SBG-carrying pGAPZ␣A vector was used as the PCR template with primers 5Ј-AGCTTC-GCTAGGGACCTTAACTTGTAT-3Ј and 5Ј-GTCCCTAGC-GAAGCTAGCAATGTCAGC-3Ј for N237A and primers 5Ј-AATAATGCTGGAGGCCGTGTACCAATA-3Ј and 5Ј-GCC-TCCAGCATTATTGATCTTATATGG-3Ј for S497A, where the underlined nucleotides indicate the mutated codons. The expression and purification of the mutant enzymes were performed according to a previous report (14).
Size-exclusion Chromatography-Size-exclusion chromatography was performed by HPLC using a TSKgel G3000SW XL column (7.8 mm (inner diameter) ϫ 30 cm; Tosoh) equilibrated with 50 mM sodium acetate buffer (pH 4.5) containing 150 mM sodium chloride. Native SBG (0.517 g, 10 l) was applied to the column and eluted at a flow rate of 0.7 ml/min while the absorbance was monitored at 280 nm. The molecular mass of native SBG was estimated from its elution coefficient relative to those of the molecular mass marker proteins (gel filtration standard, Bio-Rad): thyroglobulin (670 kDa), bovine ␥-globulin (158 kDa), chicken ovalbumin (44 kDa), equine myoglobin (17 kDa), and vitamin B 12 (1.35 kDa).

RESULTS AND DISCUSSION
Crystal Structure Analysis of SBG-SBG was purified from sugar beet seeds. Three amino acid differences (N423D, V871I, and R876L) were found in SBG as deduced by comparing the genomic DNA sequence with the reported sequence (14). Purified SBG was deglycosylated by treatment with Endo-F3, and deglycosylated SBG was crystallized and co-crystallized with acarbose. The crystals of SBG belong to the space group P2 1 2 1 2 1 (unit cell parameters a ϭ 83.5, b ϭ 95.5, and c ϭ 107.7 Å for the apo enzyme crystal and a ϭ 86.4, b ϭ 98.2, and c ϭ 108.8 Å for the acarbose complex), with one protein molecule present in each asymmetric unit. This observation was in agreement with the results of size-exclusion chromatography of SBG, indicating that it exists as a monomer in solution. Crystal structures were determined at 2.8 Å (ligand-free structure) and 1.7 Å (acarbose complex) resolution, respectively, with the molecular replacement method using the structure of NtMGAM (10) as a search model ( Fig. 1 and Table 1). All 913 residues were built based  (23)).
The crystal structure of the acarbose complex revealed that SBG is partially N-glycosylated, with electron density visible for ␤-N-acetylglucosaminyl-(134)-(␣-fucosyl-(133))-␤-N-acetylglucosaminyl-Asn-404, ␤-N-acetylglucosaminyl-Asn-728, and ␤-N-acetylglucosaminyl-(134)-␤-N-acetylglucosaminyl-Asn-823 (Fig. 1C). These electron densities are unclear in the ligandfree structure with low resolution. SBG has six potential N-glycosylation sites (Asn-Xaa-(Ser/Thr), where Xaa is not Pro). Among them, Asn-404 and Asn-728 form the Asx turn. The Asx turn is preferentially recognized by an oligosaccharyltransferase, which is located in the endoplasmic reticulum and catalyzes N-glycan transfer (24). Asn-823 is not involved in the Asx turn; however, the carbonyl side chain of Asn-823 forms a water-mediated hydrogen bond with the hydroxy group of Thr-825. This hydrogen bond may induce the nitrogen to form an imidate tautomer, which is a competent nucleophile (25). Other conserved amino acids, Asn-54, Asn-495, and Asn-517, have no such secondary structure element and hydrogen bond. The crystal structure of SBG indicates that the N-glycan was retained even though the enzyme was treated with Endo-F3 before crystallization. This apparent contradiction may be explained by the substrate specificity of Endo-F3, which has high activity for ␣-1,6-fucosylated N-glycans but not for ␣-1,3fucosylated N-glycans (26).
Overall Structure-The overall structure of SBG was divided into four major domains and two subdomains similar to other GH31AGs (Fig. 1, A and B): the N-terminal ␤-sandwich domain (residues 39 -298), the (␤/␣) 8 barrel domain (residues 299 -670), insertion subdomain b1 (residues 399 -443) and subdomain b2 (residues 474 -519), the proximal C-terminal domain (residues 671-756), and the distal C-terminal domain (residues 757-909). The N-terminal ␤-sandwich domain consists of four antiparallel ␤-sheets. Several strands are connected with long loops, one of which, from Trp-229 to Ser-245, forms part of the active site pocket. This is the so-called "N-loop" and plays a crucial role in substrate binding, as discussed below.
The (␤/␣) 8 barrel domain is the major domain of SBG. The active site pocket of SBG is formed mainly by the (␤/␣) 8 barrel domain and is extended by the N-loop. The (␤/␣) 8 barrel fold has two insertions as subdomains b1 and b2. These two subdomains form part of the active site pocket, as in other GH31AGs. Subdomain b1, inserted into ␤3␣ loop 3, is well conserved among the GH31AGs for which structures are known, except CtMGAM, which contains four small strands and one small helix (13). Subdomain b2, inserted into ␤3␣ loop 4, has no typical secondary structure element. The overall structure of subdomain b2 is similar to other GH31AGs for which structures are known, but those of Sulfolobus solfataricus ␣-glucosidase (MalA) (9) and Ruminococcus obeum ␣-glucosidase (12) are quite distinct from that of SBG, which has an ␣-helix element.
The proximal C-terminal domain consists of three antiparallel ␤-sheets and two small ␣-helices. The distal C-terminal domain forms a nine-stranded antiparallel ␤-sandwich structure. Neither C-terminal domain has any interaction with the active site pocket. These domains appear to contribute to stabilization of the (␤/␣) 8 barrel catalytic domain rather than substrate binding.
The overall structure of SBG is similar to those of other ␣-glucosidases in GH31, with root mean square deviations calculated by the Dali server (27)  A glycoside hydrolase that is able to attack polysaccharides generally bears an extra carbohydrate-binding domain and/or surface binding site for polysaccharide. In GH31 enzymes, Cellvibrio japonicas ␣-xylosidase has an extra PA14 domain in the N-terminal part of the enzyme to accommodate long xylo-oligosaccharides (28). Gracilariopsis lemaneiformis ␣-1,4-glucan lyase (Protein Data Bank code 2X2I) possesses a second substrate-binding site in the N-terminal domain (29). However, neither the extra domain nor the surface binding site is found in the crystal structure of SBG. The electron density of acarbose is visible only at the active site (Fig. 2).
Subsites Ϫ1 and ϩ1-The structures of subsites Ϫ1 and ϩ1 are almost identical to those of other GH31AGs. Two catalytic aspartic acid residues, Asp-469 and Asp-568, are located in ␤3␣ loops 4 and 6, respectively. The active site pocket is occupied by acarbose (Fig. 2). The valienamine unit (ring A) and the 4-amino-4,6-dideoxy-␣-D-glucose unit (ring B) of acarbose, occupying subsites Ϫ1 and ϩ1, respectively, are enclosed by a number of hydrogen bonds and van der Waals interactions (Fig.  3). Asp-357, Arg-552, Asp-568, and His-626 formed hydrogen bonds with the hydroxy groups of ring A. Asp-398, Trp-432, and Asp-597 interact with the hydroxy groups of ring A through water-bridging hydrogen bonds. Ile-396, Trp-467, and Trp-565 are located at the bottom of the active site pocket. Trp-329, Ile-358, Trp-432, Phe-476, and Phe-601 are located at the entrance of the active site pocket and seem to form a hydrophobic barrier. Asp-232 in the N-loop and Arg-552 in the (␤/␣) 8 barrel domain interact with ring B at subsite ϩ1 through hydrogen bonds. Met-470, which is present in two conformations, appears to make contact with ring B. All of the above residues are invariant among GH31AGs except Trp-329. The equivalence of Trp-329 is conserved as Trp or Tyr in GH31AGs. The difference of this aromatic residue was reported to be related to the substrate preference of ␣-1,4and ␣-1,6-glucosidic linkages in several GH31AGs. For example, R. obeum ␣-glucosidase, possessing Trp-169 at this position, exhibits ␣-1,6-glucoside specificity, and the substitution of Trp-169 with Tyr switches the substrate preference of R. obeum ␣-glucosidase from ␣-1,6-glucoside to ␣-1,4-glucoside (12). In addition, for both NtMGAM and CtMGAM, the specificity constant k cat /K m for ␣-1,6-glucoside was increased by replacement of the Tyr residue with Trp at this position (13). The relatively high specificity of SBG for the ␣-1,6-glucosidic linkage (the k cat /K m for isomaltose is one-fifth of that for maltose) (8) is likely because of Trp-329.
Subsites ϩ2 and ϩ3-In contrast to the numerous interactions at subsites Ϫ1 and ϩ1, a few interactions hold two glucose moieties (rings C and D) of acarbose at subsites ϩ2 and ϩ3 (Figs. 2 and  3). Subsites ϩ2 and ϩ3 are composed of residues provided by the N-loop. The nitrogen atom of Ala-234 and N␦2 of Asn-237 interact with O6 and the ring oxygen (O5) of ring D through hydrogen bonds. Ile-233 and Phe-236 form a hydrophobic lining for rings C and D. We previously proposed that Phe-236 in the N-loop contributes to the formation of subsites ϩ2 and ϩ3 via a London dispersion force interaction based on the results of the site-directed mutagenesis study without the tertiary structure information (14). The present structural study confirmed these suggestions and also indicates a contribution from the side chain of Asn-237 to the substrate binding at subsite ϩ3.
Site-directed Mutagenesis of Asn-237-To evaluate the contribution of Asn-237 to the substrate specificity of SBG, we FIGURE 2. Acarbose recognition by SBG. Shown are stereo diagrams with electron density of acarbose bound to the active site pocket of SBG. A, amino acid residues and water molecules (red spheres) interacting with acarbose. The catalytic nucleophile and acid/base are Asp-469 and Asp-568, respectively. B, structure around the glucose moiety at the reducing end of acarbose. The contour level of the 2F o Ϫ F c map is 1, and the structure is colored as described in the legend to Fig. 1. produced the N237A mutant enzyme using a Pichia pastoris expression system and assessed the kinetic properties of this mutant enzyme for a series of malto-oligosaccharides (G2-G7), amylose (G18, average DP ϭ 18), and soluble starch ( Table 2). The N237A k cat values for all substrates were ϳ75% of wild-type recombinant SBG (rSBG). The reduction in k cat values is likely because of the change in the optimum pH. The optimum pH of N237A was pH 5.3, but the kinetic parameters were determined under the same reaction conditions as used for rSBG at pH 4.8 to better compare the kinetic constants. The N237A K m values for G2 and G3 were almost the same as those of rSBG, whereas the K m values for G4 -G7 were 1.9 -2.7 times higher than those of rSBG. The reduction in k cat /K m values for malto-oligosaccharides longer than G3 was larger than those for G2 and G3. These results indicate that the substitution of Asn-237 with Ala decreased the affinity for malto-oligosaccharides longer than G3. It is noteworthy that N237A displayed a smaller k cat /K m for G4 (49.6 s Ϫ1 mM Ϫ1 ) than for G3 (62.6 s Ϫ1 mM Ϫ1 ), whereas wild-type rSBG exhibited a larger k cat /K m value for G4 than for G3. This result indicates that the N237A mutant lost the increment in binding energy at subsite ϩ3 and that Asn-237 contributed to the formation of subsite ϩ3. A reduction in affinity at subsite ϩ3 should increase the K m values for G4 -G7 because subsite ϩ3 would contribute to binding G4 -G7.
Additional Subsites-Previous subsite mapping analysis indicated that SBG possesses subsites from Ϫ1 to ϩ6 (14). In addition, the k cat /K m values for G5-G7, G18, and soluble starch of N237A gradually increased. These results indicate that SBG has other subsites unrelated to Asn-237. However, the present acarbose complex structure does not provide information on additional subsites. Thus, we expected the position of other subsites to be located in the direction of the anomeric hydroxy group of ring D of the bound acarbose. The structure of the SBGacarbose complex shows that ring D at subsite ϩ3 is an ␣-glucosyl moiety (Fig. 2B). Another refinement was performed by placing a ␤-glucosyl moiety in this position, but no electron density of the

Molecular Basis of Long Chain-specific ␣-Glucosidase
equatorial O1 with ␤-configuration was observed. The axial hydroxy group of ring D is oriented toward subdomain b2, and thus, additional subsites likely exist in subdomain b2.
Site-directed Mutagenesis of Ser-497-Among the residues in subdomain b2, we anticipated that Ser-497 (with dual conformation), in which O␥ is at a distance of 6.7 Å from the anomeric hydroxy group, contributes to the formation of other subsites. To confirm this, the S497A mutant enzyme was produced and characterized. The optimum pH of the S497A mutant enzyme was the same as that of wild-type rSBG. S497A exhibited almost the same kinetic parameters for substrates G2, G3, and G4 as rSBG; however, the mutant enzyme exhibited a 2.1-fold larger K m and a 2.1-fold smaller k cat /K m for G5 compared with rSBG ( Table 2). These results indicate that the substitution of Ser-497 with Ala had a negative effect on substrate binding at subsite ϩ4. The substitution of Ser-497 decreased the specificity for substrates longer than G5, increasing K m values and decreasing k cat /K m for the substrates. The reduction of affinity at subsite ϩ4 may affect the long-chain specificity of SBG.
The site-directed mutagenesis study provided evidence that Ser-497 contributed to the formation of subsite ϩ4 and that the reducing end of the longer substrate moved to subdomain b2. It raised the possibility that subdomain b2 includes additional subsites beyond subsite ϩ4. To our knowledge, this is the first example showing that subdomain b2 is involved in substrate binding in GH31AGs. The function of subdomain b2 in GH31AGs has been reported in only MalA and was related not to substrate binding but to maintenance of the quaternary assembly of the hexamer (9). Plant GH31AGs possess a similar subdomain b2 and a serine residue equivalent to Ser-497 (see Fig. 5A) except for barley GH31AG, which prefers shorter substrates (7, 8, 30 -32). The Ser residues and subdomain b2 may contribute to the long-chain specificity of plant GH31AGs.
Divergence of Substrate Recognition in GH31AGs-As mentioned above, the structures of subsites Ϫ1 and ϩ1 are almost identical among GH31AGs. However, those of subsites ϩ2 and ϩ3 are divergent. Subsites ϩ2 and ϩ3 of SBG contain the N-loop; however, NtMGAM and CtMGAM have different architecture at subsites ϩ2 and ϩ3 compared with SBG (Fig. 4). Subsites ϩ2 and ϩ3 of CtMGAM contain mainly Trp-1369 in the specific 21-amino acid insertion and Phe-1560 in ␤3␣ loop 7 of the catalytic domain. Pro-1159 on the N-loop is situated near subsite ϩ3, but its contribution to substrate binding seems to be modest. NtMGAM has few interactions with sugar mole-cules bound at subsites ϩ2 and ϩ3, and its N-loop is unrelated to substrate binding. This structural feature reflects the difference in the K i values for acarbose, i.e. the K i of NtMGAM (62 M) is higher than those of SBG (6.68 M) and CtMGAM (14 M) (11). In addition, the N-loop of SBG is likely to possess a different role, which is related to the long-chain specificity of SBG, i.e. the side chains of Phe-236 and Asn-237 in the N-loop make the reducing end of the long-chain substrates move toward subdomain b2, where subsite ϩ4 and possible additional subsites exist (Fig. 4A). CtMGAM has no such machinery, although it displays specificity for longer substrates. The acarbose molecule in CtMGAM twists around Phe-1560; thus, the reducing end of acarbose is oriented toward the other direction of subdomain b2 (Fig. 4B).
The structural comparison indicates that the N-loop is the key structural element that governs the long-chain specificity. GH31AGs, even short chain-specific enzymes, possess the N-loop; however, its amino acid sequence is divergent. This divergence probably causes the difference in the affinity for the substrate and determines the destination of the reducing end of the long-chain substrates. Plant GH31AGs with higher substrate specificity for longer substrates possess a conserved N-loop and an Asn residue equivalent to Asn-237 (Fig. 5B). This Asn residue is likely important for determining the binding of the reducing end of the longer substrate and contributes to the long-chain specificity. It is of interest that short chain-specific barley GH31AG has a Tyr res- . The catalytic residues and the residues related to substrate binding at subsites ϩ2 and ϩ3 are shown in stick representation. The catalytic nucleophile and acid/base are labeled with nu and a/b, respectively. Numbers indicate subsite numbers. The red sphere indicates the water molecule. The red helix in C represents the 21-amino acid insertion in CtMGAM, and other color coding is as described in the legend to Fig. 1. idue in place of the Asn residue. The Tyr residue may not be adequate to hold the substrate and organize the direction of the reducing end of the longer substrates.
In conclusion, we have presented a mechanism that explains the long-chain specificity of SBG. The N-loop and subdomain b2 appear to be associated with long-chain specificity. In particular, it is likely that Phe-236 and Asn-237 in the N-loop play key roles in determining the long-chain specificity by forming subsites ϩ2 and ϩ3 and guiding the reducing end of long substrates to subdomain b2, in which we identified subsite ϩ4 as well as other possible more distant subsites.