Structural and Kinetic Analysis of Schwanniomyces occidentalis Invertase Reveals a New Oligomerization Pattern and the Role of Its Supplementary Domain in Substrate Binding*

Schwanniomyces occidentalis invertase is an extracellular enzyme that hydrolizes sucrose and releases β-fructose from various oligosaccharides and essential storage fructan polymers such as inulin. We report here the three-dimensional structure of Sw. occidentalis invertase at 2.9 Å resolution and its complex with fructose at 1.9 Å resolution. The monomer presents a bimodular arrangement common to other GH32 enzymes, with an N-terminal 5-fold β-propeller catalytic domain and a C-terminal β-sandwich domain for which the function has been unknown until now. However, the dimeric nature of Sw. occidentalis invertase reveals a unique active site cleft shaped by both subunits that may be representative of other yeast enzymes reported to be multimeric. Binding of the tetrasaccharide nystose and the polymer inulin was explored by docking analysis, which suggested that medium size and long substrates are recognized by residues from both subunits. The identified residues were mutated, and the enzymatic activity of the mutants against sucrose, nystose, and inulin were investigated by kinetic analysis. The replacements that showed the largest effect on catalytic efficiency were Q228V, a residue putatively involved in nystose and inulin binding, and S281I, involved in a polar link at the dimer interface. Moreover, a significant decrease in catalytic efficiency against inulin was observed in the mutants Q435A and Y462A, both located in the β-sandwich domain of the second monomer. This highlights the essential function that oligomerization plays in substrate specificity and assigns, for the first time, a direct catalytic role to the supplementary domain of a GH32 enzyme.

Fructans, the fructose-rich polymers derived biosynthetically from sucrose, are important storage oligosaccharides and polysaccharides in many bacteria and fungi and numerous plant species. Furthermore, sucrose is one of the most widespread disaccharides in nature and is especially ubiquitous in higher plants as the first free sugar resulting from photosynthesis. It is the major transport compound to bring energy and carbon skeletons from source to sink tissues. Carbohydrate partitioning and sugar sensing are intimately connected to sucrose metabolism; these processes are vital throughout plant development. Therefore, the enzymes involved in fructans and sucrose processing are essential to plant cell metabolism.
The enzymes that hydrolyze sucrose are referred to collectively as invertases or ␤-fructofuranosidases (EC 3.2.1.26) and catalyze the release of ␤-fructose from the nonreducing end of various ␤-D-fructofuranoside substrates (Fig. 1). The cleavage of the ␤-glycosidic bond is carried out by a double displacement catalytic mechanism that retains the configuration of the fructose anomeric carbon, two conserved residues, an aspartic and a glutamic acid, being the nucleophile and the general acid-base catalyst, respectively. On the basis of the amino acid sequences (1) they are classified into family 32 of the glycosylhydrolases (GH32), 3 which are included into the GHJ clan together with the GH68 (inulosucrase family).
To date, the three-dimensional structures of exoinulinase from Aspergillus niger (var. awamori; AaEI) (2), a ␤-fructofuranosidase from Thermotoga maritima (TmInv) (3), a plant FEH from Cichorium intybus (CiFEH) (4), and a cell wall invertase from Arabidopsis thaliana (AtInv) (5), all included in the GH32 family, have been solved by x-ray crystallography. These studies show a common bimodular arrangement for this family, being folded into a catalytic and a ␤-sandwich domain with unknown function. In addition, the complexes of these enzymes with short substrates, i.e. fructose (2), sucrose (6), and the trisaccharides 1-kestose (7) and raffinose (8), have also been reported. These studies provide insights into the fructose-binding pocket, i.e. the Ϫ1 subsite following the nomenclature used for sugar-binding substrates (9), which identifies the cleavage as occurring between subsites Ϫ1 and ϩ1. Some general features of the ϩ1 and ϩ2 subsites are also described, and hence, whereas the ϩ1 subsite is formed by several residues making hydrophobic and polar interactions with the substrates, the sugar unit bound at subsite ϩ2 seems to be more loosely recognized, mostly through a few polar interactions mediated by water molecules. Moreover, additional binding subsites for longer substrates have been discarded (8).
It is worth noting that a GH32 yeast invertase structure has not been reported until now, a remarkable fact when taking into account that yeast invertases have been described as multimeric. As shown by electron microscopy (10) the basic structural unit of external and internal yeast invertase is a dimer, but that seems able to associate into larger oligomers upon mannose binding (11). These studies suggest that the extent of glycosylation of invertase determines its oligomeric state, but the molecular mechanisms involved in higher aggregation are uncertain. It remains to be elucidated whether the supplementary ␤-sandwich domain of the enzyme plays the role of a carbohydrate recognition domain and, therefore, is involved in oligomerization, at least in yeast enzymes.
The nonconventional yeast Schwanniomyces occidentalis (also Debaryomyces occidentalis) has been regarded for years as a biotechnologically interesting organism because of its ability to grow in a broad range of inexpensive carbon sources, such as starch and inulin, using a number of activities (12)(13)(14)(15), as well as for its efficient extracellular secretion of high molecular mass proteins (16,17) and its low attached glycosylation (18). An extracellular invertase/fructofuranosidase activity has also been reported in this yeast when lactose was used as the carbon source (14). This secreted protein showed broad substrate specificity, hydrolyzing sucrose, 1-kestose, nystose, and raffinose ( Fig. 1). In addition, the enzyme is able to catalyze the synthesis of fructo-oligosaccharides (FOS), mainly the prebiotic 6-kestose (a ␤-(2-6)linked FOS), with 1-kestose (a ␤-(2-1)-linked FOS) being a side product of the reaction (19). Currently the main industrial FOS producer is the fructosyltransferase from Aspergillus (20,21), which provides a mixture of FOS of the inulintype structure ␤-(2-1) without producing 6-kestose. The protein amino acid sequence from Sw. occidentalis shares considerable sequence similarity with other yeast invertases, inuli-nases, levanases, and fructosyltransferases, included in the GH32 family (22). In this article we report the crystal structure of the invertase produced by Sw. occidentalis (SoInv), which serves as a model for other yeast GH32 enzymes. Computational docking studies were done to identify the putative position for long substrates binding into the active site. Potential oligosaccharide-binding residues were mutated, and the activities of the mutants were characterized. Our results, illustrating the dimeric structure of yeast invertase, led to the conclusion that this oligomerization is involved directly in substrate recognition and specificity.

EXPERIMENTAL PROCEDURES
Protein Purification, Crystallization, and Data Collection-Extracellular SoInv was purified from Sw. occidentalis ATCC26077 and crystallized in an orthorhombic crystal form as described previously (23). Equal amounts of protein solution (7 mg ml Ϫ1 SoInv in 20 mM Tris-HCl, pH 7.5, and 150 mM NaCl) and precipitant buffer (15% (w/v) PEG 3350 in 100 mM Hepes, pH 7) were mixed and vapor-equilibrated against a reservoir containing the latter solution in a sitting drop setup. These rod-shaped crystals belong to P2 1 2 1 2 1 space group, with two molecules/asymmetric unit and 67% solvent content in the cell. For data collection, native crystals were transferred to cryoprotectant solutions containing 18% (w/v) PEG 3350 and 30% (w/v) glycerol before being cooled to 100 K in a stream of nitrogen. A full data set was collected at the European Synchrotron Facility (ESRF, Grenoble, France), using ID14-1 beamline, up to 2.9 Å. Additional data sets collected at beamlines ID23-1 and ID23-2 yielded lower resolution data. In an attempt to increase the diffraction power of the crystals, the sample was deglycosylated by treatment with endoglycosidase H (New England Biolabs), which reduced the molecular mass from 73 kDa, as determined by mass spectrometry, to an apparent molecular mass below 66 kDa, as deduced from the SDS-PAGE analysis. The sample was concentrated to 10 mg ml Ϫ1 , and the screening of new crystallization conditions was performed with the PACT Suite from Qiagen. Small cubes and thick plates grew in drops containing 20% PEG 6000, 0.2 M MgCl 2 , with the pH ranging from 6 to 8. These crystals belong to the P2 1 space group with two molecules in the asymmetric unit and 42% solvent content. For data collection, a plate grown at pH 7 was soaked in a solution containing mother liquor and 50 mM nystose for 3 min before being transferred to cryoprotectant solution consisting of 25% PEG 6000, 0.2 M MgCl 2 , and 15% PEG 400 and cooled to 100 K in a nitrogen stream. A full data set was collected to 1.9 Å at the ESRF ID14-4 beamline. Analysis of the diffraction patterns was performed using MOSFLM (24), and data were merged with the CCP4 suite (25). A summary of data collection and data reduction statistics is shown in Table 1.
Structure Solution and Refinement-The structure of SoInv was solved using the molecular replacement method with data to 2.9 Å collected from the orthorhombic crystal form. Initial phases were obtained with the MOLREP program (26) using the atomic coordinates of AaEI as the search model (Protein Data Bank code: 1Y4W). The best solution was obtained by using the sequence alignment of both enzymes to correct the model, which after rigid body fitting produced an R-factor of 0.49 with a correlation coefficient of 0.49. Crystallographic refinement was performed by using Refmac (27) in the CCP4 suite (25) with flat bulk solvent correction, including low resolution data to 50 Å, and using maximum likelihood target features. A subset of 5% randomly selected structure factor amplitudes was excluded from automated refinement to compute a free R-factor throughout the refinement. Large regions of the ␤-sandwich domain were removed from the model, as no electron density was observed at the polypeptide chain. These regions were mainly the loops of the ␤-sandwich but also some of the ␤-strands. Several rounds of refinement applying tight NCS (noncrystallographic symmetry) restraints between both molecules in the asymmetric unit were alternated with major rebuilding of these regions using the program O (28), after which the final 2F o Ϫ F c maps showed continuous density for the whole polypeptide chain. At the final stage, water molecules and several N-acetylglucosamine and glycerol molecules were included, combined with more rounds of positional and individual restrained B-factor refinement and translation/libration/screw-rotation correction. When the monoclinic high resolution crystal form became available, additional rounds of refinement using the same strategy but applying loose NCS restraints between both molecules in the asymmetric unit were carried out, and minor corrections were introduced into the model, mainly concerning the orientation of the peptide bond. This led to a final R-factor of 0.21 (R free ϭ 0.25) for all data to 1.9 Å resolution. The final refinement parameters are reported in Table 1. The stereochemistry of the model was checked by using PROCHECK (29) and MOLPROBITY (30), and Figs. 3, 4, and 6 -8 were generated with PyMOL (31).
Automated Docking of Ligands into SoInv Structure-The nystose coordinates were retrieved from the crystal structure (32) deposited in the Cambridge Structural Database (CSD). All hydrogen atoms were added to the ligand, and charges were assigned by the Gasteiger method using the AutoDockTools program (33). All bonds of hydroxyl groups and glycoside linkages, but not those of the sugar rings, were defined as rotatable bonds. The inulin chain coordinates were also retrieved from the crystal structure determined from electron diffraction data (34) and deposited in the Cambridge Structural Database. In this case, only hydroxyl groups were allowed to rotate, the glycoside linkages being maintained in a fixed helical conformation. The protein model contains the coordinates of the crystals structure of dimeric SoInv presented here after the removal of water molecules and all heteroatoms. Polar hydrogen atoms were then added using AutoDockTools. AutoDock 4.0 (33) was executed 50 times with the Lamarckian genetic algorithm, a population size of 150, elitism set at 1, mutation rate at 0.02, and crossover rate of 0.80. Simulations were performed with a maximum of 2,500,000 energy evaluations and a maximum of 27,000 generations. Docking results were clustered using a cut-off of 2 Å r.m.s. deviations. In the case of inulin, additional manual docking was followed by optimization of inulin using the local search (LS) algorithm (AutoDock 4.0) executed 30 times with 300 iterations and with the size of the local search space set at 1. In these calculations, the 11 C1-C2 angles linking two fructose units were allowed to rotate, as this has been suggested as the most flexible of the three glycoside torsions within each fructose-fructose pair (34).
Mutagenesis and Protein Purifications-The sequence SoINV (GenBank TM accession number CQ890277) was amplified using the forward primer 5Ј-TAGGATCCAACATGGTA-CAAGTTTTAAGTGTATTAG-3Ј and the reverse primer 5Ј-CATCTCTAGACTAGCCCTACTTATTTAGTTCTCT-3Ј. Restriction sites for BamHI and XbaI (shown in bold) were included in these primers for cloning the PCR product into the pYES2.0 shuttle vector (Invitrogen) under the control of the yeast GAL1 promoter. The generated construct was used as the template for site-directed mutagenesis using the method described previously (35) and the primers listed in Table 2. Saccharomyces cerevisiae EUROSCARF Y02321 (BY4741; Mata; his3⌬1; leu2⌬0; met15⌬0; ura3⌬0; YIL162w::kanMX4) was used as the expression host, and it was transformed by the lithium acetate method (36). Yeasts were growth on YPGal medium (yeast extract 1%, peptone 1%, and galactose 2%) to the beginning of the stationary phase (OD 660 ϭ 6 -7). Culture filtrates (1 liter) containing invertase activity were concentrated through a HY200 filter (30,000 molecular weight cut-off) using a VivaFlow 50 system (Vivascience). The concentrated fractions (50 ml) were applied to a DEAE-Sephacel chromatography column (10 ml) equilibrated with 20 mM HCl-Tris, pH 7 (buffer A). The proteins were eluted with a 0 -0.5 M NaCl gradient at 1 ml min Ϫ1 . The 0.1 M fractions (2 ml) were pooled, dialyzed in buffer A, concentrated to the desired volume by using the factor for a randomly selected 5% subset of reflections not used in structure refinement).
Microcon YM-10 (Amicon) system, and stored at Ϫ70°C. SDS-PAGE (8%) of the samples confirmed the proteins purification, and protein concentration was estimated photometrically at 280 nm (NanoDrop spectrophotometer ND-1000). The enzyme activities of all mutants were also detected by zymogram analysis (Fig. 2) using nondenaturing gradient gels (4 -15%) from Bio-Rad and subsequently were stained with 1% (w/v) 2,3,5-triphenyltetrazolium chloride (19). Enzyme and Kinetic Analysis-All of the enzymatic assays were done by the dinitrosalicylic acid method adapted to a 96-well microplate scale as described previously (19). One unit of activity was defined as that catalyzing the formation of 1 mol of reducing sugar min Ϫ1 ml Ϫ1 . For kinetic analysis, velocity measurements were performed with 0.3-5.3 mg ml Ϫ1 enzyme in triplicate at 2.5-100 mM sucrose or nystose and 1-10 mM inulin. Reactions were performed for 20 min in 50 mM NaAc, pH 5. Curve plotting and analysis of the curves were carried out using the SigmaPlot program (version 7.101). Kinetic parameters were calculated by fitting the initial rate values to the Michaelis-Menten equation.
Analytic Ultracentrifuge Analysis-Sedimentation equilibrium experiments were performed in a Beckman Optima XL-A Ultracentrifuge using a Ti50 rotor and six channel centerpieces of Epon-charcoal (optical path length, 12 mm). Samples of purified SoInv in the concentration range of 0.2 to 0.5 mg ml Ϫ1 were equilibrated against 20 mM citrate, pH 4.5, 20 mM Tris, pH 7, 20 mM Hepes, pH 7, and 20 mM Tris, pH 9, and centrifuged at 8,000, 10,000, and 14,000 rpm at 293 K. Radial scans at 280 nm were taken at 12, 14, and 16 h. The three scans were identical (equilibrium conditions were reached). The weight average molecular mass (M w ) of SoInv was determined by using the program EQASSOC with the partial specific volume of SoInv set to 0.71 at 293 K as calculated from its amino acid composition and referred to that reported for the monomer (72 kDa) (23).

RESULTS AND DISCUSSION
Crystal Structure of SoInv Shows a Bimodular Folding-Previous attempts to crystallize the native form of SoInv expressed in S. cerevisiae failed to obtain untwined, good quality diffracting crystals as reported; therefore, the extracellular enzyme was isolated directly and purified from Sw. occidentalis (23). This sample gives crystals belonging to the orthorhombic P2 1 2 1 2 1 space group, producing a diffraction pattern up to a 2.9 Å resolution. The three-dimensional structure of the enzyme was solved by molecular replacement and refined. To improve the crystal quality, the sample was partially deglycosylated with endoglycosidase H, a treatment that yielded a monoclinic crystal form giving data to 1.9 Å resolution. Experimental details and structure determination procedures are given under "Experimental Procedures." The model of the native crystals contains six glycerol molecules from the cryoprotectant solution bound at the active site, whereas the deglycosylated protein crystal was soaked in the substrate nystose prior to data collection and contains the product fructose at the active site FIGURE 2. Gel analysis of the SoInv mutants. Culture filtrates (500 ml) of the S. cerevisiae transformants that expressed the wild-type or mutant SoInv proteins (indicated at the top of the image) were processed for protein purification as described under "Experimental Procedures." Purified proteins (5-20 g) were subjected to PAGE analysis, and the enzyme activities were revealed in situ. Numbers at the left indicate the positions of the bovine serum albumin molecular masses used as a control (in kDa).

TABLE 2 Oligonucleotides employed for mutagenesis
Forward (F) and reverse (R) sequences are shown with the mutations in bold letters.

Mutant
Orientation Sequence Fig. S1). The SoInv structure obtained from both crystals is essentially the same, and therefore all of the following structural descriptions will be based on the high resolution deglycosylated form. As in the other GH32 family enzymes that are known structurally, the invertase molecule is bimodular and folds into a catalytic five-bladed ␤-propeller (residues 24 -356) and a C-terminal ␤-sandwich (residues 365-535) linked by a short 10-residue segment (Fig. 3). The propeller is assembled from five blades (I-V), each composed of four antiparallel ␤-strands (A-D from the axis toward the outside of the propeller) connected by turns in a classical "W" pattern. This domain belongs to the ␤-furanosidase superfamily (37), which contains four other members: GH43 arabinase, GH62 arabinofuranosidase (with no three-dimensional structure known up to now), GH68 levansucrase, and a group of hypothetical proteins, some of which are also predicted to be glycosylhydrolases. Interestingly, this was the first glycosylhydrolase superfamily shown to include enzymes with both inversion (GH43) and retention (GH32 and GH68) mechanisms of action (37). Furthermore, this clan shares the five-bladed ␤-propeller fold with tachylectin II, which was the first protein described as showing this folding (38), and human apyrase (39).
The second domain of invertase is composed of two antiparallel six-stranded ␤-sheets folded into a ␤-sandwich topology. This domain is the region that is most variable in sequence among GH32 family members (sequence identity of SoInv as low as 10 -12% with enzymes from bacteria or plants); it belongs to the wide concanavalin A-like lectins/glucanases superfamily with which it shares similar levels of identity.
Five of the eight predicted glycosylation sites were visible in the electron density maps and were assigned to Asn-72, Asn-119, Asn-219, Asn-334, and Asn-394. Four of them are located in the catalytic domain linked to blades I, II, III, and V, and one is in the ␤-sandwich domain. They are all at the surface of SoInv, far from the active site. A different glycosylation pattern is possibly found in the enzyme expressed in S. cerevisiae, and as reported previously, this likely prevents proper crystal growth (23). In this regard, a single extracellular protein band of about 95 kDa was detected in SDS-PAGE analysis when SoInv was expressed and purified from S. cerevisiae cells, which is 20 kDa higher than the wild-type protein expressed in Sw. occidentalis.
The Loops That Confer Specificity to Enzymes within the GH32 Family-As explained previously, the three-dimensional structure has been reported for four other members of the GH32 family: AaEI (2), TmInv (3), CiFEH (4), and AtInv (5). These structures are good representatives of several of the diverse enzymatic functionalities found among the GH32 family and also correspond to different organisms. SoInv shares the highest identity (35%) with the fungal enzyme, whereas only 25 and 23% identity are found with respect to enzymes from bacteria and plant. However, the sequence conservation within the family is not equally distributed along the polypeptidic chain, the identity being higher in the catalytic domain (39 and 24% in the catalytic and the ␤-sandwich domains, in the case of SoInv as compared with AaEI). The C␣ atoms of the AaEI, TmInv, and CiFEH structures superimposed onto SoInv with r.m.s. deviations of 1.9, 2.1, and 2.6 Å, respectively (based on the overlap of 468, 408, and 449 atoms).
Structural superimposition of these enzymes onto SoInv is shown in Fig. 4a (AtInv is highly similar to CiFEH and has been omitted for clarity). Despite the moderate sequence identity among the family, the core of the propeller is very similar in shape, with blade I the most conserved and blade IV the least conserved region. Moreover, an insertion is observed in strand D4 (Figs. 4 and 5) creating a significant disruption of blade IV in AaEI and CiFEH that is not observed in the enzymes from bacteria and yeast, although this is the least regular sheet in the propeller. It is worth noting that this region is involved directly in the oligomerization of SoInv, which will be explained below.
Furthermore, the analysis of the largest differences observed in the surroundings of the ␤-propeller center enabled us to identify the substrate binding specificity determinants. As shown in Fig. 4, the topology of this folding leads to the catalytic active site being shaped by the loops connecting the different blades of the propeller (L1-L4) and the turns linking strand B to C within each blade (T I -T V ). The structural alignment of the sequences with known structure helps greatly in delimiting the segments involved in shaping these loops (Fig. 5). As seen in Fig.  4, with the exception of T I and L4, these loops are highly variable in length and sequence, with those in AaEI being the longest and those in SoInv the shortest, and they contour a different environment to the catalytic pocket (Fig. 4). Interestingly, loop T V shapes the interface with the ␤-sandwich domain, and the different conformation presented by this loop in the plant enzymes produces a wider cleft that has been postulated for the inulin-binding site (4). On the contrary, this loop protrudes into the active site in the other enzymes and contains a tryptophan (Trp-314 in SoInv), which is likely involved in substrate recognition. Moreover, loop T IV contains a long insertion in SoInv and AaEI that interacts with the ␤-sandwich domain to a greater extent, consequently making a deeper pocket in the yeast and fungal enzymes. All of the relevant residues found at the active site of SoInv are located in these loops, as will be discussed later. A final, remarkable structural feature of SoInv is an extended 10-residue loop linking strands C5 to D5 in blade V of the propeller (Tyr-339 -Leu-349), located at the opposite side of the active site, which is longer than in the other enzymes (Fig. 4a) and is involved in the oligomerization of the enzyme, as described below.
SoInv Is a Dimeric Enzyme-Two units associate in a "butterfly" shape to form a tight dimer related by noncrystallographic 2-fold symmetry, as shown in Fig. 3. The total surface area for this dimer is 36,680 Å 2 , with a buried surface of 3,990 Å 2 (Pisa server (40)). 60% of the contacts shorter than 4 Å in this interface are produced by the interaction of the two catalytic domains, mainly through blade IV of each unit. This region contains the only two hydrogen bonds that link the main chains of both subunits within the dimer (Table 3): Gly-223-O,N from L3 in one unit is linked to Ser-281-O and Thr-283-N from strand D4 of the other. In addition, the above mentioned extended loop, Tyr-339 -Leu-349, from one subunit (significantly longer in SoInv, as noted previously) makes a large number of contacts with L3 and also with T III from the other subunit. Interestingly, all of these contacts arise at the active site surface, as shown in Fig. 6. Furthermore, the catalytic pocket of one monomer is surrounded by loops from the ␤-sandwich domain of the other, and thus 16% of the total contacts within the dimer interface arise from the interaction between these two domains, which occurs essentially through T V . showing that the major differences in the catalytic domain are concentrated in the highly variable loops connecting the different blades of the propeller (L1-L4) and the turns linking strand B to C within each blade (T I -T V ). A long insertion in a SoInv loop located in blade V, involved in dimer formation, is indicated by an arrow. b, molecular surface of the SoInv-fructose (blue), AaEIfructose (violet), TmInv-raffinose (brown), and CiEI-1-kestose (green) complexes highlighting the above mentioned loops that shape the active site. Loops T IV and T V , defined in Fig. 5, are marked on the SoInv surface. The bound ligands are represented as spherical models.
There are only two hydrogen bonds in this region, one in this loop T V through the Trp-314 main chain and the other from Asn-227 located in loop L3. However, it is interesting to note that these interactions participate in a net of polar interactions linking residues from the different domains in a fixed conformation that stabilizes the interface and makes the ground of the active site cavity within the dimer (Fig. 6). As Fig. 5 shows, all of the residues of the catalytic domain that participate in polar interactions or short contacts to the other subunit within the dimer are variable among GH32 from different sources.
The rest of the interface is made up of direct contacts between both ␤-sandwich domains, as shown in Fig. 3b. It has FIGURE 5. Structural alignment of known GH32 catalytic domain. SoInv superimposed to AaEI, TmInv, and CiFEH (Protein Data Bank codes 1Y4W, 1UYP, and 1ST8, respectively) as given by the DALI server (44). The black squares indicate amino acid similarity as calculated by the program ALSCRIPT (45). Secondary structure elements for SoInv as suggested by PROCHECK (29) are shown above the amino acid sequences. Letters A-D indicate the ␤-strands, from the axis toward the outside of the propeller, within each blade. L1-L4, connecting the different blades of the propeller, and T I -T V , linking strand B to C within each blade, are marked with an orange bar. Catalytic residues are highlighted with a black asterisk. Residues of SoInv involved in hydrophobic (caret) and polar (asterisk) interactions within the dimer interface are marked in green. Residues of SoInv involved in substrate recognition and mutated in this work are highlighted in brown.
been proposed previously that this supplementary domain of yeast invertase might play the role of a carbohydrate recognition domain involved in higher oligomerization formation (3). As seen previously, the SoInv structure reveals additional functions for this domain, even though the arrangement in which both domains are found in the SoInv dimer suggests that such a role could be performed cooperatively by them.
Previous work on S. cerevisiae invertase (ScInv) has reported different oligomerization states of the internal isoform at different pH values, with the enzyme being a monomer at pH 9.4, a dimer at pH 8.3, and an apparent octamer at pH 4.9 (41). It has also been reported that external ScInv exists as a dimer that self-associates into higher aggregates in proportions that vary with the pH, time, and concentration of the enzyme (11). As we observed previously (23), SoInv is a dimer at neutral pH, but to further investigate the oligomeric state of SoInv at different conditions, we performed a series of analytical ultracentrifugation experiments using buffers at different pH values, i.e. 4.5, 7, and 9, which always led to a molecular mass corresponding to the dimer (supplemental Fig. S2). These results indicate that oligomerization of SoInv is apparently not pH-dependent. Moreover, higher aggregates have not been detected in SoInv. The polypeptide sequence of SoInv is only two residues longer than that of ScInv, with both sequences being 66% identical. Nevertheless, an analysis of the alignment between both sequences (not shown) points to the conclusion that only half of the hydrogen bonds found in SoInv could be present in a putative ScInv dimer; therefore, this dimer might present some structural differences that would lead to a different, higher aggregation pattern than the one observed in SoInv.
The Active Site Is Made up from Both Subunits-SoInv presents a narrow pocket, 12 Å deep and 11 Å wide, typical for an exo-enzyme and similar to that found in AaEI. In the bacterial and plant enzymes, the shorter T IV loop mentioned above makes the interface between the catalytic and the ␤-sandwich domains more open to the solvent, leaving a wider gate to the active site (Fig. 4). Furthermore, the previously reported analysis on the TmInv-raffinose complex (8) shows that the terminal ␣-galactose unit is located in the active site pocket close to the protein surface, suggesting that longer substrates would situate the additional units dangling out into the solvent with no links to the enzyme. However, oligomerization of SoInv situates the ␤-sandwich domain of the other subunit facing this pocket (see Fig. 3c) and makes a straight cavity 25 Å deep, which is accessible to the solvent by a slot placed between both subunits with a maximum width of 18 Å. Therefore, additional subsites may be created in this groove, and SoInv seems able to recognize oligosaccharide units distal from the scissile bond, as shown below. Fig. 7a is a close-up view of the active site pocket of SoInv showing the six glycerol molecules found in the native crystals; the position of the substrates is inferred from the structural superimposition of the TmInv-raffinose (8) and CiEI-1-kestose complexes (7) onto the SoInv coordinates. As observed, the substrates are located in putative subsites close to three of the glycerol moieties that, as generally accepted, are considered a good indicator of sugar-binding sites. This suggests that the other glycerol molecules could reveal additional binding subsites that are able to recognize longer substrates. It is remarkable to find that these glycerol molecules interact with the other subunit of the dimer.
Failure to crystallize the recombinant form of SoInv prevented the genetic replacement of the catalytic residue in order to obtain crystals from the inactivated enzyme-substrate complexes. To overcome this impediment, we tried an alternative strategy proposed for the crystallographic analysis of glycosidases, the exploitation of sugars as a cryoprotectant substitute for glycerol (42), a method that had yielded successful results in our previous studies (43). Following this procedure, we soaked the deglycosylated crystals for a short time in a cryosolution containing as high as possible concentrations of nystose and fructosylnystose, which saturated the active sites of all of the protein molecules and which were subsequently immobilized by flash cryocooling in a nitrogen stream. As we observed some   ice formation, a small amount of PEG 400 was also added to the cryoprotectant solution. However, after trying different soaking times, the substrates were processed, and only the product fructose was bound at the active site. Automatic Docking of Substrates Suggests Relevant Details of the Active Site-In view of the impossibility of getting crystals from the complexes, the binding of long substrates such as tetrasaccharide nystose and polymeric inulin was investigated by docking analysis. The first trials of automatic docking of the substrates into the SoInv active site, as obtained in the crystal structure, failed to produce productive complexes, as the ligands were positioned with the fructose of the nonreducing end out from the catalytic Ϫ1 subsite. An inspection of the active site of SoInv showed a different orientation of the Glu-230 and Tyr-293 side chains than found in the above mentioned reported complexes from the GH32 family. Therefore, these two residues were modeled in the same conformation as observed experimentally in the complexes, with this change leading to the successful docking of the nystose molecule into the SoInv active site in a biologically significant position. In the case of the polymer inulin, only an additional change in the Arg-178 side chain allowed the substrate to enter into the active site.
Using nystose as a ligand, we obtained the lowest total docking energy (Ϫ11.9 kcal/mol) for a position (yellow in Fig. 7c) in the highest ranked cluster, which contained 24 of 50 runs and had a mean docking energy of Ϫ9.0 kcal/mol. The second ranked cluster contains only three positions presenting a less extended oligosaccharide chain (not shown) with a mean docking energy of Ϫ9.7 kcal/mol. Finally, the third ranked cluster (with mean binding energy of Ϫ7.58 kcal/mol) accounted for 20 conformations that situated the nonreducing fructose ring significantly shifted away from the Ϫ1 binding site; these were not considered, as they are not biologically productive conformations. In the case of the polymeric inulin, a segment containing 12 fructose units was docked into a position with the lowest docking energy of Ϫ6.6 kcal/mol (Fig. 7d) included within the highest ranked cluster, which contained 37 of 50 runs. As this model displayed the terminal fructose of inulin slightly flipped with respect to that observed in the SoInv-fructose complex, this position was further improved by manual docking followed by local optimization (Fig. 7d). In summary, docking analysis of the substrates nystose and inulin into the SoInv active site located these ligands in energetically favorable positions, situating the nonreducing end in the fructose site observed in the SoInv-fructose complex and the distal oligosaccharide units in the positions occupied by glycerol molecules in the native crystals. Docking of these ligands involved a rearrangement of some of the residues of the SoInv active site, in particular Arg-178, Glu-230, and Tyr-293, which therefore must be directly implicated in substrate binding. The modeled complexes suggest that the nystose and inulin chains interact directly with residues from both subunits, a unique feature of SoInv. Furthermore, the analysis of its molecular surface revealed that the dimer structure delineates an active site channel that is perfectly adapted to accommodate the polymeric chain of inulin, as illustrated in Fig. 8 showing the SoInv-inulin complex obtained by docking.
Substrate Specificity Is Elucidated by Kinetic Analysis of Mutants-An inspection of the SoInv active site in the putative complexes points to the residues that may be involved in recognizing substrates of different length. Fig. 7b shows the posi- tion of the substrate sucrose inferred from the structural superposition of the AtInv-Suc complex (6) into the SoInv coordinates, whereas Fig. 7, c and d, shows the nystose and inulin positions obtained by docking. As observed, the pocket is surrounded by aromatic residues, i.e. Trp-76, Phe-110, Tyr-293, and Trp-314, which are strictly conserved among the GH32 family (see Fig. 3), as is the Asp-50/Glu-230 pair assigned as the catalytic residues. The residues Arg-178 and Asp-179 are also conserved and belong to the well known RDP motif involved in substrate binding. All of these residues appear to play a role in substrate recognition that is shared with the other GH32 enzymes. Therefore, we investigated the role of Asn-142, Gln-176, Gln-228, and Asn-254 in the enzymatic activity of SoInv, as these residues are not conserved and consequently must be responsible for the differences in specificity encountered among the family. The structure of the SoInv dimer interface (Fig. 6) points to Gln-435 and Tyr-462 as possible candidates involved in substrate recognition; these residues were also investigated by mutagenesis experiments. In addition, Ser-281, involved in one of the two main-chain hydrogen bonds linking both subunits (Table 3), was also mutated to a bulky residue with the aim of investigating the possible existence of an active monomer or at least how a distortion in the dimer could affect the enzymatic activity. Table 4 gives the mutants we produced and their corresponding kinetic parameters. The expected absence of activity in mutants D50A and E230A confirmed the catalytic role of these residues. Furthermore, the experimental results allowed grouping the other mutants into four general groups. First, the Q228V and S281I mutants exhibited a strikingly large reduction in the catalytic efficiency (k cat /K m ) of sucrose and inulin, which was even larger when measured against nystose (at least a 100-fold decrease). Gln-228 is a residue located at the highly variable L3; the equivalent TmInv Glu-188 was described previously as linking the glucose unit in the TmInv-raffinose complex (8). Indeed, our docking analysis of SoInv shows that Gln-228 may be involved directly in binding the glucose at the reducing end of nystose through hydrogen bonds (Fig. 7c) and also the third fructose unit of inulin (Fig. 7d); therefore a decrease in catalytic efficiency against long substrates could be expected. However, there is also a 5-fold reduction in activity against sucrose that is not explained by the putative SoInvsucrose complex (Fig. 7b), as Gln-228 would not interact with the substrate in this case. Nevertheless, it is interesting to note that this residue is next to Asn-227, which is involved in a network of polar links to the other molecule; the Q228V substitution (a Val is found in AaEI) might alter the atomic interactions at the interface and consequently at the active site pocket, leading to a less efficient enzyme. The key role of the dimer in the enzymatic activity is further supported by the fact that the S282I substitution at the dimer interface and

TABLE 4 Kinetic analysis of SoInv mutants
The k cat values were calculated assuming a protein molecular mass of 180 kDa. The Ϯ sign refers to standard errors based on curve fitting using SigmaPlot. The mutants D50A and E230A gave no activity against any of the substrates. WT, wild type. far from the active site, yields similar largely reduced kinetic parameters, most probably by producing a somehow distorted dimer (this mutant still migrates as a dimer in a native electrophoresis analysis; see "Experimental Procedures" and Fig. 2).
A second group is formed by the mutants Q435A and Y462A, leading to both a 3-fold decrease in catalytic efficiency against inulin and a slighter effect against the other two substrates, with the exception found in the Tyr-462 activity versus nystose, which has a lower k cat and reduced affinity leading to a large 12-fold decrease in the overall catalytic efficiency. Both residues are located at the ␤-sandwich domain of molecule B and participate in the polar interaction network that shapes the interface and the active site pocket (Fig. 6); therefore their substitution must affect the interface and efficient positioning of the oligosaccharides within the catalytic pocket. As deduced from the complexes shown in Fig. 7, c and d, Gln-435 could be hydrogen-bonded to nystose and inulin, whereas Tyr-462 might participate in nystose binding but seems too far from the inulin chain, and none of these residues participate in sucrose binding. This fact reinforces the idea that even a small change in the dimer interface has a deleterious effect on activity, and the concerted action of both subunits determines the catalytic efficiency of SoInv.
The third group is composed of the N142Y and N254A mutants, which have a reduced catalytic efficiency most apparent when the activity is assayed against nystose. Asn-142, located at the end of strand B2, is a Tyr in the fungal and bacterial enzymes and has been described as binding the terminal galactose moiety of raffinose, through a water molecule, in the TmInv-raffinose complex (8). As seen in Fig. 7, direct interaction of Asn-142 to the substrates cannot be deduced from any of the complexes, but interaction through solvent molecules cannot be discarded. Moreover, the SoInv crystal structure shows that the side chain of Asn-142 is hydrogen-bonded to Glu-107 from loop L1 and Gln-147 located at strand C2; its replacement by a Tyr probably rearranges the conformation of the active site residues, lowering the affinity for the substrates (Table 4). Unexpectedly, this lower affinity is compensated by a higher k cat as found when it was measured against sucrose or inulin. On the other hand, Asn-254 from the variable loop T IV is located in the ground of the pocket, and its side chain could be hydrogen-bonded to the fructose unit at subsite ϩ2 in the putative SoInvnystose complex (Fig. 7c). Accordingly, the substitution N254A decreases the affinity for nystose by 4-fold. No interaction is observed in the other complexes. It is noteworthy that the equivalent TmInv Thr-208 has been described previously as interacting also with the glucose unit of the TmInv-raffinose complex. However, our results showed no significant influence on activity against sucrose or inulin upon mutation, and therefore, this residue possibly plays a role in the binding of medium-size substrates.
Finally, we studied the role of Gln-176, located at the end of L2 and preceding the conserved RDP motif. This residue is linked to the essential Gln-228 and, more significantly, to Arg-178 from this RDP motif, which, as explained previously, was necessarily shifted to succeed in properly docking inulin into the SoInv active site. However, the replacement of Gln-176 by either Glu or Ser does not reveal any clear influence on SoInv catalytic efficiency, as it is slightly increased against nystose but somewhat reduced against the other two substrates in the Q176E mutant, but curiously shows the opposite effect in the Q176S mutant. Although the moderate magnitude of the changes in catalytic efficiency indicates that this residue may not be critical, the deviations observed in the k cat and K m values of the Gln-176 mutant suggest that this position should be investigated through the analysis of further mutants to determine the precise role of this residue in the activity of SoInv.
In summary, docking and mutagenesis analyses of SoInv reveal that substrates longer than sucrose might be recognized by hydrogen links to residues Gln-228 and Asn-254 from one subunit and Gln-435 and Tyr-462 from the other. However, the amino acid replacements that have a deleterious effect on catalytic activity, i.e. Q228V and S281I, seem to be related also to some change in the interface within the dimer or, at least, to the conformation of the active site cavity, supporting the idea that oligomerization is the key to delineating SoInv functionality.
Conclusions-Fructans and related compounds play a key role in metabolism and energy exchange and may be employed as storage polymers in organisms as plants. Several microorganisms use them as regular energy and carbon sources by the expression of hydrolytic enzymes. In this study, we have described the structure of dimeric Sw. occidentalis invertase, which is an example of how these proteins have adapted, in many yeasts, to the hydrolysis of fructans. The crystal structure of the SoInv subunit shows a bimodular arrangement similar to that described previously for other GH32 enzymes. A detailed comparison of the catalytic domain with those of other known structures helped delimitate the regions that are essential in determining substrate specificity among the family members. However, the active site of SoInv is exceptional, as it is built up by both subunits and presents substrate-binding sites additional to those described for known enzymes. This structure is an interesting example of how oligomerization can be a molecular mechanism for fine-tuning the functionality within a family of homologous enzymes. Moreover, the supplementary ␤-sandwich domain was previously postulated to play a role as an ancillary carbohydrate recognition domain, owing to its lectin-like fold and to what had been observed in other glycosylhydrolases enzymes. However, the SoInv structure reveals that this domain is involved in contouring the active site cleft and also plays a direct role in binding the substrates, both medium-size oligosaccharides and the polymeric chain of inulin. This unique feature of the structure presented here may help in understanding the protein-carbohydrate recognition molecular mechanisms implicated in modularity, a pivotal property within the glycosylhydrolase enzymes that is crucial to many biologically relevant degradative processes. The specificity of SoInv has been investigated by a docking analysis of long substrates into its active site, which points to the key role of residues from both subunits in substrate binding; this has been supported by kinetic analysis of mutants. The replacements that presented the most deleterious effect on activity were Q228V, a residue involved directly in substrate recognition, and S281I, a residue responsible for one of the two main-chain hydrogen-bonding links between both subunits and, therefore, significant to proper oligomerization. Mutation of the other identified residues, Asn-142, Gln-176, and Asn-254 from the catalytic domain and Gln-435 and Tyr-462 from the ␤-sandwich domain of the other subunit, was also shown to have an effect on substrate binding or the catalytic rate, confirming the interactions of these residues with the substrate. Consequently, our results assign a direct catalytic role to the supplementary ␤-sandwich domain of SoInv, the first time that such a role has been observed within GH32 enzymes.