Three-dimensional Structure of Saccharomyces Invertase

Background: Invertase is a fundamental enzyme for sugar metabolism in yeast and a classical model in early biochemical studies. Results: Invertase shows an unusual octameric quaternary structure composed of two types of dimers. Conclusion: A peculiar pattern of monomer assembly through non-catalytic domain interactions determines invertase specificity. Significance: Unraveling the structural features that rule enzyme modularity casts new light on protein-carbohydrate recognition. Invertase is an enzyme that is widely distributed among plants and microorganisms and that catalyzes the hydrolysis of the disaccharide sucrose into glucose and fructose. Despite the important physiological role of Saccharomyces invertase (SInv) and the historical relevance of this enzyme as a model in early biochemical studies, its structure had not yet been solved. We report here the crystal structure of recombinant SInv at 3.3 Å resolution showing that the enzyme folds into the catalytic β-propeller and β-sandwich domains characteristic of GH32 enzymes. However, SInv displays an unusual quaternary structure. Monomers associate in two different kinds of dimers, which are in turn assembled into an octamer, best described as a tetramer of dimers. Dimerization plays a determinant role in substrate specificity because this assembly sets steric constraints that limit the access to the active site of oligosaccharides of more than four units. Comparative analysis of GH32 enzymes showed that formation of the SInv octamer occurs through a β-sheet extension that seems unique to this enzyme. Interaction between dimers is determined by a short amino acid sequence at the beginning of the β-sandwich domain. Our results highlight the role of the non-catalytic domain in fine-tuning substrate specificity and thus supplement our knowledge of the activity of this important family of enzymes. In turn, this gives a deeper insight into the structural features that rule modularity and protein-carbohydrate recognition.

Invertase is an enzyme that is widely distributed among plants and microorganisms and that catalyzes the hydrolysis of the disaccharide sucrose into glucose and fructose. Despite the important physiological role of Saccharomyces invertase (SInv) and the historical relevance of this enzyme as a model in early biochemical studies, its structure had not yet been solved. We report here the crystal structure of recombinant SInv at 3.3 Å resolution showing that the enzyme folds into the catalytic ␤-propeller and ␤-sandwich domains characteristic of GH32 enzymes. However, SInv displays an unusual quaternary structure. Monomers associate in two different kinds of dimers, which are in turn assembled into an octamer, best described as a tetramer of dimers. Dimerization plays a determinant role in substrate specificity because this assembly sets steric constraints that limit the access to the active site of oligosaccharides of more than four units. Comparative analysis of GH32 enzymes showed that formation of the SInv octamer occurs through a ␤-sheet extension that seems unique to this enzyme. Interaction between dimers is determined by a short amino acid sequence at the beginning of the ␤-sandwich domain. Our results highlight the role of the non-catalytic domain in fine-tuning substrate specificity and thus supplement our knowledge of the activity of this important family of enzymes. In turn, this gives a deeper insight into the structural features that rule modularity and protein-carbohydrate recognition.
Invertase (EC 3.2.1.26; ␤-fructofuranosidase) catalyzes the hydrolysis of the disaccharide sucrose (table sugar) into glucose and fructose and is a major enzyme present in plants and microorganisms. Because the yeast Saccharomyces was one of the preferred materials in early biochemical studies, yeast invertase became one of the classical model enzymes. Mitscherlich described in 1842 the existence in yeast of a substance capable of inverting dextrorotatory cane sugar into a levorotatory sugar that was identified in 1847 by Dubrunfaut as a mixture of glucose and fructose. In 1860, for the first time, Berthelot carried out the isolation of invertase (see Ref. 1). Some years later, the whole theory of enzyme kinetics was based on experimental results obtained with yeast invertase (2). Because invertase was found to be intimately associated with the yeast cells and because its purification required the preparation of yeast extracts, it was considered to be an intracellular enzyme. However, de la Fuente and Sols (3) showed that the enzyme is secreted by yeast cells and that the hydrolysis of sucrose is extracellular. Subsequently, it was discovered that the yeast produces two types of invertase: a heavily glycosylated secreted form and a non-glycosylated intracellular form (4,5). Yeast invertase is encoded by a family of repeated SUC genes (6 -9). The enzyme is normally secreted by yeast as a heavily glycosylated octameric protein. The large mass of the protein leads to it being trapped in the cell wall (10,11). Both the non-glycosylated cytoplasmic form and the secreted form of invertase are encoded by the same gene (8). These two forms are transcribed as two mRNAs of different length, which are translated into polypeptides of different size. In addition, the longer one encodes the signal peptide needed for secretion (12).
On the basis of sequence similarity, invertase is classified within family 32 of the glycoside hydrolases (13). In addition to invertases, this family (designated GH32) includes inulinases and levanases, involved in the hydrolysis of fructose-containing polysaccharides, and also transglycosylases with fructose transferase activity. GH32 enzymes act by a retaining mechanism in which an aspartate located close to the N terminus acts as the catalytic nucleophile and a glutamate acts as the general acid/ base catalyst. The reaction proceeds through attachment of the aspartate nucleophile to a fructosyl unit of the donor substrate. The fructosyl is subsequently released by hydrolysis or transferred to an acceptor sugar substrate (transglycosylation). In recent years, the crystallographic structure of several GH32 enzymes from bacteria (14 -16) and eukaryotes (17)(18)(19)(20)(21) have been reported. A close phylogenetic relative to Saccharomyces invertase (SInv) 5 is an inulinase from Schwanniomyces occidentalis (22). The characteristic structural feature of GH32 enzymes, shared by GH68 enzymes included in the same GH-J clan, is a 5-fold ␤-propeller catalytic domain consisting of five blades (each composed of four antiparallel ␤-strands with a "W" topology), which surround a central negatively charged active site cavity. The GH32 family differs from the GH68 family by an additional ␤-sandwich domain appended to the catalytic domain.
Besides its historical importance in the development of biochemistry, SInv has extensive industrial applications. It is one of the most widely used enzymes in confectionary to make liquid centers in candy making, and it is also used during fermentation of cane molasses into ethanol. A new potential application is the synthesis of prebiotic fructo-oligosaccharides used in functional foods and pharmaceutical formulations (23). The use of prebiotics to orchestrate the gut microbiota composition, with the associated benefit to human health, is an emerging issue of the utmost biotechnological interest (24).
In this work, we report the three-dimensional structure of SInv, produced in Escherichia coli, by expression of the SUC2 coding sequence without the 5Ј-end of the gene corresponding to the secretion signal peptide. Our results reveal the structural basis of the unique oligomerization pattern observed in SInv and provide key factors to understanding the enzymatic activity and specificity of this important enzyme.
Crystals of SInv belonged to space group P3 1 21, with eight molecules in the asymmetric unit and 75% solvent content within the unit cell. For data collection, native crystals were transferred to cryoprotectant solutions consisting of mother liquor plus 25% (v/v) glycerol before being cooled to 100 K in liquid nitrogen. Diffraction data were collected using synchrotron radiation at the European Synchrotron Radiation Facility (ESRF, Grenoble, France) on beamline ID23-1. Diffraction images were processed with iMOSFLM (26) and merged using the CCP4 package (27). A summary of data collection and data reduction statistics is provided in Table 1.
Structure Solution and Refinement-The structure of SInv was solved by molecular replacement using the MOLREP program (27). The structure of S. occidentalis ␤-fructofuranosidase (SoFfase; Protein Data Bank code 3KF3) was used to pre-  pare the search model with the program CHAINSAW (28) and a protein sequence alignment of SInv and SoFfase. A solution containing eight molecules in the asymmetric unit (molecules A-H) was found using reflections within the 50 -3.5 Å resolution range and a Patterson radius of 45 Å, which, after rigid body fitting, led to an R-factor of 46%. Crystallographic refinement was performed using the program REFMAC (29) within the CCP4 suite with flat bulk solvent correction and using maximum likelihood target features. Tight local non-crystallographic symmetry (NCS) and jelly body restraints were applied during the first steps of refinement. The free R-factor was calculated using a subset of 5% randomly selected structure factor amplitudes that were excluded from automated refinement. Some regions of the polypeptide chain, in particular loops 190 -200 and 230 -240 and ␤-strand 228 -250, all located at the interface between dimers, were not visible in the electron density of molecules EFGH and were excluded from the model during the first stages of the refinement. Furthermore, two NCS groups were defined composed of molecules ABCD and EFGH, respectively. After iterative refinement and rebuilding of these regions using the program Coot (30), the final 2F o Ϫ F c map showed continuous density for the whole protein. At the later stages, water molecules were included in the model, which, combined with more rounds of restrained refinement, led to a final R-factor of 22.9 (R free ϭ 23.7) for all data set up to 3.3 Å resolution. Refinement parameters are reported in Table 1. Stereochemistry of the models was checked with PROCHECK (31) and MolProbity (32). The figures were generated with PyMOL (33). Analysis of the interfacial surfaces and oligomer stability was done with the Protein Interfaces, Surfaces and Assemblies service (PISA) at the European Bioinformatics Institute (34). Root mean square deviation analysis was carried out with the program SUPERPOSE in the CCP4 package (27).
Activity Assays-Purified SInv was incubated for different times with sucrose, 1-kestose, nystose, raffinose, or inulin (from dahlia tubers) in 100 mM acetate buffer (pH 4.8) at 50°C. The enzyme was inactivated by heating at 95°C for 10 min. The calculation of the initial velocity of hydrolysis was based on the kinetics of product release. The products of the reaction were analyzed by anion exchange chromatography using a Car-boPac PA100 column (4 ϫ 250 mm) coupled to a pulsed amperometric detector (Dionex) as described previously (23).

RESULTS
The invertase of the yeast Saccharomyces was produced in E. coli, purified, and crystallized as reported previously (25). We present here the three-dimensional structure of the enzyme solved by molecular replacement at 3.3 Å resolution. The experimental and structure determination details are given under "Experimental Procedures" and in Table 1. The crystals belong to space group P3 1 21, with the asymmetric unit containing a complete homo-oligomer of eight subunits. Each chain (A-B-C-D-E-F-G-H) consists of 512 residues with a molecular mass of 58.5 kDa as calculated from its primary structure. The imposition of tight NCS during early refinement led to a model with eight identical subunits. Most of the polypeptide chain exhibited good electron density, but there were some segments that remained undefined, mainly the regions located at the sub-unit interface. Consequently and according to the oligomerization pattern shown by SInv (described below), two strict NCS groups were defined so that chains A-B-C-D and E-F-G-H were refined independently. This scheme led to a continuous electron density in all the subunits, where all the residues could be fitted. The final model showed a root mean square deviation of 1.02 Å between respective C␣ atoms from both NCS groups, with the differences being restricted to some particular regions of the polypeptide chain, as it will be explained below.
Crystallized SInv Is an Octameric Enzyme-The molecular mass of purified SInv, 428 kDa (see Fig. 2b), was consistent with an octamer association as proposed previously (25). Structural analysis showed that it is a flat square-shaped octamer with dimensions of 130 ϫ 130 ϫ 110 Å and is made up of eight subunits related by non-crystallographic 2-fold symmetry par- A sample of purified enzyme (2 mg/ml) in 0.05 M phosphate buffer (pH 7) and 150 mM NaCl was injected onto a Superdex 200 10/300 GL column coupled to an ÅKTA purifier system (GE Healthcare) previously equilibrated with the same buffer. Elution was carried out at a flow rate of 0.5 ml/min for 1.5 column volumes (solid line). SInv eluted mainly as a peak of ϳ430 kDa (peak 1), indicative of an octameric structure. A shoulder of this peak at lower elution volumes, highlighted with an arrow (peak 2), probably corresponds to higher molecular mass aggregates. Calibration of the column was performed with molecular mass standards (Bio-Rad catalog no. 151-1901) eluted under the same conditions (dotted line): peak a, thyroglobulin (670 kDa); peak b, ␥-globulin (158 kDa); peak c, ovalbumin (44 kDa); peak d, myoglobin (17 kDa); and peak e, vitamin B 12 (1.35 kDa). b, nondenaturing PAGE analysis of purified SInv performed on 6% polyacrylamide gels. Gels were stained with Coomassie Blue (left), or alternatively, invertase activity was detected by incubating the gels in sucrose and subsequently staining with 1% (w/v) 2,3,5-triphenyltetrazolium chloride in 0.25 M NaOH (right) as described previously (22). Oligomers lower than the octamer appeared after heating SInv at 47°C or incubating with urea. These treatments also decreased the proportion of the sample that formed higher aggregates. Other external conditions, such as addition of NaCl (0.1, 0.5, and 1 M), dilution (1:3, 1:5, and 1:10), or varying pH (4.5, 7.5, and 8.5), did not alter the pattern of untreated SInv (not shown).
allel to the c axis in a 222 arrangement (Fig. 1a). Its molecular surface is 136,914 Å 2 , with a total buried surface area of 22,296 Å 2 .
However, this octamer is best described as a tetramer of dimers that oligomerize by intersubunit extension of the two ␤-sheets that end the ␤-sandwich domain within each subunit (Fig. 1b). Furthermore, close inspection revealed that SInv forms two classes of dimers, AB/CD and EF/GH, which are located at opposite vertices of the square. The subunits of these two classes of dimers associate differently with each other, and, thus, the EF/GH dimers can be described as an "open" assembly, whereas the AB/CD dimers form a "closed" arrangement. Structural superposition of chain B on chain F shows that a 15°r otation would be necessary to bring monomer A into the E  Fig. 1c, with one subunit being shown in ribbon representation for clarity. a, the AB/CD dimers are tightly made by interactions among both their catalytic and ␤-sandwich domains. The base of the catalytic pocket is additionally lined by hydrophobic interactions through Phe-388 and Phe-296. b, by contrast, the EF/GH dimers interact only through their ␤-sandwich domains. In addition, the catalytic pocket is also paved by a new salt bridge formed between Asp-45 and Lys-385 from the ␤-sandwich domain, which lines the cavity. A putative 1-kestose molecule is shown in spherical representation (inferred as explained in the legend to Fig. 1). position (Fig. 1c), with three regions of the SInv monomer acting as hinges (colored blue in Fig. 1d, left).
On the other hand and like the known GH32 members, the SInv subunit folds into two domains, a catalytic ␤-propeller domain (residues 1-334) and a ␤-sandwich domain (residues 342-512), linked by a short loop (Fig. 1d, right). In turn, the ␤-propeller domain is assembled from five blades (I-V), each consisting of four antiparallel ␤-strands (A-D, from the axis toward the outside of the propeller) connected by turns in a classical W pattern. Blade I is the most and blade IV is the least conserved among GH32 family members. The catalytic site is located at the axis of the propeller and is shaped by the loops connecting the different blades (L1-L4) and the turns linking strand B to strand C within each blade (TI-TV). On the other hand, the ␤-sheet domain has two six-stranded antiparallel ␤-sheets folded into a ␤-sandwich topology and is the region presenting the lowest sequence homology among GH32 family members.
The quaternary structure of SInv in solution has been analyzed by different methods. Recombinant SInv subjected to gel filtration chromatography eluted mainly as a peak corresponding to the size of an octamer, although a small fraction of the protein appeared as aggregates of higher molecular mass (Fig.  2a). This tendency to aggregate was also observed on nondenaturing polyacrylamide gels (Fig. 2b). Oligomers of less than eight units (hexamers, tetramers, and dimers) were also detected, and incubation of the enzyme at 47°C for 1-2 h or treatment with 2 M urea stimulated the dissociation of the octameric form. The enzyme was active in all these different oligomerization states, as shown by the zymogram test (Fig. 2b). Variability in quaternary structure was also observed by ultracentrifuge analysis (data not shown), which indicated that SInv was octameric but showed a decrease in the average molecular mass at the highest centrifugation force (11,000 rpm). These results are in good agreement with those reported for both intracellular and secreted forms of the native enzyme (10,35).
Different Atomic Interactions Define Dimer Association-Table 2 lists the polar interactions found within each interface of the octamer. First, the AB/CD dimers are tightly associated by interactions among both their catalytic and ␤-sandwich domains (Fig. 1c). Fourteen of a total of 32 hydrogen bonds are made between their catalytic domains, mainly through loop L3 and strand D4 and also through a long loop connecting strands C5-D5 (residues 316 -328) that makes many interactions near the catalytic pocket, as shown in Fig. 3a. Furthermore, the catalytic pocket of one monomer is surrounded by loops from the ␤-sandwich domain of the other, and there is a polar interaction from Ser-439 to Gly-235 located at TIV that stabilizes the dimer interface at this region. The base of the catalytic pocket is additionally lined by hydrophobic interactions through Phe-388 and Phe-296 (TV) and direct polar interaction between both ␤-sandwich domains ( Table 2).
By contrast, the EF/GH dimers lack atomic interactions between their catalytic domains, as shown in Table 2, and keep only some hydrogen links between their ␤-sandwich domains, with half of them not being conserved with respect to the closed dimers. However, the catalytic pocket is also strengthened by a new salt bridge formed between Asp-45 (at loop TI) and Lys-385 (at the loop linking ␤3-␤4 of the ␤-sandwich domain), which encloses a well defined cavity (Fig. 3b). This different pattern of interaction between the subunits of the two kinds of dimers has two direct implications. First, the EF/GH active site has a wider pocket (Fig. 1c). Second, the lack of interactions between the catalytic domains produces a rearrangement in the dimer interface regions that leads to the structural differences of the two kinds of dimers (Fig. 1d).
Both types of dimers assemble through a similar interface that involves the extension of the two ␤-sheets of each ␤-sandwich domain (Fig. 4) centered on ␤1 and ␤2, respectively. It is interesting that the sheet constituting the "inner" part of the octamer (Fig. 4a) forms a regular antiparallel intermolecular ␤-sheet, similar to that reported in multimeric lectins, cytokines, and other proteins (36). These intermolecular interactions between the hydrogen-bonding edges of ␤-sheets are considered to be a fundamental form of biomolecular recognition (like DNA base pairing) and are involved not only in oligomerization and protein-protein interactions but also in protein aggregation, as they occur in ␤-amyloid fibril formation (37).  Table 3 summarizes the main features of the different interfaces found in the octamer as analyzed by the PISA program (34). As shown, the interface area of the closed dimers is twice that found in the open dimers, as is the number of polar interactions. The binding energy (⌬ i G) is negative in all interfaces, indicating their hydrophobic nature, but the p value reveals that the AB/CD interfaces, with lowest ⌬ i G, have a large hydrophobicity at a higher confidence level. Therefore, only the closed dimers, AB/CD, might be expected to exist in solution. The other interfaces may represent weak interactions (38) existing in higher oligomers of SInv depending on external conditions. This agrees with the oligomerization behavior of both cytosolic and secreted SInv isoforms (10,35).
A Few Changes in Its Sequence Determine the Oligomerization Pattern of SInv-The specific role of the ␤-sandwich domain in GH32 enzyme functionality remained elusive for a long time. The first experimental evidence of its implication in dimerization and substrate binding became available when the first structure from a yeast enzyme, SoFfase, was reported (22). Very recently, the unique role of its ␤-sandwich domain in substrate recognition has been further demonstrated from the structure of two complexes with long substrates (39). Although SoFfase and SInv share 68% sequence homology (being 48% identical), SoFfase is a dimeric enzyme, and higher aggregation forms have not been detected.
Structural superimposition of SInv and SoFfase (Fig. 5) revealed that the catalytic domains are almost identical and that most regions of the ␤-sandwich domain are very similar. Accordingly, the hydrogen link pattern in the SoFfase dimer is very similar to that in the closed AB/CD SInv dimers. However, it is remarkable that none of the residues that provide the polar links for the ␤-sheet extension through ␤1 and ␤2 are conserved in SoFfase. Consequently, this region shows poor structural alignment (Fig. 5). Furthermore, the ␤1-␤2 region in FIGURE 5. Structural differences between SInv and SoFfase. a, structural alignment of the SoFfase subunit (slate) and the SInv monomer (red) shows well conserved catalytic domains but larger differences in the ␤-sandwich domain, mostly at strands ␤1 and ␤2, at the base of the ␤-sandwich domain, which can be attributed to the interactions of the ␤-sandwich elements in the SInv octamer. b, the ␤-sandwich of an SInv subunit is shown in surface representation to highlight the effect of the different arrangement in the active site (magnified) at the dimer interface. The putative 1-kestose position is shown in spherical representation (inferred as explained in the legend to Fig. 1). SoFfase has a high content of Lys residues (9/20), which are solvent-exposed at the base of the ␤-sandwich, yielding a positively charged surface that would prevent dimer association along this region due to electrostatic repulsion effects. Quaternary Structure Determines Substrate Specificity-The active site of SInv is located at the interface within each pair of dimers. Because of the two different ways subunits can form dimers (Fig. 1c), the active sites in these dimers may have different environments. Thus, the AB/CD dimers form a very narrow pocket of 10 ϫ 10 Å that seems unable to accommodate an oligosaccharide with more than three or four sugar units (Fig.  6a). In contrast, the wider (20 ϫ 16 Å) entrance cavity observed in the EF/GH dimers might allow longer substrates (Fig. 6b), although this could involve a significant, energetically expensive distortion of the polysaccharide.
A comparison of the catalytic pockets of octameric SInv and SoFfase (Fig. 6c) reveals that the shape and size of the SoFfase cavity is more similar to those within the open EF/GH SInv dimers. Nevertheless and more significant, the chemical nature of the residues that form the ␤-sandwich domain surrounding the active center is quite different in the two enzymes. SoFfase has long chain residues, such as Arg-473, Glu-464, Tyr-462, and Tyr-468 and notably Gln-435, a residue that, together with Ser-412, makes direct polar links with the oligosaccharides at subsite ϩ3 (Ref/ 39 and nomenclature according to Ref. 40). Furthermore, Asn-401, Asn-403, and Asp-471 protrude at the entrance of the slot and define the polar boundaries that make accessible the cavity from the solvent. Therefore, most of the relevant SoFfase residues are highly flexible. By contrast, most of the corresponding SInv residues are short chain amino acids, such as Ser-412, Ser-414, Ser-415, Thr-379, Thr-380, and Ser-447. Consequently, the SoFfase active site seems more flexible to accommodate long chain substrates compared with the two catalytic pockets of SInv. This feature may therefore illustrate the structural basis for the activity of SInv as an invertase, whereas SoFfase is in fact an inulinase. As observed when comparing the activity of SInv with different oligosaccharides, the highest efficiency was with sucrose as a substrate, followed by the trisaccharides raffinose and 1-kestose, whereas the tetrasaccharide nystose was hydrolyzed at a much lower rate (Table 4). SInv showed no significant activity with inulin as the substrate.
It has been reported that SInv and SoFfase have some degree of transglycosylating activity, giving 6-kestose as the main product. This activity can be enhanced by mutagenesis (23,39,41). In contrast, homologous plant enzymes of the same GH32 family yield mostly 1-kestose. To understand the ␤(2,6) or ␤(2,1) nature of the transglycosylation reaction, the binding site on the putative acceptor sucrose and its orientation relative to the fructose unit in the covalent intermediate must be identi-fied. Fig. 6d illustrates a superimposition of SInv and SoFfase catalytic pockets showing the putative position of the products 1-kestose and 6-kestose. The figure suggests that the nucleophile (Asp-22 in SInv numbering) environment, together with the hydrophobic wall of the pocket (Trp-48, Phe-82, and Trp-291), is coincident in all three cavities. Gln-201 is well positioned to make polar links with the fructose unit at subsite ϩ1 in both transfructosylating products, Asn-228 could link to the glucose moiety of 6-kestose, and the glucose term of 1-kestose would stack with Trp-48. An essential role is thus assigned to Gln-201 to facilitate the transfructosylation reaction by binding the acceptor sucrose, whereas Asn-228 would crucially determine the product specificity as predicted (23) and also as described for SoFfase (39).
In summary, the architecture of the active site, as determined by the way the enzyme monomers are assembled, explains both substrate specificity for hydrolysis (invertase versus inulinase activity) and transfructosylation product specificity. Thus, the dimerization mode of SInv modulates its hydrolytic activity, precluding the recognition of long chain substrates. This is more apparent for the closed dimers that would be predominant in aggregation states lower than the octamer or in alternative octameric forms composed exclusively of closed dimers (see "Discussion"). However, the transfructosylating mechanism would be the same in both enzymes.

DISCUSSION
The yeast Saccharomyces plays an outstanding role in human civilization as the fermentative agent that produces bread, wine, and beer. Saccharomyces owes its predominant position as a fermentative microorganism to a very successful metabolic strategy. In the sugar-rich ecological niches where it dwells, Saccharomyces performs a highly efficient mobilization of sugars, which can be channeled to the production of an antiseptic substance (ethanol) that avoids the proliferation of competing microorganisms instead of being used for the production of biomass. The function of a set of enzymes involved in sugar metabolism, including invertase, is critical for the prevalence of Saccharomyces in its natural habitat. Despite the important  physiological role of SInv and its historical relevance as a model enzyme in early biochemical studies (2), its crystallographic structure had not yet been reported. Although it is secreted by the yeast in large amounts, native SInv appears as a heavy hyperglycosylated protein of heterogeneous molecular mass, which remains trapped in the cell wall (10). It is therefore a rather unsuitable material for crystallization. The biosynthesis and secretion of extracellular proteins by eukaryotes are a complex process in which transit of the nascent polypeptide through the endoplasmic reticulum is coupled with core glycosylation and protein folding. Hyperglycosylation takes place at a later stage in the Golgi and does not affect SInv folding or catalytic activity (42). Due to the fact that core glycosylation is often a necessary condition for proper protein folding, secreted eukaryotic proteins cannot be functionally expressed in E. coli. Therefore, the collection of a sufficient amount of protein for crystallization analysis requires either that it is abundantly produced by the organism from which derives or that it is heterologously expressed in a eukaryotic host. Examples of structurally determined SInv homologs of the GH32 family are therefore fungal inulinases abundantly produced by their natural host (17,20,22), plant fructosylases heterologously expressed in Pichia pastoris (18,19), and a plant fructosyltransferase abundant in leaves (21). Saccharomyces produces an intracellular non-glycosylated version of the invertase (4). The intracellular version, which is synthesized as a soluble protein, is encoded by the same gene as the secreted form (12). Secondary and tertiary structures of both forms are virtually identical, as revealed by CD spectroscopy analysis (35). These considerations led us to carry out heterologous expression of SInv in  E. coli, which yielded suitable protein material for crystallization analysis (25). Analysis of the crystal structure at 3.3 Å resolution that we have reported here showed that SInv folds into the catalytic ␤-propeller and ␤-sandwich domains characteristic of GH32 enzymes. A dimer association that shapes the active site has been found, similarly to that described in the phylogenetically close ␤-fructosidase from S. occidentalis (22). However, in contrast, SInv displays a special assembly of dimers into octamers through extension of each subunit ␤-sandwich domain. The particular geometry of the octamer generates closed and open dimers that are located alternatively at the vertices of a rectangle. Analysis of the interfaces and binding energy calculation showed that "closed-type" dimers are more stable. Although the octameric form is predominant in a fresh preparation of the enzyme (Fig. 2a), the weaker association within the "openedtype" dimer and the dimer-dimer interface may explain the instability manifested by the octamer upon incubation at higher (47°C) temperatures (Fig. 2b) or at a high centrifugal force. Consequentially, recombinant SInv is predominantly an octamer but may exist as a dimer and other oligomeric forms, as has been reported for both the cytosolic and secreted isoforms (10,11,35,43). Furthermore, as seen from the structure reported here, SInv dimerization plays a determinant role in substrate specificity, preventing binding of extended substrates, which explains its invertase character at the molecular level.
Electron micrographs of both internal and secreted SInv (10, 11) show a similar association pattern of spherical units with different oligomerization states: dimers, tetramers, hexamers, and octamers. Interestingly, secreted SInv octamers appear slightly open to one side, whereas intracellular invertase octamers appear mostly as nearly symmetrical closed rectangles, like the structure presented here. The electron micrographs of secreted invertase octamers resemble strikingly the model illustrated in Fig. 7, in which the protein would be composed of closed dimers linked by intermolecular ␤-sheets.
Although GH32 enzymes are generally highly conserved in the ␤-propeller and less so in the ␤-sandwich domain, SInv and SoFfase are very similar also in some regions of the ␤-sandwich domain, namely strands ␤8 -␤10 and region ␤6 -␤7. However, strands ␤1 and ␤2, which build the intermolecular ␤-sheet within the octamer, are less conserved. In fact, homology alignments of yeast GH32 sequences show that many of them contain a Pro residue at ␤1 or ␤2 that probably precludes intermolecular ␤-sheet formation (Fig. 8). In SoFfase, an unusually large number of solvent-exposed Lys residues are found in this region, resulting in a positively charged surface that likely interferes with ␤-sheet dimerization because of electrostatic repulsion effects. Therefore, the association of dimers into octamers seems to be a unique SInv feature.
To conclude, the structure of SInv presented here is an interesting new example of how non-catalytic domains with, unknown function play a role in fine-tuning enzymatic function. The contribution of the ␤-sandwich domain in building the catalytic pocket of GH32 yeast enzymes has been reported previously. However, a role in higher oligomerization leading to new specificity seems unique to SInv. The production of a complex octameric hyperglycosylated enzyme precludes its diffu-sion outside the periplasmic space. The secretion of invertase that occurs in many microorganisms represents an evolutionary advantage eliminating a mechanism for sucrose import. The unique quality of Saccharomyces is that it keeps the invertase trapped on the cell surface, avoiding its diffusion into the medium, where it would also aid competing organisms. We hope that the resolution of the sophisticated molecular architecture of this enzyme may contribute to the understanding of protein-carbohydrate interactions and to the design of novel, more efficient enzymes for biotechnological purposes.