Crystal structure of Thermotoga maritima alpha-L-fucosidase. Insights into the catalytic mechanism and the molecular basis for fucosidosis.

Fucosylated glycoconjugates are involved in numerous biological events, and alpha-l-fucosidases, the enzymes responsible for their processing, are therefore of crucial importance. Deficiency in alpha-l-fucosidase activity is associated with fucosidosis, a lysosomal storage disorder characterized by rapid neurodegeneration, resulting in severe mental and motor deterioration. To gain insight into alpha-l-fucosidase function at the molecular level, we have determined the crystal structure of Thermotoga maritima alpha-l-fucosidase. This enzyme assembles as a hexamer and displays a two-domain fold, composed of a catalytic (beta/alpha)(8)-like domain and a C-terminal beta-sandwich domain. The structures of an enzyme-product complex and of a covalent glycosyl-enzyme intermediate, coupled with kinetic and mutagenesis studies, allowed us to identify the catalytic nucleophile, Asp(244), and the Brønsted acid/base, Glu(266). Because T. maritima alpha-l-fucosidase occupies a unique evolutionary position, being far more closely related to the mammalian enzymes than to any other prokaryotic homolog, a structural model of the human enzyme was built to document the structural consequences of the genetic mutations associated with fucosidosis.

␣-L-Fucosidases catalyze the removal of nonreducing terminal L-fucose residues linked via ␣-1,2, ␣-1,3, ␣-1,4, or ␣-1,6 bonds to oligosaccharides and their conjugates. Decreased ␣-Lfucosidase activity is related to a number of pathological conditions such as inflammation, cancer, and cystic fibrosis (1)(2)(3). Severe deficiency of ␣-L-fucosidases causes fucosidosis, an autosomal recessive lysosomal storage disease resulting in the lethal accumulation of fucosylated glycoconjugates in the lysosomes of most tissues, including the peripheral and central nervous systems (4). ␣-L-Fucosidases are also of considerable interest because fucosidase activity in serum can be used in the diagnosis of patients with early colorectal and hepatocellular cancers (5).
␣-L-Fucosidases have received much attention due to the central role of fucosylated glycoconjugates in many biological events. The latter are widely distributed in the animal kingdom, where they can be encountered in a variety of tissues, including liver, brain, and spleen, in human amniotic fluid, and as a discriminative feature of ABO and Lewis blood group antigens (6). Accordingly, fucosylated glycans play a crucial role in many physiological and pathological processes, including immune response (7), signal transduction (8), development (9), and adhesion processes of pathogens (10). Implications of fucose in sperm-egg interaction (11), early embryogenesis (12), and apoptosis (13) have been reported, and alterations in the fucosylation pattern have been observed in a number of diseases, including cancer and diabetes (14,15). Fucose also appears to be an important immune modulator for the interactions between the selectin family of cell adhesion molecules and the sialyl Lewis X antigen of their glycoconjugate counterreceptors, enabling the rolling of leukocytes on the endothelium and subsequent extravasation (16).
␣-L-Fucosidases are found exclusively in family GH29 of the classification of glycosidases based on sequence similarities, which reflect the folds, active site architecture, molecular mechanism, and, to a minor extent, substrate specificity (17). 1 ␣-L-Fucosidases have been shown to follow a double displacement mechanism with net retention of the anomeric configuration, as initially proposed for human liver ␣-L-fucosidase in 1987 (18) and recently confirmed by biochemical studies on ␣-L-fucosidases from Thermus sp. Y5 (19), the marine mollusc Pecten maximus (20), Sulfolobus solfataricus (21), and Thermotoga maritima (22).
We have determined the crystal structure of the ␣-L-fucosidase from the marine hyperthermophilic bacterium T. maritima (TM aFuc). 2 Currently, this enzyme is the closest bacterial relative of mammalian ␣-L-fucosidases and shares 38% identity with its human counterpart. The true biological role of TM aFuc and its natural substrate is not known. However, this bacterium appears to have evolved a remarkably large repertoire of enzymes dedicated to the breakdown of plant pectin and xyloglucan. These hemicelluloses often carry ␣-L-fucose side chains, and it is thus tempting to speculate that TM aFuc operates synergistically with other glycosidases to degrade these polymers. Alternatively, it is also conceivable that TM * This work was supported in part by a European Molecular Biology Organization long-term fellowship (to G. S.) and by the Association Vaincre les Maladies Lysosomales, the Association Française contre les Myopathies, the Protein Engineering Network of Centres of Excellence of Canada, and the National Sciences and Engineering Research Council of Canada. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The crystal structures of complexes of TM aFuc with ␣-L-fucose and with a mechanism-based inhibitor, 2-deoxy-2-fluoro-␣-Lfucopyranosyl fluoride (fucosyl fluoride), allowed us to delineate the substrate-binding pocket and the covalent 2-deoxy-2fluoro-L-fucosyl-enzyme intermediate. Together with kinetic and mutagenesis experiments, these structures reveal for the first time the overall fold of glycoside hydrolase family GH29, allow the identification of the two catalytic residues, and provide a structural template to understand the molecular basis of fucosidosis.

EXPERIMENTAL PROCEDURES
Cloning-The coding sequence of the T. maritima gene (TM0306) was amplified by PCR from genomic DNA, and the PCR product was subcloned into the pDEST17 prokaryotic expression vector (Gateway, Invitrogen) following the manufacturer's instructions. This vector encodes an N-terminal His 6 tag fused to the entire coding region of ␣-L-fucosidase. Sequence analysis revealed a single base difference compared with GenBank TM /EBI accession number NC_000853, which resulted in a Y271C mutation.
Expression and Purification-Expression was carried out using Escherichia coli BL21(DE3) cells grown in LB medium at 37°C to A 600 ϭ 0.6 and induced by the addition of 0.5 mM isopropyl-1-thio-␤-D-galactopyranoside. After incubation at 37°C for 4 h, the cells were sedimented, resuspended in 200 mM NaCl and 20 mM Tris-HCl (pH 8.0), and lysed with a French press. After centrifugation, the supernatant was incubated for 15 min at 60°C, and precipitated E. coli proteins were eliminated by centrifugation. The enzyme was further purified by Ni 2ϩ affinity chromatography, and a gel filtration step yielded highly purified enzyme as judged by matrix-assisted laser desorption ionization time-of-flight mass spectroscopy analysis. Selenomethioninesubstituted enzyme was produced using the same bacterial strain grown in minimal M9 medium and supplemented, before induction, with selenomethionine and amino acids known to inhibit methionine biosynthesis (23).
Site-directed Mutagenesis and Kinetic Assays-The E66A, D224A, and E266A mutants were generated using the QuikChange TM mutagenesis kit (Stratagene) according to the manufacturer's instructions and were verified by DNA sequencing. These mutants, expressed under the conditions described above, were insoluble, and the coding sequences of the mutants, together with that of wild-type TM aFuc as a control, were therefore subcloned from pDEST17 into the expression vector His-pKM596, a derivative of pKM596 (24), resulting in four His-maltosebinding protein-␣-L-fucosidase-pKM596 expression constructs. Expression and purification of these uncleaved fusion proteins were carried out as described above for the wild-type TM aFuc construct, and Michaelis-Menten parameters for the hydrolysis of p-nitrophenyl fucoside (pNP-Fuc) and fucosyl fluoride were obtained as described previously (22). k cat /K m values were determined from initial rates at low substrate concentrations through Lineweaver-Burk analysis.
Crystallization and Data Collection-Crystals were grown by the hanging-drop vapor diffusion method at 20°C from protein solution at 5.0 mg/ml mixed with 18% (w/v) polyethylene glycol 6000, 100 mM Tris-HCl (pH 8.0), and 5% Jeffamine M-600. Crystals belong to space group H32 and contain two molecules/asymmetric unit. Crystals of the fucose complex and the inhibitor complex were obtained by short soaks in the crystallization solution supplemented with 0.5 M fucose and/or a small amount of inhibitor powder, respectively. All data sets were collected at 100 K on flash-frozen crystals. Cryosolutions were of the same composition as the crystallization/harvesting solutions with the addition of increasing amounts of Jeffamine M-600 and supplemented with 5% (v/v) glycerol in the case of native, selenomethionylated, and inhibitor complex crystals and with the addition of 0.5 M fucose in the case of the fucose complex. Data were indexed and integrated with DENZO (25), and all further computing was carried out with the CCP4 program suite (26), unless otherwise stated. Data collection statistics are summarized in Table I.
Structure Solution and Refinement-The structure of ␣-fucosidase was solved by the selenomethionine three-wavelength multiple anomalous dispersion (MAD) method using the program SOLVE (27). The initial MAD phases had mean figures of merit of 0.45 for data to 2.8-Å resolution and of 0.62 after 10 cycles of density modification using DM, implementing solvent flattening and histogram matching (28). Despite the fact that only 6 selenium atoms of 12 could be located, with 6 selenomethionine residues being located either at the N terminus or in disordered loop regions, the resulting experimental map was of good quality and could be further improved by NCS averaging and phase combination techniques. Both protein chains were manually traced and fitted with the program TURBO-FRODO (29). Refinement was carried out against data of the native crystal with CNS (30) and REFMAC (31) using the maximum likelihood approach and incorporating bulk solvent corrections, anisotropic F o versus F c scaling, NCS restraints, and TLS refinement. Weak data in the outmost resolution shell were included in the refinement, as they improved considerably sigmaA weighted electron density maps. A random 7% (2642) of reflections were set aside for cross-validation purposes. BUSTER (32) was used to determine ill defined regions of the model. Automated solvent building was performed with ARP/wARP (33). The two complex structures were subjected to rigid body refinement with REFMAC (31), and further refinement was carried out as described above. Ligands were included only toward the end of refinement, and no torsion angle restraints were applied to avoid imposition of a given pyranose ring conformation.
The two molecules present in the asymmetric unit encompass Arg 7 -Pro 46 , Asp 56 -His 268 , and Asp 274 -Val 447 . Six N-terminal residues as well as the N-terminal His 6 tag and the linker used for cloning and purification, 2 C-terminal residues, and surface loops Thr 47 -Met 55 , Val 269 -Gly 273 , and Gly 297 -His 300 in chain B could not be built due to either diffuse or totally absent electron density. These surface regions, devoid of crystal-packing interactions, were also missing in the structures of the L-fucose complex and the glycosyl-enzyme intermediate. Two solvent-exposed Cys residues, Cys 364 and Cys 365 , form a rare disulfide bridge between consecutive residues. The linked L-fucosyl moiety and the product refined to occupancies of 0.4 and 0.6, respectively, in one molecule of the asymmetric unit and 0.6 and 0.4, respectively, in the second molecule. The stereochemistry of the final models was verified with the program PROCHECK (34). Refinement and structure quality statistics are listed in Table I. The global B-factors given in the Protein Data Bank files were calculated from TLS tensors and the residual B-factors after refinement with REFMAC (31).
Model Building-The structure of the catalytic domain of human ␣-L-fucosidase encompassing Pro 6 -Trp 347 was modeled with the protein homology modeling SWISS-MODEL server using the crystal structure of apo-TM aFuc as template. Given the difficulties in producing an accurate sequence alignment of some particular regions of members of the GH29 family, the accuracy of the resulting model was adjusted manually using the graphics program TURBO-FRODO (29).

RESULTS AND DISCUSSION
Overall Fold-The crystal structure of TM aFuc was determined at 2.5-Å resolution by the MAD method, taking advantage of the anomalous scattering of selenium, and the model was refined against a native data set at 2.4-Å resolution. Complexes with L-fucose, the reaction product, and fucosyl fluoride, a mechanism-based inhibitor, were obtained by soaking experiments and were refined to 2.8-and 2.25-Å resolution, respectively. Clear unbiased electron density could be observed for both L-fucose and the inhibitor prior to their incorporation in the refinement (Fig. 1, A and B).
TM aFuc is a two-domain protein with overall dimensions of 75 ϫ 65 ϫ 55 Å ( Fig. 2A). The N-terminal domain (residues 7-359) adopts a (␤/␣) 8 -barrel-like fold, with eight parallel ␤-strands packed around a central axis and surrounded by six ␣-helices. A depression at the C-terminal ends of the ␤-strands hosts the active site. The helix corresponding to helix 5 in the classical triose-phosphate isomerase barrel fold is missing in TM aFuc and is replaced by a small loop region, whereas the structural equivalent of helix 6 corresponds to the disordered region Val 269 -Gly 273 , which could not be modeled. Despite being located close to the active site, this region does not become structured upon substrate/product binding, as previously observed in related enzyme systems (35,36).
The N-terminal domains of the three TM aFuc crystal structures are very similar, with a root mean square deviation for 330 C-␣ positions in the range of 0.3 Å. Slightly greater differences, ϳ0.45 Å for 85 C-␣ positions, can be observed for the C-terminal domains. Significant differences can be also detected at the level of the relative arrangement of the two domains, indicating that the domain junction is, in fact, a flexible hinge.
DALI searches (37) performed with the isolated TM aFuc catalytic domain identified several relatives, the top ranked hit being ␤-N-acetylhexosaminidase (Protein Data Bank code 1hp4) (223 equivalent C-␣ positions with a root mean square deviation of 3.7 Å). In contrast, the C-terminal domain shares only limited similarity with other proteins, such as the Nterminal domain of Serratia marcescens chitobiase (Protein Data Bank code 1qba) (70 equivalent C-␣ positions with a root mean square deviation of 2.9 Å) and, with lower scores, with several carbohydrate-binding modules. Whereas the DALI search identified structurally similar modules encountered in a number of glycosidases from various families (GH13, GH15, GH17, GH20, GH27, and GH38), a BLAST search conducted with the C-terminal module alone did not produce any statistically significant hit, not even within GH29 family enzymes.
Whether this module has a carbohydrate-binding function in TM aFuc remains unclear, but the lack of significant sequence similarity to other GH29 family enzymes suggests that the C-terminal domain of TM aFuc might have evolved another function. The assignment of a specific function to this domain is rendered particularly difficult by the fact that a number of proteins of diverse function, such as human leukotriene A 4 hydrolase (Protein Data Bank code 1hs6), a proteolytic fragment of peptidylprolyl isomerase TLP2 (code 1tul), or the anthrax protective antigen (code 1acc), also fall in the same range of structural similarity.
Oligomeric State-In the crystal, TM aFuc assembles into a compact hexameric arrangement, with overall dimensions of 110 ϫ 120 ϫ 140 Å (Fig. 2B). This assembly can be divided into two trimers that stack on each other and are rotated by ϳ30°. Upon hexamer formation, the buried surface area (1.6-Å probe radius) per TM aFuc monomer lies between 2380 and 2520 Å 2 , a value in the highest range compared with other multimeric edifices (38).
The interactions stabilizing the trimers are established exclusively between the catalytic domains, notably between the N-terminal residues, ␣-helices A3 and A4, and the loop region preceding ␣-helix A3. The C-terminal domains are projected toward the outer limits of the trimers. The interactions between two trimers involve the N-terminal ␣-helix, the surface loop regions following ␤-strands B5 and B6, and a very minor contact area in the C-terminal domain. No covalent bonds are formed between the monomers, even though a cysteine residue, Cys 380 , is located very close to its counterpart in a neighboring molecule; these two residues are too far apart to form a disulfide bridge. Size exclusion chromatography at low and high ionic strength (100 and 500 mM NaCl, respectively) gave essentially identical elution profiles corresponding to M r values of ϳ300,000, indicating that the hexameric form also prevails in solution.
The significance of the hexameric state of TM aFuc is unclear at present. On the one hand, oligomerization is a frequent stabilizing feature of proteins from hyperthermophilic organisms such as T. maritima. On the other hand, it cannot be excluded that oligomerization may have a functional significance in TM aFuc, since most ␣-L-fucosidases so far characterized, whether from mesophilic or thermophilic organisms, are oligomeric. 3 However, the active sites of the single monomers constituting the TM aFuc multimer are well separated, by ϳ40 Å, and no interactions between bound substrate and neighboring subunits are detectable. The stability of the hexameric assembly is not disrupted by the introduction of maltose-binding protein at the N terminus. The resulting fusion protein also displays an enzymatic activity comparable with that of native TM aFuc. This suggests that the flexible linker is sufficiently long to allow the accommodation of maltose-binding protein on the outside of the core of the hexameric assembly.
Substrate-binding Site-The crystal structures of TM aFuc in complex with L-fucose, the reaction product, allowed us to identify the active site, located in a small pocket formed by the C-terminal ends of the central ␤-strands of the (␤/␣) 8 -domain. The size of the pocket allows a single L-fucose residue to bind, consistent with the exoglycosidase action of ␣-L-fucosidases. In the product complex, L-fucose is present in the standard 1 C 4 chair conformation, and its 1-hydroxyl group adopts a ␤-anomeric configuration, presumably as a result of mutarotation in 3 Available at www.brenda.uni-koeln.de. solution. The sugar is tightly enveloped by the enzyme, and each functional group makes direct contacts with one or more protein side chains, emphasizing a specific substrate recognition pattern for L-fucose (Fig. 1C). The 2-hydroxyl is hydrogenbonded to the invariant His 129 and Trp 67 residues. The latter is, in turn, involved in a bidentate hydrogen-bonding network with O-3, together with the conserved side chain of Glu 66 . The conserved His 128 residue stabilizes the axial 4-hydroxyl group, together with the invariant His 34 residue, whereas the 1-hydroxyl group is hydrogen-bonded to Asp 224 . The exocyclic C-6 methyl group is enclosed within a hydrophobic pocket formed by the side chains of the largely conserved residues Phe 32 , Tyr 171 , Trp 222 , and Phe 290 . No structured water molecules could be observed in the vicinity of the ligand.
No major structural rearrangements of the enzyme occur upon ligand binding. The most striking difference can be observed for Phe 290 , which rotates ϳ25°around 2 , and for Tyr 64 , which flips its side chain by 90°around 2 . This latter movement brings the aromatic ring close to the hydrophobic face of the sugar ring; but instead of establishing a classical "stacking" interaction, often observed in sugar-binding proteins, the plane of the ring is placed perpendicular to the hydrophobic face of the sugar, favoring only very weak van der Waals interactions at a distance of ϳ4.5 Å. This situation is often observed at the Ϫ1 subsite of glycosidases (subsite nomenclature of Ref. 39), which require conformational flexibility of the sugar ring during catalysis (40).
The tight interactions between L-fucose and the enzyme leave barely any space for potential modifications of, or substituents upon, the sugar hydroxyl groups, with the exception of the 2-hydroxyl, which is located near the opening of the binding pocket. This hydroxyl group could accommodate a small substituent such as a methyl group, a modification commonly encountered in type II rhamnogalacturan (41). The tight substrate-binding pocket of TM aFuc opens up into a vast groove, and apparently, no steric restraints are exercised upon aglycone binding. This is consistent with the biological role of ␣-L-fucosidases of both bacterial and eukaryotic origin because FIG. 2. Overall view of TM aFuc. A, stereo ribbon diagram colored in a gradient along the TM aFuc amino acid sequence, from the N terminus (blue) to the C terminus (red). Secondary structure elements corresponding to the classical (␤/␣) 8 -fold are indicated. The catalytic nucleophile Asp 224 and the acid/base Glu 266 are shown in ball-and-stick presentation, with carbon atoms colored gray and oxygen atoms colored red. B, the hexameric assembly of TM aFuc, as seen from the top (left) and rotated by 90°(right). For clarity, the trimers composing the hexamer are colored magenta and white, and one trimer is highlighted by a transparent surface representation. Bound fucose molecules are shown in green. their substrates, either plant cell wall components or animal glycoconjugates, are not linear homopolymers, but mostly branched glycans of very diverse composition. However, significant differences in linkage specificities (e.g. ␣-1,2, ␣-1,3, ␣-1,4, or ␣-1,6 bonds) have been documented for several ␣-L-fucosidases (1,42,43), and studies of complexes of oligosaccharides with inactive mutants will be necessary to shed light on the structural basis of fine substrate/linkage specificity.
Catalytic Mechanism-Retaining glycosidases operate through a double displacement mechanism that relies on two strategically located carboxyl groups 5.5 Å apart, either Asp or Glu. In the first step of catalysis, one of the carboxyl groups acts as a general acid, assisting aglycone departure, whereas the second carboxyl group attacks the anomeric carbon, generating a covalent glycosyl-enzyme intermediate. In the second step, the first carboxyl group acts as base, activating an incoming water molecule (or another acceptor group) for nucleophilic attack on the anomeric carbon of the glycosyl-enzyme intermediate, resulting in either hydrolysis or transglycosylation. The two steps proceed through transition states with substantial oxocarbenium ion character and give, as the net result of two inversions of configuration, a final product with an anomeric stereochemistry identical to that of the starting substrate.
Three carboxyl side chains are present in the active site pocket of TM aFuc, Glu 66 , Asp 224 , and Glu 266 , of which only Asp 224 is consistently found invariant in a multiple sequence alignment of all sequences available in glycosidase family GH29 (data not shown). Asp 224 was recently identified as the catalytic nucleophile of TM aFuc by trapping of a covalent glycosyl-enzyme intermediate using fucosyl fluoride, followed by proteolytic digestion and peptide sequencing by tandem mass spectrometry (22). Here, we report the crystal structure of this covalent glycosyl-enzyme intermediate obtained by soaking TM aFuc crystals in a solution containing fucosyl fluoride. The crystal structure represents the average structure of two distinct species: the enzyme with bound hydrolysis product, 2-deoxy-2-fluoro-L-fucose, and the trapped covalent glycosylenzyme intermediate. A similar coexistence of a glycosyl-enzyme intermediate and the product complex has previously been observed in the high resolution structures of other glycosyl-enzyme intermediates (44,45). Unbiased electron density maps before incorporation of the saccharide in refinement showed clear evidence for the covalent linkage to Asp 224 (Fig.  1B), and only modeling of the double species was consistent with the residual electron density maps.
Within this structure, the hydrolyzed 2-deoxy-2-fluoro-L-fucose is present in the standard 1 C 4 chair conformation and in the ␤-anomeric configuration, as also observed in the L-fucose complex. The covalent 2-deoxy-2-fluoro-L-fucosyl species is ␤-linked to Asp 224 and adopts a 3 S 1 conformation, resulting in an axial orientation of the glycosidic bond. This orientation would allow an in-line nucleophilic attack on the anomeric carbon of the glycosyl-enzyme by a water molecule (hydrolysis) or by an acceptor (transglycosylation). In retaining ␣-D-glycosidases, the distortion of the ␤-glycosyl-enzyme intermediate prior to the second step of catalysis is analogous to, and probably as important as, the distortion observed in the first step of catalysis by retaining ␤-D-glycosidases. Indeed, the recently determined structures of two trapped intermediates on an ␣-Dmannosidase reveal a somewhat similar distortion of the intermediate (in that case to a 1 S 5 conformation), also resulting in an axial orientation of the glycosidic bond (44). As has been observed in covalent intermediates with other retaining ␣-glycosidases, the carbonyl oxygen of the catalytic nucleophile is seen to interact with O-5 of the sugar ring (3.2 Å), possibly providing electrostatic stabilization at the transition state or ground state destabilization (44,46).
The oxocarbenium ion-like transition states in the hydrolysis reaction are stabilized by extensive donation of lone pair electrons from the endocyclic oxygen, resulting in a partial double bond character for the O-5-C-1 bond. This imposes co-planarity for C-5, O-5, C-1, and C-2 atoms, a geometry found only in pyranose conformations such as the 4 H 3 and 3 H 4 half-chair and the classical 2,5 B and B 2,5 boat conformations (47). In the pseudo-rotational itinerary of pyranoside ring interconversion, the 1 C 4 chair conformation observed in the product complex and the 3 S 1 conformation of the covalent glycosyl-enzyme intermediate are found on either side of the 3 H 4 half-chair conformation, which makes the latter the likely candidate for the transition state conformation.
Of the two remaining carboxylic acid residues in the active site, Glu 266 appears best positioned to act as the catalytic acid/base. This residue is positioned 5.5 Å from Asp 224 , a distance typical for the separation of nucleophile and acid/base residues in retaining glycosidases (48). Although Glu 66 is found 7.5 Å distant from Asp 224 , making it an unlikely candidate for the function of acid/base, such separation is not unknown for retaining glycosidases and therefore cannot be discounted (49). In the structure of the L-fucose complex, Glu 266 is located 3.75 Å from the equatorial O-1 of the ␤-anomer. In the ␣-anomer, this distance would be ϳ2.8 Å. Glu 66 makes a hydrogen bond to the O-4 hydroxyl group of L-fucose and is ϳ7.5 Å from the glycosidic oxygen.
Kinetic Analyses of the E66A and E266A Mutants-Because of the limited resolution of the structure of the L-fucose complex and because neither Glu 266 nor Glu 66 appears to be invariant in multiple sequence alignments of glycosidase family GH29, a mutational study was performed to ascertain the identity of the acid/base. Alanine mutants of both residues were prepared as maltose-binding protein fusion proteins, and their ability to catalyze the hydrolysis of pNP-Fuc and fucosyl fluoride was investigated. The E66A mutant enzyme was found to retain the ability to catalyze the hydrolysis of pNP-Fuc and fucosyl fluoride substrates with only modest drops in k cat of 10-and 5-fold, respectively. However, the K m was observed to increase dramatically (Table II). These data suggest a key role for Glu 66 in substrate binding, but demonstrate it to be a non-essential residue for catalytic function. In contrast, replacement of Glu 266 with alanine produced an enzyme with very low hydrolytic activity when pNP-Fuc was used as a substrate (Table II). No saturation of substrate was observed in this case, and the k cat /K m was 65,000-fold lower than that of the wild-type enzyme. In sharp contrast, however, this mutant was able to catalyze the hydrolysis of fucosyl fluoride with only a 60-fold decrease in k cat , with the K m being slightly lower than with the wild-type enzyme. Such a decrease in K m is expected if the intermediate accumulates, as would be expected in such a case.
That a yet lower K m value was not observed probably reflects the fact that the K m for the wild-type enzyme is already low, as deglycosylation is almost certainly rate-limiting. Indeed, it is quite probable that the K m values may reflect, in part, a transglycosylation process, thus partially reflecting binding of the acceptor sugar rather than the donor. Notwithstanding these considerations, the observation that this mutant is able to catalyze the hydrolysis of fucosyl fluoride, a substrate that does not require acid catalysis for aglycone departure, but is essentially inactive against pNP-Fuc, provides very strong evidence that this residue is the acid/base catalyst.
For further conformation of this assignment, chemical rescue experiments were performed with this mutant using pNP-Fuc as the substrate. It is well documented that small anions can bind in the cavities formed upon substitution of the catalytic carboxylate residues of glycosidases and in doing so promote catalytic ability by acting in their place (50). Formate was found to produce an 8-fold rate enhancement at 3 M, with the effect being linear over the range tested (0.7-3 M) (data not shown). The rescue experiments with formate were performed at pH 5, the pH optimum for this enzyme. It is therefore possible that formate, in its protonated formic acid form, can assist glycosylation by providing acid catalysis for aglycone departure, in addition to facilitating deglycosylation. This  apparent K m of 3 mM (data not shown). This scenario mirrors that observed with the D224A mutant reported previously, a 10-fold rate enhancement and a K m of 4 mM for azide being observed in that case (22). Another explanation for this observation could be due to a change in the rate-determining step, with azide accelerating the deglycosylation step sufficiently for glycosylation to become at least partially rate-limiting. Similar results were also obtained when fucosyl fluoride was used as the substrate, although the effects were more modest in this case. Formate gave rise to a maximum rate enhancement of 1.5-fold at 3 M formate, with the effect again being linear across the range investigated. Azide also gave a 1.5-fold rate increase at 300 mM azide, with no further increase being observed at higher concentrations; lower concentrations were not investigated in this case.
Comparison with Family GH27-Unexpectedly, the crystal structure of TM aFuc revealed a structural similarity, beyond the sole conservation of the overall fold, with enzymes from family GH27. The three-dimensional structures for two members of this family are known, viz. chicken ␣-N-acetylgalactosaminidase (Protein Data Bank code 1ktb) (52) and rice ␣-galactosidase (code 1uas) (53). An overlap of TM aFuc and chicken ␣-N-acetylgalactosaminidase in complex with ␣-N-acetylgalactosamine (Protein Data Bank code 1ktc) shows that several elements of the active site architecture are conserved, with the nucleophile and the acid/base residues near the C-terminal ends of ␤-strands 4 and 6, respectively, and a nice overlap of the bound ligands (Fig. 3). As a consequence of their chemical structures, N-acetyl-D-␣-galactopyranose is "reversed" with respect to ␣-L-fucopyranose, with the O-5 position of one occupying the C-2 position of the other. However, the axial/equatorial distribution of the functional groups remains similar except for C-2 and O-5. Other conserved features in the active site include a structurally conserved tyrosine, TM aFuc Tyr 171 and ␣-Nacetylgalactosaminidase Tyr 103 , lining the bottom of the active site pocket and donating a hydrogen bond to the nucleophile. In   addition, TM aFuc Arg 254 , corresponding to ␣-N-acetylgalactosaminidase Arg 197 , is involved in a hydrogen bond to the general acid/base. The common overall fold, the conserved domain organization, and the similarities of the active site architecture suggest a common evolutionary origin for the two enzyme families.
Human ␣-L-Fucosidase-A phylogenetic tree built from the sequence alignment of all known members of family GH29 (Fig.  4) clearly indicates that, unlike any other prokaryotic ␣-Lfucosidases, TM aFuc is closely related to ␣-L-fucosidases from the animal kingdom, with sequence identities ranging from 34 to 38%. This high level of sequence identity allows for reliable modeling of the human enzyme based on the crystal structure of TM aFuc. After extensive energy minimization, the derived model of the catalytic domain of human ␣-L-fucosidase is highly similar to that of TM aFuc and thus reflects the high level of sequence and structural conservation found between these two enzymes. The shape of the catalytic pocket is conserved between TM aFuc and the human enzyme, with 11 side chains of the 13 in contact with fucose conserved both in sequence and in position. However, drastic differences are found in the periphery of the catalytic site, where a 20-residue deletion within the surface loop following ␤-strand 1 (TM aFuc residues 44 -63) allows for an enlarged access to the catalytic site of human aFuc (Fig. 5). The sequence similarity between TM and human aFuc is comparatively lower in the region corresponding to the aglycone-binding site, making the homology model uncertain in this area. This variable region may reflect differences in substrate specificity between the two enzymes.
Mapping Gene Mutations: Implications for Fucosidosis-The clinical features associated with fucosidosis are divided into two subtypes (types 1 and 2). Whereas type 1 is the most severe infantile form, with a general deterioration of neurological function at 1-2 years of age, type 2 represents a milder form, and patients may survive into the 2nd or 3rd decade of life. The cause of this observed clinical variability of the disease is not yet understood, but rather than being related to the nature of fucosidosis mutations, it appears that other genetic (or nongenetic) factors are involved (54).
The currently known genetic defects causing fucosidosis include a premature stop defect, frameshift, and alteration of splicing in the human FUCA1 gene (Table III). Among the ϳ100 worldwide patients affected by fucosidosis, the most frequent mutation is a C-to-T transition, resulting in the generation of an "in-frame" stop codon at position 422 (human aFuc numbering). The Q281R mutation is responsible for the polymorphism detected by isoelectric focusing, and L405R leads to a mild clinical phenotype. In contrast, amino acid substitutions G60D, S63L, and N329Y result in a severe form of the disease. In each of these cases, ␣-L-fucosidase activity in these fucosidosis patients is nearly absent, and cross-reacting immunological material is only 6% of the normal level.
Sequence alignment shows that the corresponding residues in eukaryotic ␣-L-fucosidase homologs are invariant (Gly 60 , Ser 63 , and Asn 329 ) or conserved (Leu 405 ). The spectrum of fucosidosis mutations is distributed mostly in the catalytic domain, with only one (L405R) located in the C-terminal ␤-sandwich domain. None of these mutations are localized in the vicinity of the active site, and they do not disturb the oligomeric assembly of ␣-L-fucosidase. Instead, the homology model of human ␣-L-fucosidase clearly shows that Gly 60 and Leu 405 are located at structurally important sites, buried in a completely hydrophobic environment within the protein core. Therefore, substitutions G60D and L405R are likely to interfere with proper folding of the mutant proteins. Likewise, Ser 63 and Asn 329 , located in a more hydrophilic environment in the inte-rior of the protein, appear to be structurally important residues, and side chain substitution at these positions may result in misfolded proteins.
In contrast, most of the gene mutations leading to a premature end of translation occur near the linker region connecting the catalytic domain to the C-terminal ␤-sandwich domain. Presumably, a truncated polypeptide chain resulting from such mutations would not fold properly into a single domain structure due to the exposure of too many hydrophobic residues at the domain interface. These mutations thus highlight a potential functional role of the C-terminal ␤-sandwich domain, which is the protection of a hydrophobic patch on the surface of the catalytic domain. Carrying an entire domain adduct through the evolutionary process was presumably less demanding than the mutation of several surface residues.
In conclusion, this study provides the first structural template for an ␣-L-fucosidase. The overall fold, domain organization, and catalytic machinery of TM aFuc are reminiscent of glycosidases from family GH27, suggesting a common evolutionary origin for the two enzyme families. The structure of an enzyme-product complex enabled us to locate the catalytic pocket and explains the stereochemical basis for enzyme-substrate interactions. The crystal structure of a covalent fucosylenzyme intermediate, in conjunction with kinetic and mutagenesis studies, allowed us to unambiguously identify the general acid/base Glu 266 , an assignment that could not be derived solely from sequence alignments and thus reflects a unique feature of the GH29 family. Construction of a model of the human enzyme based on the high sequence identity between TM aFuc and mammalian ␣-L-fucosidases allowed us to map the gene mutations responsible for fucosidosis. The mutations are distributed throughout the protein core, suggesting that the disease is caused mainly by protein folding defects.