Identification of an archaeal α-l-fucosidase encoded by an interrupted gene. Production of a functional enzyme by mutations micking programmed -1 frameshifting. Vol. 278 (2003) 14622–14631

The analysis of the complete genome of the thermoacidophilic Archaeon Sulfolobus solfataricusrevealed two open reading frames (ORF), named SSO11867 and SSO3060, interrupted by a −1 frameshift and encoding for the N- and the C-terminal fragments, respectively, of an α-l-fucosidase. We report here that these ORFs are actively transcribed in vivo, and we confirm the presence of the −1 frameshift between them at the cDNA level, explaining why we could not find α-fucosidase activity in S. solfataricus extracts. Detailed analysis of the region of overlap between the two ORFs revealed the presence of the consensus sequence for a programmed −1 frameshifting. Two specific mutations, mimicking this regulative frameshifting event, allow the expression, in Escherichia coli, of a fully active thermophilic and thermostable α-l-fucosidase (EC 3.2.1.51) with micromolar substrate specificity and showing transfucosylating activity. The analysis of the fucosylated products of this enzyme allows, for the first time, assigning a retaining reaction mechanism to family 29 of glycosyl hydrolases. The presence of an α-fucosidase putatively regulated by programmed −1 frameshifting is intriguing both with respect to the regulation of gene expression and, in post-genomic era, for the definition of gene function in Archaea.

It has been known that carbohydrates can serve as structural components of natural products, as energy sources, or more interestingly, as key elements in various molecular recognition processes. In this regard, ␣-L-fucose is an important constituent of the carbohydrate chains of glycoconju-gates involved in a variety of biological events as growth regulators and receptors in signal transduction, cell-cell interactions, and antigenic response (1).
In plants, ␣-L-fucosylated oligosaccharides derived from xyloglucan, a plant cell wall component that controls cell expansion, have been shown to regulate auxin-and acid pH-induced growth (2). In mammals, oligosaccharides containing fucose have been found, for instance, in human milk and in blood group substances (3), and they are reported to play important roles in fertilization (4) and in adhesion processes of viruses, bacteria, and other parasites (5). Changes in fucosylation patterns have been observed in several physiological events including pregnancy (6), programmed cell death of different cell types (7), and in a variety of pathological events including diabetes (8) and colon and liver carcinomas (9,10). In addition, the determination of ␣-fucosidase activity can be used to predict the development of colorectal, ovarian, and hepatocellular carcinomas (11)(12)(13), whereas the deficiency in this enzyme causes fucosidosis, a well known lysosomal storage disorder (14). The central role of fucose derivatives in biological events explains the interest in ␣-L-fucosidase and fucosyltransferase activities.
Family 29 of glycosyl hydrolases classification (GH29) 1 groups ␣-fucosidases (EC 3.2.1.51) from a variety of sources, including human and several pathogenic bacteria (15). No structural data are available about this class of enzymes, and the residues involved in catalysis are still unknown. Recently, it has been shown (16) that the ␣-fucosidase from Thermus sp. Y5 performs the hydrolytic reaction with the retention of the anomeric configuration. The analysis of the genome of the hyperthermophilic Archaeon Sulfolobus solfataricus (17) revealed the presence of two ORFs, annotated as SSO11867 and SSO3060, separated by a Ϫ1 frameshift, that are homologous to the N-and the C-terminal fragments, respectively, of fulllength bacterial and eukaryal GH29 fucosidases.
In the genome of this Archaeon several interrupted genes, conserved in distantly related Archaea, have been identified, arguing that these interruptions were not sequencing mistakes. Rather, since no multiple interruptions could be found, a selection toward a single conserved shift occurs, suggesting a conserved translational regulation mechanism. 2 The maintenance of a correct reading frame is fundamental to the integrity of the translation process; nevertheless, an increasing number of cases have been described in which localized deviations from the standard translational rules are used to regulate the correct expression of a minority of genes (18). These events, named recoding, are used to increase the diversity in gene expression or for its regulation, and they include programmed ribosome frameshifting to a different reading frame, ribosome hopping over nucleotides, and reading of stop codons as sense codons (readthrough). Among recoding events, programmed Ϫ1 frameshifts are by far the most prevalent (19); they have been well characterized in RNA viruses, but the basic molecular mechanisms governing these events are almost identical from yeast to humans (20). In Archaea, documented recoding events are limited to readthrough (21); no proofs of programmed Ϫ1 frameshifting have been reported in this domain so far.
To test the hypothesis that SSO11867 and SSO3060 ORFs could express a functional ␣-fucosidase, we have cloned and expressed these genes in Escherichia coli. Two specific mutations, designed on the basis of the programmed Ϫ1 frameshifting mechanism, allow the expression of a full-length thermophilic and thermostable ␣-L-fucosidase, with micromolar substrate specificity, which promotes transfucosylation reactions by following a retaining reaction mechanism. This is the first evidence of an ␣-fucosidase from Archaea, and our data unequivocally demonstrate that GH29 enzymes follow a retaining reaction mechanism. Furthermore, the data presented give support to the hypothesis that translational recoding could be used to regulate gene expression in Archaea.
Substrates-All commercially available substrates were purchased from Sigma.
The standard program used was as follows: hot start 5 min at 95°C, 2 min at 48°C, and 4 min at 72°C; 10 cycles at 95°C for 45 s, 48°C for 1 min, and 72°C for 4 min; 20 cycles at 95°C for 45 s, 58°C 1 min, and 72°C 4 min; final extension at 72°C for 10 min. The PCR products were directly sequenced by using an automatic sequencer. To prepare the total RNA, S. solfataricus P2 strain cells were grown at 0.6 A 600 (midexponential phase) in the indicated culture medium; cells were lysated by three cycles of freeze-thawing (2 min at Ϫ70°C; 2 min 37°C), and total RNA was extracted by the RNeasy Kit (Qiagen, Germany). Contaminating DNA was eliminated by digestion with DNase I RNase-free (Promega). Reverse transcriptase (RT)-PCR was performed by using the Titan One Tube RT-PCR system (Roche Molecular Biochemicals) and the same oligonucleotides shown above. The PCR program used is as follows: cDNA synthesis and pre-denaturation for 31 min at 50°C and 2 min at 94°C; amplification by 40 cycles of 15 s at 94°C, 30 s at 45°C, and 2 min at 72°C; final extension of 10 min at 72°C. The cDNA products obtained were directly sequenced as described above, with no further purification steps.
␣-Fucosidase activity was searched in extracts of S. solfataricus grown in different culture media (minimal salts medium supplemented with yeast extract (0.1%) or casamino acids, glucose, fucose, or sucrose (each at 0.1%), and different combinations of these). Enzymatic assays in S. solfataricus extracts were performed by using up to about 0.65 mg of crude extracts in standard conditions (see below for the standard enzymatic assay) on p-nitrophenyl-␣-L-fucopyranoside (pNp-Fuc) at 75°C for up to 5 h; at these conditions the hydrolysis of the substrate was identical to that of the blank mixture without protein.
Plasmids Preparation-The SSO3060 ORF, encoding for the C-terminal fragment of the ␣-fucosidase, was cloned by amplification of S. solfataricus, strain P2, chromosomal DNA via PCR by using the standard program (see above), and the following synthetic oligonucleotides (Genenco, Florence, Italy): ␣-fuc-1, 5Ј-GGGAATTCATATGTTCACTGG-AGAGAATTGGGAACCGTA-3Ј; ␣-fuc-2, 5Ј-CGCGGATCCCTATCTAT-AATCTAGGATAACCC-3Ј which introduce an NdeI and BamHI sites at the 5Ј, just before the first ATG, and at the 3Ј ends of the ORF, respectively. The resulting DNA fragment was cloned in the pET29a plasmid (Novagen), obtaining the vector pET-3060, in which SSO3060 ORF is under the control of the isopropyl-1-thio-␤-D-galactopyranosideinducible T7 RNA polymerase promoter that drives high expression levels in bacterial hosts. The ORF obtained after amplification was controlled by DNA sequencing.
The DNA fragment containing SSO11867 plus SSO3060 was cloned by PCR by using the standard program, the synthetic oligonucleotide ␣-fuc-3 described above, and ␣-fuc-4, 5Ј-GAGGAAGATCTCTATCTAT-AATCTAGGATAACCC-3Ј. Both oligonucleotides introduce BglII sites at the 5Ј, just before the first ATG of the SSO11867 ORF, and at the 3Ј end of the SSO3060 ORF. The resulting DNA fragment was cloned in the pGEX-2TK plasmid (Amersham Biosciences). In the plasmid obtained, pGEX-11867/3060, GST was fused to the N-terminal of the SSO11867 gene product; the fusion and the entire DNA fragment obtained after amplification were controlled by DNA sequencing.
Site-directed Mutagenesis-The vector pGEX-11867/3060, described above, was used as template for site-directed mutagenesis experiments (24). The synthetic oligonucleotides used were ␣-fuc-3 and ␣-fuc-4, and the following mutagenic oligonucleotide FrameFuc, 5Ј-phosphate-GTT-ACTGGGCCGAAATTCTTTAGGTGATATTGG-3Ј where the mismatched nucleotides are underlined. The DNA fragment containing the mutations was subcloned in the vector pGEX-11867/3060; the mutant clone, named pGEX-frameFuc, was identified by direct sequencing and was completely re-sequenced.
Protein Purification-E. coli BL21(DE3)/pET-3060 was grown in 2 liters of Super Broth at 37°C. Gene expression was induced by the addition of 1 mM isopropyl-1-thio-␤-D-galactopyranoside when the culture reached an A 600 of 1.0. Growth was allowed to proceed for 16 h, and cells were harvested by centrifugation at 5,000 ϫ g and frozen at Ϫ20°C. The resulting cell pellet was thawed, resuspended in 2 ml g Ϫ1 cells of 50 mM sodium phosphate buffer, pH 7.4, 150 mM NaCl, 1% (v/v) Triton X-100 (PBS-Triton buffer), and homogenized by treatment with a cell disruption equipment (Constant Systems Ltd., Warwick, UK). After disruption, the homogenate was diluted 1:1 with the same buffer and centrifuged for 30 min at 30,000 ϫ g; cell debris was discarded, and the crude extract was diluted 10-fold in the same buffer; no activity on pNp-Fuc at 65°C was observed (see below for the standard enzymatic assay).
Growth of E. coli BL21(RB791)/pGEX-11867/3060 and total proteins extraction was performed as described for pET-3060. The crude extract was applied to a glutathione-Sepharose 4B column (Amersham Biosciences) that had been equilibrated with the same buffer. After 10 column volumes washing with PBS buffer (without Triton), the fusion protein was eluted from the column by the addition of 500 mM Tris-HCl, pH 8.0, supplemented with 10 mM reduced glutathione, at room temperature (22-25°C); the eluate is collected in 1.5-ml volume fractions and assayed for GST activity at 25°C. Active fractions were collected and stored at 4°C. The active pool was then subjected to thrombin treatment; to this aim, pooled fractions (about 30 ml) were incubated at 4°C overnight with 30 units of thrombin solution (Amersham Biosciences). This sample, analyzed by SDS-PAGE, did not revealed any band compatible with the molecular weight of the full-length ␣-fucosidase but showed low activity on pNp-Fuc at 65°C.
Mutant ␣-fucosidase was expressed from E. coli BL21(RB791)/pGEX-frameFuc and extracted as described for pET-3060. The GST binding was performed by adding 3 ml of the glutathione-Sepharose 4B matrix (Amersham Biosciences), equilibrated with the same buffer, to the crude extract and incubated overnight at 4°C. After the binding, the matrix was packed, and after 30 column volumes washing with PBS buffer (without Triton), the matrix was resuspended in 1 volume of PBS buffer and incubated overnight at 4°C with 60 units of thrombin solution. The efficiency of thrombin cleavage was tested by loading onto SDS-PAGE an amount of the matrix slurry before and after the thrombin treatment. Thereafter, the soluble and GST-free ␣-fucosidase protein was recovered by 5 column volumes washes with PBS buffer. Washes containing the ␣-fucosidase protein were pooled and concentrated by ultrafiltration on an Amicon YM30 membrane (cut-off 30,000 Da). After this treatment, the ␣-fucosidase was Ͼ95% pure by SDS-PAGE and was used for all the subsequent characterizations. The purification procedure yielded about 2 mg of pure protein from 13.7 g of wet cell pellet. The sample stored at 4°C is stable for several months. Direct sequencing of the N-terminal of the purified enzyme produced the sequence: Ser-Val-Gly-Ser-Met-Ser-Gln-Asn-Ser-Tyr-Lys-Ile-Leu-Lys-, in which the underlined amino acids correspond to the N terminus of SSO11867.
Enzyme Characterization-The standard assay of ␣-fucosidase activity was performed at 65°C in 50 mM sodium phosphate buffer at pH 6.3, with pNp-Fuc substrate at the final concentration of 1 mM. The molar extinction coefficient of p-nitrophenol is 9340 M Ϫ1 cm Ϫ1 measured at 405 nm, at 65°C, in 50 mM sodium phosphate buffer, pH 6.5. One unit of enzyme activity was defined as the amount of enzyme catalyzing the hydrolysis of 1 mol of substrate in 1 min at the conditions described. Spontaneous hydrolysis of the substrate (about 0.1%) was subtracted by using appropriate blank mixtures without enzyme. The kinetic constants of the ␣-fucosidase mutant (Ss␣-fuc) on pNp-Fuc were measured at 65°C in 50 mM sodium phosphate buffer, pH 6.3, by using substrate concentrations in the range 0.005-3 mM. The protein concentration in the reaction mixture was 0.4 g ml Ϫ1 . All kinetic data were calculated as the average of at least two experiments and were plotted and refined with the program GraFit (25).
Thermal activity of Ss␣-fuc was analyzed by assaying the enzyme (0.8 g) on pNp-Fuc substrate concentrations of 1 and 3 mM in the temperature ranges of 40 -65 and 70 -95°C, respectively. Thermal stability was tested by incubating pure enzyme (0.08 mg ml Ϫ1 ) in PBS buffer at the indicated temperatures, as reported previously (26).
Molecular mass of denatured Ss␣-fuc was determined by SDS-PAGE in both reducing and non-reducing conditions. Molecular mass of native Ss␣-fuc was determined by gel filtration on a Superose 6 column 26/60 HiLoad (Amersham Biosciences) runs in PBS buffer at a 0.3 ml min Ϫ1 flow rate; molecular weight markers were run under the same conditions.

Identification and Sequence Analysis of the ␣-Fucosidase
Locus-S. solfataricus, strain P2, is a hyperthermophilic Archaeon able to grow at acidic pH (pH 3-5) and at high temperatures (80 -87°C). In an effort to determine the full set of glycosyl hydrolases produced by this Archaeon, we analyzed the ORFs putatively encoding for these enzymes in the sequenced genome (17). Two of these ORFs, SSO11867 and SSO3060, encode for 81 and 426 amino acids polypeptides, respectively, and are homologous to the N-and C-terminal parts, respectively, of GH29 ␣-fucosidases (Fig. 1) (15). SSO11867 and SSO3060 are separated by a Ϫ1 frameshift in a 40-base overlap (Fig. 2); this is consistent with the observation that no ␣-fucosidase activity could be found in S. solfataricus extracts obtained from cells grown on different media (glucose, fucose, or sucrose as the only energy sources or in combination with yeast extract and casamino acids). To test if the frameshift was due merely to sequencing errors, DNA fragments containing the overlapping region were amplified from the genome of S. solfataricus P2 and MT4 strains, which are strictly taxonomically related (29) and directly sequenced. The sequences were identical to the one published (Fig. 3A), confirming the presence of a Ϫ1 frameshift between these two ORFs. Moreover, a cDNA fragment was amplified by RT-PCR from a total RNA preparation of S. solfataricus, strain P2, and directly sequenced (Fig. 3B). Again, the obtained sequence turned out to be identical to that available from the data bank with no ambiguities (Fig. 3C), indicating that the population of RNA amplified by RT-PCR was identical to the genomic DNA; this result excludes the possibility of RNA-editing events and demonstrates that SSO11867 and SSO3060 are co-transcribed in vivo.
Experiments of primer extension on the same total RNA preparations were performed with oligonucleotides that anneal on the SSO11867 and SSO3060 ORFs (Fig. 4). The primer Fext2 showed a specific transcriptional initiation site nine nucleotides upstream from the first ATG of the SSO11867 ORF (Fig. 4); this was confirmed by primer Fext1, which, however, showed additional initiation sites (Fig. 4). The strongest signal showed by Fext1 maps on the third base of the first putative ATG of the C-terminal ORF SSO3060 (Fig. 4). However, the expression of the ORF SSO3060 in E. coli as pET3060 did not produce any protein band of the expected molecular weight (46.5 kDa), and no detectable ␣-fucosidase activity on the pNp-Fuc substrate at 65°C was found. This suggests that this ORF could not express a functional enzyme independently; this is not surprising because in the N-terminal ORF SSO11867 an amino acidic sequence conserved among GH29 ␣-fucosidases can be found (Fig. 1).
Potential promoter sequences were found at Ϫ26 nucleotides from the transcription initiation site upstream from SSO11867, whereas no such consensus could be identified in the region upstream from SSO3060 (Fig. 4).
Preparation of a Full-length ␣-L-Fucosidase-The complete S. solfataricus ␣-fucosidase locus, with SSO11867 and SSO3060 in different reading frames (pGEX-11867/3060), drives the expression in E. coli of trace amounts of ␣-fucosidase activity; after removal of GST a specific activity of 2.3 ϫ 10 Ϫ2 units mg Ϫ1 on pNp-Fuc at 65°C was found. A detailed analysis of the DNA sequence of the region of overlap between the two ORFs revealed the presence of a stretch of six adenines followed by a thimine (Fig. 2) that resembles one of the heptamers that are involved in programmed Ϫ1 frameshifting (19). Typically, the sites cis-regulating these events consist of a "slippery" heptameric sequence of the general form X-XXY-YYZ (codons are in the zero frame, and X, Y, and Z can be identical or different nucleotides) and often include an upstream Shine-Dalgarno sequence and a downstream mRNA secondary structure. In fact, it has been reported that a Shine-Dalgarno sequence along the mRNA and pseudoknots or stem-loops promote the pausing of the ribosome (19). The sequence of overlap between the ORFs SSO11867 and SSO3060 presents a similar organization. The slippery sequence A-AAA-AAT was immediately followed by a stem loop, and the rare codon CAC, which is used at low frequency by S. solfataricus (4.7 per FIG. 2. The ␣-fucosidase locus in S. solfataricus. The N-terminal SSO11867 ORF is in the zero frame, and the C-terminal SSO3060 ORF, for which only a fragment is shown, is in the Ϫ1 frame. The slippery heptameric sequence is underlined; the rare codon is boxed, and the arrows indicate the stem of the putative mRNA secondary structure. The amino acids involved in the programmed Ϫ1 frameshift and the first codon translated after this event in the Ϫ1 frame are shaded. ␣-Fucosidase from S. solfataricus thousand in the S. solfataricus genome), was found upstream from the slippery sequence (Fig. 2). This rare codon presumably plays the function of the Shine-Dalgarno sequence, which is rarely observed in isolated genes in this Archaeon (30), by inducing the pausing of the ribosome and increasing the frequency of the frameshifting event. These observations raised the hypothesis that a programmed Ϫ1 frameshift could promote the expression of an active enzyme in S. solfataricus; however, testing this hypothesis in vivo is impaired by the lack of molecular genetic tools for hyperthermophilic Archaea.
To test whether a functional ␣-fucosidase could be produced, a single frame between the ORFs was restored by site-directed mutagenesis. In the mechanism of programmed Ϫ1 frameshifting proposed for Eukarya and bacteria, two tRNAs, hybridized to the XXY and YYZ codons of the X-XXY-YYZ sequence, are proposed to slip simultaneously backwards on the mRNA to the Ϫ1 frame, hybridizing to XXX and YYY codons. The AAT triplet, coding for Asn-78 in SSO11867, corresponds to the YYZ codon, and is the last one decoded in the zero frame (19); after this triplet, the ribosome would shift onto the TTC codon of the Phe-10 (SSO3060 numbering) continuing the translation in the Ϫ1 frame (Fig. 2). To obtain the fused gene, we performed site-directed mutagenesis in the pGEX-11867/3060 vector in which glutathione S-transferase (GST) enzyme was fused to ␣-Fucosidase from S. solfataricus the N-terminal of SSO11867. On the basis of the mechanism proposed, a T in the region following the slippery heptamer was introduced; moreover, we introduced the conservative mutation AAA3 AAG (encoding for Lys-77 in SSO11867) to increase the translational fidelity by disrupting the heptameric sequence. These mutations changed the putative Ϫ1 frameshifting site from CTA-AAA-AAT-TCG-GCC (zero frame, slippery heptamer underlined) to CTA-AAG-AAT-TTC-GGC (the nucleotides in boldface were originally in the Ϫ1 frame; the mutations are underlined). As a result, the ORFs are in the same translation frame producing a single polypeptide.
The obtained pGEX-frameFuc plasmid was used to express the mutant enzyme in E. coli. An intense protein band of molecular weight compatible with the GST-␣-fucosidase fusion (84.4 kDa) was observed after affinity chromatography of the crude extracts on glutathione-Sepharose 4B (Fig. 5); the re-moval of GST by thrombin cleavage was performed directly onto the column producing, by a single purification step, a protein more than 95% pure (Fig. 5). The recombinant enzyme showed the expected 57-kDa molecular mass (corresponding to the predicted full-length polypeptide of 495 amino acids) and revealed a specific activity of 32.1 units mg Ϫ1 on pNp-Fuc at 65°C; it was termed Ss␣-fuc.
Characterization of the ␣-Fucosidase-The native molecular weight of the enzyme, expressed and purified as described above, was analyzed by gel filtration under native conditions: a single peak containing thermophilic ␣-fucosidase activity eluted between thyroglobulin (660 kDa) and ferritin (490 kDa) for a molecular mass of about 508 Ϯ 22 kDa, suggesting that Ss␣-fuc could be a decamer in solution (Fig. 6). The enzyme has a broad pH dependence on pNp-Fuc at 65°C, showing almost the same specific activity in the range pH 3.3-6.3 on different ␣-Fucosidase from S. solfataricus buffer systems (data not shown). The thermal activity of Ss␣fuc is reported in Fig. 7A; the activity on pNp-Fuc increased sharply up to the optimal temperature of 95°C, the highest temperature tested. The calculated value of activation energy (E a ) for the hydrolytic reaction of this substrate was 91 Ϯ 2 kJ mol Ϫ1 .
The residual activity of Ss␣-fuc after preincubation at temperatures ranging from 75 to 95°C was followed for up to 2 h on pNp-Fuc at 65°C (Fig. 7B); the enzyme displayed high stability at 75°C, showing even a 40% activation after 30 min of incubation and maintaining 60% residual activity after 2 h at 80°C. The calculated activation energy for the inactivation reaction, calculated from the Arrhenius plot shown in the inset of Fig.  7B, turned out to be 1120 Ϯ 167 kJ mol Ϫ1 , more than 10-fold the E a measured for the catalyzed reaction, confirming the extreme thermal stability of Ss␣-fuc.
Ss␣-fuc Promotes Transfucosylation Reactions-In order to test the transfucosylating activity of Ss␣-fuc, we investigated the pyranosidic acceptor pNp-Glc. In the reaction in which pNp-Fuc and pNp-Glc were the donor and the acceptor substrates, respectively, in 5 h Ss␣-fuc cleaved the donor, forming fucosylated products with 14% total yield with respect to pNp-Fuc. Interglycosidic linkages were determined by NMR spectroscopy (COSY and 1 H-13 C NMR correlation), and the signals were assigned as reported under "Experimental Procedures." In the COSY spectrum of one of the products, following the correlations through pyranosidic protons of ␣-D-Glc unit and starting from the anomeric signal at 5.74 ppm (J ϭ 3.9 Hz), it is easy to detect H3 proton, which correlates with carbon signal at 74. The analysis by TLC of the hydrolysis reaction mixture con-taining only pNp-Fuc revealed the presence of fucose, of pNp, and of a compound that, after acetylation and NMR spectroscopy, was identified as ␣-L-Fuc-(1-3)-␣-L-Fuc-O-pNp. This disaccharide was used as substrate for enzymatic hydrolysis; interestingly, this reaction mixture revealed, by TLC analysis, the presence of fucose and pNp-Fuc products after incubation at 65°C for 3 min and the complete hydrolysis of the substrate after 10 min (Fig. 8). This indicated that the enzyme cleaved the aryl disaccharide starting from its non-reducing end.

DISCUSSION
Among the disrupted genes in S. solfataricus, we have identified two ORFs, SSO11867 and SSO3060, homologous to eukaryal and bacterial ␣-fucosidases from family 29 of glycosyl hydrolases; the presence of a Ϫ1 shift between these ORFs was confirmed, and we found that they are actively co-transcribed in S. solfataricus. A complete programmed Ϫ1 frameshifting regulation site has been identified, suggesting that SSO11867 and SSO3060 could be expressed in vivo by following this mechanism. Remarkably, the two ORFs express in E. coli trace amounts of ␣-fucosidase activity, confirming that the frameshifting cassette promotes non-regulated frameshifting during overexpression of foreign genes in E. coli (19). Primer extension analysis showed two major transcriptional initiation sites: the first mapped upstream from the two ORFs, and the second started from the G of the first ATG of the SSO3060 ORF. This ORF could not express a functional protein with ␣-fucosidase activity in E. coli, suggesting that post-transcriptional cleavage may occur on the longer transcript. However, the possibility that SSO3060 expresses in S. solfataricus a polypeptide with a different function, thus explaining the programmed Ϫ1 frameshifting regulation, cannot be ruled out; experiments are in progress to study the expression in vivo of these genes.
Two mutations, in key positions to mimic programmed Ϫ1 frameshifting, made it possible to drive the expression of the full-length ␣-fucosidase Ss␣-fuc that is optimally active at 95°C. Remarkably, the enzyme is stable at 80°C, the temperature at which S. solfataricus optimally grows; these observations are strong evidences that the enzyme produced by following the programmed Ϫ1 frameshifting could be stable and active in vivo. These results indicate that SSO11867 and SSO3060 ORFs encode for the first archaeal ␣-fucosidase identified so far, suggesting that they may not be pseudogenes and that programmed Ϫ1 frameshifting could be present in this living domain.
The substrate characterization revealed that Ss␣-fuc cata- ␣-Fucosidase from S. solfataricus lyzes the hydrolysis of pNp-Fuc with high efficiency and specificity; in the framework of our mechanistic studies on thermophilic glycosyl hydrolases, the reaction mechanism of Ss␣-fuc was analyzed in detail. Glycosyl hydrolases follow two distinct mechanisms that are termed retaining or inverting if the enzymatic cleavage of the glycosidic bond liberates a product with the same or the opposite anomeric configuration of the substrate, respectively (32). Both classes of enzymes employ a pair of carboxylic acid residues in catalysis and operate via transition states with substantial oxocarbenium ion character. The alignment of the amino acid sequences of the 24 GH29 ␣-fucosidases shown in Fig. 1 revealed two residues of aspartic/ glutamic acid that are completely conserved in this family (Asp-124 and Asp-146 in Ss␣-fuc numbering) and are the best candidates to be involved in catalysis as the nucleophile or the acid/base of the reaction (not shown); experiments are currently in progress to test the functional role of these residues.
In retaining enzymes, when acceptors different from water intercept the reactive glycosyl-enzyme intermediate, they promote transglycosylation reactions (Fig. 9). The ability of Ss␣fuc to function in the transglycosylation mode allowed us to demonstrate experimentally that the enzyme catalyzes the formation of the ␣-(1,2) and ␣-(1,3) bonds between fucose and the glucose of pNp-Glc. The ␣-anomeric configuration of the interglycosidic linkages in the products unequivocally indicates that GH29 ␣-fucosidases follow a retaining reaction mechanism; this is the first time that the mechanism followed by glycosyl hydrolases from this family has been experimentally demonstrated by following transglycosylation reaction. Moreover, the activity observed on ␣-L-Fuc-(1-3)-␣-L-Fuc-O-pNp revealed that Ss␣-fuc is an exo-glycosyl hydrolase that attacks the substrates from their non-reducing end. The disaccharide ␣-L-Fucp-(1-3)-␣-D-Glc-O-pNp was synthesized previously (33) by using a mesophilic ␣-L-fucosidase in a reaction using the fluoroderivative of the donor and glucose as acceptor in 34% yield. In our case no efforts were made for the optimization of the reaction conditions (keeping as low as possible the acceptor/ donor equivalent ratio useful for synthetic application with rare acceptors). However, the thermophilic nature of Ss␣-fuc is of interest for biotechnological exploitation of its transferring capability (34).
Family 29 of glycosyl hydrolases includes enzymes from plants, vertebrates, and pathogenic microbes of plants and humans. The data presented here demonstrate that this enzymatic activity is present in all the three living domains. The ORF TM0306 from the bacterium Thermotoga maritima, putatively encoding for an ␣-fucosidase, is the only other known example from hyperthermophiles (35); recently, an ␣-L-fucosidase secreted by the moderate thermophile Thermus sp. Y5 has been isolated and characterized (16). Interestingly, Ss␣-fuc showed the highest amino acid sequence identity (40%) with a putative ␣-fucosidase from the phytopathogen bacterium Xanthomonas (36), whereas only 25% identity with the T. maritima enzyme was observed; this is surprising because higher similarity between enzymes from hyperthermophiles would be expected. Structural comparisons of Ss␣-fuc with Thermus sp. enzyme, which is a secreted tetrameric protein, are hampered by the only partial sequence available for the eubacterial enzyme; however, Ss␣-fuc did not show signal peptides for secretion, and no ␣-fucosidase activity was found in S. solfataricus media.
␣-Fucosidases in higher plants and in mammals are associated with different mechanisms of cell growth and regulation, because they are involved in the modification of fucosylated glucans (1). By contrast, at present, there are only limited data on ␣-fucosidases from eukaryal and bacterial microorganisms.
The presence, in S. solfataricus, of glycosylated proteins has been reported recently (37); although the composition of their carbohydrate moiety is unknown, different enzymatic activities for the synthesis, modification, and hydrolysis of these glycosidic bonds should be present in this Archaeon and regulated in some manner. The activity of Ss␣-fuc (efficient hydrolysis of aryl fucosides, very high substrate specificity, transfucosylation capability) may suggest its involvement in vivo in these biological processes.
Relatively little is known about the function of the 22 glycosyl hydrolases of S. solfataricus; several genes encoding for this class of enzymes, including SSO11867 and SSO3060, an ␣-glucosidase (SSO3051), a ␤-glucuronidase (SSO3036), a ␤-xylosidase (SSO3032), and the clustered ␣-xylosidase (XylS), and ␤-glycosidase (Ss␤-gly) (SSO3022 and SSO3019, respectively) map in the same region of about 50 kb and are likely to be involved in the degradation of sugars for energy metabolism. In particular, we reported previously that XylS and Ss␤-gly hydrolyze xyloglucan oligosaccharides in vitro in a cooperative way (22). In this regard, a similar situation is present in T. maritima, in which the ␣-fucosidase gene is part of a cluster of six ORFs potentially involved in xyloglucan utilization (22); remarkably, S. solfataricus, strain P2, hydrolyzes xyloglucan from tamarind seed, which is not fucosylated. 3 The ability of Ss␣-fuc to hydrolyze short fucosylated oligosaccharides is revealed by the rapid hydrolysis observed using ␣-L-Fuc-(1-3)-␣-L-Fuc-O-pNp substrate. The hypothesis that Ss␣-fuc, XylS, and Ss␤-gly cooperate in the sequential hydrolytic steps on fucosylated xyloglucans from plant cell walls (38) is tempting and is currently under investigation, in the hope of shedding some light on a novel metabolic pathway in S. solfataricus for the utilization of this hemicellulose as a carbon source.