Cloning, Expression, Characterization, and Nucleophile Identification of Family 3, Aspergillus niger b -Glucosidase*

The b -glucosidase from Aspergillus niger (CMI CC 324262) was purified, and an N-terminal sequence and two internal sequences were determined. Bgl I genomic gene and the cDNA were cloned from a genomic library and by reverse transcriptase-polymerase chain reaction, respectively. The cDNA was successfully expressed in Saccharomyces cerevisiae and Pichia pastoris . Sequence analysis revealed that the gene encodes a 92-kDa enzyme that is a member of glycosidase family 3. 1 H-NMR analysis of the reaction catalyzed by this enzyme confirmed that, in common with other family 3 glycosidases, this enzyme hydrolyzes with net retention of anomeric configuration. Accordingly, the enzyme was inactivated by 2-deoxy-2-fluoro b -glucosyl fluoride, with kinetic parameters of k i 5 4.5 min 2 1 , K I 5 35.4 m M , through the trapping of a covalent glycosyl enzyme intermediate. The catalytic competence of this intermediate was demonstrated by the fact that incubation with linamarin resulted in reactivation, presumably via a transglycosylation mechanism. Peptic digestion of the 2-deoxy-2-fluoroglucosyl enzyme and subsequent analysis of high pressure liquid chromatography eluates

␤-Glucosidases (EC 3.2.1.21; ␤-D-glucoside glucohydrolase) play a number of different important roles in biology, including the degradation of cellulosic biomass by fungi and bacteria, degradation of glycolipids in mammalian lysosomes, and the cleavage of glucosylated flavonoids in plants. These enzymes are therefore of considerable industrial interest, not only as constituents of cellulose-degrading systems, but also in the food industry (2,3).
Aspergillus species are known as a useful source of ␤-glucosidases (4 -6), and Aspergillus niger is by far the most efficient producer of ␤-glucosidase among the microorganisms investigated (4). Shoseyov et al. (7) have described a ␤-glucosidase from A. niger B1 (CMI CC 324626), which is active at low pH values as well as in the presence of high ethanol concentrations. This enzyme effectively hydrolyzes flavor compound glycosides in certain low pH products, such as wine and passion fruit juice, thereby enhancing their flavor (8 -11) and is particularly attractive for use in the food industry because A. niger is considered nontoxic (3). Other A. niger ␤-glucosidases have also been purified (12)(13)(14); however, differences in their properties have been reported, including ranges of molecular masses (116 -137 kDa) and isoelectric points (pI values of 3.8 -4) and pH optima (3.4 -4.5). Indeed, at least two ␤-glucosidases with distinct substrate specificities have been identified in commercial A. niger ␤-glucosidase preparations (15). To clear this confusion and also to allow protein engineering work to be performed it was important to clone, express, and characterize a ␤-glucosidase from this source. Although the cloning and expression of a functional A. niger ␤-glucosidase gene in Saccharomyces cerevisiae has been reported previously (16), the protein was not characterized, and the sequence was not published.
Glycosidases have been assigned to families on the basis of sequence similarities, there now being some 77 different such families defined containing over 2000 different enzymes (17) With the exception of the glucosylceramidases (family 30) all simple ␤-glucosidases belong to either family 1 or 3. Family 1 contains enzymes from bacteria, plants, and mammals including also 6-phospho-glucosidases and thioglucosidases; further, most family 1 enzymes also have significant galactosidase activity. Family 3 contains ␤-glucosidases and hexosaminidases of fungal, bacterial, and plant origin. Enzymes from both families hydrolyze their substrates with net retention of anomeric configuration, presumably via a two-step, double-displacement mechanism involving two key active site carboxylic acid residues (see Refs. 18 -20 for reviews of mechanism). In the first step, one of the carboxylic acids (the nucleophile) attacks at the substrate anomeric center, whereas the other (the acid/base catalyst) protonates the glycosidic oxygen, thereby assisting the departure of the aglycone. This results in the formation of a covalent ␤-glycosyl enzyme intermediate. In a second step this intermediate is then hydrolyzed by general base-catalyzed attack of water at the anomeric center of the glycosyl enzyme to release the ␤-glucose product. Both the formation and the hydrolysis of this intermediate proceed via transition states with substantial oxocarbenium ion character.
It seemed probable that the A. niger ␤-glucosidase would be a member of family 3 given that this family contains fungal enzymes of similar mass, including those from other Aspergillus sp. Mechanistic information on this family is relatively sparse, the best characterized being the glycosylated 170-kDa ␤-glucosidase from Aspergillus wentii studied by Legler. He showed that this enzyme carries out hydrolysis with net retention of anomeric configuration, labeled the active site with conduritol B-epoxide, and demonstrated that the residue labeled (an aspartic acid within the sequence VMSDW) was the same as that derivatized by the slow substrate D-glucal (1,21). Further he showed that the 2-deoxyglucosyl enzyme trapped by use of D-glucal was kinetically identical to that formed during the hydrolysis of pNP-2-deoxy-␤-D-glucopyranoside 1 (22). The full gene sequence for the A. wentii enzyme has not yet been determined, but the active site peptide identified aligns well with a highly conserved region in other family 3 glycosidases. Legler et al. (23) also carried out a detailed kinetic analysis of the enzyme, including measurement of Hammett relationships, kinetic isotope effects, and studies of the binding of potent reversible inhibitors such as gluconolactone and nojirimycin. Unfortunately no structural data are available on this enzyme. However, very recently the three-dimensional structure of another family 3 enzyme, the ␤-glucosidase from barley, has become available (24). This protein has a (␣/␤) 8 barrel fold with an active site containing several conserved residues, including candidates for the nucleophile and acid/base catalyst, which have been suggested, on the basis of this structure, to be Asp-285 and Glu-491, respectively. However the authors expressed some doubt as to which roles were played by these residues. Further, there are in fact six subgroups within family 3, with the A. niger enzyme and the barley enzyme belonging to quite different subgroups. This also leaves room for doubt as to whether all members do indeed use the same catalytic mechanism. There seemed, therefore, to be a need for unequivocal identification of the active site nucleophile in this enzyme. A reliable method for the identification of the active site nucleophile in retaining glycosidases has been developed that involves trapping of a 2-deoxy-2-fluoro-glycosyl enzyme intermediate by treatment of the enzyme with the appropriate 2-deoxy-2-fluoro ␤-glycosyl fluoride (25). Trapping of the intermediate is a consequence of inductive destabilization of the oxocarbenium ion-like transition states, thereby slowing its formation and hydrolysis, coupled with the use of an excellent leaving group to accelerate the first step, thereby rendering the intermediate kinetically accessible. The labeled residue can then be identified through standard protocols. This method should therefore be applicable to the A. niger enzyme, permitting the assignment of these two residues.
This manuscript describes the cloning, expression, purification, and characterization of the biotechnologically important ␤-glucosidase from A. niger. It also provides a direct identification of the active site nucleophile by use of the mechanismbased 2-deoxy-2-fluoro ␤-glycosyl fluoride, thereby removing any residual uncertainty left upon analysis of the three-dimensional structure of the barley enzyme.

MATERIALS AND METHODS
Enzyme Assays-The plate assay was conducted as follows: 0.5 mM (final concentration) MUGlc (Sigma) was dissolved in 45°C PC buffer (50 mM phosphate, 12 mM citric acid, pH 3.4). The solution was mixed with 3% agar in water that was boiled and cooled down to 45°C. This solution (20 ml) was poured into a Petri dish, and after solidification 10-l enzyme samples were spotted. The plate was incubated at 50°C for 1 h and then illuminated with long UV. An intense fluorescence was indicative of ␤-glucosidase activity. Detection of ␤-glucosidase in polyacrylamide gels was carried out by washing the SDS-polyacrylamide gel with 1:1 isopropanol:PC buffer to remove SDS and renature the enzyme. The gel was then washed once in PC buffer and incubated in a thin layer of a solution of 0.5 mM MUGlc. After incubation at 50°C for 1 h, the active protein band was visualized by UV light. Quantitative assays were performed using pNPGlc as a substrate according to Shoseyov (7).
Purification of A. niger ␤-Glucosidase-A crude preparation of A. niger B1 (CMI CC 324626) ␤-glucosidase was obtained from Shaligal Ltd. (Tel-Aviv, Israel). A sample (10 ml) of the crude enzyme (140 units/ml) was first diafiltered on a 50-kDa cut-off Amicon membrane (Amicon Corp., Danvers, MA), with 20 mM citrate buffer pH 5. The proteins were then separated on an fast protein liquid chromatography equipped with a Mono-Q RH 5/5 column (Amersham Pharmacia Biotech) equilibrated with the same buffer. The enzyme was eluted with a  linear gradient of 0 -350 mM NaCl. Active fractions were monitored and pooled (between 80 -110 mM NaCl). The partially purified enzyme was dialyzed against 20 mM citrate buffer, pH 3.5, applied to a Resource-S column equilibrated with the same buffer, and eluted with a gradient of 0 -1 M NaCl. The purified enzyme (eluted at 155 mM) was concentrated by ultrafiltration (50-kDa cut-off membrane, Amicon). Deglycosylation of A. niger ␤-Glucosidase by N-Glycosidase-F-A reaction mixture (total volume 12.5 l) containing 0.125 g of pure ␤-glucosidase (previously denatured by boiling for 3 min in 0.1% SDS and 5% ␤-mercaptoethanol), 0.2 units of N-glycosidase-F (Roche Molecular Biochemicals), 50 mM sodium phosphate buffer, pH 7.5, 25 mM EDTA, 1% Triton X-100, and 0.02% sodium azide was incubated for 4 h at 37°C. Reaction was stopped by the addition of PAGE sample application buffer and boiling for 3 min.
Proteolysis and N-terminal Sequences of A. niger B1 ␤-Glucosidase-Partial enzymatic proteolysis with Staphylococcus aureus V8 protease was carried out as described by Cleveland et al. (26), as follows. Five micrograms of fast protein liquid chromatography-purified ␤-glucosidase were concentrated by acetone precipitation. The protein was separated on preparative 10% SDS-PAGE. The gel was stained with Coomassie Blue to visualize the ␤-glucosidase band, destained, and rinsed with cold water; the protein band visualized in this manner was then excised, and the gel slice was applied to a second SDS-PAGE gel (15% acrylamide) and overlaid with S. aureus V8 protease. Digestion was carried out within the stacking gel by turning off the current for 30 min. When the bromphenol blue dye neared the bottom of the stacking gel, the current was then restored, and cleavage products separated in this manner by SDS-PAGE were electroblotted to polyvinylidene difluoride membranes. The native protein was also transferred to polyvinylidene difluoride. The N-terminal sequence of the native protein and two of the numerous cleavage products were analyzed by Edman degradation using a gas-phase protein sequencer (Applied Biosystems model 475A) microsequencer.
Cloning of bgl1 cDNA and Genomic Gene-Total RNA was isolated from A. niger B1 in the following manner: A. niger B1 was grown in liquid culture in mineral medium (NH 4 ) 2 SO 4 3 H 2 O (0.5 g/l), KH 2 PO 4 (0.2 g/l), MgSO 4 (0.2 g/l), CaCl 2 H 2 O (0.1 g/l), FeSO 4 6 H 2 O (0.001 g/l), ZnSO 4 7 H 2 O (0.001 g/l), and 2 mM citric acid, at pH 3.5 with 1% w/v bran as a carbon source. The medium was autoclaved, cooled, and inoculated with A. niger B1 (10 6 spores/ml). Baffled flasks were used with shaking at 200 rpm at 37°C. The appearance of ␤-glucosidase activity was monitored by placing 5 l of growth medium on 1% agar plates containing 0.5 mM MUGlc. Activity was detected after 15 h of shaking. The mycelium was harvested after 24 h, and medium was removed by filtering through GFA glass microfibre (Whatman Inter. Ltd., Maidstone, United Kingdom). The mycelium was then frozen with liquid nitrogen and ground to a fine powder with a mortar and pestle. Total RNA was then produced from this powder by the guanidine thiocyanate (TriReagent™) method (Molecular Research Center, Inc.).
cDNA was produced in the following manner. Reverse transcriptase reaction using the total RNA from the extraction described above was carried out with the Stratagene reverse transcriptase-polymerase chain reaction kit (Stratagene, La Jolla, CA). The reaction volume was 50 l and contained: 10 g of total RNA, 1 g of oligo(dT) 18 , 20 units of RNase Block Ribonuclease Inhibitor, 1 ϫ buffer (50 mM Tris-HCl, pH 8.3, 75 mM KCl, 10 mM dithiothreitol, 3 mM MgCl 2 ) 500 M of each dNTP, and 300 units reverse transcriptase. The RNA was denatured at 70°C, cooled slowly at room temperature to allow the annealing of primers before it was added to the reaction mixture. The reaction mixture was incubated at 37°C for 1 h and then heated at 95°C for an additional 5 min. The cDNA from the reaction was kept at Ϫ70°C and used for a PCR reaction with degenerate primers. Degenerate primers were synthesized based on part of the N-terminal sequence and an internal sequence determined by Edman degradation. Amino acid sequence 1:  Preparation of genomic DNA plasmid library was as follows: An A. niger B1 genomic library was constructed in the pYEAUra3 yeast/ Escherichia coli shuttle vector (CLONTECH Lab. Inc. Palo Alto, CA). A. niger B1 was grown in liquid culture as described above for total RNA isolation except that the mycelium was harvested after 48 h. The mycelium ground with liquid nitrogen was used to produce genomic DNA by the CTAB method of Murray and Thompson (27). The library was constructed from partially digested Sau3A genomic DNA cloned into the BamHI site of the pYEUra3 yeast shuttle vector (CLONTECH Lab. Inc. Palo Alto, CA). The pYEAUra3 yeast/E. coli shuttle vector was digested with BamHI and dephosphorylated with calf intestine alkaline phosphatase (CIP) to prevent self-ligation. The partially digested genomic DNA was then cloned into the shuttle vector with T4 ligase and used to transform TOP10 E. coli electro-competent cells, which were then plated on LB-agar (50 g/ml ampicillin). The 2.2-kilobase partial cDNA was digested with PstI to produce a 1.2-kilobase fragment DNA probe. A total of 4 ϫ 10 4 colonies were grown on LB-agar (50 g/ml ampicillin) plates and then blotted to Hybond™-N membranes. The colonies were screened using the 1.2-kilobase fragment. A sample (25 ng) of the probe was labeled with [ 32 P]dCTP by using the random sequence nanonucleotide rediprime DNA labeling system (Amersham Pharmacia Biotech). Positive clones were subcloned in pUC18, and nucleotide sequences were determined at the Weizmann Institute's Department of Biological Services, Rehovot, Israel.
The isolated cDNA was digested with NcoI and BamHI and cloned into a pET3d expression vector (Novagen Inc., Madison, WI). Positive E. coli BL21(DE3) pLysS colonies containing the bgl1 cDNA were confirmed by enzyme restriction and sequence analysis. Recombinant BGL1 was expressed according to the manufacturer's protocol.
Expression of bgl1 cDNA in S. cerevisiae and Pichia pastoris-The pYES2 vector (Invitrogen Inc, San Diego, CA) was used to successfully clone the bgl1 cDNA gene into the pYES2-bgl1 plasmid using the HindIII/BamHI sites and transform S. cerevisiae using the lithium acetate method (28). The BGL1 was expressed by inducing the Gal1 promoter according to the manufacturer's protocol. The S. cerevisiae strain INVSc2 (MATa, his3-D200, ura3-167) was used as the host. P. pastoris strain GS115 (his4 mutant) was used as the host for shuttle and expression vector plasmid pHIL-S1 (Invitrogen Inc, San Diego, CA). The bgl1 cDNA was cloned into the EcoRI/BamHI sites of pHIL-S1, yielding the pHIL-S1-bgl1 expression and secretion vector. Expression in P. pastoris was carried out according to the manufacturer's protocol. Screening of ␤-glucosidase-expressing clones was facilitated by top agar containing 50 mg of 5-bromo-4-chloro-3-indolyl ␤-D-glucopyranoside, 30 ml of methanol, and 1% agar/liter. Blue color indicated a colony producing active ␤-glucosidase.
Determination of the Stereochemical Course of Hydrolysis-The method was essentially as described by Wong et al. (29). pNPGlc (10 mol) was dissolved in 0.5 ml of 25 mM acetate buffer, pH 3.5, in D 2 O in an NMR tube. ␤-Glucosidase was lyophilized and redissolved in 100 l of D 2 O (35 units/ml). The 1 H-NMR spectrum of the substrate was recorded, then enzyme (10 l) was added, and spectra were recorded at time intervals on a Bruker AMX400 at 25°C.
Inactivation and Reactivation Studies-Pure enzyme (0.47 mg/ml) was incubated in the presence of various concentrations of 2-deoxy-2fluoro-␤-glucosyl fluoride (2FGlcF) (0.5-6 mM) in 30 mM citrate buffer, pH 4.8, at 50°C. Residual enzyme activity was determined at different time intervals by the addition of an aliquot (10 l) of the inactivation mixture to a solution containing 30 mM citrate buffer, pH 4.8, 8 g of bovine serum albumin and 0.625 mM 2,4-dinitrophenyl ␤-D-glucopyranoside (830 l). Release of 2,4-dinitrophenol was determined spectrophotometrically by measuring the absorbance at 400 nm 1 min after the addition of the substrate. Pseudo first-order inactivation rate constants at each inactivator concentration (k obs ) were determined by fitting each curve to a first-order equation using the program GraFit (30). Kinetic parameters for inactivation were determined by fitting these pseudo first-order rate constants at each inactivator concentration to the following expression.  Labeling and Proteolysis-To BGL1 (25 l, 10 mg/ml), 2FGlcF was added (166 mM, 2.5 l), and the mixture was incubated for 15 min at room temperature. Trifluoroacetic acid/H 2 O (10%, 2.5 l) was then added, and after 4 h at room temperature aqueous sodium phosphate (45 l, 50 mM, pH 7.0) was added to bring the pH to 2.0. Freshly prepared pepsin (25 l, 1 mg/ml in 50 mM phosphate, pH 2.0) was then added (giving a pepsin:␤-glucosidase ratio of 1:10), and the mixture was incubated overnight.
Electrospray Mass Spectrometry-Mass spectra were recorded on a PE-Sciex API 300 triple quadrupole mass spectrometer (Sciex, Thornhill, Ontario, Canada) equipped with an Ionspray ion source. Peptides were separated by reverse-phase high pressure liquid chromatography on an Ultrafast Microprotein Analyzer (Michrom BioResources Inc., Pleasanton, CA) directly interfaced with the mass spectrometer. In each of the MS experiments, the proteolytic digest was loaded onto a C18 column (Reliasil, 1 ϫ 150 mm), then eluted with a gradient of 0 -60% solvent B over 60 min followed by 100% solvent B over 2 min at a flow rate of 50 l/min (solvent A: 0.05% trifluoroacetic acid, 2% acetonitrile in water; solvent B: 0.045% trifluoroacetic acid, 80% acetonitrile in water). Spectra were obtained in either the single quadrupole scan mode (LC/MS), the tandem MS neutral loss scan mode (LC/MS/MS), or the tandem MS product ion scan mode (MS/MS).
In the single quadrupole mode (LC/MS) the quadrupole mass analyzer was scanned over a m/z range of 300 to 2200 Da with a step size of 0.5 Da and a dwell time of 1 ms/step. The ion source voltage was set at 5 kV, and the orifice energy was 50 V.
In the neutral loss scanning mode, MS/MS spectra were obtained by searching for the mass loss of m/z 165, corresponding to the loss of the 2FGlc label from a peptide ion in the singly charged state. Thus, scan range: m/z 300 -2200; step size: 0.5; dwell time: 1 ms/step; ion source voltage: 5 kV; orifice energy: 45; Q0 ϭ Ϫ10; IQ2 ϭ Ϫ54. To maximize the sensitivity of neutral loss detection, normally the resolution is compromised without generating artifact neutral loss peaks.

RESULTS AND DISCUSSION
Purification of Wild Type A. niger ␤-Glucosidase-A. niger ␤-glucosidase enzyme preparation was purified by Mono-Q fast protein liquid chromatography. Active protein samples eluted from the Mono-Q column were separated on a 10% SDS-PAGE gel, stained with Coomassie Blue, and incubated in the presence of MUGlc as described under "Materials and Methods" to demonstrate activity of the enzyme. At this stage of purification, a discrete band, having an apparent molecular mass of approximately 160 kDa, and ␤-glucosidase activity could be detected (Fig. 1B). However, the apparent mass of the denatured enzyme (boiled for 10 min in the presence of ␤-mercaptoethanol) was shown to be 120 kDa on 10% SDS-PAGE (Fig.  1A). The enzyme designated BGL1 was further purified to homogeneity on a Resource-S column (Fig. 2). Deglycosylation of A. niger ␤-glucosidase was performed by N-glycosidase-F. SDS-PAGE analysis indicated that approximately 20 kDa of the A. niger ␤-glucosidase mass can be attributed to N-linked carbohydrates.
FastA analysis (31) indicated that the N-terminal sequence as well as the internal sequences have high similarity with sequences from the yeast ␤-glucosidase Saccharomycopsis fibuligera BGL1 belonging to family 3 of the glycosyl hydrolases.
Isolation and Characterization of bgl1 cDNA and Genomic Gene-Reverse transcriptase-polymerase chain reaction was used to isolate a cDNA probe that was used to clone the genomic gene. The bgl1 cDNA and the genomic gene were successfully cloned and sequenced (Fig. 3A, GenBank TM accession no. AJ132386). The cDNA sequence perfectly matched the DNA sequence of the combined exons. The open reading frame was found to encode a polypeptide with a predicted molecular mass of 92 kDa. The genomic gene consisted of seven exons intercepted by six introns (Fig. 3B). Sequence analysis of the DNA sequence upstream of the sequence encoding for the mature protein revealed a putative leader sequence intercepted by an 82-base pair intron.
Production of rBGL1 in E. coli-rBGL1 was overexpressed in E. coli. No apparent ␤-glucosidase activity could be detected in the E. coli extracts; however, SDS-PAGE analysis revealed a relatively intense protein band expressed at the expected molecular weight. Western blot analysis, using rabbit polyclonal anti-native BGL1 antibodies, positively identified the 90-kDa protein band (data not shown). Further analysis revealed that the protein was accumulated in inclusion bodies. Several refolding experiments were conducted; however, these efforts to produce active protein from E. coli failed (data not shown).
Expression of rBGL1 in S. cerevisiae and P. pastoris-Recombinant BGL1 was successfully expressed both in S. cerevisiae and P. pastoris. In S. cerevisiae a relatively low level of expression was found. The recombinant protein was detected by a Western blot analysis (Fig. 4A). The total protein extract of S. cerevisiae expressing bgl1 cDNA had a ␤-glucosidase activity of 1.9 units/mg protein. No ␤-glucosidase activity was detected in control S. cerevisiae transformed with the vector only under the same assay conditions. However, no protein band corresponding to rBGL1 could be detected by Coomassie Blue staining. P. pastoris transformed with bgl1 secreted relatively high levels of rBGL1 to the medium (about 0.5 g/liter) appearing as almost pure protein in the culture supernatant (Fig. 4B). This recombinant enzyme was very active (124 units/mg protein) and even without any purification, its specific activity was very close to that of the pure native enzyme.
1 H-NMR Determination of Stereochemical Outcome-1 H-NMR spectra of a reaction mixture containing pNPGlc and BGL1 revealed that the ␤-anomer of glucose was formed first (H-1 ␦ ϭ 4.95 ppm), with the ␣-anomer (H-1 ␦ ϭ 5.59 ppm) appearing only later as a consequence of mutarotation (Fig. 5). BGL1 is indeed, therefore, a retaining glycosidase, as has been observed for other family members (32,33) Inactivation and Reactivation of A. niger ␤-Glucosidase-Enzyme was incubated in the presence of various concentrations of 2FGlcF, and residual enzyme activity was determined at different time intervals. Enzyme activity decreased in a time-dependent manner according to pseudo first-order kinetics allowing the determination of pseudo first-order rate constants for inactivation at each inactivator concentration. These values were then fit to the expression described under "Materials and Methods" to yield values of the inactivation rate constant (k i ϭ 4.5 min Ϫ1 ) and the reversible dissociation constant (K i ϭ 35.4 mM) (Fig. 6, A and B).
Rates of reactivation of 2-deoxy-2-fluoroglucosyl-BGL1 were determined in the presence of different concentrations of linamarin by monitoring activity regain after 0, 10, 20, and 30 min (Fig. 7). The regain of activity followed a first-order process at each linamarin concentration yielding pseudo first-order reactivation rate constants from which a reactivation rate constant of k react ϭ 0.1 min Ϫ1 and a linamarin dissociation constant of K react ϭ 12.3 mM were determined. This corresponds to a halflife for reactivation of approximately 7 min at saturating linamarin concentration.
Identification of the Labeled Active Site Peptide by Electrospray MS-Peptic hydrolysis of the unlabeled and labeled enzyme samples resulted in mixtures of peptides that were separated by reverse-phase high pressure liquid chromatography using the electrospray MS as a detector. Scanning in the normal LC/MS mode revealed a large number of peaks in the total ion chromatogram in each case (Fig. 8A). Identification of the peptide bearing the 2-fluoroglucosyl label was achieved by use of the tandem mass spectrometer in the neutral loss mode. This involves subjecting the ions introduced into the MS to limited fragmentation by an inert gas while in the collision cell. The ester linkage between the fluorosugar and the peptide is relatively labile and is known to fragment via homolytic bond fission with the loss of a neutral sugar of known mass. Scanning the two quadrupoles (Q1 and Q3) in a linked mode such that only ions losing the mass of the sugar in question will be detected allows facile localization of the labeled peptide. When the spectrometer was scanned in the neutral loss mode looking for peptides that lose a mass of 165 Da, a major peak at 28 min was detected along with a number of minor peaks (Fig. 8B). This major peak was absent in a control sample derived from unlabeled enzyme but otherwise treated identically (Fig. 8C), therefore indicating that the peak at 28 min was indeed the peak of interest. This 2-fluoroglucosylated peptide had m/z 802 as seen in Fig. 8D, thus corresponds to an unlabeled peptide of m/z 637. A peptide of this same mass (m/z 637) was indeed seen in the LC/MS profile of the peptic digest of the unlabeled enzyme, as would be expected. Further, no peptide of m/z 802 was seen to elute around 28 min in this control digest.
Peptide Sequencing by MS/MS-Both the labeled (m/z 802) and unlabeled (m/z 637) peptides were subjected to collisioninduced fragmentation and monitoring in the daughter ion mode, and similar fragmentation patterns were seen in the two cases. The daughter ion scan MS/MS spectrum of the labeled peptide (m/z 802) is shown in Fig. 9, revealing almost a complete Y and B ion series. This allowed the sequence of the peptide to be readily deduced as follows.
Upon losing the sugar, the precursor ion (802) gave the fragment at 637 corresponding to the unlabeled peptide. The series of Y ion fragments at 637, 538, 407, 320, and 205 corresponds to a sequence of Val-Met-Ser-Asp-Trp as shown in Fig.  9. Similarly, the fragments at 619, 433, 318, and 231 belong to the B ion series and correspond to the same sequence as that deduced from the Y ions. Interestingly a parallel series of Y ions is also evident deriving from the labeled peptide. Within this sequence, the only likely candidate for the nucleophile, based on chemical precedent (18 -20) and on the observed relatively facile fragmentation in the mass spectrometer, is the aspartic acid residue. Confirmation of this comes from the fragments at 703, 572, and 485, which must arise from a glycosylated peptide, containing Val, Met, Ser, and Asp. In particular, the fragment at 485 can only come from the glycosylated dipeptide Asp-Trp, thereby confirming that the sugar must be on the aspartic acid residue.
Amino Acid Sequence Alignment with Other Family 3 Glycosyl Hydrolases-Alignment of the region containing the proposed catalytic nucleophile with this same region from a range of family 3 ␤-glucosidases revealed that Asp-261 of BGL1 is fully conserved within this family (Fig. 10). Such absolute conservation is fully consistent with the key role played by the catalytic nucleophile. Beyond its role in forming the covalent glycosyl enzyme intermediate and stabilizing the oxocarbenium ion-like transition states, the nucleophile also modulates the ionization state of the acid/base catalytic residue and forms very strong hydrogen bonds to the sugar 2-hydroxyl at the transition state. Substitutions at that center in other glycosidases have been shown to result in loss of essentially all activity (34). Interestingly, a glucose moiety was found in the active site in the three-dimensional structure of the barley ␤-glucosidase recently solved, and in fact the equivalent residue in that enzyme (Asp-285) was found in close proximity to the anomeric carbon on the ␣-face of the sugar and formed a hydrogen bond to the sugar 2-hydroxyl (24).
The agreement found with the much earlier studies of Bause and Legler (1) on the A. wentii enzyme is extremely gratifying given the fact that the reagent employed initially in that case, a conduritol epoxide, has in several other cases tagged the wrong active site residue (25). However, in this case such an outcome seemed unlikely given that this same amino acid residue had been labeled using D-glucal, a substrate analogue that can form a stable glycosyl enzyme intermediate in much the same way as do the 2-deoxy-2-fluoro glycosides. Indeed this early work of Bause and Legler (1) provided the inspiration for the development of the 2-fluorosugar approach. Conclusion-We describe the purification, characterization, cloning, and overexpression in P. pastoris of the biotechnologically important ␤-glucosidase from A. niger. We also provide a direct identification of the active site nucleophile in this enzyme by use of the mechanism-based inhibitor 2-deoxy-2fluoro-␤-D-glucosyl fluoride, thereby removing any residual uncertainty left upon analysis of the three-dimensional structure of the family 3 barley enzyme. This also opens the way to generation of mutants modified at this position that may function as "glycosynthases" for the synthesis of oligosaccharides (35).