Mechanistic and Structural Analysis of a Family 31-Glycosidase and Its Glycosyl-enzyme Intermediate *

We have determined the first structure of a family 31 -glycosidase, that of YicI from Escherichia coli, both free and trapped as a 5-fluoroxylopyranosyl-enzyme intermediate via reaction with 5-fluoro-D-xylopyranosyl fluoride. Our 2.2-Å resolution structure shows an intimately associated hexamer with structural elements from several monomers converging at each of the six active sites. Our kinetic and mass spectrometry analyses verified several of the features observed in our structural data, including a covalent linkage from the carboxylate side chain of the identified nucleophile Asp to C-1 of the sugar ring. Structure-based sequence comparison of YicI with the mammalian -glucosidases lysosomal -glucosidase and sucrase-isomaltase predicts a high level of structural similarity and provides a foundation for understanding the various mutations of these enzymes that elicit human disease.


INTRODUCTION
transferases and the mechanistically interesting -glucan lyases, which carry out an elimination reaction rather than hydrolysis.
Despite the importance of the family, mechanistic insights are limited. The enzymes are known to be retaining -glycosidases, which hydrolyse the glycosidic bond with net retention of anomeric configuration via an acid/base catalysed mechanism involving a covalent glycosyl-enzyme intermediate. Through a range of studies involving affinity labeling, trapping of reaction intermediates and site-directed mutagenesis the catalytic nucleophile has been identified as an aspartic acid within the consensus sequence WIDMNE (D224 for the A. niger -glucosidase, (6)(7)(8)(9)(10)(11)(12)(13)(14)). On the basis of sequence comparisons and kinetic analysis of mutants the acid/base catalyst has been tentatively assigned as an aspartic acid residue (D647 in SPGase, the Schizosaccharomyces pombe -glucosidase (14)). Highly oxocarbenium ion-like transition states are suggested by the large -secondary kinetic isotope effects seen for both -glucosidases and the -glucan lyase (15)(16)(17), as well as by the tight binding observed for azasugar transition state analogue inhibitors such as nojirimycin and acarbose (17)(18)(19)(20)(21). However, a key missing component, which would be particularly important for inhibitor design, is the 3dimensional structure of any member of this family.
Kinetic parameters for -D-xylopyranosyl fluoride were determined by monitoring the release of fluoride from a range of substrate concentrations using an Orion 96-09 combination fluoride ion electrode interfaced to a computer running the LoggerPro software (Vernier Software Ltd.). Initial rates were used for the determination of kinetic parameters, which were obtained by direct fit of the data to the Michaelis-Menten equation using GraFit 4.0. Initial rates of hydrolysis of ax-5F XF and eq-5F XF by YicI were identical at all concentrations of ax-5F XF (0.1 -2 mM) and eq-5F XF (0.1 -1.2 mM) assayed. This rate was taken as the V max value for each substrate and each k cat value was calculated from these V max values by dividing each V max by the corresponding enzyme concentration.

pH-Dependence Studies
Measurement of the pH-dependent activity of YicI was carried out using -Dxylopyranosyl fluoride as substrate and monitoring fluoride ion release. The following buffers were used: sodium acetate (pH 4.0 -5.5), MES (pH 6.0), MOPS (pH 6.5 -pH 7.5) and Gly-Gly (pH 8.0 -pH 8.5). All buffers were 0.05 M in strength and contained 0.05 M NaCl. The substrate-depletion method was employed as follows. A solution of substrate (0.02 mM, ~50 fold lower than the value of K m ) was preincubated at 37 ºC, then the release of fluoride ion after addition of the enzyme was monitored using an Orion 96-09 combination fluoride ion electrode interfaced to a computer running the LoggerPro software (Vernier Software Ld.) until at least 80% depletion of substrate had occurred.
Appropriate controls confirmed that, at 37 ºC, YicI is stable over the reaction time at each pH value studied. Fitting of the data to a first order rate equation (GraFit 4.0), yielded an apparent rate constant for the reaction, from which the k cat /K m value for each substrate was calculated by dividing that rate constant by the concentration of YicI.
Obtained k cat /K m values were then plotted versus pH and fitted to the appropriate curve using GraFit 4.0, thereby yielding apparent pK a values.

Inhibition Studies
Inhibition studies were performed by measuring enzyme activity in the presence of various concentrations of each inhibitor, using PNP Xyl as substrate. YicI, preincubated at 37 ºC, was added to 200 L of buffer solution containing PNP Xyl and varying amounts of inhibitors, also preincubated at the corresponding temperature. The release of p-nitrophenol was monitored spectrophotometrically at 400 nm. The experiments were repeated at different concentrations of PNP Xyl. A Dixon plot of 1/v versus inhibitor concentration for each substrate concentration intersects a line given by 1/V max at an inhibitor concentration equal to -K i .

Inactivation Kinetics
For analysis of inactivation kinetics, the enzyme (final concentration 0.2 -3.6 mg/ml) was pre-incubated with a range of concentrations of ax-5F XF or eq-5F XF at 37 ºC or 10 ºC. 10 l aliquots of the sample were withdrawn at time intervals and added to 500 l of 3 mM PNP Xyl pre-equilibrated at the appropriate temperature in the UV/Visible spectrometer. The residual enzyme activity at each time interval at each concentration of inactivator was measured in this way. Pseudo-first order rate constants at each inactivator concentration (k obs ) were determined by fitting each curve to a first order rate equation.
Values for the inactivation rate constant (k i ) and the inactivator dissociation constant (K i ) were determined by fitting k obs values and inhibitor concentrations to the following Reactivation of the inactivated enzyme was carried out as follows. Enzyme (100 L, 0.30 mg/mL) fully inactivated by the inactivator at 10 ºC was concentrated using 10 kDa nominal cut-off centrifugal concentrators (Amicon Corp., Danvers, MD) to a volume of approximately 40 L and diluted with 1000 L of buffer. This was repeated twice, and the retentate was diluted to a final volume of 100 L. The inactivated enzyme was then incubated at 37 ºC and reactivation was monitored by removal of aliquots (10 L) at appropriate time intervals and assayed as described above. Measured activities were corrected for decreases in activity due to denaturation over this time course using data for noninactivated control samples. The reactivation rate constant, k react , was determined by fitting the data to a first order rate equation, as described above.

Crystallization, Data Collection and Processing
All YicI crystals used in this study were grown using a protein concentration of 10  Tables 2 and 3. Crystals of the YicI:eq-5F XF complex were obtained in a short 3 minute soak using crystal form 1 and 0.5 mM eq-5F XF in cryo-buffer.

Structure Phasing and Refinement
The structure was solved using both MAD and MIR phases with the COMBINE procedure in SOLVE (30). Briefly, MAD phases from the Ho derivative were combined with MIR phases obtained using the highly redundant La derivative as an anomalously scattering "native" dataset and the PCMB crystal soak as an isomorphous derivative. The mean figure of merit following this procedure was 0.39. Density modification using DM (29) improved the map enough to be able to assign a rough C-alpha trace for one monomer, which was then utilized to find the NCS operators. Phase extension to 2.2 Å with 6-fold averaging was performed using RESOLVE (30). The model output from RESOLVE contained approximately 85% of the structure, with matched sequence assignment for 60% of sidechains. The remainder of the model was built manually using XFIT (31). Crystal form 2 was solved by molecular replacement with MOLREP (32), using the crystal form 1 hexamer as the search model. Refinement of all datasets was achieved using CNS (33) and REFMAC (34), with full statistics provided in Table 3.
Ligand topologies and restraints were generated using the PRODRG server (35)

Sequence Alignment Analysis
The hypothetical protein encoded by the yicI gene from E. coli shows sequence similarity of 50% or more with hypothetical proteins from a range of organisms as well as with known glycosyl hydrolase (GH) family 31 enzymes ( Figure 1). It also shows 20 -29% similarity with mammalian, plant and fungal -glucosidases, and plant -xylosidases and 20% similarity with -1,4 glucan lyases (GLases) from family 31. The consensus sequence surrounding the catalytic aspartate nucleophile in this region of plant, mammalian and fungal enzymes, including -glucosidases and -xylosidases, is WiDMnE, with a slight variation for GLases to WiDMnX (V or T). However, bacterial proteins are shown to have the sequence KTDFGE and are absolutely invariant. These clear distinctions between amino acid sequences of family 31 enzymes of higher organisms and bacteria suggest that the bacterial proteins took a different path early in the evolution of GH family 31 (36,37).

Substrate Specificity
To investigate the enzymatic activity of the YicI protein (see methods for cloning, overexpression and purification), 12 different glycoside substrates were incubated with the protein in pH 7.0, 0.05 M phosphate buffer at 37 o C. As expected from the high sequence similarity with the GH family 31 -xylosidase from L. pentosus, the YicI protein rapidly hydrolyses p-nitrophenyl -D-xylopyranoside (PNP Xyl) and -Dxylopyranosyl fluoride ( XylF) as substrates. Lower, but significant activity was observed with p-nitrophenyl -D-glucopyranoside, while neither the aryl -glycosides of sugars such as galactose, mannose and arabinose, or any -glucosides, were hydrolyzed at all (Table 1). Similarly, neither maltose or maltotriose were cleaved, though the recent paper of Okuyama et al (22) showed that isoprimeverose (Xyl (1,6) Glc) is an excellent substrate, consistent with the expected role of this enzyme in xyloglucan degradation. pentosus (38,39) When YicI was incubated with either PNP Xyl or PNP Glc and products were checked by TLC, only the hydrolysis products, xylose and glucose were detected, indicating the absence of lyase activity. As a result, the YicI protein has been designated as an -xylosidase and very similar conclusions were reached in the aforementioned work by Okuyama et al.

The Effects of Temperature, Metal Ions and pH on Activity
Hydrolysis of PNP Xyl was measured at various temperatures (data not shown), with activity increasing up to 50 o C and being rapidly lost at higher temperatures. YicI stored in pH 7.0, 0.05 M phosphate buffer retains full activity for 48 hours at 37 o C and for several months at 4 o C. The effects of various metal ions including Ca 2+ , Mg 2+ , Mn 2+ , Zn 2+ , Ni 2+ , Cu 2+ and Co 2+ on the activity of YicI were examined and none of these affected the rate of hydrolysis of PNP Xyl. Measurement of the pH-dependent activity of YicI was performed using -D-xylopyranosyl fluoride ( XylF) as a substrate. A classical bellshaped dependence of k cat /K m upon pH was observed as expected, indicating at least two essential, ionizable groups in the free enzyme ( Figure 2). The two apparent pK a values are pK a1 = 4.9 0.2 and pK a2 = 7.9 0.1, with an estimated pH optimum of pH 6.4.
These presumably correspond to the nucleophile, D416 and the acid/base catalyst, D482, respectively.

The Inactivation of YicI with Mechanism Based Inactivators, ax-and eq-5-Fluoro--D-Xylopyranosyl Fluorides
Two new mechanism-based inactivators, ax-5F XF and eq-5F XF were designed, synthesized and tested ( Figure 3). Compounds of this class are designed to react as substrates, but form a relatively stable 5-fluoroxylosyl-enzyme intermediate, which hydrolyses only slowly, thus they often behave as time-dependent inactivators. However, no time-dependent inactivation was observed when ax-5F XF was incubated with YicI at 37 o C, and aliquots were removed for assay of activity. However, the activity measured was lower than that in the control reaction containing no inhibitor -and was concentration-dependent. The inhibition observed must be due to ax-5F XF carried over in the aliquot for assay -suggesting that ax-5F XF is acting as a tight binding, reversible inhibitor. This behavior is reminiscent of that seen upon the reaction of 5-fluoro-glycosyl fluorides with -glycosidases, where apparent tight binding was also observed(12,41-43), but was found to be due to the accumulation of a glycosyl-enzyme intermediate that turns over on a timescale shorter than the assay time. Thus, ax-5F XF was tested as an apparent competitive reversible inhibitor and the (apparent) K i ' value was determined to be 9.8 M. The equatorial epimer eq-5F XF behaved likewise, with a K i ' value of 0.45 M revealing even tighter apparent binding.
To confirm that this rapid inactivation is due to the accumulation of the covalent intermediate, reaction mixtures of both samples were subjected to analysis by electrospray ionization mass spectrometry, which revealed that the mass of the protein (89,094 8) increased by a mass of 157 (reaction with ax-5F XF) and 158 (reaction with eq-5F XF). This is in agreement, within error, with the expected difference for addition of the mass of a 5-fluoro-xylosyl unit (mass of 151) from ax-5F XF and eq-5F XF. The labeling of YicI by these two reagents indicates that the inhibition is indeed due to the accumulation of the covalent intermediate.
Time-dependent inactivation was indeed measured when reactions were carried out at a lower temperature, where turnover was slowed ( Figure 4). Even though the inactivation was not complete at low concentrations of inactivators, the initial phase of inactivation followed pseudo first-order kinetics in both cases. Inactivation kinetics were fitted to the simple model: The apparent first order rate constants, k obs at each concentration of each inactivator were for ax-5F XF and 3.2 ( 0.1) 10 -4 s -1 for eq-5F XF at 37 o C, proving that inactivation occurred through the normal enzymatic reaction mechanism.
Rates of enzyme-catalysed hydrolysis of each 5F XF substrate at 37 o C were measured using a fluoride electrode at a series of substrate concentrations. In each case the enzymatic reaction continued at a constant rate until almost all the substrate had been consumed, indicating very low K m values for each compound. These observations are reminiscent of what was seen during the reaction of -glycosidases with 5-fluoro--glycosyl fluorides (12,(41)(42)(43) and consistent with the low K i ' values measured, since the K m value of each compound as substrate should be equal to its K i ' value as inhibitor. The turnover number, k cat , was determined from the slope of these plots of fluoride release versus time, yielding k cat = 2.16 ( 0.02) 10 -3 sec -1 for ax-5F XF and 1.22 ( 0.08) 10 -4 sec -1 for eq-5F XF. The similarity of these values to the reactivation rate constants confirms that the same process is being monitored in the two cases.

Overall architecture and oligomerization of YicI
Crystallization trials yielded two different crystal forms of YicI, each possessing 6 molecules of YicI per asymmetric unit, related by point group 32 non-crystallographic symmetry. The hexamer is formed from two trimeric structures stacked slightly out of register, as is seen from the two views in Figure 5. This observation of a hexameric YicI is in agreement with results from native PAGE and dynamic light scattering analysis (data not shown), and consistent with the results of Okuyama et al based on gel filtration chromatography (22). Structure refinement statistics are listed in Table 3. The rms deviation between the form 1 and form 2 hexamers is 0.46 Å 2 for 18,552 common backbone atoms (0.29 Å 2 between both form 1 structures), as calculated using the "magic fit" and "refine fit" procedure in swiss PDB viewer (44). Using an identical procedure, the range of rms deviations between chain A of form 2 and the other chains of the hexamer ranges between 0.2 and 0.28 Å 2 for backbone atoms. These values are consistent with YicI adopting a highly similar conformation in all the protomers in this study with the only significant exception being due to a partial disorder of some active site loops ( induced by crystal packing) in monomer D of crystal form 1. The total buried surface area per monomer upon oligomerization was calculated to be 3913 Å 2 using CNS (33), a significantly high value in keeping with a physiologically relevant oligomer. Remarkably, although YicI contains 13 Cys residues per monomer, the structure has no disulphide bonds (although a distance of 3.8 Å between the SG atoms of C343 and C412 suggests that one may be possible in structurally related homologues).
The structure of a YicI monomer can be divided into five distinct domains: the N-   (46)) and chondroitin AC lyase (PDB 1rw9, (47)). These structurally similar domains have been proposed to play a role in carbohydrate binding (48). An interesting feature of these homologous domains is that they are placed at different points in the protein sequence, preceding the catalytic domain in glucoamylase, and following it in AC lyase. This domain is also found in other members of GH13 and, like YicI, can be seen to associate with the active site of an opposing monomer when present in a multimeric structure (49). Any role of domain N in substrate binding may come from direct effects of loop residues contributing to the various active sites of the hexamer, or more indirect structural effects due to its close interaction with regions of the catalytic domain.

The Catalytic Domain
The catalytic domain is composed of a 8 8 barrel, very similar to that of the GH13 family members, with an additional mixed + domain inserted between 3 and 3. The secondary structure of the catalytic domain, and a sequence alignment with selected GH31 family members, is provided in Figure 1. CAZY clans GH-A, D, H and K have also been shown to contain 8 8 catalytic domains (1), all of which place the active site nucleophile on 4, and YicI appears to be consistent with this (see below). The 8 8 barrel is typical of this fold, although 5 is replaced by a loop that contacts domain N, identical to some members of GH13 (50). Helix 8 continues into another -helix, with the conserved residue P570 forming a ~60 kink between the two helices. In keeping with the YicI substrate specificity for -glycosidic bonds, the catalytic domain structure matches best with GH13 -amylase II (PDB 1bvz, (49)), with a DALI score of 17.3 for a 253 amino acid overlap (12% identity).
Both YicI and -amylase II possess an inserted domain between 3 and 3 (residues 349-387 in YicI), with the -sheet of both inserts overlaying well, but showing structural differences in the loop regions. This domain is known to vary with substrate specificity (51). As well as contributing residues to the active site of YicI, this insert is also involved in monomer:monomer packing, and forms part of the central "pore" of the hexamer.
Conserved sequence regions I to VII of GH31 family members (52) are shown mapped to the YicI structure in Figure 1, one of which, region II, is in the 3-3 insert. Region I and regions III-VI are located in the main 8 8 fold, and contribute some of the active site residues. Region VII is located on the 8/continuing helix interface, and is involved in packing of the catalytic domain against the C-terminal domains. identity, (53)) and 5.3 with -amylase II (PDB 1bvz, 62 AA overlap, 10% identity, (49)).

The C-terminal Domains
This domain plays a role in forming the YicI hexamer, associating with the same domain from the opposing monomer in the dimeric ring. The homologous domain fromamylase II is not responsible for dimer formation, and is solvent-exposed.
The distal C-terminal domain is located at the points of the triangle-shaped hexamer, and makes its major interaction with the rest of the structure at the 8 8 conserved region VII, as noted above. This domain plays no role in hexamer formation, and may be responsible for binding carbohydrates, due to its similarity to carbohydrate binding domains. The fold consists of an antiparallel -sandwich, comprising 2 sheets of 5 -strands each, strands 2 and 6 running parallel to each other. The highest DALI scores obtained were 4.8 for an -with all ten -strands matching, and also shows 2 parallel strands in the same location as those of YicI. Carbohydrate binding domains often bind divalent cations for structural stability, and the location of the Ho binding site in this domain in our heavy atom soaking trials may be indicative of the ancestry of the distal C-terminal domain, although it appears that only water is bound in this pocket in the native electron density maps.

Structure of the Active Site
As is typical of a 8  Indeed, the proposed nucleophile in YicI is residue D416, located on 4. Although the catalytic domain of YicI matches well with those from GH13, and the reaction catalysed is similar ( -retaining), the acid/base in YicI is residue D482 on -strand 6, whereas for GH13 members the acid/base is a glutamic acid and is located on 5. This placement of the acid/base on 6 is more similar to that of enzymes from GH18 (chitinases), which catalyse a -retaining reaction. However, the important functional characteristic is the distance between the two catalytic carboxylic acid residue oxygens -~6 Å in YicI and ~6 Å in GH13 (PDB 1UH2, (48)). The longer distance of at least 7 Å in GH18 (PDB 1CTN, Interestingly, approximately half of these active site hydrophobic residues originate from elements outside the 8 8 fold, a fact not immediately obvious when looking at sequence alignments containing the seven classical GH31 motifs (52). This may have substantial consequences for specificity within homologues of different oligomeric structure, such as human sucrase-isomaltase, which is predicted to be a dimer (59).

Binding of eq-5F XF
Soaking of the form 1 crystals in 0.5 mM eq-5F XF for 3 min resulted in clear ring- The presence of the bulky F277 near C5 of the adduct may provide an explanation for the very low -glucosidase activity of YicI. This phenylalanine residue is situated at the end of 1, and appears to be present in an insert that is not found in the GH31 family members that possess -glucosidase activity (Figure 1). This insertion may sterically crowd this region of the active site, preventing the extra CH 2 OH group of glucose (c.f. xylose) from binding. Other residues contributing to the overall shape of the sugarbinding site include W345, which may assist in distortion of the substrate, and F515, which is positioned over the hydroxyl groups O3 and O4. An active site tryptophan is commonly observed in glycosidases and has been postulated to serve as a platform for substrate distortion (65).
Of particular interest is the orientation of the catalytic nucleophile in this structure. In the other -glycosidases for which the structure of a glycosyl enzyme intermediate has been determined, the carbonyl oxygen of the nucleophile is located in close proximity to sugar ring O5. This led to the suggestion that O:O interactions may play an important catalytic role, either via ground state destabilization or transition state stabilization. In YicI the nucleophile is twisted substantially away from this region, with the carbonyl oxygen located in close proximity to the sugar H2. While this may not be catalytically relevant for YicI, it does suggest that, in the structurally related -glucan lyases, the C2 proton may well be abstracted by the departing carboxylate, as had been previously suggested (17).
An additional interesting interaction network between the substrate and the enzyme is that involving the sugar 2-hydroxyl, which interacts with E419 via R466. Previous kinetic studies on SPGase and its mutants (13) had shown that mutation of the equivalent conserved Glu (E484 in SPGase and E419 of YicI) eliminated its ability to hydrolyse maltose, but had lesser effects on rates of hydration of D-glucal, a substrate that has no 2-hydroxyl. The structure of YicI is therefore completely consistent with this finding, and suggests that E419 orients R466 for optimal interaction and transition state stabilization.
Interestingly glucan lyases, which utilize an elimination mechanism rather than hydrolysis for enzymic deglycosylation (17) have Val or Thr at the corresponding position. This implies that the E419-mediated hydrogen bonding network may play a more important role in effecting deglycosylation through hydrolysis than through elimination.

Binding of Tris and ordered water at the active site: implications for possible multivalent sugar binding
Following molecular replacement of form 2 using form 1, it was apparent that there was some unexplained electron density at the active site. This was accounted for by modeling in two molecules of Tris per active site (denoted TRS1 and 2; Figure 7b), Tris being a well-known glycosidase inhibitor and a common ligand in many glycosyl hydrolase structures. Indeed, Tris has been shown to inhibit the sucrase-isomaltase of GH31 (66).

Importance of the YicI Structure in Relation to the Human GH31 Family Members
The YicI structure provides a strong foundation for the modeling of the catalytic 8 8 domain of the human GH31 enzymes (lysosomal -glucosidase and sucrase-isomaltase) ( Figure 1). Intriguingly several clinically relevant isolates of the lysosomal -glucosidase encode for mutations that lie directly within the 8

Labeling and Proteolysis for Electrospray Mass Spectrometry
A stock solution of the enzyme (20 µL, 8.8 mg/ml) was incubated with ax-or eq-5F XylF (20 L, 10 mM) at 37 ºC for 30 min. The sample was diluted with 0.05 M phosphate buffer (pH 2.0, 90 l) and incubated with pepsin (15 L, 1 mg/ml) for 15 min at room temperature. The sample was then rapidly frozen and analyzed immediately upon thawing. A control sample was prepared according to the same procedure, except that no inactivator was added.

Mass Spectrometry
Mass spectra were recorded using an ABI MDS-SCIEX API QSTAR Pulsar i mass spectrometer (Sciex, Thornhill, ON). Peptides were separated on a reverse phase C18 column using an Ultimate Capillary HPLC system (LC Packings, Amsterdam, Netherlands) interfaced with the mass spectrometer. A post-column splitter was used in all experiments, splitting off 85% of the sample into a fraction collector and sending 15% into the mass spectrometer. Spectra were obtained in either the single-quadrupole scan mode (LC/MS) or the tandem MS daughter ion scan mode (MS/MS).
In LC/MS experiments, proteolytic digests of the protein were loaded onto a C18 column (300 m 150 mm) and eluted with a gradient of 2 to 40% eluting solvent (0.1% formic acid and 85% acetonitrile in water). The mass analyzer was scanned over a mass-tocharge ratio range of 300 -2400 amu, with a step size of 0.1 amu and a scan time of 1 second. The ion source potential was set at 5 kV; the orifice energy was 50 V. After the LC/MS experiment, total ion chromatograms of the labeled and unlabeled enzyme digests were compared to find the fraction containing the labeled peptide fragments. Samples of the labeled peptide were collected from the post-column flow splitter and lyophilized.
The concentrated sample was then sequenced via tandem MS fragmentation analysis.
To determine amino acid sequences of peptides derived from YicI, the mass spectrometer was operated in an IDA (information dependent acquisition) MS/MS mode, where the precursor ion is selected "on the fly" from the previous scan. An m/z ratio (902 in this case) for an ion that had been selected for fragmentation was placed in a list. The peptides including m/z 902 (doubly charged) previously fractionated by HPLC were introduced into the mass spectrometer via a nanospray ion source (Protana, Staermosegaardvej, Denmark). Following mass selection in the first quadrupole (Q1), the peptide of interest was fragmented by collision with nitrogen gas in the second quadrupole (Q2) and the resulting product ions were analyzed in the TOF mass analyzer. The following settings were used: TOF scan range of m/z 100 -1820 amu, step size of 0.1 amu, and the scan time of 1 second, Q2 potential of -42 V and source voltage of 1000 V.

Experimental Identification of the Catalytic Nucleophile of YicI
The accumulation of the covalent glycosyl-enzyme intermediate provided an opportunity to directly identify the amino acid labeled. Fully inactivated enzyme was prepared with both inactivators, separately, along with a control sample containing no inactivator. These were subjected to peptic digestion, followed by LC/ESI MS comparative mapping. A comparison of the masses of all the peptides in the inactivated and control samples revealed that the only significant difference was a peptide fragment corresponding to m/z 601 (triply charged) and 902 (doubly charged) which was detected in the two inactivated samples while no such peptide was detected in the control sample (Supp. Figure I). If these are the labeled peptides of interest, then a peptide of mass ~550 (triply charged) or ~827 (doubly charged) might be expected in the unlabeled sample, this being the mass difference between the peptide of mass 1011 and the 5-fluoro-xylosyl label of mass 51 (triply charged) or 75 (doubly charged). Unfortunately, no such peptide was observed, possibly indicating that the unlabeled peptides are susceptible to further peptic digestion.
Differences in proteolytic cleavage as a consequence of the presence of a sugar residue are not rare (1,2). Thus, the labeled fragment was isolated from both inactivated samples by HPLC and sequenced by ESI tandem mass spectrometry (Supp. Figure II