Structure and Mechanism of Helicobacter pylori Fucosyltransferase

Helicobacter pylori α1,3-fucosyltransferase (FucT) is involved in catalysis to produce the Lewis x trisaccharide, the major component of the bacteria's lipopolysaccharides, which has been suggested to mimic the surface sugars in gastric epithelium to escape host immune surveillance. We report here three x-ray crystal structures of FucT, including the FucT·GDP-fucose and FucT·GDP complexes. The protein structure is typical of the glycosyltransferase-B family despite little sequence homology. We identified a number of catalytically important residues, including Glu-95, which serves as the general base, and Glu-249, which stabilizes the developing oxonium ion during catalysis. The residues Arg-195, Tyr-246, Glu-249, and Lys-250 serve to interact with the donor substrate, GDP-fucose. Variations in the protein and ligand conformations, as well as a possible FucT dimer, were also observed. We propose a catalytic mechanism and a model of polysaccharide binding not only to explain the observed variations in H. pylori lipopolysaccharides, but also to facilitate the development of potent inhibitors.

Helicobacter pylori is a serious human pathogen that causes gastritis and both peptic and duodenal ulcers (1,2). The pathogen is associated with an increased risk for development of both gastric adenocarcinoma (3) and mucosa-associated lymphoma (4). H. pylori is highly adapted to colonize in human gastric mucosa, where it may remain for decades or even a lifetime. The specific infection in the stomach is caused by the attachment of adhesion proteins, such as BabA and SabA, to specific glycoconjugates on the gastric epithelial cell surface (5,6).
Although H. pylori elicits local as well as systemic antibody response (7), location at such a specific niche permits it to escape elimination by the host immune response.
The lipopolysaccharides (LPSs) 4 of H. pylori contain fucosylated oligosaccharides, predominantly the type II blood group antigens Lewis x and Lewis y (8) in addition to the minor type I Lewis a and Lewis b (9). H. pylori fucosyltransferases (FucTs) are responsible for the last steps in the biosynthesis of Lewis antigens. The molecular mimicry of host cell surface antigens has been suggested to mask the pathogen from host immune surveillance and thus plays an important role in colonization and long term infection in the stomach (10). Furthermore, H. pylori continuously alters the expression of Lewis antigens, a process known as phase variation, to generate several LPS variants in one bacterial population and to display structural heterogeneity. The dynamic variation in cell surface antigens is thus a survival advantage for the bacteria and an essential feature to interact with host cells (11).
In H. pylori genomes, there exist two homologous ␣1,3/4-FucT genes, futA and futB, and one gene futC for ␣1,2-FucT (12). These three fut genes do not always encode functional proteins. For instance, futA, but not futB, encodes an active ␣1,3/4-FucT in H. pylori strains NCTC11639 and UA948 (13,14). The futC gene in strain NCTC11639 is functional (12) but not functional in strain UA948 (14). The on/off status of fut genes and the various levels of FucT activities present in different H. pylori strains determine the Lewis antigen expression patterns of H. pylori LPS. FutA and FutB contain a variable number of DD(or N)LRV(or I)NY tandem repeats in the C terminus.
FucTs are inverting glycosyltransferases (GTs), i.e. the enzymes catalyze the fucosyl transfer from the donor GDP-␤-L-fucose to form an ␣-glycosidic linkage. On the basis of the CAZy classification, FucTs are categorized into family GT10. Furthermore, FucTs also fall into three sub-families, namely ␣1,2-, ␣1,3/4-, and ␣1,6-FucTs, according to the position of the glycosidic bond each enzyme forms. H. pylori ␣1,3/4-FucTs are composed of an N-terminal catalytic domain, 2-10 heptad repeats, and a C-terminal tail rich in both basic and hydrophobic residues. The heptad repeats, previously proposed to form a leucine zipper, are essential for homodimerization (15). The positively charged yet hydrophobic tail is required for association with the cell membrane. In contrast, mammalian ␣1,3/4-FucTs have the typical structure of type II transmembrane proteins, consisting of a short N-terminal cytoplasmic tail, a transmembrane domain, and a stem region followed by a large C-terminal catalytic domain. The primary sequences of H. pylori and mammalian ␣1,3/4-FucTs have homology mainly on the nucleotide binding site. For instance, comparison of the H. pylori FucT (NCTC11639) polypeptide sequence with mammalian FucTs, including human FucT III to VII, bovine FucTIII, and mouse FucT IV, reveals sequence similarity (40 -45% identity) within a stretch of 69 amino acid residues (corresponding to Leu-229 to Leu-297 in H. pylori ␣1,3-FucT). Therefore, the overall sequence difference makes the H. pylori enzyme an attractive target for therapeutic intervention. Despite the biological importance of human and H. pylori FucTs, further understanding is impeded by the lack of an x-ray structure.
Recently we and others demonstrated that systematic deletion of the C terminus of H. pylori ␣1,3-FucT greatly improved the marginal solubility of the full-length protein (15,16). Up to 80 residues, including the tail containing hydrophobic and positively charged residues (sequence 434 -478) and five of the ten heptad repeats (sequence 399 -433), can be removed without significant change in structure and catalysis. Several biophysical studies indicate that half of the heptad repeats are essential for the secondary and native quaternary structures (15). We herein present for the first time x-ray crystallographic structures of FucT. The results may suggest a reaction mechanism of FucT and provide a basis for both LPS variation and the design of inhibitors.

EXPERIMENTAL PROCEDURES
Protein Expression and Purification of the Truncated, Selenomethionine-labeled, and Mutant FucTs-The ␣1,3-FucT from H. pylori strain NCTC11639 was prepared according to the published procedure (15). Protein crystallization was successful with deletion of C-terminal 115 residues. Approximately 160 mg was obtained for crystallization studies, with a purity of Ͼ95% as determined by SDS-PAGE. The C-terminal truncated enzyme has only four methionine residues, and their full substitutions with selenomethionine were not sufficient for phase angle calculations. We thus mutated Leu-202, which is located in the middle of a long ␣-helix based on the secondary structure prediction, to methionine to increase the anomalous difference. The selenomethionine-labeled L202M FucT was produced in Escherichia coli by using a non-auxotrophic protocol and purified in a similar manner. The molecular weight was confirmed by electrospray ionization-mass spectrometry. The enzyme activity was measured based on the formation of GDP that was coupled with the pyruvate kinase/lactate dehydrogenase assay to monitor the consumption of NADH with a fluorescence excitation at 340 nm and emission at 460 nm (15).
Crystallization, Data Collection, and Structure Determination-Purified enzyme was concentrated to 18 -20 mg/ml and crystallized at room temperature by the hanging drop vapor diffusion method. Orthorhombic crystals for FucT, selenomethionine-labeled, and mutant enzymes were grown by using equal volumes of the protein solution and the reservoir that contained 4 -5% polyethylene glycol 3350, 0.2-0.4 M ammonium sulfate, and 100 mM Bis-Tris (pH 5.5). The substrate-bound FucT complexes were obtained by soaking the apo crystals prior to data collection with a 2-10 mM solution in the crystallization buffer as detailed in Table 1. Before data collection, all crystals were briefly soaked in a crystallization buffer containing 25-30% (v/v) ethylene glycol and were flash frozen in a gaseous stream of liquid nitrogen. The diffraction data were processed and scaled by using HKL/HKL2000 packages (17). Statistics are shown in Table 1. The crystal belongs to the orthorhombic space group P2 1 2 1 2, with unit cell dimensions a ϭ 104 Å, b ϭ 136 Å, and c ϭ 96 Å, in which an asymmetric unit comprises three FucT molecules.
Using the multiple anomalous dispersion data sets of L202M at 2.5-Å resolution, the selenium atoms were located, and initial phase angles were calculated by using the program SOLVE (18). Different trials with various parameters in RESOLVE (19,20) allowed automatic tracings by computer for 60 -70% of the amino acid sequence. Statistics for the multiple anomalous dispersion phasing are shown in supplemental Table S1. Because there are three FucT molecules in an asymmetric unit, the completeness of the protein model was improved by cross-referencing of the auto-built polypeptide fragments. The electron densities were also sufficiently clear to allow manual fitting of most missing residues. Using the program O (21), a protomer containing residues 1-348 was thus constructed and placed in all three positions. The model yielded an initial R-value of 0.45 at 2.5-Å resolution. Further refinement employed the native data set and the program CNS (22), in which 5% reflections were set aside for R free calculation (23). With strong non-crystallographic symmetry restraints, the model gave R and R free of 0.30 and 0.32 at 1.9-Å resolution.
The electron density map revealed some regions with significant deviations among the three monomers (A, B, and C), which were then excluded from the non-crystallographic symmetry restraints in subsequent refinement. Helix ␣4 did not have strong densities in monomer A and was not seen in monomer B. Water molecules and sulfate ions were included at later stages. The FucT crystals in complex with GDP and GDP-fu- cose were isomorphous to the apo crystal. Direct use of a previous mid-stage model in rigid-body refinement yielded R values of 0.274 and 0.276, respectively, for the FucT⅐GDP and FucT⅐GDP-fucose crystals. The initial difference Fourier maps clearly indicated that the pyrophosphate group of GDP replaced the bound sulfate ion in the active site. The sugar moiety of the bound GDP-fucose was visible in all three monomers, although the densities were weaker for monomer A. The FucT⅐GDP-fucose crystal also allowed slight extension of the C termini, but the amino acids beyond 352 were still invisible.
Synthesis of Enzyme Inhibitors-Preparation of compounds 1 and 2 (see Fig. 5C) followed the procedure reported by Wong and his coworkers (24). Compounds 1 and 2 were individually coupled with 80 different acids (supplemental Fig. S1) to generate the amide derivatives in microplates in the presence of (1H-benzotriazole-1-yl)-1,1,3,3-tetramethyluronium hexafluorophosphate (1 eq) and diisopropyl ethylamine (2 eq) in dimethylformimide. Without purification, the mixtures were subjected to the enzyme activity assay. Compound 3 was synthesized by 1,3-dipolar cyclization (known as "click chemistry") of propargyl-GDP (the alkyne) and 5-azidopentanoic acid (biphenyl-4-ylmethyl)amide (the azide) in the presence of CuSO 4 and copper wire at room temperature. The solvent mixture was composed of H 2 O, EtOH, and t-BuOH (3:2:5) (25). After purification, the resulting product produced 1 H and 13 C NMR spectra that are consistent with the reported data. These prepared compounds were studied for inhibition in accordance with the aforementioned procedure (15).

Structure of FucT and Its Complexes with GDP/GDP-Fucose-
FucT crystallization was only successful upon deletion of the C-terminal 115 residues; this truncated form retained 20% of the native enzyme activity (15). The refined FucT structures all contain three protein monomers (denoted A, B, and C) in the asymmetric unit cell, starting at Met-1 and terminating at residues 348 -352. Each monomer is composed of two similar domains, both having parallel ␣/␤ topology of the Rossmann folds (Fig. 1A). The N-and the C-terminal domains encompass residues 20 -150 and 160 -320, respectively. The first helix ␣1 (residues 2-13) interacts with the C-terminal domain, whereas the last helix ␣12 (residues 328 -340) interacts with the N-terminal domain. Refinement statistics for the crystals of FucT and two complexes (one with GDP and the other with GDPfucose) are shown in Table 1.
In the apo FucT crystal, there are three bound sulfate ions, two of which are at identical sites in monomers B and C, and adjacent to the side chain of Arg-195, an essential residue for enzymatic activity (see below). The third sulfate is bound to  a distal loop of monomer C. Despite sulfate ions existing in the mother liquor, when the apo crystals were soaked with GDP, the sulfate was readily replaced by GDP. The crystalline FucT retained enzymatic activity to hydrolyze the donor substrate, GDP-fucose in the absence of acceptor. Soaking the FucT crystal with GDP-fucose left only the product GDP in the active site, unless the soaking time was kept to a minimum.
Interactions between the enzyme and the diphosphate are more precisely defined in the GDP-fucose complex of FucT (Fig. 1B) than those for the enzyme and sulfate ion. Both side chains of Arg-195 and Lys-250 provide the neutralizing charges. Among the 18 well defined direct hydrogen bonds (H-bonds) with GDP-fucose (Table 2), there are seven H-bonds between FucT and the nucleoside moiety of GDP, whereas the fucosyl group forms five specific H-bonds with the enzyme. Thus, FucT recognizes its substrate GDP-fucose with very high specificity. GDP alone is bound to the enzyme with a similar repertoire involving 12 direct H-bonds (Table 2), and the diphosphate group is rotated outward (Fig. 1C). In the FucT⅐GDP-fucose complex structure, one of the two H-bonds of Arg-195 is formed with the ␣-phosphate and another with the ␤-phosphate (Fig. 1B). In contrast, both H-bonds from Arg-195 are redirected to the ␤-phosphate in the FucT⅐GDP com-plex. Thus, after the fucose transfer, an additional positive charge is arranged to neutralize the leaving group.
Comparison with Other FucTs-The GDP-fucose binding sites of H. pylori and human ␣1,3-FucTs have comparatively higher sequence similarity than other regions (Fig. 1D). This region, including the invariant residues Asn-240, Tyr-246, Glu-249, and Lys-250, provides a molecular basis accounting for the same donorsubstrate specificity. These residues were studied by site-directed mutagenesis and kinetic analysis to assess their roles in enzyme function (see below).

TABLE 3 Deviations between FucT and other GT-B models
BGT stands for phage ␤-glucosyltransferase, MurG denotes E. coli N-acetylglucosaminyltransferase, and GtfD represents TDP-vancosaminyltransferase. about a uniform axis, yet the overall conformation of FucT is apparently more open than all other models. In general, the C-terminal domains are more conserved than the N-terminal domains (Table 3 and Fig. 2). In particular, the C-terminal domains of FucT and BGT have 130 matched pairs of C␣ atoms, including 17 identical residues. There are also 10 identities in the N-terminal domain. These correspond to Ͻ10% of the entire sequence. The other two models of MurG and GtfD have more variations.

Model (PDB code) N-terminal domain C-terminal domain Domain rotation angle
Catalytic Mechanism-BGT, being a member of the glycosyltransferases-B family, uses Asp-100 as the general base. Interestingly, this residue is equivalent to Glu-95 of FucT. Glu-95 is located in the N-terminal domain that presumably associates with the acceptor substrate. The side chain of Glu-95 is positioned immediately adjacent to the anomeric carbon of fucose in the GDP-fucose complex (Fig. 3), which supports its suggested role as a general base in catalysis.
A model of the acceptor Nacetyllactosamine (LacNAc) can be constructed based on the locations of the catalytic base and the donor substrate, as well as the shape and orientation of the active site cleft. The shape of LacNAc is complementary to the cleft (Fig. 3A). The 3-OH of N-acetylglucosamine (GlcNAc) is close to the anomeric carbon of the fucose residue, as well as the carboxylate of Glu-95 (Fig. 3B). The methyl group of the N-acetyl group, trans to the C2 of GlcNAc, also makes hydrophobic contact with the side chain of Leu-124. On the distal side, other OH groups may form additional H-bonds with the side chains of the first ␣-␤-␣ motif of the N-terminal domain, such as Trp-33, Trp-34, and Glu-41 (Fig. 3B).
The structures of the substrate-and product-bound forms of FucT allow us to suggest a catalytic mechanism that is similar to  that of human FucT (38). LacNAc is bound to the pocket in the N-terminal domain, which has a highly negative electrostatic potential (Fig. 3A) that reduces the pK a of the C3-OH group of GlcNAc to favor nucleophilic attack. Upon deprotonation of the C3-OH group by Glu-95 (Fig. 4A), the acceptor nucleophile can attack the anomeric position of GDP-fucose to form a new glycosidic bond with an inverted configuration (Fig. 4A). The side product GDP dissociates at the same time. Because the anomeric car-bon is located between the acceptor nucleophile, the C3-OH of GlcNAc, and the leaving group GDP (Fig. 4B), the geometry is consistent with the in-line displacement mechanism (39).
Meanwhile, it is intriguing that two neighboring regions of opposite charges are also observed. The positively charged pocket, in close association with GDP-fucose, facilitates glycosidic bond cleavage by neutralizing the negative charges on GDP. Lys-250 and Arg-195 are the two critical residues (Fig.   FIGURE 5. Acceptor substrate model, proposed reaction mechanism, and potential FucT inhibitors. A, the C␣ tracings of the FucT monomers in different crystals (in stereo view). Monomers A, B, and C are colored blue, cyan, and magenta, respectively. The bound GDP-fucose is shown as a stick model with the carbon backbone colored in gray. Two regions of ␣2-␤2 and ␤8-␣7, which undergo conformation changes, are colored red. B, the molecular surface of the FucT is colored from red to blue according to the electrostatic potential from Ϫ15 k B T (acidic) to ϩ15 k B T (basic) using the program GRASP. The observed GDP-fucose and the modeled (LacNAc) 4 -octasaccharide (␤1,3-linkage between each LacNAc unit) are shown in gray and green, respectively, for the carboncarbon bonds. The octasaccharide lies in the crevice between the two binding domains and makes primary contact with the ␣2 helix. C, structures of the potential FucT inhibitors. Compounds 1 and 2 were coupled with 80 different carboxylic acids (X-COOH) by amide-forming reactions to give GDP derivatives that were unpurified and directly subjected to the FucT activity assay. Compound 3 was found to be an inhibitor against H. pylori FucT (K i ϭ 0.59 M).

TABLE 4 Kinetic analysis of FucTs and mutants
FucT⌬C45 stands for the FucT to be truncated C-terminal 45 residues, and FucT⌬C115 represents FucT to be truncated C-terminal 115 residues.

FucT or mutant
Relative activity GDP-fucose LacNAc , not detected because of no or very low level activity (Ͻ0.5%). Thus, the kinetic parameters could not be obtained. 4B). A major conformational change of the Arg-195-containing helix ␣7 is induced by the binding of GDP or GDP-fucose (Fig.  5A). The negatively charged surface flanking the fucose moiety stabilizes the developing oxonium ion in the transition state. In particular, Glu-249 is the key residue near C1 and O5 of fucose at a distance of 4.9 and 3.9 Å (Fig. 4B), respectively. On the other hand, a considerable distance (8 Å) exists between the donor and acceptor. Thus, the two domains must move, triggered by the binding of both substrates, to yield a more closed conformation of the protein and facilitate additional cross-domain interactions. Essential Catalytic Residues Supported by Site-directed Mutagenesis-The binding of donor substrate in the absence of acceptor suggests that FucT catalyzes the sugar transfer reaction by a sequential mechanism, as do most other GTs (26). Site-directed mutants of FucT were thus prepared to support the roles of the key residues identified above. The enzyme activity was measured based on the formation of the side product GDP (15). Although the enzyme is able to hydrolyze GDP-fucose as aforementioned, our isotope-based TLC analysis indicated that the rate of hydrolysis is at least 340-fold lower than that of the enzymatic fucosylation (data not shown). The truncated FucT⌬C115 (representing the protein short of C-terminal 115 residues) has much less activity in comparison with the larger FucT⌬C45 (short of C-terminal 45 residues) that contains an additional leucine zipper in the C terminus and forms a dimer in solution as the full-length FucT (15). All the mutations are based on FucT⌬C45, as shown in Table 4. Substitution of Glu-95 with alanine (E95A) or aspartic acid (E95D) led to complete activity loss, in agreement with the suggested role as a general base. Alanine mutants of Arg-195 or Lys-250 resulted in no detectable activity, supporting the idea that the two residues provide positive charges to interact with negatively charged GDP-fucose (Figs. 1B and 4B). Glu-249 functions to stabilize the proposed oxonium cation, as well as to form two H-bonds with both the ribose and the fucose residues of GDP-fucose (Figs. 1B and 4B). The importance of Glu-249 is further strengthened by undetectable activities of the mutants E249A, E249D, and E249Q. Tyr-246 is involved in the binding with GDP-fucose (Fig. 1B), as suggested by the reduced activity of Y246A and the greatly increased K m GDP-fucose value (Ͼ800 M). The mutant N240A retained 10% activity of the wild-type enzyme. Surprisingly, the K mLacNAc of N240A was 12.8 mM, 18-fold higher than that of the wild-type FucT, whereas the K mGDP-fucose was not affected. Despite the sequence identity of Asn-240 among various ␣1,3/4-FucTs (data not shown), its H-bond to fucose is possibly (Fig. 1B) not crucial for the donor substrate binding. By contrast, Asn-240 is implicated in binding with the acceptor substrate. The interaction likely occurs either after the conformation change induced by the occupancy of both donor and acceptor substrates, or in the larger proteins (with less C-terminal truncation).
LPS Model and Inhibitor Design-Because FucT catalyzes the fucosyl transfer to oligosaccharyl/polysaccharyl Lac-NAc-containing LPS in vivo, a model of the (LacNAc) 4 -octasaccharide (␤1,3-linkage between each LacNAc unit) was constructed (Fig. 5B) based on the aforementioned considerations. The modeled structure indicates that the octasaccharide lies in the crevice between the two binding domains, in proximity to the ␣2 helix (Fig. 5). By coincidence, helix ␣2 adopts different conformations among the three structures (Fig. 5A), which may provide better contact with the LacNAc acceptor. Meanwhile, at least 4 -5 LacNAc units are encompassed by the enzyme (Fig. 5), in agreement with the need to accommodate a polysaccharide acceptor for fucosylation.
Because FucT has more interactions with GDP than it does with fucose, it would be necessary to mimic the former interactions when developing FucT inhibitors. A variety of molecules were prepared by coupling of GDP-hexylamine (compound 1, Fig. 5C) with 80 different carboxylic acids (supplemental Fig.  S1). The inhibitory activity of the resulting unpurified products was assessed using the FucT enzyme assay. Most of the molecules had IC 50 values of 10 -100 M, comparable to that of GDP. Similar results were obtained using the molecules derived from the reactions of GDP-propylamine (compound 2) and carboxylic acids. Compound 3, reported previously as a potent inhib- itor of human FucT VI (24), effectively inhibited H. pylori FucT (K i ϭ 0.59 M). This lead compound represents a 150-fold enhanced affinity compared with GDP-fucose (K m ϭ 88 M). Because there is considerable distance between the donor and acceptor sites, the improved inhibition is likely due to the substantial molecular size that makes it possible to gain extra multiple interactions with the acceptor binding site.
Possible Ramifications of an Intertwining Dimer-One asymmetric unit of the FucT crystal contains three molecules of the protein. Monomers A and B are related by a non-crystallographic 2-fold axis and constitute a possible dimer. A similar dimer is formed by monomer C and the adjacent monomer CЈ related by a crystallographic dyad. Additionally the remaining C termini of both monomers are oriented in the same direction (Fig. 6A) and would probably be followed by the leucine zipper motifs (truncated in our experimental protein) in the fulllength protein. As a result, the dimer observed is consistent with the proposed model (15) in which the C-terminal heptads of each subunit interact with those of the other subunit in parallel to yield a dimer. Although the studied enzyme lacks the C-terminal heptad repeats plus a membrane anchor and appears to be a monomer in solution (15), the structural information suggests that the homodimer in our crystal is not entirely caused by crystal packing. Rather, it may present a native form of the enzyme.
In the dimeric FucT structure, the C terminus of the counter subunit is 15-18 Å away from the predicted catalytic base Glu-95, whereas the residue is much farther away (32-33 Å) from that of the same subunit (Fig. 6B). Owing to the important role of the C terminus in the acceptor binding and specificity (40), the C terminus (residues 350 -363) that is missing in the resolved x-ray structures likely crosses over the interface to approach the active site of the counter monomer and thus participates in acceptor binding. This hypothesis is supported by the observation that the truncation of the heptad repeats has greater impact on acceptor binding than donor binding (Table 4).
A molecular ruler mechanism of FucT was proposed for how H. pylori varies its LPS fucosylation pattern (41). The formation of a homo-and heterodimeric FucT, due to the expression of the genes futA and futB, is putatively linked to LPS variation. Because the region of heptad repeats mediates the formation of homo-and heterodimers, a number of different dimers, depending on the number and identity of the heptads involved in dimerization, may affect the sizes of O-antigen fucosylation units. This mechanism thus suggests that a homodimer preferentially fucosylates one LacNAc unit. Nevertheless, our preliminary study on the fucosylation of oligomeric LacNAc indicates that the FucT homodimer is able to carry out multiple fucosylations (data not shown), suggesting that the ruler mechanism requires re-evaluation. In accordance with our structures, the enzyme not only accommodates four to five LacNAc units but also exists as an intertwining dimer, the features of which are likely applicable to the formation of heterodimers: only one or two isoforms can be generated due to head-to-head interactions of the two catalytic domains.
In summary, the first x-ray structures of FucT suggest a reaction mechanism at the molecular level. This mechanism is help-ful in the design of potential lead compounds for enzyme inhibition. Further optimization of inhibitors and crystallization of FucT/inhibitor complexes will potentiate the development of a new treatment for H. pylori infection.