Molecular Basis for Recognition of the Cancer Glycobiomarker, LacdiNAc (GalNAc[β1→4]GlcNAc), by Wisteria floribunda Agglutinin*

Aberrant glycosylation and the overexpression of specific carbohydrate epitopes is a hallmark of many cancers, and tumor-associated oligosaccharides are actively investigated as targets for immunotherapy and diagnostics. Wisteria floribunda agglutinin (WFA) is a legume lectin that recognizes terminal N-acetylgalactosaminides with high affinity. WFA preferentially binds the disaccharide LacdiNAc (β-d-GalNAc-[1→4]-d-GlcNAc), which is associated with tumor malignancy in leukemia, prostate, pancreatic, ovarian, and liver cancers and has shown promise in cancer glycobiomarker detection. The mechanism of specificity for WFA recognition of LacdiNAc is not fully understood. To address this problem, we have determined affinities and structure of WFA in complex with GalNAc and LacdiNAc. Affinities toward Gal, GalNAc, and LacdiNAc were measured via surface plasmon resonance, yielding KD values of 4.67 × 10−4 m, 9.24 × 10−5 m, and 5.45 × 10−6 m, respectively. Structures of WFA in complex with LacdiNAc and GalNAc have been determined to 1.80–2.32 Å resolution. These high resolution structures revealed a hydrophobic groove complementary to the GalNAc and, to a minor extent, to the back-face of the GlcNAc sugar ring. Remarkably, the contribution of this small hydrophobic surface significantly increases the observed affinity for LacdiNAc over GalNAc. Tandem MS sequencing confirmed the presence of two isolectin forms in commercially available WFA differing only in the identities of two amino acids. Finally, the WFA carbohydrate binding site is similar to a homologous lectin isolated from Vatairea macrocarpa in complex with GalNAc, which, unlike WFA, binds not only αGalNAc but also terminal Ser/Thr O-linked αGalNAc (Tn antigen).

Neoplastic transformation often results in cells with unusual glycosylation patterns specific to the type and stage of different cancers, providing numerous venues for therapeutic and diagnostic reagents (1,2). Lectins are non-catalytic proteins with strict specificity for individual mono-or oligosaccharides, which can make them powerful tools for detecting changes in the carbohydrate structure in glycoproteins and glycolipids. Several lectins are currently used in histochemical analyses to identify malignant or premalignant cells (3,4). Wisteria floribunda agglutinin (WFA) 2 is a legume lectin that binds N-glycans terminating in ␤-linked N-acetylgalactosaminides, particularly ones with LacdiNAc (␤-D-GalNAc-[134]-D-GlcNAc) ( Fig. 1) termini, and to terminal galactose residues with lower avidity (5)(6)(7)(8)(9)(10)(11). WFA's biological activities in vitro have long been known to include inducing T lymphocyte activation (12) and hemagglutination (13). As with most legume lectins, the biological role of WFA is poorly understood; it may function as a mediator of symbiosis between nitrogen fixing bacteria and the plant's roots or as a plant defense mechanism (14).
Lectins are known generally to bind di-and oligosaccharides with higher affinity compared with monosaccharides (14). The detailed specificities of lectins can now be determined by using glycan arrays (15,16), and array data obtained for WFA confirmed it preferentially binds glycans with LacdiNAc termini (Consortium for Functional Glycomics, plant lectin dataset #2342). At the lowest concentration of added lectin, the top six glycans recognized all had LacdiNAc termini and were followed by GalNAc ␣ and ␤ ligands that showed from 28% to 20% of the binding of the best ligand.
The LacdiNAc structure is abundant in invertebrates (5,(17)(18)(19), and some mammalian glycoproteins and lipids contain N-and O-glycans terminating in LacdiNAc, particularly hormones (20). The GalNAc transferase ␤4GalNAc-T3 is primarily responsible for its biosynthesis (21). Significantly, glycoproteins expressing terminal LacdiNAc appear to become elevated in a variety of human cancers, including prostate, lung, ovarian, colon, and liver cancers (6,8,9,(22)(23)(24)(25). The tissue-specific expression of the LacdiNAc is a potent diagnostic marker for specific human cancers (23), and several recent studies have promoted the use of WFA for detection of LacdiNAc or terminal GalNAc overexpressed during cancer progression and growth (26 -33). However, the molecular basis for WFA recognition of and specificity toward LacdiNAc is not known.
To date, there is one high resolution structure of a mushroom lectin (Clitocybe nebularis (CNL)) in complex with Lac-diNAc (34). However, a hemagglutinin inhibition assay of the CNL protein showed broad specificity to lactose, galactose, glucose, and sucrose, ranging from high to low inhibition concentration, respectively (34). Thus, its cross-reactive potential severely limits its use for cancer biomarker detection. WFA is currently considered the most prominent diagnostic lectin against cholangiocarcinoma when compared with other homologous lectins (26,35) and the most discriminatory against altered N-glycans on Mac2-binding protein, a secretory N-glycoprotein with elevated expression levels during viral hepatitis induced liver cirrhosis (36,37).
Recently, Narimatsu (38) successfully produced recombinant WFA and reported its full-length primary sequence along with closely related lectins. Despite high sequence homology, the lectins derived from Wisteria japonica and Wisteria brachybotrys have unique sugar binding activities (9). For instance, W. japonica agglutinin only binds terminal ␣and ␤-linked N-acetylgalactosamine.
There is some confusion surrounding the nomenclature and quaternary structure of WFA, where initially it was reported as a dimer with a molecular mass of the monomer ranging from 35 kDa to 28 kDa (9 -12); the WFA sold commercially is reported as a tetramer with a molecular mass of 116 kDa in an oxidizing environment. An earlier publication also reported a tetrameric hemagglutinin, purified from seeds of W. floribunda, termed WFH (39). The tetrameric WFH does not have mitogen activity but displays strong hemagglutination and leukoagglutination activity, whereas the dimeric WFA show phytogenic and hemagglutination activity (38). The dimeric lectin is referred to WFM for its mitogenic capability. It is currently unclear whether these lectins represent different isoforms or entirely different lectins.
A better understanding of WFA and the molecular basis for LacdiNAc-lectin recognition is of significant biomedical and immunological interest. Here we report the high resolution crystal structures of the tetrameric form of WFA from Vector Laboratories Ltd (Burlington, ON, Canada) in complex with GalNAc and LacdiNAc together with binding data.

Experimental Procedures
Synthesis of LacdiNAc␤-pNP-The bacterial ␤1,4-galactosyltransferase (gene HP0826, construct HP-21) from Helicobacter pylori and the UDP-GlcNAc/Glc 4-epimerase (gene Cj1131c, construct CPG-13) from Campylobacter jejuni were expressed and purified as reported in Namdjou et al. (40) and Bernatchez et al. (41), respectively. The synthesis reaction mix included 5 mM GlcNAc␤-pNP (15 mg), 10 mM UDP-GlcNAc, 10 mM MnCl 2 , 50 mM MES, pH 6.5, 39 units of HP-21, and 46 units of CPG-13. The reaction was incubated at 37°C. The additions of 12 units of HP-21 and 14 units of CPG-13 were made after 4 h, and an addition of 3 mM UDP-GlcNAc was made after 6 h. Capillary electrophoresis analysis (P/ACE MDQ system equipped with diode array detection, Beckman Coulter, Fullerton, CA) showed that the reaction was 85% complete after 23 h. The reaction mix was applied to a solid phase extraction cartridge (SepPak C18). Unbound material was removed by washing with water and then 20% methanol. The mix of GlcNAc␤-pNP/LacdiNAc␤-pNP was eluted with 50% methanol. Incubation at 4°C in 50% methanol resulted in the selective precipitation of the LacdiNAc␤-pNP. The LacdiNAc␤-pNP was recovered by centrifugation, washed with 100% methanol, and solubilized in water.
Surface Plasmon Resonance-The interactions of sugar ligands with immobilized WFA were measured by SPR using a Biacore T200 instrument (GE Healthcare). Immobilizations were carried out on research grade CM5 sensorchips at a protein concentration of 50 g/ml in 10 mM sodium acetate buffer, pH 4.0, using the manufacturer's amine coupling kit. The amount of immobilized WFA was 3928 resonance units (RU), and the reference surface was blocked with ethanolamine alone. The binding analyses were carried out at 25°C in 20 mM HEPES buffer, pH 7.5, containing 50 mM NaCl, 0.1 mM CaCl 2 , and 0.1 mM MnCl 2 at a flow rate of 20 l/min. The data were analyzed with the T200 evaluation software, version 3.0.
Purification and Crystallization-Lyophilized WFA (Vector Laboratories) was diluted to 1 mg/ml in 20 mM Tris-HCl, pH 8.0, 0.1 mM Ca 2ϩ , and 0.1 mM Mn 2ϩ and further purified using size exclusion chromatography BioSep SEC-s3000 column (Phenomenex, Torrance, CA) with the same buffer. Samples were concentrated to 15 mg/ml in the presence of either 5 mM LacdiNAc or 10 mM GalNAc for crystallization trials. Sitting drops were set up in a 16°C room with 96-well plates using a Gryphon Xtallization Robot (Art Robbins Instruments, San Jose, CA). Crystals of WFA in presence of LacdiNAc appeared in Ͼ10% all screens tested.
The best crystal conditions were obtained in condition 22, 9, and 6 of PEG I, PEG II, and JCSGϩ crystal screens (Qiagen, Toronto, ON, Canada) with the following formulas, respectively: 0.1 M Tris-HCl, pH 8.5, and 25% (v/v) PEG 550 MME: 0. Data Collection, Molecular Replacement, and Structure Refinement-All crystals of WFA in the presence of GalNAc and LacdiNAc contained appropriate amounts of cryoprotectant and were flash-frozen in liquid nitrogen batch directly. X-ray diffraction data sets were collected at the Canadian Macromolecular Crystallography Facility on beamline 08ID-1 (CMCF-ID) of the Canadian Light Source (Saskatoon, Saskatchewan, Canada) at 0.979 Å wavelength with a MarMosaic CCD300 detector and processed using HKL2000 (HKL Research Inc. Charlottesville, VA).
The structure of WFA in complex with GalNAc was solved by molecular replacement using Phaser (42) with a monomer of a seed lectin from Vatairea macrocarpa (43) (PDB code 4XTM) as a search model. Subsequently, structures I, II, and III in complex with LacdiNAc␤-pNP were solved using the WFA-Gal-NAc as a search model. Manual fitting of A-weighted Fo-Fc and 2Fo-Fc electron density maps was carried out with Coot (44). Restrained refinement was carried out using REFMAC5 (45). All stereo figures and r.m.s.d. (root mean square deviation) calculations presented in this paper were made using SetoRibbon (available upon request). Electrostatic surface potential figures were made using Chimera molecular visualization software (46). Marvin version 5.7.0 from ChemAxon was used for drawing chemical structures and making models for LacdiNAc␤-pNP used in this study. Geometric restraints and final model generation were obtained from the PRODRG server (47). Buried surface area was calculated with AreaIMol (48) in CCP4 suite (49) using 1.4 Å probe radius and standard van der Waals radii. All solvent molecules were excluded for the calculations. Calculations were averaged to account for small differences between the buried surfaces in each monomer.
Tryptic Tandem Mass Spectrometric Analysis-Purified WFA was diluted to 1 mg/ml in 20 mM Tris-HCl before MS analysis. WFA was first digested using trypsin (Promega, Madison, WI) for 18 h at room temperature and subsequently acid-ified with formic acid and stored in Ϫ80°C. The sample was subsequently eluted through a zip tip into an Eppendorf tube with 5 l of 0.1% TFA and 50% acetonitrile. Tryptic peptides were subjected to mass spectrometric analysis using a nano-HPLC system (Easy-nLC II, Thermo Fisher Scientific, Mississauga, ON, Canada) coupled to the electrospray ionization source of an LTQ Orbitrap Velos (Thermo Fisher Scientific) using conditions described in Garcia-Pino et al. (50).
Peptic Tandem Mass Spectrometric Analysis-WFA (1 mg/ml) in Tris buffer was reduced with 10 mM tris(2-carboxyethyl)phosphine for 4 h at room temperature. 50 pmol of WFA was subsequently diluted into 10 mM HEPES, pH 7.5, and 50 mM NaCl and acidified to give a concentration of 0.6 M guanidine-HCl, 0.8% formic acid. Samples were rapidly frozen in liquid nitrogen and stored at Ϫ80°C until mass analysis. Protein samples were rapidly thawed and injected onto a UPLC (ultra performance liquid chromatography) system kept in a cold box at 2°C. The sample was run over two porosyme immobilized pepsin columns (Applied Biosystems; porosyme, 2-3131-00) in series, 1 at 10°C and the other at 2°C using 200 l/min for 3 min, and peptides were collected onto a VanGuard precolumn trap (Waters Ltd., Mississauga, ON, Canada). The trap was subsequently eluted in line with an Acquity 1.7-m particle, 100 ϫ 1-mm 2 C18 ultra performance liquid chromatography column (Waters) using a gradient of 5-36% B (buffer A: 0.1% formic acid, buffer B 100% acetonitrile) for Ͼ20 min. Mass spectrometry experiments were performed on an Impact II TOF (Bruker) acquiring over a mass range from 350 to 1500 m/z for 30 min using an electrospray ionization source operated at 200°C and a spray voltage of 4.5 kV (Impact).
MS/MS was run in a data-dependent acquisition mode with a 0.5-s precursor scan from 200 to 2000 m/z followed by 12 fragment scans from 150 to 2000 m/z of 0.25 s (Impact). The resulting MS/MS datasets were analyzed using PEAKS7 (Bioinformatics Solutions). A database containing a number of possible sequences from WFA and a known set of possible contaminants was used including pepsin. False discovery rate was set at 0.1% with a cut-off Ϫ10 log 10 (p value) score of 29.9.

Results
Synthesis of LacdiNAc␤-pNP-The H. pylori ␤-1,4-galactosyltransferase (HP0826) was able to use UDP-GalNAc as a donor, although its natural donor is UDP-Gal. Analytical test syntheses showed that HP0826 could achieve Ͼ99% conversion of a fluorescent GlcNAc␤ derivative to LacdiNAc when using a large excess (5ϫ) of UDP-GalNAc and extended incubation times (data not shown). The synthesis of LacdiNAc could be performed at a preparative scale by coupling the reaction with the UDP-GlcNAc/Glc 4-epimerase (Cj1131c) from Campylobacter jejuni in order to use UDP-GlcNAc as a precursor. Starting with 15 mg of p-nitrophenyl GlcNAc, we synthesized 20 mg of LacdiNAc␤-pNP with a recovery yield of Ͼ85%. The structure and purity of the product were confirmed by thin layer chromatography (data not shown), capillary electrophoresis (data not shown), mass spectrometry (data not shown), and NMR (supplemental Table S1 and Fig. S1).
Surface Plasmon Resonance-On the basis of the glycan array data, three compounds were chosen for SPR analyses (supplemental Fig. S2), representing high, medium, and low affinity structures ( Table 2). In steady state analyses, the K D of WFA for LacdiNAc␤-pNP was 5. X-ray Diffraction Data and Overall Structure-Data collection and refinement statistics for WFA structures are given in Table 1. Data were collected for WFA crystals in complex with GalNAc were collected to 2.34 Å and solved in the space group P2 1 2 1 2, with an Rpim of 3.50%. The structure contained one tetrameric lectin in the asymmetric unit, displaying the characteristic legume fold ( Fig. 2A). Like other L-type lectins, WFA possesses a conserved metal binding site for the divalent Ca 2ϩ and Mn 2ϩ ions. Their presence allow for the formation of a cis peptide bond between Ala-116 and Asp-117 (numbering scheme shown in Table 3), a common feature of legume lectins, which is essential for proper folding and carbohydrate binding (51).
Excellent electron density is observed for the GalNAc (Fig.  2B) in all four binding sites and for the polypeptide chain all monomers, with exception of solvent-exposed residues 68 -71, 91-94, and 141-148. C-terminal residues 269 -273, including the Cys residue were disordered and, thus, excluded from the final model.
Data for WFA in complex with LacdiNAc␤-pNP were collected to 1.80, 1.95, and 2.09 Å resolutions for conditions I (pH 8.5), II (pH 6.5), and III (pH 4.2), respectively. All three structures contained one homotetramer in the asymmetric unit and were solved in space group P2 1 2 1 2 1 with an Rpim of 4.30%, 8.90%, and 5.70%, respectably. Excellent electron density was observed for the LacdiNAc␤-pNP (Fig. 2, C-E) and polypeptides in three structures, with exception of solvent-exposed residues in identical regions as the WFA-GalNAc complex. Condition III displayed poor density and high B-factors for the nitrophenyl moiety of LacdiNAc␤-pNP (Fig. 2E).
All structures also showed good density for N-linked GlcNAc on Asn-146, whereas a few of the monomers showed unambiguous density for a biantennary N-glycan up to seven ordered carbohydrate residues, including a ␤(132)Xyl residue typically found in plant N-glycans. The most ordered N-glycan structure (lowest mean isotropic temperature factor) was observed for one of the monomers in condition III and was stabilized by crystal contacts via terminal ␣(133)Man (Fig. 2F).
WFA-GalNAc Interactions and Buried Surface Area-There are seven hydrogen bonds formed between WFA and GalNAc (Fig. 2G), summarized in Table 4. The total buried surface area for the GalNAc residue is ϳ146 Å 2 . There are multiple hydrophobic interactions between the hydrophobic face of sugar ring and Phe-159 in addition to van der Waals contacts from ␤ carbon and C4Ј of His-249 side chain to C6Ј carbon of GalNAc (Fig. 2G). WFA forms a hydrophobic groove via Gly-134, Gly-135, Trp-163, and Leu-245 surrounding the acetyl group of GalNAc. There is also an indirect hydrogen bond formed via a water molecule and the carbonyl group of Pro-133. The carbonyl group of the acetamide moiety was oriented toward the binding site, forming a weak a hydrogen bond (3.00 -3.15 Å) with the amide atom of Gly-134 (Fig. 2G). The methyl moiety of GalNAc formed van der Waals contacts (3.8 -4.0 Å) with Trp-163 residue.

WFA-LacdiNAc Interactions and Buried
Surface Area-In the three structures of WFA in complex with LacdiNAc, all direct hydrogen bonds were directed to the GalNAc moiety (Fig. 2G). An additional water-mediated hydrogen bond was observed between the amido group of GlcNAc and carbonyl group of Leu-245 (Fig. 2H). There were also hydrophobic interactions between Leu-245 and the hydrophobic face of the GlcNAc residue (Fig. 2H). The additional buried surface area for GlcNAc was ϳ37 Å 2 , of which 23 Å 2 and 14 Å 2 were contributions of Leu-245 and Ser-246, respectively.
Tandem MS Sequencing and Alignment-During refinement and modeling, we discovered significant differences in the 2Fo-Fc and Fo-Fc electron density maps around Thr-183 and Gly-190 (Fig. 3A) in all structures when compared with sequence published in the patent by Sato et al., (38). Based on their electron density and surrounding environment, we predicted that Thr-183 should be an Ile, and Gly-190 is either an Asn or an Asp (Fig. 3B). To confirm the sequence and presence of isolectins in sample, we performed tandem mass spec analysis on WFA. We achieved 91 and 94% protein coverage by cleavage with trypsin (Fig. 3C) in native condition and pepsin (Fig.  3D) in reduced condition, respectively. Combined, we obtained sequence coverage of 98%, excluding the N-and C-terminalcleaved residues that are cleaved post-translationally. Residues ranging from 181 to 204 could not be confirmed using trypsin digest alone but were confirmed using pepsin digest in conjunction with the electron density. The presence of Ile-183 and Asp-190 was confirmed via multiple fragments from the pepsin digest. The best one showed a Ϫ10log 10 (p value) score of 48.5 and a ppm error of 0.7, corresponding to sequence (S)IVSRK-TISWDLEND(E) (Fig. 3E). Interestingly, the original residues Thr-183 and Gly-190 (not observed in the electron density) were observed in one of the fragments (of sequence (S)IVSRK-TTSWDLENGEVAN(V) with a Ϫ10log 10 (p value) score of 57.4 and a ppm error of Ϫ2.3 (Fig. 3F). Only five residues could not be confirmed using either method; they were Arg-144, His-145, Lys-146, Trp-258, and Thr-259. The sequence obtained from the mass spectrometry analysis and with electron density analysis is shown in Table 5, along with related plant lectins.

Sequence of W. floribunda agglutinin as published in the patent by Sato et al. (38)
Grey highlighted residues were originally assigned to Thr-183 and Asp-190, but our data show that Ilc-183 and Asp-190 (highlighted in grey) are more abundant in solution. Asn-146 is glycosylated, and Cys-272 forms a disulfide bond between monomers of the tetramer. Underlined residues are processed and are not part of the secreted lectin. Residues in contact with LacdiNAc and GalNAc either through water bridges or direct hydrogen bond/van der Waals interactions are shown in bold. Leu-245 is the only residue involved in recognition of LacdiNAc.

TABLE 4 H-bond interactions between W. floribunda agglutinin residues and carbohydrate ligand in GalNAc and GalNAc(␤134)GlcNAc-p-nitrophenyl (LacdiNAc-pNP) in conditions I, II, and III complexed structures
The distance cut-off for hydrogen bond assignment was 3.3 Å. Hydrogen bond distances are averaged over the four molecules in the tetramer in the asymmetric unit. No hydrogen bonds were observed between WFA and GlcNAc in LacdiNAc ligand.   NOVEMBER 11, 2016 • VOLUME 291 • NUMBER 46

JOURNAL OF BIOLOGICAL CHEMISTRY 24091
Namdjou et al. (40) reported that HP0826 can be expressed in high yields in Escherichia coli and is an efficient biocatalyst to synthesize compounds with terminal ␤1,4-linked Gal residues. Similarly to the bovine B4GalT1, HP0826 could achieve Ͼ99% conversion of GlcNAc␤ derivatives to LacdiNAc derivatives when using a large excess of UDP-GalNAc and extended incubation times. This work reports an efficient protocol that should be applicable for the synthesis of other LacdiNAc derivatives from GlcNAc precursors using bacterial enzymes that are easy to produce at large scale. WFA Recognition of LacdiNAc and GalNAc-WFA displays one of the highest affinities observed for a legume lectin to a disaccharide, with a K D of 5.45 M against LacdiNAc. The 17-fold difference in affinity between LacdiNAc and GalNAc is likely due to the displacement of water around Leu-245 and hydrophobic interactions between Leu-245 and the hydrophobic face of GlcNAc (Fig. 2H). In contrast, the 5-fold higher affinity to GalNAc compared with Gal is likely due to a combination of an additional hydrogen bond to the acetamido group and hydrophobic interactions to the methyl group. Hydrophobic residues surrounding acetamides are common in carbohydrate-binding proteins and other homologous legume lectins.
Comparison to Other Legume Lectins-Most leguminous lectin monomers adopt the "jelly-roll" (a type of ␤-sandwich) topology characterized by three ␤-sheets: a six-stranded back sheet, a small top sheet, and a seven-stranded curved front sheet and finally a series of ␤ turns and loops which hold the two large sheets together (54,55). The front sheet together with a few of the loops forms the carbohydrate binding site. Despite the observed structural conservation between leguminous lectins, their specificities, loops, and quaternary structures vary widely (54 -56). There are many plant lectins in the literature with sequence identity of 60 -70% to WFA, with a subset of these shown in Table 5. The structures of lectins from Robina pseudoacacia and V. macrocarpa have been determined in complex with GalNAc (43, 57) (PDB codes 1FNZ and 4XTM respectively) and with Tn antigen in the latter case (PDB code 4XTP).
Like WFA, these lectins form a stable tetramer even over a wide range of pH values; however, WFA oligomerization is further facilitated via disulfide bonds between the C-terminal ends of each monomer (9). This finding was the basis for a patent (38) where a mutation of C-terminal Cys to a Ala residue produced monomeric recombinant WFA. Cys residues in legume lectins are rare; one other GalNAc-specific lectin isolated from Sophora japonica tree bark has disulfide bonds cross-linking monomers (58). The C-terminal residues were disordered in all of the structures of WFA and excluded from the final model.
Many of these lectins use the same amino acid residues in binding and metal coordination ( Table 5). The V. macrocarpa lectin shares the highest sequence homology, and all residues that are observed to make contact with GalNAc and Tn antigen in those structures are also found in WFA. Secondary structure alignment of WFA with V. macrocarpa lectin shows a near perfect overlap between the bound GalNAc and the binding site residues (including the underlying residues) with a r.m.s.d. of 0.35 Å. Because of the high structural and sequence homology with V. macrocarpa lectin, WFA would be expected to bind Tn-antigen, but glycan array experiments have shown GalNAc␣ compounds have a much Structures of R. pseudoacacia and V. macrocarpa have been determined in complex with GalNAc with PDB codes 1FNZ and 4XTM, respectively (43 and 57). Residues in contact with GalNAc either directly or via a water bridges are shown in bold, and residues involved in metal coordination are underlined. lower affinity. Furthermore, in an earlier study (8) WFA staining did not overlap with an anti-Tn antibody.
In contrast, the lectin isolated from R. pseudoacacia uses an Asp instead of the shorter Ser-246 (Ser-257 in Table 5), which would form steric clashes with the GlcNAc, hindering the recognition of LacdiNAc. All other residues contacting GalNAc were conservative mutations or identical between R. pseudoacacia and WFA.
To date, only one other lectin structure has been determined in complex with LacdiNAc, and like WFA, C. nebularis lectin (CNL) forms hydrogen bonds exclusively to the GalNAc moiety (34). There are six hydrogen bonds formed between the CNL and GalNAc compared with the seven of WFA, with the latter also forming a larger buried surface area to the GlcNAc moiety than CNL. This would be consistent with the use of WFA as a more sensitive vehicle for detection of low levels of LacdiNAc expressed on surfaces of cancerous cells. The C. nebularis lectin also differs significantly in its fold and quaternary structure compared with WFA. The CNL fold is similar to the ␤-chain of ricin (hence referred to as a "ricin-like or R-type lectin"), forming primarily a dimer in its native state, and exhibits low sequence and structural homology with WFA.
Tandem MS Sequencing and Presence of Isolectins-Due to the discrepancy observed between the electron density and the sequence published in the patent by Sato et al. (38), we performed both tryptic and peptic digests of WFA coupled with mass spectrometric analysis. We obtained a total of 98% sequence coverage. We detected two forms of the lectin present in solution, with multiple polypeptide fragments containing the residues Ile-183 and Asp-190, whereas one fragment confirmed the presence of an isolectin with the previously reported sequence of Thr-183 and Gly-190. However, the 2Fo-Fc and Fo-Fc electron density maps were more appropriate for Ile-183 and Asp-190 side chains, possibly due to higher abundance of this isoform.
In addition, we did not observe polypeptide fragments containing Cys-272 (Fig. 3C) under native condition, whereas pepsin digest under reducing condition (Fig. 3D) confirmed the presence of Cys-272 and its participation in disulfide bond formation. The disulfide bond between the C-terminal tails of neighboring monomers is likely involved in further stabilization of the tetramer, and its flexibility inhibits access of large molecules to the central channel, which in turn could destabilize oligomerization (43). Sato et al. (38) also postulated that the C-terminal end is processed and cleaved between Asn-274 and Asn-275 by an enzyme, and our digest under reducing condition corroborated this finding with coverage up to Asn-274. Asn-specific cleavage of C-terminal peptides of legume lectins is a common phenomenon (59,60).
Effects of pH on Structure and Binding-Three different structures of WFA were obtained in three different conditions at pH 8.5, 6.5, and 4.2 (conditions I, II, and III, respectively). Although no significant differences were observed between the main chain polypeptide structures; i.e. no conformational changes observed, acidic pH had a destabilizing effect on ligand. The respective mean B-factors of the ligand averaged across all four monomers were 44.0, 34.3, and 53.4 for conditions I, II, and III. This is also evident in the electron density maps, with poor density observed for the GlcNAc (in some monomers) and nitrophenyl moieties in condition III (Fig. 2E).
Conclusion-Recent studies of WFA-reactive colony-stimulating factor 1 receptor and Mac-2-binding protein (27,33) emphasize the clinical importance of this lectin as a tool for diagnosis and detections of cancers, particularly early hepatocellular carcinoma. Until now, the molecular basis for binding of WFA to LacdiNAc was unknown. Our study shows how WFA achieves high affinity to LacdiNAc biomarker, paving the way for rational engineering of the carbohydrate binding site for further improvements. Furthermore, structures of W. floribunda lectin in complex with GalNAc and LacdiNAc in conjunction with dissociation constants provide the first molecular explanation for the preference of WFA for LacdiNAc over Gal-NAc. WFA achieves higher affinity for LacdiNAc via additional van der Waals interactions and buried surface area around the hydrophobic face of the GlcNAc sugar ring. Additionally, the carbohydrate binding site of WFA is identical to a structurally related seed lectin derived from V. macrocarpa. This lectin binds both ␣ and ␤ GalNAc residues, and it has been proposed as a means of detecting the Tn antigen (43). However, glycan array experiments have shown WFA recognizes the Tn-antigen with a much lower affinity than for LacdiNAc, and WFA staining did not overlap that of an anti-Tn antibody (8). Finally, our findings confirm the presence of disulfide bonds between the WFA monomers and at least one other isolectin from commercial source. In combination with avidity effects, this tetrameric WFA achieves a high degree of specificity toward LacdiNAc required for detection of cancerous cells coated with this biomarker.