The Antineoplastic Lectin of the Common Edible Mushroom (Agaricus bisporus) Has Two Binding Sites, Each Specific for a Different Configuration at a Single Epimeric Hydroxyl*

The lectin from the common mushroom Agaricus bisporus, the most popular edible species in Western countries, has potent antiproliferative effects on human epithelial cancer cells, without any apparent cytotoxicity. This property confers to it an important therapeutic potential as an antineoplastic agent. The three-dimensional structure of the lectin was determined by x-ray diffraction. The protein is a tetramer with 222 symmetry, and each monomer presents a novel fold with two β sheets connected by a helix-loop-helix motif. Selectivity was studied by examining the binding of four monosaccharides and seven disaccharides in two different crystal forms. The T-antigen disaccharide, Galβ1–3GalNAc, mediator of the antiproliferative effects of the protein, binds at a shallow depression on the surface of the molecule. The binding of N-acetylgalactosamine overlaps with that moiety of the T antigen, but surprisingly, N-acetylglucosamine, which differs from N-acetylgalactosamine only in the configuration of epimeric hydroxyl 4, binds at a totally different site on the opposite side of the helix-loop-helix motif. The lectin thus has two distinct binding sites per monomer that recognize the different configuration of a single epimeric hydroxyl. The structure of the protein and its two carbohydrate-binding sites are described in detail in this study.

Lectins are proteins that recognize specific carbohydrate structures and thereby participate in molecular recognition events of fundamental relevance in a variety of biological processes (1)(2)(3). The lectin from the common edible mushroom Agaricus bisporus (ABL) 1 is a member of a remarkable group that shares the property of binding, selectively and with high affinity, the Thomsen Friedenreich antigen or T antigen. The T antigen is a disaccharide, Gal␤1-3GalNAc, linked to either serines or threonines on cell surface glycoproteins and hidden in healthy cells but exposed in a high percentage of human carcinomas and other neoplastic tissues (4,5). The effect of the lectins that bind the T antigen on cell proliferation can be very pronounced and very different. For instance, the peanut agglutinin (Arachis hypogaea agglutinin) (PNA) stimulates the proliferation of human intestinal epithelial cells (6), whereas jacalin has the opposite effect, i.e. strong inhibition of cell growth (7). ABL has the remarkable property of reversibly inhibiting the proliferation of malignant epithelial cell lines without any apparent cytotoxicity for normal cells (8). This effect is thought to be a consequence of the selective blocking by ABL of nuclear localization sequence-dependent protein import, which is essential for cell functioning (9,10). More recently, it was shown that the lectin binds in the cytoplasm to a truncated form of oxygen-regulated protein 150 (Orp 150) that expresses the sialylated form of the ABL ligand, sialyl-2,3-galactosyl-␤1,3-Naceltylgalactosamine, and appears to be essential for nuclear localization sequence-dependent nuclear protein import (11).
ABL is a homotetramer with a molecular mass of 64,000 Da and has a sequence that does not show any significant similarities to any of the other lectins that bind the T antigen studied thus far by x-ray diffraction (12). Four different isoforms have been isolated, but the structural differences among them are still unclear (13). The structural requirements of carbohydrates and glycotopes to bind to what is believed to be a single site per monomer have also been extensively investigated, and there is evidence that ABL behaves differently from the other lectins that bind the T-antigen disaccharide (14,15). The x-ray structure of the ABL molecule described here reveals that the protein has a new fold, predicted to be present in other proteins of fungal origin, that is quite different from those of the other lectins of known three-dimensional structure that bind the T antigen. It also shows that the quaternary structure of the tetramer presents 222 (D2) point group symmetry. A most unexpected result is the discovery of a second carbohydrate-binding site that specifically binds N-acetylglucosamine. These two binding sites present in each of the four monomers of the tetramer appear to be independent from each other and are able to distinguish two monosaccharides that differ only in the configuration of a single epimeric hydroxyl.

MATERIALS AND METHODS
Protein Purification and Crystallization-ABL was purified from A. bisporus fruiting bodies by affinity chromatography in a column of human erythrocytic stroma incorporated into a polyacrylamide gel (16) as described elsewhere (17). The five isoforms were separated by pre-parative isoelectric focusing, and the most basic apoprotein was crystallized either by vapor diffusion in hanging drops or in microdialysis cells using as precipitant 4 M sodium formate buffered with 0.02 M Tris-HCl, pH 8.0. The crystals of the apoprotein are orthorhombic space group C222 1 with unit cell parameters a ϭ 91.9 Å, b ϭ 96.7 Å, and c ϭ 75.4 Å and contain two monomers in the asymmetric unit (18). A second crystal form was grown in the presence of N-acetylglucosamine using the vapor diffusion method by mixing equal volumes of the protein solution and a solution of 15% polyethylene glycol 8000, 0.5 M lithium sulfate, and 5% isopropanol as the precipitant. These crystals belong to the tetragonal space group P4 3 2 1 2 with unit cell parameters a ϭ 85.5 Å and c ϭ 257.2 Å and contain four monomers in the asymmetric unit.
Data Collection-The diffraction data were collected from crystals frozen at 100 K after a brief immersion in a mixture of 70% of the mother liquor and 30% glycerol. The data were initially obtained using copper K␣ radiation from a Rigaku RU-300 rotating anode x-ray generator with either a Rigaku R-axis II or a Mar345 imaging plate area detector. The data used for the final refinement of the apoprotein were collected at the XRD1 beamline of the Elettra synchrotron in Trieste ( ϭ 1.00 Å). Two data sets, at high and low resolution, were collected from the same crystal using a Mar charge-coupled device area detector. The two heavy atom derivatives were prepared by overnight soaking of a crystal in mother liquor with the addition of the two compounds at a final concentration of ϳ1 mM. The co-crystals with the different monosaccharides and disaccharides were prepared by soaking crystals of the apoprotein in mother liquor saturated with the carbohydrates. The data for the tetragonal crystal form were collected at the ID23 beamline of the European Synchrotron Radiation Facility (Grenoble, France). The data were indexed, integrated, and reduced using the programs MOSFLM (19), AUTOMAR, and Scala (20). The diffraction data statistics of the main selected data sets are summarized in Table I.
Structure Determination-Initial phases to 2.5 Å resolution were determined by multiple isomorphous replacement with the two heavy atom derivatives of the orthorhombic form. The two osmium sites were located in a difference Patterson map (21) and refined using the program MLPHARE (20). The single isomorphous derivative phases were used to locate the most significant gold site in the difference Fourier map. These two major sites (one site from each of the two derivatives) were used as input for the program autoSHARP (22), which was used to locate the minor sites of the two derivatives, and for density modification and final phasing. The electron density map thus produced was of excellent quality and could be readily interpreted. The initial model of the apoprotein was built in the high quality map at 2.5 Å resolution using the program O (23). Model building proceeded without difficulty from Thr 2 to Gly 133 following the sequence from nucleic acid available at the ExPASy server (12). At this point, two facts became evident: the first was that the electron density did not match the published sequence, and the second was that there was no electron density in the map beyond amino acid 143 (the published sequence is 154 amino acid long). Both facts could be easily explained as described below, and therefore a 142-amino acid polypeptide chain (from Thr 2 to Gly 143 ) was built for both monomers in the asymmetric unit. This model was initially refined using the program Crystallography and NMR System (24) without imposing non-crystallographic symmetry. Subsequent refinement using the high resolution synchrotron data was carried out with the program REFMAC (25). During the process of refinement and model building, the quality of the model was controlled with the program PROCHECK (26). Solvent molecules were added to the model in the final stages of refinement according to hydrogen bond criteria and only if their B factors refined to reasonable values and if they improved the R free value. The final model of the apoprotein contains 2270 nonhydrogen protein atoms and has very reasonable geometry (see Table I), with 89.2% of the residues in the most favored regions of the Ramachandran plot, and the remaining 10.8% of the residues in the additionally allowed region. The monosaccharides and disaccharides in the co-crystals were modeled into difference Fourier maps phased by the refined, unliganded structure. The models of the complexes were refined with REFMAC using the same criteria followed in the refinement of the apoprotein. The final statistics of three selected orthorhombic co-crystals are given in Table I. The tetragonal crystal form, grown in the presence of N-acetylglucosamine, was solved by molecular replacement with the program MOLREP (27). The model of this form, as well as that of its co-crystals with the T antigen, was refined using the program REFMAC.

RESULTS AND DISCUSSION
Amino Acid Sequence-The 2.5 Å high quality electron density map of the orthorhombic crystals was very straightforward to interpret from Thr 2 to Gly 133 in terms of the translated sequence of the cDNA coding for the protein (12). However, when this point in the chain trace was reached, it became evident that there was no longer correspondence between electron density and amino acid sequence and also that the chain appeared to be shorter than predicted by the published sequence. The C-terminal portion of the polypeptide chain, based on the cDNA sequence and starting with amino acid number 131, was reported to be the following: 130-TEGIISRPISSSD-KCFIRLPSQKS-Stop. This amino acid sequence corresponds to the following nucleotide sequence: 390-acc-gaa-ggg-ata-atctca-agg-cca-atc-tca-tca-tcg-gat-aag-tgc-ttt-atc-cgc-cta-ccg-tctcag-aaa-tca-tga- [Stop].
If, after the first six nucleotides, at the point where the sequence presents three guanines, a fourth guanine is introduced, the new nucleotide sequence becomes 390-acc-gaa-ggggat-aat-ctc-aag-gcc-aat-ctc-atc-atc-gga-taa- [Stop], and, accordingly, the new amino acid sequence is 130-TEGDNLKANLIIG-Stop, which is 143 rather than 154 amino acids long and totally different from amino acids 134 to 143. The electron density of both chains in the asymmetric unit of the high resolution map of the apoprotein fits very well to this new sequence, and therefore, the model was built accordingly. The refinement at 1.5 Å resolution strengthened this interpretation, and the maps calculated with the phases from the final model confirmed the absence of electron density beyond amino acid 143. This result was further validated in all the co-crystals of the protein examined. The possibility that the nucleotide chromatogram should have been interpreted in terms of four rather than three guanines is not the only one that will yield the amino acid sequence compatible with the experimental electron density, but it is the most likely. Another point where there is a discrepancy is in amino acid number 64, an Ile (atc) according to the published sequence and a Ser (agc) as judged by the electron density in the maps. The predicted molecular mass of a monomer with the sequence used to fit the electron density maps from Thr 2 to Gly 143 is 16,053.8 Da, a value in very good agreement with the experimental results of mass spectrometry measurements (see the supplemental data). The sequence similarity with other members of the fungal saline soluble lectin family is also preserved in the C-terminal portion of the polypeptide chain used to fit the electron density maps (see Fig. 5).
Overall Structure of the Monomer-The final model of the apoprotein comprises 142 amino acid residues for each of the two monomers present in the asymmetric unit of the orthorhombic crystal form (see Table I). The maps do not show electron density for Met 1 . The two molecules in the asymmetric unit are related by a non-crystallographic dyad with an r.m.s.d. of 0.25 Å calculated over 142 C␣ pairs of equivalent residues. The most important difference between the monomers is the presence in monomer A of a 3 10 helix spanning residues 28 -30, in a region where the other monomer presents a ␤ turn. A monomer of ABL fits into a box with the approximate dimensions 45 ϫ 40 ϫ 25 Å. The maps do not show electron density other than that of the amino acid side chains in any of the three potential O-glycosylation sites or the N-glycosylation site that could be present according to the amino acid sequence.
The ABL monomer is a single domain structure organized as a ␤ sandwich ( Fig. 1A) with six strands of ␤ chain in the first sheet (strands H, I, J, A, D, and C) and four strands in the second sheet (strands B, E, F, and G). The first sheet is of the mixed type, and the second sheet is antiparallel. The topology of the first sheet is Ϫ2X 1 4 Ϫ1 Ϫ1, and that of the second sheet is Ϫ1 Ϫ1 Ϫ1 (Fig. 1B). A helix-loop-helix motif (spanned by residues 90 -108), packed against the second sheet and found between strands G and H, connects the two sheets. The space between the two sheets is filled mostly with hydrophobic side chains. The external surface of the second ␤ sheet and the helix-loop-helix motif form a pocket filled with the side chains of Trp 26 , Trp 77 , Tyr 28 , Tyr 98 , His 72 , Asn 94 , and Arg 107 . The T-antigen disaccharide binds at the edge of this pocket, interacting with the residues of the chain connecting the two sheets through the loops present between strands B and C and D and E and the ␤ turn between strands F and G of the second sheet. A hairpin at the other edge of the first sheet, which is formed by strands C and D, is found protruding away from the body of the monomer and the second sheet. A very clear salt bridge is formed at the interface between the two sheets between Glu 36 of strand C of the first sheet and Arg 23 of strand B of the second sheet. The overall shape of the monomer is thus that of a central roughly cylindrical body with a hairpin protruding in one direction and the helix-loop-helix motif protruding in the opposite direction. The only Cys present in the sequence, Cys 78 , is found in strand G of the second sheet, and its side chain does not point toward the pocket but toward the space between the two sheets.
Quaternary Structure-The asymmetric unit of the orthorhombic crystals of ABL is a dimer in which each monomer buries 1178 Å 2 of solvent-accessible area, 16% of 7369 Å 2 , the total surface area of a monomer. This value is consistent with a physiological role for the dimer (30,31). The contacts between the two monomers in the crystallographic asymmetric unit are established through the two strands B present at the edge of the second ␤ sheet. The two four-stranded sheets, although in contact with each other, do not form a single eight-stranded sheet in the dimer in the manner that has been observed quite frequently in legume lectins and many other oligomeric proteins (28). Two hydrogen bonds are established between the nitrogen and the carbonyl of Arg 23 of one monomer and those of the other monomer related to it by the non-crystallographic dyad. In addition to these hydrogen bonds between the two strands, another contact is established between the side chains of Glu 22 of one chain and Asn 25 of the other. Another contact is established between strand B of one monomer and strand C of the other sheet in the other monomer: the side chains of Arg 19 in one monomer and Asp 35 of the other monomer are in contact with each other. These contacts between monomers force the two Arg 23 residues into positions that are quite close to one another (and also very close to the non-crystallographic dyad), but the positive charges of the arginines are neutralized by those of the two Glu 36 of the same monomer.
All the biochemical evidence available is consistent with the existence of an ABL tetramer under physiological conditions (29). The solvent-accessible area of the asymmetric unit and those of the symmetry-related dimers was calculated and used to identify the other dimer in the physiological tetramer. The solvent-accessible area of the asymmetric unit is 12,381 Å 2 , which reduces to 10,587 Å 2 upon tetramer formation. Thus, the area of each dimer that becomes buried upon tetramer formation is 1794 Å 2 , about 14.5% of the total surface and a value that is perfectly in line with those observed for other physiological oligomers (30,31). A tetrameric assembly with an open structure in which there are two dyads that form an angle of 73°and Ϫ73°has been described for PNA, another lectin that binds the T antigen (32). As shown in Fig. 1, C and D, the ABL tetramer is an object with 222 (D 2 ) symmetry.
The interaction between dimers in the tetramer is characterized by the contacts established between each monomer and both members of the other dimer. This fact reinforces the contacts between dimers that are not particularly extensive in each of the two other interfaces. The point of closest contact at a second interface between monomers is found between the two chains A of the first ␤ strand, more specifically, the nitrogen of Tyr 10 is in contact with the carbonyl of Gln 11 of the other monomer, and the nitrogen of Gln 11 is in contact with the carbonyl of Tyr 10 of the other chain. The corresponding side chains are, however, at almost 5 Å from each other. The other relevant contacts between monomers in this region are hydrophobic.
The interface with the other monomer in the second dimer is at the point where all the hairpins formed by strands C and D of the first sheet are close to each other in the center of the molecule (Fig. 1C). In particular, the carbonyl of Glu 36 is in contact with the nitrogen of Arg 38 of the other monomer and vice versa. Another important contact is established between the side chains of Arg 38 and Asp 35 of the other monomer.
Overall, these interactions appear to be less extensive than those observed within the dimer in the asymmetric unit (Fig.   1D), but they are strengthened by the fact that they occur repeatedly. There are no reports concerning the existence of the dimer as a stable unit.
In conclusion, the quaternary structure of the ABL molecule can be described as a dimer of dimers with 222 symmetry. Fig.  1C is a stereodiagram of the tetramer viewed looking down one of the crystallographic dyads, and Fig. 1D shows it rotated 90°, i.e. looking down the other crystallographic dyad.
T-antigen Binding Site-The remarkable antineoplastic properties of ABL are due to the selectivity with which it binds the T-antigen disaccharide moiety, i.e. Gal␤1-3GalNAc. Therefore the T-antigen and benzyl T-antigen disaccharide were the first two molecules to be tested by soaking the orthorhombic crystals of the apoprotein in the original mother liquor asymmetric unit, but careful analysis of the maps showed the presence of extra electron density in one of the two binding sites that could not be ascribed to the ligands. Several alternatives to the original mother liquor as well as to the cryoprotectant were tested, and the extra density was identified as a Tris molecule. To eliminate the possible interference of one or more ingredients of the mother liquor (18), for all subsequent soaking experiments, the orthorhombic crystals were first transferred to a solution of 30% polyethylene glycol 4000, 0.1 M sodium acetate buffered with 0.1 M borate, pH 8.5. A test with the two disaccharides mentioned above showed that ligand binding in this solution was identical to that observed in the original mother liquor, but the Tris molecule was not present in the electron density maps.
Four monosaccharides and seven disaccharides were tested by preparing co-crystals and examining electron density maps of the binding site. The only monosaccharide that showed very clear electron density in the T-antigen binding site was Nacetylgalactosamine. Crystals soaked in galactose or glucose did not show any electron density anywhere in the maps, and crystals soaked in N-acetylglucosamine did not diffract at all after a soaking time of longer than ϳ2 h, but shorter soaking times (of about 0.5 h) revealed that N-acetylglucosamine binds at a totally different site (see below). In addition to the Tantigen and benzyl T-antigen disaccharides, three other disaccharides displayed very clear electron densities at this binding site: the closely related p-nitrophenyl T-antigen disaccharide, the T antigen (Gal␤1-3GalNAc-␣-O-Ser), and lacto-N-biose (Gal␤1-3GlcNAc). Lactose and N-acetyllactosamine did not appear to bind in the crystals under these conditions. The structure determination statistics of the co-crystals with the T antigen and lacto-N-biose (Gal␤1-3GlcNAc), representative of the two classes of disaccharide that were found to bind at this site, are listed in Table I.
The binding of the four variants of the T-antigen disaccharide was identical for the Gal␤1-3GalNAc moiety present in all of them. The only monosaccharide that binds at this site: Gal-NAc does it in a position that overlaps completely with that moiety in the disaccharides and, the other disaccharide that binds in the crystals, Lacto-N-biose (Gal␤1-3GlcNAc), binds overlapping with the T-antigen variants with the exception of the only different epimeric hydroxyl, OH4. Therefore the main interactions at this binding site can be illustrated by discussing the contacts established by the T antigen (Gal␤1-3GalNAc-␣-O-Ser) and the protein (Fig. 2A).
The T-antigen binding site is a shallow depression delimited by the loops connecting strands B3 C, D3 E, and F3 G on one side and the helix-loop-helix motif on the other. Following a standard convention, we will use the numbering of GalNAc, Gal, and Ser to describe the interactions of the T antigen with the protein. The two most important interactions of the disaccharide are with Ser 48 and Asn 73 . The OG of Ser 48 interacts with O7 of GalNAc and its carbonyl with O2 of Gal, whereas the OD of Asn 73 interacts with N2 and its N with O7 of GalNAc. Single interactions are observed between the carbonyl of Gly 49 and O4 of GalNAc and between the nitrogen of Ala 29 and O4 of Gal.
The ND1 of His 72 and NH1 of Arg 107 , positioned at the end of strand F and in the middle of the first helix interact with O4 and O5 of GalNAc through a bridge with 2 water molecules; O2 and O3 of Gal are coordinated with another water molecule. Relevant hydrophobic contacts are established with Tyr 28 and, to a lesser extent, with Tyr 74 and Tyr 98 . The Ser moiety of the T antigen does not form any hydrogen bridges with the protein and is found protruding into the solvent. The side chains that are closest to it are those of His 72 and Asn 73 . The distances for the contacts described above are all very similar in all the co-crystals examined. The most important of these interactions are illustrated in Fig. 2 for two disaccharides and for N-acetylgalactosamine.
N-Acetylglucosamine Binding Site-When crystals of the orthorhombic form of ABL were soaked in mother liquor saturated with N-acetylglucosamine for periods of no longer than ϳ30 min and frozen immediately after that at 100 K, they almost completely retained their integrity and isomorphism. Longer soaking times caused severe deterioration of the crystalline order, and after about 2 h, the diffraction patterns were totally abolished. Difference electron density maps revealed that, in the crystals soaked for 30 min, there was extra electron density present at a site that was totally different from the T-antigen binding site. The density could be clearly interpreted in terms of a single molecule of N-acetylglucosamine bound per dimer of ABL. Crystal breakdown is caused by the interference of N-acetylglucosamine with crystal packing contacts between the second ABL monomer and a symmetry-related protein molecule. This result was observed consistently, not only with crystals of the apoprotein but also with co-crystals of the Tantigen disaccharide and with N-acetylgalactosamine. Therefore, a crystal screen was set up with solutions of ABL that contained 20 mM N-acetylglucosamine, and two new crystal forms were found. The first diffracted only to a resolution of about 3.5 Å, but the second, a tetragonal form, diffracted to 2.4 Å resolution. This second form, which contains one tetramer of ABL in the asymmetric unit, was solved by molecular replacement, and it confirmed the presence of the N-acetylglucosamine binding site in all four monomers of the asymmetric unit. The site is identical to that identified in the orthorhombic form. Soaking of these crystals in solutions containing the T antigen revealed that the presence of N-acetylglucosamine did not interfere with the binding at the other site. The last two columns of Table I list the structure determination statistics of the two crystal forms with N-acetylglucosamine in this second carbohydrate-binding site.
The N-acetylglucosamine binding site is delimited by residues from the last strand of the first ␤ sheet (strand G), the first strand of the second ␤ sheet (strand H), and the second of the two helices of the helix-loop-helix motif. The two sugar binding sites of the protein are thus situated on the two sides of the motif as shown in Fig. 1A. The main interactions of Nacetylglucosamine are with Thr 82 and Asp 79 that appear to be responsible for the specificity of the site for N-acetylglucosamine (Fig. 3A). Thr 82 makes a hydrogen bond with O7 of the N-acetyl group, which explains why glucose does not bind at this site. The two carboxylate oxygens of Asp 79 make contacts with O6 and O4 that point in the same direction in this sugar and in opposite directions in N-acetylgalactosamine, which explains why the latter does not bind at this site. Other contacts are established with Arg 103 , placed at the beginning of the second helix and Tyr 114 . The former is in contact with O5 and O6, and the OH of the latter is at ϳ2.6 Å from O3 of the sugar and about 3.0 Å from O4. The carbonyl of Ile 80 makes a hydrogen bond with O3 of the sugar. A water molecule is seen in contact with O4 in all the co-crystals of the orthorhombic form, but not in those of the tetragonal form (fewer water molecules were included in these models because of the lower resolution). The distances for all the contacts described above are very similar in all the co-crystals examined, and no conformational changes in the active site are detected between the free and ligand-bound protein. The main interactions of crystals that contain in every case N-acetylglucosamine in this site and selected ligands in the other site are illustrated in Fig. 3.  (38), and, more recently, the fungal galectin CGL2 (39). The Tantigen binding lectins have been classified into two groups on the basis of the monosaccharide that occupies the primary binding site (37). According to this classification, ABL belongs to the category that has GalNAc binding at the primary site. The group also includes M. pomifera agglutinin, jacalin, and amaranthin; the first two lectins bind the disaccharide with the Gal moiety completely exposed to the solvent and interact with the carbohydrate only through the GalNAc moiety. We tested experimentally the importance of the Gal moiety in the binding of disaccharides to ABL by studying the binding of lacto-Nbiose (Gal␤1-3GlcNAc) to the lectin. In the case of this disaccharide, it is the Gal moiety that determines the specificity of binding to the T-antigen binding site; GlcNAc alone binds at the other site. Therefore, binding of the T antigen to ABL appears to be more similar to the binding in amaranthin than that in the other lectins of the group, and ABL is thus the not yet been studied in detail, we have found that the site does not bind glucose, galactose, and N-acetylgalactosamine in the crystalline state. Therefore, the site appears to be rather specific for N-acetylglucosamine. The architecture of the nonspecific sites that bind N-acetylglucosamine has been discussed in detail recently (41); overall, they are topologically quite different from that of the second binding site of ABL. The structure of Urtica dioica agglutinin, a small monomeric lectin that is specific for saccharides containing GlcNAc, was solved unliganded and in complexes with a trisaccharide and tetrasaccharide (42). In this case, a Ser and a Tyr are found to play central roles in binding the carbohydrates to the two slightly different sites, but there is also in both cases a strong aromatic component that seems to be totally absent in the second ABL binding site.
Structural Similarity with Other Proteins-The ABL fold has no structural similarity to other T-antigen disaccharide-binding proteins. A comparison of the amino acid sequence of these proteins with the corrected ABL sequence using the program CLUSTALW (43) reveals very low identity percentages: 18% for amaranthin (34) and M. pomifera agglutinin (44), 14% for PNA (45), 13% for jacalin (46), and 16% for LT-1. Similar results are obtained using the program ALIGN (47).
A search for three-dimensional similarity with the ABL coordinates in the Dali server (48) identifies 11 proteins that score above the significance similarity threshold. They are mostly ␤ sandwiches and are molecules with very diverse functions. The protein with the highest score, more than twice the value of the second in the list, is the soluble form of equinatoxin II, a pore-forming toxin from the sea anemone Actinia equina (49). The equinatoxin II fold (Protein Data Bank accession code 1LAZ) is also a single domain ␤ sandwich, but it has 12 ␤ sheet strands, and its topology is totally different from that of ABL. The r.m.s.d. for 110 of a total of 175 C␣ of the toxin compared with ABL is 2.7 Å.
A sequence similarity search in the ExPASy server identifies the members of a fungal saline-soluble lectin family as similar to ABL. The family includes the lectins from the parasitic nematode trapping fungus Arthrobotrys oligospora (50 -52), the two mushrooms Pleurotus cornucopiae (53) and Xerocomus chrysenteron (54), and the filamentous fungi Podospora anserina and Neurospora crassa used in many studies of fundamental cell biology (55).
The A. oligospora lectin was identified in the fungus as responsible for the capture of the host that takes place through its interaction with a receptor bearing the specific carbohydrate present on the nematode surface (50). This mechanism of adhesion to host surfaces is believed to be present in many other both parasitic and symbiotic fungi. The lectin binds not only the T-antigen disaccharide but also sulfated glycoconjugates and phospholipids (51).
The gene sequence encoding two fruit body lectins of P. cornocupiae was reported recently (53). They are very similar to one another, and the proteins they code for clearly belong to the saline-soluble lectin family. Although their physiological function and ligand binding specificity are not known, they are believed to be involved in the defense mechanism of the fungus.
X. chrysenteron is an edible mushroom of the boletus family that possesses remarkable insecticidal properties. The lectin responsible for this property was suggested as a likely candidate to confer pest resistance to transgenic plants (54). The presence of genes coding for proteins of the saline-soluble lectin family in the filamentous fungi P. anserina and N. crassa gives a clear indication that this lectin family is very widely distributed among fungi (55). Fig. 5 compares the sequences of ABL with those of five members of the fungal saline-soluble lectin family. The residues involved in T-antigen binding to ABL are indicated with a T, and those involved at the N-acetylglucosamine-binding site are indicated with an N. Only one of the two P. cornocupiae lectins was included in the comparison because the two forms characterized differ in only a few amino acids. The sequence similarities in the family are evident, as is the conservation (or substitution by acceptable alternatives) of the residues involved in T-antigen disaccharide and N-acetylglucosamine binding. Therefore, not only can the ABL fold be proposed as characteristic of the entire family, but the presence of two saccharide binding sites can also be postulated for the other members of the family.
The crystal structure of ABL defines the fold for a new family of lectins and provides the basis for understanding their interactions with the family of the T-antigen-derived molecules. Although the existence of two distinct binding sites with different binding specificities was documented for another fungal lectin, Psathyrella velutina lectin (molecular mass of a monomer, 40,000 Da; specificity for heparin/pectin and N-acetylglucosamine/N-acetylneuraminic acid; Ref. 56), the occurrence of two binding sites in a single domain, which can discriminate between two monosaccharides differing only in the configuration of a single epimeric hydroxyl, is reported here for the first time. The presence in ABL of the two types of sites with different specificities explains the conflicting evidence in the literature for the binding of different saccharides based on ligand competition studies. The complex carbohydrate binding specificity reported for many other lectins might have, in some cases, an equally simple explanation. The physiological role of different sites in lectins discriminating with this level of subtlety is unknown, as is the diffusion of this phenomenon in lectins from other living organisms.