C-type Lectin-like Carbohydrate Recognition of the Hemolytic Lectin CEL-III Containing Ricin-type β-Trefoil Folds*

CEL-III is a Ca2+-dependent hemolytic lectin, isolated from the marine invertebrate Cucumaria echinata. The three-dimensional structure of CEL-III/GalNAc and CEL-III/methyl α-galactoside complexes was solved by x-ray crystallographic analysis. In these complexes, five carbohydrate molecules were found to be bound to two carbohydrate-binding domains (domains 1 and 2) located in the N-terminal 2/3 portion of the polypeptide and that contained β-trefoil folds similar to ricin B-chain. The 3-OH and 4-OH of bound carbohydrate molecules were coordinated with Ca2+ located at the subdomains 1α, 1γ, 2α, 2β, and 2γ, simultaneously forming hydrogen bond networks with nearby amino acid side chains, which is similar to carbohydrate binding in C-type lectins. The binding of carbohydrates was further stabilized by aromatic amino acid residues, such as tyrosine and tryptophan, through a stacking interaction with the hydrophobic face of carbohydrates. The importance of amino acid residues in the carbohydrate-binding sites was confirmed by the mutational analyses. The orientation of bound GalNAc and methyl α-galactoside was similar to the galactose moiety of lactose bound to the carbohydrate-binding site of the ricin B-chain, although the ricin B-chain does not require Ca2+ ions for carbohydrate binding. The binding of the carbohydrates induced local structural changes in carbohydrate-binding sites in subdomains 2α and 2β. Binding of GalNAc also induced a slight change in the main chain structure of domain 3, which could be related to the conformational change upon binding of specific carbohydrates to induce oligomerization of the protein.

CEL-III is a hemolytic lectin isolated from the sea cucumber Cucumaria echinata (1,2). This lectin binds to carbohy-drates containing Gal/GalNAc at nonreducing ends in the presence of Ca 2ϩ . CEL-III exhibits the highest affinity for GalNAc, followed by lactose, lactulose, and methyl ␤-galactoside (Me-␤-Gal) 2 among the carbohydrates tested (3). After binding to cell surface carbohydrate chains, CEL-III oligomerizes to form membrane pores, thereby leading to colloid osmotic rupture of the cell membrane (4). In addition to hemolytic activity, this lectin exhibits a strong cytotoxicity for some cultured cell lines, which is also caused by formation of oligomers in the cell membrane (5,6). Such a cell membrane-damaging action was also known for several bacterial pore-forming toxins (7), such as ␣-hemolysin from Staphylococcus aureus (8,9), aelolysin from Aeromonas hydrophila (10,11), and the anthrax toxin from Bacillus anthrasis (12). These toxins exert their pore-forming action through conformational changes that lead to oligomerization in the target cell membranes after binding to specific cell surface receptors. Recently, some hemolytic lectins have also been reported (13)(14)(15)(16). The importance of oligomerization in the target cell membranes is also suggested for these lectins.
From the cDNA nucleotide sequence, it was inferred that CEL-III is composed of three domains as follows: two N-terminal carbohydrate-binding domains (domains 1 and 2) and a C-terminal domain (domain 3) (17). The carbohydratebinding domains have a relatively low but definite similarity with the carbohydrate-binding domains of ricin-type (R-type) lectins, such as the B-chains of the toxic plant lectins ricin (18) and abrin (19). On the other hand, although ricin-like plant lectins contain toxic subunits, namely the A-chains that inactivate the 60 S ribosomal subunits in eukaryotic cells, CEL-III has domain 3 that contains hydrophobic segments and may be involved in the oligomerization in the target cell membrane. In fact, we have observed that domain 3 fragments, once produced by limited digestion with trypsin, spontaneously associate to form oligomers in solution. The resulting domain 3 oligomer exhibited a marked increase in ␤-sheet content, as measured by circular dichroism spectra (20). This is consistent with an increase in ␤-sheet structure of the entire CEL-III oligomer (4 fore, it seems reasonable to conclude that oligomerization of CEL-III is mainly mediated by interactions between its domain 3 after binding to cell surface carbohydrates. We have solved the crystal structure of CEL-III (21), which confirmed the domain structure as suggested from the amino acid sequence; domains 1 and 2 adopt a ␤-trefoil structure, each consisting of three subdomains or motifs. They showed apparent similarity with the B-chains of ricin and abrin despite their relatively low sequence identity. The basic folds of the carbohydrate-binding domains of CEL-III and ricin B-chain are similar, but there is a significant difference that CEL-III contains Ca 2ϩ ions at their carbohydratebinding sites, whereas the ricin B-chain does not. Although Ca 2ϩ is known to be essential for interaction with carbohydrates in some lectins, such as C-type lectins (22)(23)(24)(25)(26), ␤-trefoil lectins generally bind carbohydrates without Ca 2ϩ ions. To elucidate the carbohydrate-binding mechanism and the role of bound Ca 2ϩ ions, we assessed the x-ray crystal structure of CEL-III/GalNAc and CEL-III/methyl ␣-galactoside (Me-␣-Gal) complexes. Results revealed that CEL-III recognizes specific carbohydrates in a very similar manner as C-type lectins despite the structural similarity with R-type lectins. It was also observed that the binding of the carbohydrates induced local structural changes in carbohydrate-binding sites and domain 3 that could be related to the conformational change upon binding to cell surface carbohydrates, which leads to oligomerization of the protein in cell membranes (4).

EXPERIMENTAL PROCEDURES
Materials-Oligonucleotides used in this study were purchased from Exigen (Tokyo, Japan). Ex Taq TM DNA polymerase and the DNA ligation kit were obtained from Takara (Otsu, Japan) and used as recommended by the supplier. Restriction endonucleases and DNA-modifying enzymes were purchased from MBI Fermentas (Burlington, Canada). The plasmid vectors used in this work were as follows: pGEM-T Easy vector from Promega and pET-32a expression vector from Novagen. All other chemicals were of analytical grade for biochemical use.
Purification and Crystallization of CEL-III/Carbohydrate Complexes-CEL-III was purified from C. echinata according to a method reported previously (2). The proteins extracted from a homogenate of C. echinata were applied to a lactose-Cellulofine column equilibrated with 0.15 M NaCl, 10 mM Tris-HCl, pH 7.5 (TBS), containing 10 mM CaCl 2 . Adsorbed lectins (CEL-I, CEL-III, and CEL-IV) were eluted with TBS containing 20 mM EDTA. The lectins were then separated using a GalNAc-Cellulofine column, utilizing the differences in carbohydrate-binding specificities. After elution of CEL-III with TBS containing 0.1 M lactose, CEL-I and CEL-IV were eluted with TBS containing 20 mM EDTA. CEL-III was finally purified by gel filtration through Sephadex G-75 in TBS. Crystallization of CEL-III/carbohydrate complexes was done under similar conditions as those for native CEL-III crystals (21) in the presence of 10 mM carbohydrates. Briefly, the protein solution (7 mg/ml, 2-4 l) in TBS containing 10 mM CaCl 2 was mixed with the same amount of reservoir solution (12% (w/v) polyethylene glycol 8000, 100 mM BisTris, NaOH, pH 6.5, and 200 mM magnesium acetate) and subjected to sitting drop vapor diffusion at 20°C. Diffraction images from the crystals were collected using synchrotron radiation on beamline BL44XU at SPring-8 (Hyogo, Japan) at 100 K using an imaging plate detector, DIP6040 (MAC Science, Japan). Diffraction images were indexed and integrated using the program Mosflm (27) and processed using the CCP4 programs (28) Scala and Truncate. Data collection statistics are summarized in Table 1.
Structure Determination and Refinement-The crystal structures of CEL-III/carbohydrate complexes were solved by molecular replacement method using native CEL-III (Protein Data Bank code 1VCL) (21) as a search model. Molecular replacement was performed using the program Phaser (29). The model was refined using the program Refmac (30) from the CCP4 suite (28). Manual fitting of the model was carried out by the program Xfit (31) and Coot (32). The quality of the final models for native and complexed CEL-III was assessed by Ramachandran plots and analysis of the model geometry with the program Procheck (33). The refinement statistics are summarized in Table 1. Figures for protein models were drawn by PyMOL (34).
Expression of the Carbohydrate Recognition Domain 1 of CEL-III in E. coli Cells-To obtain a cDNA fragment encoding domain 1 (residues 1-156) of CEL-III, PCR was performed using CEL-III cDNA (17) (DDBJ Data Base accession number AB109017) as a template with forward primer 5Ј-CCATGG- GACAAGTTTTGTGCACGAATCCA-3Ј and reverse primer 5Ј-GGATCCTTAACCGTAGAACAGCTCTGGCCC-3Ј. The PCR products were ligated into the pGEM-T Easy vector. After confirmation of the DNA sequence, the DNA fragment was excised by digestion with NcoI and BamHI, and ligated into the expression vector pET-32a, previously digested with the same enzymes. The resulting plasmid pET-CRD1 was introduced into Escherichia coli BL21(DE3) CodonPlus RIL strain (Stratagene), and recombinant protein was induced with 1 mM isopropylthiogalactoside according to the supplier's instruction. After induction, the culture was incubated for an additional 5 h at 37°C. Cells were harvested and lysed by BugBuster protein extraction reagent (Novagen) and sonication. The lysate was centrifuged at 10,000 ϫ g for 15 min, and the supernatant was loaded onto HisBind resin columns (Novagen). After washing of the column, recombinant protein was eluted by 20 mM Tris-HCl, pH 7.9, containing 1 M imidazole and 500 mM NaCl according to the supplier's manual. Protein solution was dialyzed against 10 mM Tris-HCl buffer, pH 7.5, containing 150 mM NaCl, and 10 mM CaCl 2 .
The recombinant domain 1 thus obtained was investigated for its carbohydrate binding activity.
Site-directed Mutagenesis-Site-directed mutagenesis was performed by the unique site elimination method (35) using a QuikChange site-directed mutagenesis kit (Stratagene). Nomenclature of domain 1 mutants and oligonucleotide primers used for mutagenesis are shown in Table 2. Mutations were introduced into the amplified cDNA fragment that had been subcloned into the pGEM-T Easy vector. After mutagenesis, the cDNA fragments were sequenced to verify the presence of the desired mutation. The mutant cDNA fragment was then recovered and ligated into the pET-32a expression vector. Expression and purification of all mutants were done by procedures identical to those described for the wild-type domain 1.
Examination of Carbohydrate Binding Activities of Domain 1 and Its Site-directed Mutants-Carbohydrate binding activity of recombinant domain 1 and its site-directed mutants was examined by affinity chromatography. Wild-type and mutant proteins (D23A, Y36A, D43A, Q44A, and Q45A), in which amino acid residues in the binding site of subdomain 1␣ were replaced by Ala residues, were applied onto a GalNAc-Cellulofine column (0.8 ϫ 10 cm) equilibrated with TBS containing 10 mM CaCl 2 . After washing the column with the same buffer, adsorbed proteins were eluted with 100 mM lactose in the same buffer.

RESULTS
Crystallization and Structure Determination of CEL-III/ Carbohydrate Complexes-Crystals were prepared under similar conditions as for native CEL-III (21). We have tried to crystallize CEL-III complexed with several carbohydrates, such as GalNAc, lactose, lactulose, Me-␤-Gal, Me-␣-Gal, and galactose (3). Among these, CEL-III/GalNAc and CEL-III/Me-␣-Gal complexes gave good diffraction patterns, although Me-␣-Gal showed lower affinity than Me-␤-Gal (3), and they were analyzed. Statistics of data collection and refinement parameters are summarized in Table 1. The space group of the CEL-III/Me-␣-Gal crystal was P2 1 with a unit cell axis of a ϭ 53.3 Å, b ϭ 65.5 Å, c ϭ 127.0 Å, ␤ ϭ 97.1°, and two CEL-III molecules were contained in the asymmetric unit, which is similar to native CEL-III crystals. On the other hand, CEL-III/GalNAc crystals belonged to P1 with a unit cell axis of a ϭ 52.6 Å, b ϭ 65.2 Å, c ϭ 66.7 Å, ␣ ϭ 85.3°, ␤ ϭ 73.6°, ␥ ϭ 89.9°, which contained two CEL-III molecules in the asymmetric unit. The structures of these CEL-III/carbohydrate complexes were solved by a molecular replacement method using native CEL-III as a search model. Five Ca 2ϩ and two Mg 2ϩ ions were assigned to the same positions as those for native protein, considering their electron density, coordination numbers, bond distances, and ligand atoms (21). As shown in Fig. 1 a The primers with the reverse-complementary sequence of each primer were also used for mutagenesis.
overall and carbohydrate-binding domain structure of the CEL-III/ GalNAc complex is illustrated in Fig. 2.
Overall Structure and Carbohydrate-binding Sites of the CEL-III/ Carbohydrate Complex-In CEL-III/ carbohydrate complexes, GalNAc and Me-␣-Gal molecules were bound to the five subdomains, 1␣, 1␥, 2␣, 2␤, and 2␥, in domains 1 and 2, whereas subdomain 1␤ did not bind carbohydrate molecules (Fig. 2 for the CEL-III/GalNAc complex). This is consistent with the fact that these subdomains, with the exception of 1␤, have a very similar structure with Ca 2ϩ ions. In fact, only subdomain 1␤ lacks amino acid residues directly involved in coordination with Ca 2ϩ in other subdomains (the residues at a, b, and f positions in Fig. 3). At each carbohydrate-binding site, 3-OH and 4-OH of the bound carbohydrates are coordinated with a Ca 2ϩ ion and simultaneously form hydrogen bonds with side chains of nearby amino acid residues. For example, in subdomain 1␣, 4-OH forms two hydrogen bonds with the carboxyl side chain of Asp-23 (position a) and the amide NH of Gly-26 (position c), and 3-OH of the carbohydrates forms a hydrogen bond with Asp-39 (position e) (Fig. 4, A and B). Asp-39 also forms a hydrogen bond with a water molecule that is fixed by a coordinate bond with Ca 2ϩ and hydrogen bonds with Asp-43 (position f). Such a carbohydrate recognition mode was essentially common in all the carbohydrate-binding sites, except for the residues at position e, which includes Asp, Gln, and Glu residues (Fig. 3).
In the case of the CEL-III/GalNAc complex, the binding of both anomers of GalNAc to CEL-III was observed from their electron density map (Table 3), depending on the local environment of each binding site. For example, GalNAc bound in subdomain 1␣ was mostly ␣-anomer, being stabilized by a hydrogen bond with the carboxyl side chain of Asn-410 of the neighbor CEL-III molecule in the crystal. Although Gal-NAc shows relatively high affinity for CEL-III among simple carbohydrates, there is no specific bonding between acetamido group of GalNAc and protein atoms except for van der Waals contact. The average hydrogen bond distances for the bound carbohydrates and their anomeric state are listed in Table 3.   DECEMBER 28, 2007 • VOLUME 282 • NUMBER 52

JOURNAL OF BIOLOGICAL CHEMISTRY 37829
Besides the GalNAc molecules bound in domains 1 and 2, another GalNAc molecule was found in the pocket formed between domains 1 and 3 (Fig. 4C). Both CEL-III molecules in the asymmetric unit of the crystal contained GalNAc at this position. However, binding of this GalNAc was mediated by several hydrogen bonds with surrounding water molecules, which were fixed on the surface of the protein. Therefore, it seems reasonable to assume that this is nonspecific binding of GalNAc. In fact, no carbohydrate molecule was observed in this pocket of the CEL-III/Me-␣-Gal complex. Fig. 5A shows the superimposition of the carbohydrate-binding site residues (subdomain 2␥) of CEL-III and the ricin B-chain. As seen in this figure, orientation of GalNAc bound to CEL-III was very similar to that of galactose moiety of lactose bound to the ricin B-chain when both proteins were superimposed based on the corresponding residues (Fig. 3). Stacking interaction of the hydrophobic face of the carbohydrates with aromatic side chains (Trp-269 in CEL-III and Tyr-248 in the ricin B-chain; position d) was also common in both proteins. However, recognition of 3-OH and 4-OH of GalNAc was essentially done by coordinate bonds with Ca 2ϩ ions as well as hydrogen bond networks at the binding site in CEL-III, whereas recognition of 3-OH and 4-OH was achieved only by hydrogen bonds in ricin B-chain. Despite involvement of Ca 2ϩ ions,  amino acid residues involved in recognition of carbohydrates are mostly at the same positions in CEL-III and ricin B-chain. Among these, the residues at positions a, b, c, and f form coordinate bonds with Ca 2ϩ in CEL-III. The residues at positions a and c also form hydrogen bonds with 4-OH of the carbohydrates, whereas the residue at position e forms a hydrogen bond with 3-OH of GalNAc as well as a water molecule, which is fixed by Ca 2ϩ and the aspartate residue at position f (Fig. 6). There are some variations in the conformation of the residues at position e (Asp, Glu, and Gln) that are not directly involved in binding of Ca 2ϩ . Such a Ca 2ϩ -mediated recognition of carbohydrates by CEL-III resembles that of C-type lectins, as was representatively shown by CEL-I from C. echinata (Fig. 5B) (25). In the CEL-I/GalNAc complex, 3-OH and 4-OH of GalNAc were coordinated with Ca 2ϩ ions at the binding sites in addition to the hydrogen bond networks with nearby amino acid residues (Gln-101, Asp-103, Glu-109, Asn-123, and Asp-124). CEL-I also stabilized binding of GalNAc by stacking with the side chain of Trp-105. These results indicate that CEL-III recognizes specific carbohydrates in a similar manner as C-type lectins, although it has typical ␤-trefoil folds common in R-type lectins.
Mutational Analyses of the Carbohydrate-binding Sites-To examine the roles of the amino acid residues and Ca 2ϩ ions in the carbohydrate-binding sites of CEL-III, several mutants of domain 1 (residues 1-156), in which amino acid residues in subdomain 1␣ had been replaced by Ala residues, were expressed in E. coli cells, and their carbohydrate binding activity was examined by affinity chromatography on GalNAc-Cellulofine columns. As shown in Fig. 7, the wild-type recombinant domain 1 exhibited affinity for the GalNAc-Cellulofine column, clearly indicating that domain 1 alone has carbohydrate-binding ability. On the  other hand, mutants D23A and D43A, in which Asp-23 and Asp-43 coordinated with Ca 2ϩ (Fig. 4A) had been replaced by Ala residues, showed almost no affinity for the column, confirming the importance of these residues. In addition to coordinate bonds with Ca 2ϩ , Asp-23 and Asp-43 also formed hydrogen bonds with 4-OH of GalNAc and a water molecule, which in turn formed a hydrogen bond with Asp-39. The Y36A mutant showed a weak affinity for the GalNAc-Cellulofine column, suggesting that the stacking interaction between Tyr-36 and GalNAc plays a supporting role in stabilizing the carbohydrate binding. The Q44A mutant showed no binding activity for the GalNAc-Cellulofine column. Although Gln-44 has no direct interaction with Ca 2ϩ or carbohydrates, this residue is located at the first position of the QXW motif (Fig. 3), which is known to be a conserved motif in the ␤-trefoil domains (36) and contributes to stabilize the 3 10 helix by hydrogen bonding within the carbohydrate-binding motifs (21). Therefore, it seems likely that mutation of this residue caused destabilization of the basic structure of ␤-trefoil fold. In contrast to Q44A, Q45A retained a fairly high carbohydrate binding activity (Fig. 7), suggesting that the middle residue of the QXW motif contributes little to structural stabilization and carbohydrate-binding ability. In fact, this position is not conserved, not even among the five carbohydrate-binding sites of CEL-III (Fig. 3). It should be noted that these recombinant proteins of domain 1 contained both subdomains 1␣ and 1␥, with the latter also having binding ability as evidenced by the crystal structure (Fig. 2). However, considering the above results, subdomain 1␥ alone does not seem to provide enough affinity to bind to the affinity column.
Conformational Changes Induced by Binding of Carbohydrate-Comparison between the structures of CEL-III and its carbohydrate complexes revealed partial conformational changes in subdomains 2␣ and 2␤ that were induced by binding of carbohydrates. As shown in Fig. 8A (subdomain 2␣), binding of GalNAc induced a large movement of the side chain of Glu-184 to form a hydrogen bond with 3-OH of Gal-NAc. Simultaneously, Tyr-181 and Asn-186 were moved aside to accommodate Glu-184, accompanied by a movement of the main chain. On the other hand, there was a large movement of Tyr-222 in the binding site of subdomain 2␤ induced by binding of GalNAc (Fig.  8B), leading to a stacking interaction with the hydrophobic face of GalNAc, which would contribute to stabilization of the GalNAc binding. These changes were observed in the two CEL-III/GalNAc molecules in an asymmetric unit of the crystals, confirming that these were not artificial changes due to crystal packing but were caused by specific binding of the carbohydrates. Similar structural changes were also observed in the CEL-III/Me-␣-Gal complex (data not shown). Fig. 9A shows the comparison between the main chain structure of CEL-III and its GalNAc complex. As seen in this figure, their overall structure is very similar, but some local differences were found, especially in domain 3. As shown in Fig. 10, in addition to the changes around the carbohydrate-binding site in subdomain 2␣ (residues 165-190), relatively large deviations in the C-␣ atoms in CEL-III/carbohydrate complexes were observed at residues 300 -325 and 350 -380, corresponding to the flanking regions of the two ␣-helices (H8 and H9). As seen in Fig.  9B, the beginning of the ␣-helix H8 was located near the carbohydrate-binding site in subdomain 2␣, and therefore it seems possible that movement of the residues in the carbohydrate-binding site induced by the binding of GalNAc may lead to slight conformational changes in domain 3.

DISCUSSION
It was assumed from sequence similarity with lectins that have ␤-trefoil domains, such as ricin B-chain (18,37), that the N-terminal 2 ⁄3 of CEL-III contained two carbohydratebinding domains (domains 1 and 2). The ␤-trefoil domains contain a characteristic sequence motif (QXW) 3 and are thought to function as carbohydrate-binding modules not only in lectins but also in various enzymes, toxins, and cell surface receptors (36). The carbohydrate-binding ability of domains 1 and 2 was confirmed in the present x-ray crystal structural analysis. However, there was a remarkable difference between CEL-III and ricin B-chain because CEL-III bound carbohydrate molecules at five of six subdomains in domains 1 and 2, whereas ricin B-chain only has carbohydrate-binding sites in its subdomains 1␣ and 2␥. As revealed previously by the crystal structure of native CEL-III (21), the main chain structures of the subdomains in CEL-III are considerably similar with the exception of 1␤, reflecting their high amino acid sequence similarity (17). In particular, the amino acid residues known to be important for carbohydrate binding in other ␤-trefoil lectins, such as aspartate residues (Asp-23, Asp-121, Asp-168, Asp-209, and Asp-256) and aromatic residues (Tyr-36, Tyr-134, Tyr-181, Tyr-222, and Trp-269), are highly conserved in these five subdomains. These findings strongly suggest that these five subdomains have carbohydrate-binding ability, which was proven in this study. The presence of multiple binding sites in a single protein molecule should be advantageous for substantial binding affinity by cooperative interactions with carbohydrates. For example, mannose-binding lectin has relatively limited interactions with mannose-containing carbohydrate chains at each carbohydrate-binding site, but it produces sufficient affinity for mannose oligosaccharides by forming oligomers with multiple carbohydrate-binding sites (38). It has been suggested recently that the lipid rafts, which contain various glycolipids, function as receptors for several bacterial toxins (39,40). Because lactosyl ceramide is the most effective receptor for CEL-III in the human erythrocyte membrane (41), it seems likely that the presence of multiple carbohydrate-binding sites in a protein molecule is advantageous for recognizing glycolipid clusters in the lipid rafts on target cell membranes.
We have previously presented two putative carbohydratebinding models for CEL-III; the first in which Ca 2ϩ ions were directly involved in carbohydrate binding, and the second where Ca 2ϩ ions were replaced by bound carbohydrate. This study revealed that the first model was correct, and Ca 2ϩ ions were essential for the recognition of carbohydrates through coordinate bonds with OH groups of carbohydrates. Despite differences in the involvement of Ca 2ϩ , the orientation of the carbohydrate bound to CEL-III was very similar to that for ricin B-chain as shown in Fig. 5A. The OH groups of bound carbohydrate also formed hydrogen bond networks with nearby amino acid side chains in the carbohydratebinding sites of CEL-III. This also resembles carbohydrate