The structures of the H(C) fragment of tetanus toxin with carbohydrate subunit complexes provide insight into ganglioside binding.

The entry of tetanus neurotoxin into neuronal cells proceeds through the initial binding of the toxin to gangliosides on the cell surface. The carboxyl-terminal fragment of the heavy chain of tetanus neurotoxin contains the ganglioside-binding site, which has not yet been fully characterized. The crystal structures of native H(C) and of H(C) soaked with carbohydrates reveal a number of binding sites and provide insight into the possible mode of ganglioside binding.

chinery. The H-chain can be cleaved into two domains: H N and H C . After internalization of the entire toxin into vesicles, the L-chain is translocated into the cytosol, a process apparently requiring the activity of the H N fragment (5,6).
The H C fragment, also termed fragment C, is the 50-kDa carboxyl-terminal fragment of TeNT (residues 865-1315) and is required for the early stages of intoxication. It has long been recognized that TeNT displays ganglioside binding activity (7). Ganglioside consists of a sialic acid-containing oligosaccharide linked to ceramide. Most have the basic form: Gal␤3GalNAc␤4-(NeuAc␣3)Gal␤4Glc␤Cer to which one or more N-acetylneuraminic (sialic) acids are bound. GM1 and GD1b have mono-and disialic acid residues respectively attached to the internal galactose residue. GT1b and GQ1b have, in addition, mono-and disialic acids attached to the terminal galactose residue.
Binding studies of TeNT to brain membranes and to purified gangliosides have shown that these activities are for the most part both mediated by the H C fragment (9) (and reviewed in Ref. 8). Analysis of the binding efficiencies of different gangliosides to both TeNT and the H C fragment has shown an absolute requirement for the disialic acid moiety attached to the internal galactose present in GD1b, GT1b, and GQ1b (10,11). A single sialic acid residue on this internal Gal residue, as found in the monosialic ganglioside GM1, is clearly not sufficient for binding (10,11). The presence of sialic acid on the terminal Gal residue (as in GT1b and GQ1b) appears to enhance binding only slightly and cannot therefore be considered essential for binding. However, the Gal␤3GalNAc disaccharide moiety is necessary for binding (10,11). Recent studies using surface plasmon resonance to study binding of purified H C fragment have extended these findings and demonstrated a preference for GD1b over GT1b and negligible binding to GQ1b gangliosides (12).
It has long been questioned whether gangliosides represent the sole receptor for TeNT, and much convincing evidence has been presented that argues that a second, protein, receptor exists (1,9). This putative receptor could determine the specificity of TeNT for certain neuronal cell types and could possibly also be involved in the retrograde transport of TeNT to higher centers of the central nervous system. A two-receptor model, invoking both ganglioside and protein receptors, has been presented (9). The x-ray crystal structure of H C at 2.7-Å resolution reported previously (13) shows the protein to have two domains, an amino-terminal jelly roll domain and a carboxyl-terminal ␤-trefoil domain. A closely homologous structure has been found in the receptor-binding domain of botulinum toxin (14). The amino-terminal jelly roll domain, which is similar in structure to many lectins, is an obvious candidate for the ganglioside bind-ing of TeNT. However, deletion mutagenesis studies (15,16) suggest that carboxyl-terminal residues, in particular residues 1306 -1310 of the ␤-trefoil domain, are essential for cell and ganglioside binding activity. The structure of Umland et al. (13) showed that these residues are largely solvent-exposed forming one edge of a shallow pocket. The exception is Val-1306, which is the final residue in a ␤-strand belonging to the ␤-trefoil core. A later report (17) of photoaffinity studies using derivatized ganglioside GD1b showed binding to be accompanied by photoactive modification of His-1293, which is located on the other side of the ␤-trefoil domain from the carboxyl-terminal residues. The apparent discrepancy in these results can be reconciled if the deletion of residues 1306 -1309 causes a change in the conformation of the ␤-trefoil and disrupts the true ganglioside-binding site.
To investigate the structural basis of ganglioside binding we have determined structures from crystals of native H C and from H C crystals soaked in solutions of lactose, galactose, Nacetylgalactosamine, and sialic acid. As these sugars are all subunits of ganglioside, it is expected they would bind at locations corresponding to potential ganglioside-binding sites.

Expression and
Purification of H C -DNA from plasmid pTETtac215 (18) was incorporated into the Escherichia coli expression vector pET16b to produce H C as a fusion protein with a 24-amino acid amino-terminal tag, including 10 His residues. Cells were grown to midlog phase, and expression was induced by the addition of isopropyl-␤-Dthiogalactopyranoside to 1 mM for 4 h. Cells were then harvested and is the mean intensity for reflection h, after rejection of outliers. Phasing power ϭ ⌺ h ͉F Hcalc ͉ /⌺ h ⑀. Where ͉F Hcalc ͉ is the calculated structure factor amplitude for the heavy-atom and ⑀ is the residual lack of closure (of the vector triangle ͉F P ͉, ͉F PH ͉, and ͉F Hcalc ͉) for reflection h (calculated for acentric and centric reflections over the resolution interval 30 - Crystallization, Data Collection, and Processing-Crystals were produced by vapor diffusion using a sitting drop containing 2 l of a 3 mg/ml protein solution in 20 mM imidazole (pH 7.0) and 100 mM NaCl, mixed with 3 l of a well solution, consisting of 200 mM ammonium sulfate with 40% (w/v) PEG 4K and 1% (v/v) 2-methyl-2,4-pentanediol. Several data sets were collected from native, derivative, and carbohydrate-soaked crystals as detailed in Table I. Crystals were soaked in well solutions containing 100 mM carbohydrate for 24 h (lactose, galactose, sialic acid) and 15 days (N-acetylgalactosamine) before data collection. Diffraction data were collected at the Daresbury synchrotron radiation source under cryocooled conditions (100 K). The lactose-and galactose-soaked crystals used 15% (v/v) glycerol made up with the well solution as the cryoprotectant. The N-acetylgalactosamine-and sialic acid-soaked crystals used 30% (v/v) PEG 400 as the cryoprotectant. Galactose, N-acetylgalactosamine, and sialic acid were purchased from Sigma and lactose from Fluka. Data from native and galactose-and lactose-soaked crystals were processed using the HKL Suite (19). The native protein crystallizes in space group P2 1 2 1 2 1 with unit cell dimensions a ϭ 67.1, b ϭ 70.9, c ϭ 122.2 Å and one molecule in the asymmetric unit. This cell is about 12% (by volume) smaller than that of the crystals of H C reported previously by us (20) but is still larger than that reported by Umland et al. (13). The data from the sialic acid-and N-acetylgalactosamine-soaked crystals were processed with Mosflm (21) and Scala (22,23). Table I shows the data and phasing statistics.
Crystal Structure Determination-The initial mercury heavy atom position was solved using ShelX-90 (24). Cross-phased difference maps were used to find sites in other heavy atom derivative data sets. Initial heavy atom refinement (including anomalous data) proceeded using MLPHARE (23,25). Subsequent use of SHARP (26) located further minor heavy atom sites. After density modification and phase refinement with DM (27), O (28) was used to build the molecule into the electron density extending in resolution to 2.5 Å. The C ␣ -trace of the molecule in Ref. 13 facilitated the sequence assignment in the map. This native structure was not fully refined because a structure of similar quality had been published (13), and higher resolution data were obtained with carbohydrate-soaked crystals.
The high resolution (1.8 Å) lactose-soaked data set showed considerable lack of isomorphism with respect to the MIR data sets. Molecular replacement using AMORE (29) was used to reposition the partially refined native structure in the unit cell of the high resolution data set. Further rebuilding and refinement of the lactose-soaked structure proceeded using O, REFMAC (30), and ARP (31). The structure geometry was analyzed using PROCHECK (32). The galactose-, N-acetylgalactosamine-, and sialic acid-soaked structures were refined and validated similarly (Table II). The ligand interactions were displayed using LIGPLOT (33).

RESULTS AND DISCUSSION
The topology of TeNT H C has already been described (13). It consists of two domains, an amino-terminal jelly roll domain and a carboxyl-terminal ␤-trefoil domain linked by a short peptide as shown in Fig. 1. Fig. 2 shows the difference electron density observed for each of the carbohydrate-soaked crystals. Clear electron density shows single sugar-binding sites for the lactose-, galactose-, and sialic acid-soaked crystals (Fig. 2, a, b, and c) and two sites for N-acetylgalactosamine-soaked crystals (Fig. 2, d and e). The locations of these bound carbohydrate ligands on the protein surface are shown in Fig. 5. The structures of the proteins in the four complexes are very similar to each other. The most significant difference is a small change in the poorly defined loops 983-985 and 1180 -1185, the latter of which moves to accommodate the galactose.
The final models of the lactose-, galactose-, sialic acid-, and N-acetylgalactosamine-soaked crystals consist of 441 residues (875 to 1315 of TeNT) and 692, 352, 226, and 506 water molecules, respectively (Table II). The geometry of the models is adequate, with 87.2% (Lac-soak), 85.5% (Gal-soak), 86.5% (NGA-soak), and 82.5% (sialic acid-soak) of the residues in the most favored areas of the Ramachandran plot as defined by PROCHECK (32). Only one residue in the NGA-soak structure and three in the sialic acid-soak structure are in the disallowed regions.
A comparison of the lactose complex with the structure published by Umland et al. (13) (Protein Data Bank accession code 1AF9) shows that the structure models are largely very similar. There are however a number of regions around residues 943, 984, 1067, 1183, 1220, and 1293 that have tandem differences of greater than 1 Å in the C ␣ -positions. All these regions are on the surfaces of the domains, and the differences probably reflect large mobility. Excluding these regions, the average deviation in C ␣ -positions between the two models is only 0.3 Å.
Our studies with carbohydrate soaks do not provide adjacent binding sites for the different carbohydrates. The lactosesoaked crystal shows density for the lactose in the region close to His-1293 and Trp-1289 (Figs. 2a and 3). Hydrogen bonds are formed between the galactose unit and the protein through O-6 and OG Ser-1287, O-6 and OD-2 Asp-1222, and O-4 and carbonyl oxygen of Thr-1270. O-3 forms two water-mediated Hbonds with OH Tyr-1290 and the amide N of Gly-1300. The glucose unit forms two H-bonds between O-3 and O-2 with ND-2 Asn-1220 (Fig. 4). Hydrophobic ring packing interactions are formed between the galactose ring and Trp-1289 (Fig. 3). The glucose unit is oriented with respect to the H C so that a ceramide attached at the C-1 position would be directed away from the protein. In addition, the location of the C-3 of the galactose unit allows space for attached sialic acids.
In the galactose-soaked crystal there is clear density (Fig. 2b) for a galactose molecule near residues 1195, 1179, and 1180. Hydrogen bonds between O-2 and amide NH of Phe-1195, O-3 and both amide NH Tyr-1180 and carbonyl oxygen Phe-1195, O-6, and NE Arg-1179, and O-4 and carbonyl oxygen of Thr-1181 are formed from the galactose to the protein (Fig. 4).
The N-acetylgalactosamine-soaked map contains density corresponding to two separate carbohydrate molecules (Fig. 2,  d and e). The observed electron density is of similar quality for  (Fig. 4).
The sialic acid-soaked crystals show clear density for a single sialic acid (Fig. 2c) When these different binding sites are positioned on a common H C molecule (Fig. 5) they are located in four distinct regions (the sialic acid-binding site and the NGA1-binding site are considered a single region). The lactose-binding site appears to be part of the ganglioside-binding site. It is located close to the His-1293 identified from photoaffinity labeling (17) and to other residues identified by mutagenesis studies (16) 2 as a ganglioside-binding site.
Another carbohydrate-binding region has adjacent sites for both NGA and sialic acid. This suggests that this broad region is possibly capable of binding linked carbohydrate units. Crystal-soaking experiments with disaccharide units are being undertaken to investigate this possibility.
The single galactose-binding site occurs in a depression on the protein surface created by a loop from residues 1180 to 1196. The galactose unit is binding (Fig. 4) in the vicinity of Arg and Asp residues, with its apolar ␤ face parallel to an aromatic residue (Tyr). These features are characteristic of galactosebinding sites (34,35). The galactose is bound in a position where it could mimic a terminal galactose of a ganglioside such as GD1b, because the O-1 is in an open environment.
A comparison of the structures of tetanus toxin H C and botulinum toxin H C (Protein Data Bank accession code 3BTA) shows much similarity in the carbohydrate-binding regions. The strongest similarity is in the lactose-binding site where, with the exception of ND-2 Asn-1220, each of the TeNT H C atoms coordinating the carbohydrate has a structural counterpart in the BoNT H C structure (TeNT Ser-1287 with BoNT Ser-1263, Asp-1222 with Glu-1202, Thr-1270 with Phe-1251, Tyr-1290 with Tyr-1266, Gly-1300 with Gly-1278, Trp-1289 with Trp-1265). These data are in accord with studies on BoNT. Fluorescence quenching experiments by Kamata et al. (36) have been interpreted by Lacy and Stevens (37) to implicate Trp-1265 in the ganglioside-binding site of BoNT. This is supported by neutralizing antibody data from Kubota et al. (38) that implicates residues 1266 -1272. Lacy and Stevens (37) find that all these residues are located in a deep, positively charged cleft, which they assign to the ganglioside-binding site in the BoNT structure. In the galactose-and NGA-binding regions the structural coincidences are fewer because of differences in the lengths and positions of loops. It remains to be seen if move-ment of loops is associated with carbohydrate binding.
The data presented here indicate that tetanus toxin has multiple carbohydrate-binding sites. It is recognized that many lectins use multiple binding sites to bind their cognate carbohydrates with higher affinities (34,35). Subsite multivalency refers to the presence of primary and secondary binding sites on the lectin surface, as illustrated by the lactose-and the NGA-binding sites described here. Subunit multivalency refers to the binding of a multivalent (carbohydrate) ligand by more than one lectin subunit. The results reported here do not provide direct evidence of subunit multivalency in TeNT, although as the toxin functions in a carbohydrate-rich environment, multivalency would not be unexpected. It remains to be seen whether the individual binding regions described above are separate and independent or whether they may be involved in forming cross-links between ganglioside and tetanus toxin molecules.