Crystal Structure of Clostridium botulinum Whole Hemagglutinin Reveals a Huge Triskelion-shaped Molecular Complex*

Background: Botulinum HA is a component of the botulinum neurotoxin complex and dramatically increases the oral toxicity of the complex. Results: The crystal structure of botulinum HA reveals that 12 subcomponents of HA constitute a huge triskelion-shaped molecule. Conclusion: The complex is functionally and structurally separable into two parts. Significance: This is the first crystal structure of the whole botulinum HA complex. Clostridium botulinum HA is a component of the large botulinum neurotoxin complex and is critical for its oral toxicity. HA plays multiple roles in toxin penetration in the gastrointestinal tract, including protection from the digestive environment, binding to the intestinal mucosal surface, and disruption of the epithelial barrier. At least two properties of HA contribute to these roles: the sugar-binding activity and the barrier-disrupting activity that depends on E-cadherin binding of HA. HA consists of three different proteins, HA1, HA2, and HA3, whose structures have been partially solved and are made up mainly of β-strands. Here, we demonstrate structural and functional reconstitution of whole HA and present the complete structure of HA of serotype B determined by x-ray crystallography at 3.5 Å resolution. This structure reveals whole HA to be a huge triskelion-shaped molecule. Our results suggest that whole HA is functionally and structurally separable into two parts: HA1, involved in recognition of cell-surface carbohydrates, and HA2-HA3, involved in paracellular barrier disruption by E-cadherin binding.

structure of the 12 S toxin. HA can bind intestinal epithelial cells via its carbohydrate-binding activity, which facilitates efficient transport of the toxin complex across the intestinal epithelial monolayers (8 -18). In addition to this adhesive activity, we recently identified a novel activity of HA: type A HA and type B HA (HA/B) disrupt the epithelial paracellular barrier by directly binding to E-cadherin, thereby facilitating paracellular transport of macromolecules, including 12 S toxin (19 -22). By contrast, type C HA does not bind E-cadherin but is capable of disrupting the epithelial barrier of certain cells (20). These studies indicate that HA plays an important role in toxin absorption and pathogenesis of botulism. Additionally, these properties of HA could be exploited to facilitate the transepithelial delivery of medicines. However, the molecular details of the mechanisms by which HA elicits these multiple biological activities remain unclear, in part because functional reconstitution studies have not been performed on the whole HA complex, and detailed structural information about the complex has not been previously available (Table 1).
In this study, we achieved functional reconstitution of whole HA (HA1-HA2-HA3) using purified recombinant HA1, HA2, and HA3 of serotype B. Using this system, we obtained the first x-ray crystallographic structure of whole HA. The results of our structure-function studies suggest that HA1 proteins, which play a critical role in cell binding (9,10,(13)(14)(15)(16)(17)(18), are located in the most distal part of the complex, whereas the E-cadherinbinding site is in the inner part of the complex, the HA2-HA3 connecting region.

EXPERIMENTAL PROCEDURES
Plasmid Construction-DNA fragments encoding botulinum HA (HA1, amino acids (aa) 7-294; HA2, aa 2-146; and HA3, aa 19 -626) (23) were amplified by PCR from C. botulinum type B strain Okra genomic DNA using gene-specific primers containing restriction sites at their 5Ј-ends. In the forward primer for HA1, an oligonucleotide encoding four consecutive aspartic acids was inserted between the restriction site and the gene-coding sequence. The HA1 and HA3 fragments were inserted into the KpnI-SalI site of the pET52b(ϩ) expression vector (Merck Millipore). The HA2 fragment was inserted into the NheI-XhoI site of the pET28b(ϩ) vector (Merck Millipore). The inserted regions were confirmed by DNA sequencing.
Protein Expression and Purification-To obtain high levels of the proteins at high purity, HA1, HA2, and HA3 were expressed as Strep-tag-, His-, and Strep-tag-tagged proteins, respectively. Expression vectors harboring HA genes were transformed individually into Escherichia coli strain Rosetta TM 2(DE3) (Merck Millipore). Protein expression was induced using Overnight Express TM Autoinduction System 1 (Merck Millipore). Expression of Strep-tag-HA1 and Strep-tag-HA3 was induced for 36 h at 30°C. Expression of His-HA2 was induced for 40 h at 18°C. The Strep-tag-tagged proteins and His-HA2 were purified using StrepTrap HP and HisTrap HP columns (GE Healthcare), respectively. Strep-tag was cut from Strep-tag-HA1 using human rhinovirus 3C protease (Merck Millipore), which was removed by passing the digest through a HisTrap HP column. The His tag was removed from His-HA2 using thrombin (GE Healthcare), and the HA2 protein was purified using a HiTrap DEAE FF column (GE Healthcare). Protein concentrations were determined using the BCA protein assay reagent (Thermo Scientific).
For reconstitution of the HA complex, recombinant HA1, HA2, and HA3 proteins were mixed at a molar ratio of 4:4:1 in PBS (pH 7.4) and incubated for 3 h at 37°C. The HA complex was captured with a StrepTrap HP column, followed by a brief wash with PBS and incubation with human rhinovirus 3C protease on the column for 2 h at 20°C to remove Strep-tag attached to the HA3 protein. The HA complex was eluted from the column and further purified with a Superdex 200 10/300 GL column (GE Healthcare). The protein was concentrated using Amicon Ultra-0.5 100K centrifugal filter units (Merck Millipore). The homogeneity of the purified preparation was confirmed by SDS-PAGE and native PAGE. The type B 16 S toxin (B16S toxin) was purified from a bacterial culture of C. botulinum type B strain Okra as described previously (20).
Analysis of Complex Formation by Gel Filtration-Purified HA proteins or mixtures of these proteins were incubated for 3 h at 37°C prior to gel-filtration analysis. Proteins were individually loaded onto a Superdex 200 10/300 GL column previously equilibrated with PBS (pH 7.4) using an ÄKTA Pure system (GE Healthcare). The purified HA complex was analyzed using the same column equilibrated with 20 mM Tris-HCl (pH 7.4) and 200 mM NaCl. Absorbance was measured at 280 nm.
Measurement of Transepithelial Electrical Resistance (TER)-Measurement of TER was performed using a Millicell-ERS system (Merck Millipore) as described previously (20). Briefly, Caco-2 cells (derived from a human colon carcinoma) were plated onto filters (0.4-mm pore size) in Transwell chambers (Costar). After steady-state TER was achieved, HA proteins were added to the upper or lower side of the chamber. TER was measured at time points up to 24 h.
Crystallization-A 7 mg/ml solution of recombinant HA in 0.01 M Tris-HCl (pH 8.0) was employed for crystallization. Crystallization trials were set up at room temperature as sitting-drop vapor-diffusion experiments on Cryschem crystallization plates. The initial screening was performed at 295 K using the sparse matrix method (24) with commercial crystalscreening kits (Hampton Research). The crystallization droplets consisted of 1 l of HA solution and 1 l of reservoir solution containing 0.25 M ammonium chloride, 3% (w/v) polyethylene glycol 4000, 3% trehalose, 3% (w/v) benzamidine HCl, 3% (w/v) methylpentanediol, 3% (w/v) ethylene glycol, 0.1 M calcium chloride, and 0.1 M Tris-HCl (pH 8.0) and equilibrated with 500 l of reservoir solution. Hexagon-shaped tabular crys- tals appeared within 1 week and grew to maximum dimensions of ϳ0.3 ϫ 0.3 ϫ 0.2 mm. Data Collection, Phasing, and Refinement-X-ray diffraction data from HA crystals were collected at 100 K in a nitrogen stream after the crystals were soaked in reservoir solution containing 23% ethylene glycol and 3% glycerol as a cryoprotectant. A native data set at 3.5 Å was collected from a single crystal using a Rayonix MX300-HE CCD detector on beamline BL44XU at SPring-8 (see Table 2). The x-ray wavelength was 0.9 Å, the angle oscillation range was 0.5°, and the crystal-todetector distance was 500 mm.
Molecular replacement ( Table 2) was performed using Phaser (25). The asymmetric unit contained one HA complex. The resulting electron density maps modified by Parrot (26) allowed us to automatically trace almost all of one independent complex. Model building and inspection were based on Coot/ REFMAC5 (27,28). The atomic coordinates and structure factors for HA/B were deposited in the Protein Data Bank with accession code 3WIN. All images of the molecular structure were prepared using PyMOL.

RESULTS
Reconstitution of the Functional HA Complex-To obtain a functional HA complex, we reconstituted the whole complex from the HA subcomponents in vitro. We expressed each subcomponent of HA separately in E. coli and purified them. When these three subcomponents were mixed and incubated at 37°C, a large molecular complex formed spontaneously. When the three subcomponents were mixed pairwise and analyzed by gel filtration, only the HA2-HA3 complex was observed (Fig. 1A).
We reported previously that HA compromises the barrier function of epithelial cells and that E-cadherin binding of HA is sufficient for this effect (21). We assessed the barrier-disrupting activity of each HA subcomponent, alone or in combination, by measuring TER in Caco-2 cells. Neither the single subcomponents nor the pairwise combinations HA1-HA2 and HA1-HA3 exhibited any activity. By contrast, the HA2-HA3 complex and the full complex exhibited comparable activities when these proteins were applied from the basolateral side of the cells (Fig.  1B, upper panel). These results are consistent with the previous observation that the HA2-HA3 complex is sufficient for E-cadherin binding of HA (21). Because E-cadherins reside on the basolateral surface of epithelial cells, translocation of HA from the apical to the basolateral side of cells is a prerequisite for barrier disruption when the complex is applied from the apical side (19,21). In this experimental setting, the full complex exhibited higher activity than the HA2-HA3 complex (Fig. 1B, lower panel), indicating that HA1 facilitates apical-to-basolateral translocation of the HA complex.
Our preliminary experiments showed that the stability of the whole HA complex was greatly affected by the tags attached to HA1. In particular, HA complexes readily aggregated when tagfree or His-tagged HA1 was used for reconstitution of the complex, whereas the presence of a FLAG tag (DYKDDDDK) on HA1 improved the solubility of the complex. However, the complex harboring FLAG-HA1 yielded two major peaks in gelfiltration analysis and appeared to aggregate to some extent. Therefore, we added four consecutive aspartic acid residues (a portion of the FLAG tag) to the N terminus of HA1. The resultant purified HA complex, which we used for crystallization, yielded a single peak in gel-filtration analysis, and exhibited barrier-disrupting activity comparable to that of the native B16S toxin (Fig. 1, C and D). In addition to the aspartic acid cluster in HA1, the N terminus of each HA subcomponent contains vector-derived sequences that are not identical to those in the native complex. However, electron density was not observed in these regions, and we assumed that these regions do not affect the overall structure of the complex (see below).
Structure Determination-The crystallographic data are shown in Table 2. Analysis of the diffraction pattern and systematic absences allowed us to assign HA crystals to the hexagonal space group P6 3 22, with unit cell parameters a ϭ b ϭ 324.7 Å, c ϭ 117.6 Å, ␣ ϭ ␤ ϭ 90°, and ␥ ϭ 120°. A total of 45,592 unique reflections were obtained using the HKL2000 software package (29). The intensity data in the resolution range of 50.0 to 3.5 Å were processed with an R merge of 8.6%. Assuming a molecular mass of 148 kDa for the expressed HA (HA1-HA2-HA3 monomer), packing density calculations indicated the V max to be 6.06 Å 3 Da Ϫ1 , with one HA arm (HA1-HA2-HA3 monomer) per asymmetric unit. This corresponds to a solvent fraction of about 79.7%, an allowed value for protein crystals.
The crystal structure of whole HA was solved by the molecular replacement technique using the structures of type B HA3 5 and type D HA1-HA2 (Protein Data Bank ID 2E4M) and the Phaser program from CCP4. The electron density maps were of excellent quality, allowing unequivocal tracing of each subcomponent (HA1-HA2-HA3), and were subsequently refined at 3.5 Å resolution. The structure was refined using REFMAC (28); 5% of the unique reflections were used to monitor the free R-factor. The final values for general and free R-factors were 20.2 and 25.1%, respectively (all reflections in the 50.0 -3.5 Å resolution range). The refined model consists of 10,554 atoms with 93 solvent molecules. Stereochemistry checks indicated that the refined model was in good agreement with expectations for models within this resolution range ( Table 2). The asymmetric unit contains one HA1-HA2-HA3 complex ( Fig.  2A); the structure of the trimer, which represents the natural form, was generated by crystallographic symmetry (Fig. 2B).
Structure of Whole HA-Whole HA has a triskelion-like fold and forms a pore in the center. Measuring from the central pore, the HA complex has a radius of 160 Å. Notably, the central pore (diameter of ϳ26 Å) is the same size as that of hemolysin, a pore-forming toxin produced by Staphylococcus aureus (30). The crystallographic trimeric fold is stabilized by a number of specific interactions between individual HA3 molecules, as observed in type C HA3 (31).
Interactions between HA1 and HA2-HA1 has a dumbbelllike fold consisting of two ␤-trefoil domains, the N-terminal (I and IЈ, aa 9 -146) and C-terminal (II and IIЈ, aa 147-294) domains, connected by a short ␣-helix (Tyr 147 -Phe 152 ). According to co-crystal studies and mutation experiments using type A and C HA1 (32)(33)(34), the C-terminal ␤-trefoil domain harbors major sugar-binding site(s) that determine the spectrum of carbohydrate recognition of whole HA and the 16 S toxin. No density was observed for the N-terminal resi-5 S. Amatsu and K. Kitadokoro, unpublished data. dues, including the aspartic acid cluster and residues 7 and 8, implying that these residues adopt a disordered structure. The HA2 molecule consists of a single polypeptide chain with residues 5-145; no electron density was visible for the three N-terminal residues or the C-terminal Ile 146 . Comparison of the interface residues of type B HA1-HA2 with those of type D HA1-HA2 (Protein Data Bank ID 2EM4) using PISA revealed that the interface hydrophobic residues are well conserved between the two serotypes (Fig. 3, A and B). Accordingly, when type B HA1-HA2 and type D HA1-HA2 were overlapped using SSM from Coot, the interfacing regions of HA2 and the HA1-I and HA1-IЈ domains fit well with each other; however, HA1-II and HA1-IIЈ of types B and D do not overlap (Fig. 4A). This differential interdomain arrangement was observed between two crystal structures of type A HA1 and between type A and C HA1 (35). The arrangement of type B HA1 resembles one of the structures of type A HA1 (Fig. 4B).
Interactions between HA2 and HA3-Among the key findings in the structure of whole HA are the details of the intramolecular interactions between HA2 and HA3. HA2 binds to the slightly lateral region of the apex of the triangle formed by the HA3 trimer. At the interface, one ␤-loop-␤-region of HA2 is inserted into a crevice formed by two ␤-loop-␤-regions of HA3-IV. Analyses of the scores of the interface performed using PISA revealed that the associations are mediated mainly by hydrophobic interactions, as well as by several salt bridges and hydrogen bonds. The HA3 ␤-loop-␤-regions from Ile 565 to Gln 584 interact with the hydrophobic phenylalanine cluster of HA2 (Phe 7 , Phe 56 , and Phe 88 ) (Fig. 5A). Ile 575 of HA3-IV makes a hydrophobic contact with the phenylalanine cluster of HA2, and the next residue (Asp 576 ) forms a hydrogen bond with Arg 54 of HA2 (Fig. 5A). On another side, Phe 547 of the HA3 ␤-loop-␤-regions from Tyr 536 to Thr 559 forms a hydrophobic interaction with Ile 18 , Ile 92 , and Ala 93 (Fig. 5B), and the outside Lys 550 of HA3-IV forms a hydrogen bond with Glu 119 of HA2. These two specific interactions determine the relative positions of HA2 and HA3.
Structure of the Type B HA3 Trimer-HA3 is the biggest subcomponent in the whole HA complex, containing residues 19 -626; no electron density was visible for the three N-terminal residues 19 -21 and aa 189 -207. HA3 exists naturally as a trimer, and HA3 is stable for crystallization. We have already obtained three types of type B HA3 crystals: one symmetrically monomeric, one trimeric, and one containing four molecules in the asymmetric unit. 5 Comparison of the type B and C HA3 structures (31) revealed that the overall structures are similar. The type C HA3 structure was determined in the monomeric form, but the authors of that study incorrectly traced against the asymmetric unit because they lacked maps of electron density from residues 185-203. The crystal structures that we have solved with four molecules in the asymmetric unit (lacking density maps for aa 189 -195) clearly indicate that the natural HA3 monomer has an elongated shape ( Fig. 2A).
There are no remarkable structural differences between type B HA3 in the trimeric form and that in the whole HA complex, except in domain HA3-IV, which contains the HA2-binding regions (Fig. 5C). The HA2-binding sites of HA3-IV in whole HA adopt a closed conformation, in contrast to the corresponding region of domain IV in the trimeric HA3 structures. In particular, the two loops of HA3-IV in whole HA have moved to accommodate the HA2 molecule.
Structural Homolog of HA-A DALI search was performed on whole HA. As predicted, the results revealed that type C HA1, type C HA3, and type D HA1-HA2 complexes are similar to type B HA1, HA3, and HA1-HA2, respectively. Another structural homolog proposed for type B HA3 is CPE, a ␤-poreforming enterotoxin produced by Clostridium perfringens (Protein Data Bank ID 3AM2; Z ϭ 9.7). (Fig. 6, A and B). As described above, HA3 monomers consist primarily of four domains, HA3-I, HA3-II, HA3-III, and HA3-IV. The overall shapes of HA3-I and HA3-II are quite similar to that of the N-terminal pore-forming domain of CPE (called N-CPE), whereas HA3-III and HA3-IV resemble the C-terminal receptor-binding domain of CPE (called C-CPE), which binds to certain members of the claudin family, components of tight junctions (36).

DISCUSSION
In this study, we have provided the first crystal structure of whole HA, a multifunctional large protein complex that is critical for the high oral toxicity of the BoNT complex. Type B whole HA is a huge triskelion-shaped complex with an HA1: HA2:HA3 stoichiometry of 6:3:3. The crystal structure is consistent with the three-arm structure proposed previously for the 16 S toxin based on electron microscopy (37,38).
Our reconstitution experiment showed that among all three pairwise combinations of the three proteins, only HA2 and HA3 form a complex. Although an HA1-HA2 complex was reported in type C HA (39), HA1 formed a stable complex with neither HA2 nor HA3 and was found in a complex only when all three subcomponents were present. These observations suggest that HA2-HA3 complex formation precedes HA1-HA2 association at least in type B. HA2 appears to play a key role in this process. Nevertheless, the detailed mechanism of HA complex formation remains to be elucidated.
As we described previously, HA/B disrupts the epithelial paracellular barrier by binding to E-cadherin via specific protein-protein interactions (21). Our TER data (Fig. 1B) show that basolaterally added HA3 alone does not possess barrier-disrupting activity; however, the HA2-HA3 complex disrupts the barrier almost as potently as whole HA and the 16 S toxin. These data indicate that HA3 has no ability to bind to E-cadherin, whereas the HA2-HA3 complex, whole HA, and the 16 S The calculations of interfaces between subcomponents were performed using PISA. A, the interface between HA1 and HA2 of type B. HA1 and HA2 are shown in yellow and magenta, respectively. Residues directly involved in contacts are colored blue and green for HA1 and HA2, respectively. B, the interface between HA1 and HA2 of type D (37). HA1 and HA-2 are shown in blue and cyan, respectively. Residues directly involved in contacts are colored yellow and green for HA1 and HA2, respectively. The residues involved in the interface are conserved between type B and D HA1-HA2.
toxin possess comparable abilities to bind E-cadherin. The structural conformations of HA3 domains are highly conserved between the HA3 trimer and HA3 in whole HA, with the exception of domain HA3-IV, which contains HA2-binding regions.
These results indicate that the parts of HA that are essential for E-cadherin binding are located at the connecting region between HA2 and HA3-IV. A database search identified CPE as a structural homolog of HA3. HA3-I and HA3-II are homologous to N-CPE, whereas HA3-III and HA3-IV are homologous to C-CPE, which is the binding site for receptors (claudins). As described above, HA3-IV is presumed to be involved in E-cadherin binding. Furthermore, type C HA3-IV contains a sialic acid-binding site (31,40), although the role of sialic acid binding is totally unclear. Type B HA3 has also been reported to bind sialic acid (15), and the amino acid residues involved in sialic acid binding in type C HA3 are conserved in type B. Therefore, the ligand-binding function appears to be conserved between C-CPE and HA3-IV. In contrast, the significance of the homology between the N-terminal parts of these proteins is unclear. The N terminus of CPE is responsible for ␤-pore formation on the host plasma membrane, in a manner analogous to that of S. aureus hemolysin (41,42); however, the depth of the HA pore is insufficient to cross the membrane, and some of the amino acid residues conserved among ␤-pore-forming toxins are not present in HA3 (42). Consistent with this, HA3 has never been reported to form pores on the plasma membrane. In HA, this structure might be exploited for other purposes, e.g. NTNH binding. The N-terminally located long loop of NTNH, designated the nLoop, is involved in 16 S toxin formation (7,38). This loop is located at the opposite side of the BoNT-interacting surface of NTNH, where it is fully accessible to HA (7). Taken together, these observations suggest that the pore might be used as the docking site for nLoop.
In nature, HA-containing BoNT complexes enter the host through the apical surface of intestinal epithelial cells. Our previous results show that HA disrupts the paracellular barrier of the epithelial monolayer via a two-step mechanism (6, 19, 21). First, HA on the apical surface translocates to the basolateral surface via transcytosis. Next, HA located on the basolateral surface binds to the extracellular domain of E-cadherin, thereby disrupting the paracellular barrier. The results of our functional reconstruction experiment show that the HA2-HA3 complex has attenuated barrier-disrupting activity relative to whole HA when these proteins are added apically, whereas it has the same potency as whole HA when applied basolaterally. These data provide evidence that HA1 plays an important role in the first step of HA action and that the HA2-HA3 complex is necessary and sufficient for the second step. Our results are consistent with the observation that the carbohydrate-binding activity of HA1 in the 16 S toxin plays an important role in the binding and transport of the toxin complex across intestinal epithelial monolayers (9,10,(13)(14)(15)(16)(17)(18). In the structure of whole HA, the orientation of the BoNT-NTNH binding side was easily inferred from studies of negative-stain electron microscopy of 16 S toxins (37,38). On the basis of this, we propose a model for the interaction of the 16 S toxin with polarized epithelial cells (Fig. 7). First, HA1, which is located on the front of the 16 S toxin, binds to the apical cell surface via its carbohydrate-binding sites and translocates from the apical to the basolateral surface by transcytosis. Next, the HA2-HA3 connecting region interacts with the extracellular domain of E-cadherin at the basolateral surface. Our structure shows that the deduced E-cadherin-binding site at the HA2-HA3 connecting area is located inside whole HA; however, this site would be accessible from the opposite side of the NTNH-binding region even after a complex is formed between HA and cell-surface carbohy-drates. In the future, crystallization of whole HA in complex with E-cadherin will facilitate the understanding of the molecular mechanisms by which HA disrupts the paracellular barrier.
In summary, our structure of whole HA in the botulinum neurotoxin complex provides a comprehensive view of this huge multifunctional protein, which plays important roles in intestinal absorption of BoNT. This new information will generate opportunities for understanding the pathogenicity of the botulinum neurotoxin complex, for development of therapeutics against botulinum intoxication, and also for development of novel methods for transepithelial delivery of various macromolecule-based medications.  (7), HA/B (this study), and a model of the type A 12 S and B16S toxins obtained by electron microscopy (38). Left panel, on the apical side of the epithelial cells, the toxin complex binds to the cell surface via the most distally located HA1 (dotted cyan circles), which allows toxin translocation across the cell. Right panel, on the basolateral side, the complex interacts with E-cadherin through the inner region of HA (dotted red circles), thereby disrupting the epithelial barrier. Red, type A BoNT; green, type A NTNH; blue, HA.