Modular Organization of α-Toxins from Scorpion Venom Mirrors Domain Structure of Their Targets, Sodium Channels*

Background: Scorpion α-toxins affect voltage-gated sodium channels in both mammals and insects. Results: We perform thorough computational analyses of α-toxin molecular architecture and structure-function relationship. Conclusion: Taxon specificity of “orphan” toxins can be predicted from a structural perspective. Significance: The proposed surface mapping technique is a new tool to analyze protein-protein complexes. To gain success in the evolutionary “arms race,” venomous animals such as scorpions produce diverse neurotoxins selected to hit targets in the nervous system of prey. Scorpion α-toxins affect insect and/or mammalian voltage-gated sodium channels (Navs) and thereby modify the excitability of muscle and nerve cells. Although more than 100 α-toxins are known and a number of them have been studied into detail, the molecular mechanism of their interaction with Navs is still poorly understood. Here, we employ extensive molecular dynamics simulations and spatial mapping of hydrophobic/hydrophilic properties distributed over the molecular surface of α-toxins. It is revealed that despite the small size and relatively rigid structure, these toxins possess modular organization from structural, functional, and evolutionary perspectives. The more conserved and rigid “core module” is supplemented with the “specificity module” (SM) that is comparatively flexible and variable and determines the taxon (mammal versus insect) specificity of α-toxin activity. We further show that SMs in mammal toxins are more flexible and hydrophilic than in insect toxins. Concomitant sequence-based analysis of the extracellular loops of Navs suggests that α-toxins recognize the channels using both modules. We propose that the core module binds to the voltage-sensing domain IV, whereas the more versatile SM interacts with the pore domain in repeat I of Navs. These findings corroborate and expand the hypothesis on different functional epitopes of toxins that has been reported previously. In effect, we propose that the modular structure in toxins evolved to match the domain architecture of Navs.

To gain success in the evolutionary "arms race," venomous animals such as scorpions produce diverse neurotoxins selected to hit targets in the nervous system of prey. Scorpion ␣-toxins affect insect and/or mammalian voltage-gated sodium channels (Na v s) and thereby modify the excitability of muscle and nerve cells. Although more than 100 ␣-toxins are known and a number of them have been studied into detail, the molecular mechanism of their interaction with Na v s is still poorly understood. Here, we employ extensive molecular dynamics simulations and spatial mapping of hydrophobic/hydrophilic properties distributed over the molecular surface of ␣-toxins. It is revealed that despite the small size and relatively rigid structure, these toxins possess modular organization from structural, functional, and evolutionary perspectives. The more conserved and rigid "core module" is supplemented with the "specificity module" (SM) that is comparatively flexible and variable and determines the taxon (mammal versus insect) specificity of ␣-toxin activity. We further show that SMs in mammal toxins are more flexible and hydrophilic than in insect toxins. Concomitant sequence-based analysis of the extracellular loops of Na v s suggests that ␣-toxins recognize the channels using both modules. We propose that the core module binds to the voltage-sensing domain IV, whereas the more versatile SM interacts with the pore domain in repeat I of Na v s. These findings corroborate and expand the hypothesis on different functional epitopes of toxins that has been reported previously. In effect, we propose that the modular structure in toxins evolved to match the domain architecture of Na v s.
Voltage-gated sodium channels (Na v s) 6 are vital components of the nervous system and muscles, playing a central role in the excitability of these tissues (1,2). Dysfunctions of Na v s cause a number of channelopathies (3), and approaches for treatment of these diseases are eagerly awaited.
The pore-forming ␣-subunit of Na v s (ϳ260 kDa) is organized in four non-identical but homologous repeats (pseudosubunits) (I-IV), each consisting of six transmembrane segments (S1-S6). Segments S1-S4 form a voltage-sensing domain (VSD) in each of the repeats (VSDs I-IV), whereas segments S5 and S6 from all four repeats contribute to the sole pore domain (PD) of Na v s (PDs I-IV will denote S5-S6 from corresponding repeats). We should note here that the homologous repeats of Na v s are more often referred to as "domains" in the literature. Although imperfect, we shall follow this conventional nomenclature.
Na v pharmacology is mostly determined by the ␣-subunit. Note that at present nine isoforms of ␣-subunit have been well characterized in humans (Na v 1.1-1.9), and just one appears to function in insects (e.g. the Para protein in Drosophila) (4,5). To date, a host of substances has been described affecting Na v s through binding to different parts of the channels, so-called receptor sites (6). One of the most prominent examples is scorpion ␣-toxins, classical and potent modulators of Na v s, which slow or inhibit channel inactivation, leading to prolongation of the action potential (7). Scorpion ␣-toxins bind to receptor site 3 of Na v s, whereas the first two sites are targeted by tetrodotoxin and batrachotoxin, respectively (8). Extensive mutagene-sis of Na v s (in particular the rat neuronal isoform Na v 1.2) allowed mapping of receptor site 3 to the extracellular surface of VSD IV of the channels with additional participation of the loops protruding from PD I (9 -11).
Besides being the armament of venomous animals, natural toxins are perfect tools for investigation of the channel structure and function. They also represent templates for generation of pharmacologically active compounds modifying selectively Na v s activity. Understanding of structure-activity and structure-selectivity relationships in toxins will serve as a firm basis for design of novel selective Na v modulators helpful in treating channelopathies in humans or, for example, generation of novel selective and safe insecticides.
Conventionally, scorpion ␣-toxins are divided into three groups based on taxon selectivity of their action: "mammal toxins" preferentially affecting mammals, "insect toxins" showing high insect toxicity, and "␣-like toxins" affecting both mammals and insects (12). This classification is not strict, however, because insect toxins, for instance, are able to kill mammals, albeit at higher doses. Scorpion ␣-toxins have been the objects of structural biology research for over 25 years (13,14), and it has been firmly established that these small proteins (ϳ65 residues) share a common ␤␣␤␤ motif, stapled by four disulfide bridges (7). Although much effort has been made to understand the structure-activity relationships in the three groups of ␣-toxins, no universal conclusion has been derived.
It is apparent that toxin selectivity is somehow encoded in the properties of its molecular surface and dynamics. A number of techniques have been developed for delineation of specific regions on molecular surfaces, which can be potentially involved in protein-protein recognition. Detailed mapping of surface hydrophobic/hydrophilic properties of two interacting molecules has been widely used for protein-ligand and proteinprotein interactions (15)(16)(17)(18), where molecular surface properties are projected onto a plane, cylinder, or sphere, aiding reduction of complexity and facilitating relatively straightforward comparison of projected "maps" rather than matching complex three-dimensional shapes.
To address the problem of ␣-toxin selectivity, we employ an original computational approach for detailed mapping and comparison of various physico-chemical properties, namely, hydrophobicity, flexibility, and electrostatic potential. This approach is based on spherical projection of the molecular properties and initially takes into account the dynamic behavior of molecules. Our results add novel atomistic description of putative mechanism of Na v -selective recognition by their peptide ligands and bring new formalism into the field of toxinology.

EXPERIMENTAL PROCEDURES
Homology Modeling-To enrich statistics, structures of several ␣-toxins with known pharmacological profile were modeled using related toxins as templates (see Table 1 for details). Alignments were produced with the ClustalW software (19) (Fig. 1). Homology modeling was performed with Modeler version 8.2 (20). The cis/trans configuration of the peptide bond between residues 8 and 9 was left "as is" in experimental structures; in models, the configuration was derived from templates, but it always got stabilized as trans for mammal and insect toxins and primarily cis for ␣-like toxins. The C termini of certain toxins were amidated where needed (see the footnotes to Table 1). 20 models were produced for each molecule; the  model with the lowest value of the energy-like Modeler objective function was selected for molecular dynamics (MD) calculations.
MD Simulations-The Gromacs version 4.0.7 package (21) and a unified protocol were used. Each toxin was placed in a water box (60 ϫ 60 ϫ 60 Å 3 ) together with the necessary amount of counterions and subjected to energy minimization, followed by heating to 300 K for 100 ps and 60 ns of unconstrained MD runs. The Gromos96 45a3 force field (22) and SPC water model (23) were used. MD simulations were carried out with a time step of 2 fs and imposed three-dimensional periodic boundary conditions in the isothermal-isobaric (NPT) (where N ϭ number of particles, P ϭ pressure, and T ϭ temperature) ensemble with an isotropic pressure of 1 bar and a constant temperature of 300 K. The temperature and the pressure were scaled using the V-scale thermostat (24) and Berendsen barostat (25) with 0.1 and 1 ps relaxation parameters, respectively. The van der Waals and electrostatic interactions were truncated using the twin range 10/12 Å spherical cut-off.
All trajectories were fitted to a single structure to eliminate rotation and translation and permit direct comparison of hydrophobic and electrostatic properties of molecules (see "Molecular Hydrophobicity Potential (MHP) Calculations"). To perform the analysis of essential motions, mass-weighted covariance matrices and their eigenvectors were extracted from MD trajectories using standard tools from the Gromacs package. The calculations were performed for protein heavy atoms and polar hydrogen atoms (as defined by the force field) using the 20 -60-ns time span of the trajectory. Root mean square fluctuation (RMSF-NM; "NM" represents "normal mode") was calculated from the trajectory filtered using the slowest mode (first eigenvector), and expressed as a measure of molecular flexibility. Fig. 2 was prepared with PyMOL (26); "intermediate" structures are the result of Gromacs normal mode analysis. Residueresidue maps of high amplitude correlated motions were constructed with an in-house Python script (Fig. 3). First, each residue in a toxin of length n was assigned a vector that describes its motion along the slowest mode. Then and n ϫ n map of dot products was calculated to reveal the high amplitude correlated motions. Second, an averaged map for each toxin group was plotted, in accordance with the sequence alignment ( Fig. 1) that relates structural elements. As a result of the alignment procedure, each map in Fig. 3A has dimensions of 69 ϫ 69 residues.
Molecular Hydrophobicity Potential (MHP) Calculations-The MHP approach assumes that each atom in a molecule possesses its "intrinsic" value of hydrophobicity (atomic hydrophobicity constant), taking the molecular topology into account (16,17). These constants have been determined from the database of experimental log P values for a large number of organic compounds (27); MHP at any given point is calculated as a superposition of contributions created by each atom, monotonically decay-ing with distance. The calculations were performed with the PLATINUM software (28).
Comparison of hydrophobic/hydrophilic properties of specificity modules (SMs) and core modules of ␣-toxins was accomplished by calculating the average dynamic hydrophobicity (͗MHP SM,Core ͘ MD ) independently for molecular surface of each module. ͗MHP SM ͘ and ͗MHP Core ͘ are the average MHP values over all points of the Connolly surface that belong to the SMs or core modules, respectively. Collecting statistics from an MD trajectory yields the average dynamic hydrophobicity and its S.D. value: ͗MHP SM,Core ͘ MD Ϯ S.D. When comparing two groups of toxins, two distributions of ͗MHP͘ MD are considered; the statistical significance of the uncovered difference is determined by the two-tailed unpaired Student's test (p value less than 0.05 was considered as significant).
Mapping Hydrophobic Properties on Molecular Surface-For more detailed and descriptive assessment of the hydrophobic/ hydrophilic properties of molecular surfaces, quantitative comparison, and recognition of similarity and distinction between molecules and groups of molecules, we employ a three-dimensional to two-dimensional mapping approach, which utilizes MHP spherical projection maps. The main idea is that comparison of similar, but still complex three-dimensional objects becomes more straightforward when data are presented in a regular form. Here, we applied transformation of molecular surfaces into spheres and then used one of the standard spherical projections to represent data in two dimensions as a rectangular grid. The main steps of the method are as follows. 1) The Connolly molecular surface was calculated. 2) MHP for each point of the surface was calculated (for these two steps, PLAT-INUM was used (28)). 3) Each point was projected onto a sphere (here, we arbitrarily chose a sphere radius of 20 Å). The center of the sphere was aligned with the geometrical center of the surface and set as the coordinate origin. Coordinates (x, y, z) could now be described by latitude ϭ arcsin(z/͉r͉) and longitude ϭ arctan(y/x), where r is the radius vector of the point, and latitude and longitude have the same meaning as in geography. 4) The MHP data were interpolated on a regular grid determined by latitude (Ϫ90°, ϩ90°) and longitude (Ϫ180°, ϩ180°), with a 1°step. 5) An equal-area Mollweide cartographic projection (29) was built. 6) The map was visualized using color contours.
Steps 3-6 were implemented in the Python script with step 4 performed with the module scipy.interpolate.griddata, step 5 performed with the module basemap, and step 6 performed with the module matplotlib. Interpolation of the data on a reg- Asterisks mark toxins that have amidated C termini (according to Uniprot annotation). b ␣-Toxins with unknown experimental three-dimensional structures. These molecules were modeled. The specified Uniprot accession numbers and Protein Data Bank (PDB) codes were used as a source of the modeled toxin sequence and its structural template, respectively (see "Experimental Procedures"). c These parameters were computed from the 20 -60-ns time span of MD simulations (the first 20 ns were considered as the equilibration period). The values are means Ϯ S.D. d Average MHP on the surface of the SMs and core modules. e Average root mean square fluctuation calculated from the trajectory, filtered using the first eigenvector (RMSF-NM) for residues of the RT loops and core domains. f Toxicity data were collected from the literature. Toxicity to insects was assayed on the cockroach B. germanica. Toxicity to mammals was assayed on mice by either intracerebroventricular (i.c.v.), subcutaneous (s.c.), or intravenous (i.v.) injections. g PD 50 , half-paralytic dose. ular grid permits not only fancier visualization of the maps but also a variety of operations like addition, subtraction, averaging, calculation of S.D., etc. MD-averaged data for each toxin under study were obtained, and Fig. 4 demonstrates group averaging. To make sure that spherical projections did not bring too much distortion into the original surface-distributed data, we additionally performed comparison of surface MHP (Fig. 4B).
Calculation of Na v Loop Hydrophobicity and Charge-An amino acid sequence alignment of seven insect (from Drosophila melanogaster, Musca domestica, Anopheles gambiae, Bombyx mori, Blattella germanica, Heliothis virescens, and Nasonia vitripennis) and 21 mammalian channels (Na v 1.1-1.7 from humans, rats, and mice) was built (not shown). For estimation of hydrophobicity, the MHP-based scale was used; each residue was assigned a log P value (P representing the water/octanol distribution coefficient), equal to the sum of atomic MHP constants (27). Alternatively, the Eisenberg whole-residue hydrophobicity scale was used (30), which yielded very similar results (not shown). For each channel loop, hydrophobicity was calculated as an average value over single residue hydrophobicities (Fig. 5A). Loop charges were calculated analogously (Fig. 5B). Because ␣and ␤-toxins have been proposed to interact with channels' VSDs at "clefts" between the S1-S2 and S3-S4 extracellular loops (10, 32), we have extended loop sequences by several residues in both directions to engage probable cleft surface.
Production of Recombinant Toxin BeM9-All steps were performed according to the published guidelines (33). DNA encoding BeM9 from the venom of the scorpion Mesobuthus eupeus (34) (Uniprot ID P09981) was assembled from synthetic oligonucleotides ( Table 2) by PCR. Because no methionine residues are found in BeM9, we introduced a methionine codon upstream of the toxin-coding sequence and used cyanogen bromide (CNBr) to liberate the toxin from the carrier protein thioredoxin (Trx; see below). The toxin DNA was cloned into the expression vector pET-32b (Novagen (Madison, WI)), which was then used to transform Escherichia coli BL21(DE3). Gene expression was induced by 0.2 mM isopropyl ␤-D-1-thiogalactopyranoside. Trx-BeM9 fusion protein was isolated by affinity chromatography on TALON Superflow Metal Affinity Resin (Clontech, Mountain View, CA) and then hydrolyzed using CNBr as suggested (35). Recombinant BeM9 was purified by reversed-phase HPLC on a Jupiter C 5 column (250 ϫ 4.6 mm; Phenomenex, Torrance, CA).
Two-electrode voltage-clamp recordings were performed at room temperature using a Geneclamp 500 amplifier (Molecular Devices, Downingtown, PA) controlled by a pClamp data acquisition system (Axon Instruments, Union City, CA). Bath solution composition was the same as oocyte incubation solution but without gentamicin. Sodium current traces were evoked from a holding potential of Ϫ90 mV by 100-ms depolarizations to V max (the voltage corresponding to maximal sodium current in control conditions) with a start-to-start interval of 0.2 Hz. All data were analyzed using pClamp Clampfit version 10.0 (Molecular Devices) and Origin version 7.5 (OriginLab, Northampton, MA) software.

RESULTS
The flowchart of the study is as follows. 1) A comprehensive database of currently available three-dimensional structures of ␣-toxins was established and supplemented by homology models of several other well characterized members of the family. 2) A series of MD simulations was carried out, and group analysis of essential motions was performed. 3) Hydrophobic/hydrophilic properties were calculated on the molecular surface using the MHP approach and averaged over MD trajectories. 4) These properties were mapped using spherical projection. 5) Hydrophobic/hydrophilic properties of the extracellular loops of Na v s were estimated based on their sequences and compared with the proposed modular organization of the toxins. 6) Activity of several ␣-toxins from M. eupeus venom was predicted based on the calculated properties and compared with experimental data.
Homology Modeling of Scorpion ␣-Toxins-To perform analysis of dynamic and physicochemical properties of ␣-toxins that might determine their phylum selectivity and toxicity, we created a database of available experimental structures of ␣-toxins and divided them into three groups, according to the published data on their toxicity: 1) "classic" mammal toxins, 2) insect toxins, and 3) ␣-like toxins that are active on both phyla. We should note that the boundaries between the three groups are not strict (see toxicity data in Table 1). It seems, however, that in mammals, the three groups show differential activity with respect to Na v isoforms. To make the statistics more robust, we extended the database by homology models of several toxins with unknown three-dimensional structure but clearly described pharmacological profile. The high conservation of ␣-toxin spatial structure makes homology modeling rather straightforward. In total, the data set included eight mammal, six insect, and 13 ␣-like toxins (Table 1 and Fig. 1).

MD Simulations Reveal Modular
Organization of Scorpion ␣-Toxins-Comparison of static structures can be biased, especially in the case of homology models. To take into account the flexibility and to study the dynamic organization of scorpion ␣-toxins, we performed an analysis of the essential motions based on 60-ns-long MD simulations in a box with explicit water. Here, the last 40 ns of each MD trajectory were used, assuming that toxin structural parameters reach equilibrium after the first 20 ns. The total MD statistics for all toxins exceeds 1 s.
Analysis of several "slow" (low frequency) modes (eigenvectors 1-5) reveals three regions in ␣-toxin structure that display relatively independent movements: 1) the N-terminal reverse turn (RT) loop (residues 8 -12; throughout, the numbering is according to Aah2) coupled with the C terminus (residues 56 -64) and constituting the so-called "RC domain" previously identified based on biochemical data (37)(38)(39)(40)(41); 2) the ␤ 2 -␤ 3 loop (residues 39 -43); and 3) the rest of the molecule (its "core"). A common feature of all ␣-toxins is the "reciprocal" motion of the ␤ 2 -␤ 3 loop and the C terminus (see below). Based on the dynamic behavior, we rationalize that the ␣-toxin structure comprises two parts, or modules: the core module and the FIGURE 1. Sequence alignment of ␣-toxins used in the study. Residues are numbered according to the Aah2 toxin. Conserved secondary structure elements and the disulfide bridges are shown below. Residues belonging to the SM are boxed. Residues are colored as follows. Red, negatively charged (Ϫ; Asp and Glu); blue, positively charged (ϩ; Arg and Lys); green, polar (ϳ; Ser, Thr, Asn, Gln, and His); orange, hydrophobic aromatic (@; Trp, Tyr, and Phe); olive, hydrophobic aliphatic (#; Ala, Val, Leu, Ile, and Met); gold background, cysteine; black, glycine and proline. A "consensus" sequence is given below each group, where symbols indicate conservation of Ն75%. Residues that are conserved within but differ between the groups are shown on a pink background and are considered as "functionally variable" (see Fig. 4A). Residues that have been hypothesized to evolve under positive selection are marked with red arrows above (66,67). SM (Fig. 2). The latter encompasses the "RC domain" and the ␤ 2 -␤ 3 loop. Interestingly, the "classic" mammal toxins have the most conformationally flexible SMs. The RT loop of these toxins demonstrates the highest amplitude of motions along the slowest mode (an example for insect and mammal toxins is shown in Fig. 2).
We applied group analysis and built averaged residue-residue maps of high amplitude correlated motions (Fig. 3A). First, these maps allow identification of the most flexible regions that are common for all toxins in the group (diagonal elements). It is seen that mammal toxins have very mobile RT loops as compared with insect toxins (Fig. 3A, large boxes). High mobility of the RT loop in mammal toxins is also detected by analysis of RMSF along the first eigenvector (RMSF-NM) (Fig. 3B). Second, the off-diagonal elements illustrate the correlated (blue regions), anti-correlated (red regions), or non-correlated (white regions) character of motions between different parts of a protein. The small boxes in Fig. 3A highlight that in mammal toxins, the RT loop moves collectively with the C terminus of the protein, whereas in insect toxins, this is not the case. The "lakelike" pattern in mammal toxins (inside the small box) mostly corresponds to the "twisting" motion of the RT loop and C terminus. The reciprocal movements of the ␤ 2 -␤ 3 loop and C terminus are found in all toxins (red region around residues 43 and 62). In mammal toxins, the RT loop and C terminus move in a concerted manner with respect to the core module (Aah2 is shown in Fig. 2). On the contrary, in insect toxins, the "RC domain" is relatively rigid, and the most mobile part is the ␤ 2 -␤ 3 loop (Fig. 2, Lqq III).
It is worth noting that mammal toxins have conserved glycine residues in positions 4 and 17 (Fig. 1). Although distant from the RT loop, these residues might affect the loop flexibility, acting like "hinges" (Fig. 2). In contrast, insect toxins most frequently have alanine or phenylalanine residues in the same positions associated with the lower amplitude of the RT loop motions in these toxins. To test the putative role of Gly-4 and Gly-17 in toxin flexibility, we introduced respective in silico mutations in a typical mammal toxin Bot 3 (G17A and G17F) and typical insect toxins Lqq III (F17G) and BmK ␣IT1 (A17G) and simulated their MD using the standard protocol. As a result, it was shown that the flexibility increased in BmK ␣IT1 (A17G) (RMSF-NM ϭ 0.11 Ϯ 0.03 versus 0.04 Ϯ 0.01 nm, mutant versus wild type) and dropped in Bot3 (G17F) (RMSF-NM ϭ 0.09 Ϯ 0.02 versus 0.13 Ϯ 0.03 nm, mutant versus wild type). For the two other mutants, no apparent effect on flexibility was found. The double mutant of Lqq III toxin (A4G/F17G) was studied to see if the introduction of both "hinge" residues increases RT loop flexibility (in this case, a single substitution F17G is not enough). Accordingly, the flexibility increased (RMSF-NM ϭ 0.10 Ϯ 0.03 versus 0.05 Ϯ 0.03 nm, mutant versus wild type).
Mammal ␣-Toxins Possess Prominently Hydrophilic Specificity Modules-To compare the average dynamic hydrophobicity in the three groups of toxins, we employed the MHP approach (16). It represents a powerful tool to calculate spatial distribution of hydrophobic/hydrophilic properties proven to be important in molecular recognition (15). Similar dynamic analysis of the surface hydrophobicity has recently been used in characterization of posttranslational modification effects in globular proteins (42,43).
In addition, we employed an original computational method for two-dimensional mapping of properties like MHP and electrostatic potential, distributed over molecular surfaces of the toxins. This method utilizes spherical projection maps and is helpful in the delineation of a possible relationship between protein surface properties and activity. The hydrophobic/hydrophilic maps are constructed as follows (see "Experimental Materials" for further details). 1) The MHP value is calculated in each point of the Connolly surface of a molecule; 2) these values are projected onto a sphere concentric with the surface; 3) spherical MHP projection is interpolated on a rectangular coordinate grid; and 4) a Mollweide equal area projection (often used for global world or sky maps) is built. A detailed assessment of this method's potential to relate structural and dynamic features of small proteins to their activity (with a set of biologically relevant examples) will be published elsewhere.
Hydrophobicity maps for individual mammal, insect, and ␣-like toxins were built (not shown). Fig. 4A shows group-and MD-averaged maps that help to spot common and distinct features between these groups of toxins. A common feature for all ␣-toxins is the interchange of hydrophobic "isles" and hydrophilic "lakes". At the same time, the large and "deep" hydrophilic "lake" is present only in mammal toxins. Surprisingly, this "lake" is located inside the SM (marked with a red border) and moreover perfectly superimposes with the "RC domain." Correspondingly, quantitative analysis of averaged dynamic hydrophobicity reveals conserved properties of the core module and the variability of SM (Fig. 4B). This result is statistically significant; ͗MHP SM ͘ MD values (see "Experimental Procedures") for mammal toxins and either insect or ␣-like toxins differ with p Ͻ 10 Ϫ4 . At the same time, the SMs present non-uniformly distributed MHP values for groups of mammal and especially ␣-like toxins (Table 1), thus suggesting that the binding sites for ␣-toxins in mammalian Na v s may differ substantially.

Sequence-based Analysis of Extracellular Loops in Sodium
Channels-To date, the three-dimensional structure of eukaryotic Na v s is unknown. The recently solved structures of bacterial sodium channels (44 -46) represent homotetramers, in contrast to single-chain quasitetrameric eukaryotic channels. No structures of complexes with toxins are available yet, although there is a wealth of biochemical and modeling data on toxin-channel interactions. It has been experimentally established that scorpion ␣-toxins interact preferentially with VSD IV of Na v s (9), with additional evidence that contacts may also form with loops in PD I (47).
We find that the observed modular organization of scorpion ␣-toxins (particularly the "asymmetry" of hydrophobic/ hydrophilic properties) is mirrored in properties of Na v extracellular loops. To assess the probable differences in Na v s from diverse animals, we calculated the average net hydrophobicities of those loops that can participate in toxin binding: S1-S2 and S3-S4 loops of repeat domain IV (VSD IV) and S5-P and P-S6 loops of repeat domains I and III (PD I and PD III), using sequences of seven insect and 21 mammalian channels. The most notable feature is the conserved hydrophobic properties of VSD IV loops (Fig. 5A). This comes in contrast to all other repeat domains (not shown in Fig. 5). Taking into account the pronounced differences in the SMs between mammal and insect ␣-toxins (see above), it is unlikely that the toxins bind to VSD IV by these modules. From the perspective of loop hydrophobicity, it seems more probable that the conserved core module is responsible for interaction with VSD IV. Instead, the SMs may interact with the adjacent repeat domain pore loops. Both S5-P and P-S6 pore loops of repeat domain I have a hydrophobicity pattern similar to the corresponding toxins; mammalian channels possess more hydrophilic loops. The P-S6 loop in PD III is much more hydrophilic in insect channels and is unlikely to be involved in the interaction with toxins.
Analogously to hydrophobicity, the average electric charge of the same set of extracellular loops of insect and mammalian channels was calculated (Fig. 5B). The net charge of the channel loops is negative to match the positive charge of ␣-toxins. The largest negative charge is carried by pore loops in repeat domains I (S5-P) and IV (P-S6). Overall, loop charge in insect and mammal channels is similar, with VSDs I and IV being most The scale is the dot product of two vectors, describing the motion of each residue in the first eigenvector-related dynamics. Note that maps are symmetrical with respect to the diagonal. White strips in the maps correspond to gaps in the alignment, presented in Fig. 1. B, mammal toxins feature the most flexible RT loops. Group-averaged mean RMSF-NM values for residues in the RT loop and core modules are plotted. p values are given to emphasize the significance of difference between groups of mammal and insect toxins. Error bars, S.D.
conserved. Moreover, there is little difference in electrostatic properties between toxin groups (not shown). It is therefore unlikely that ␣-toxin selectivity is determined by electrostatic interactions with the target channels.
Validation of the Method; Activity Prediction of Toxins from M. eupeus-Although peptides BeM9, -10, and -14 from M. eupeus scorpion venom were among the first ␣-toxins to be described (34,48) and the first to be studied structurally (13,14), they were rather poorly characterized biochemically, and no exact evidence existed that they affected insect or mammalian Na v s. Recent work by Zhu et al. has described the activity of four other ␣-toxins from this venom named MeuNaTx␣-1, -2, -4, and -5 (36). Because those results were not included in our reference list (Table 1), we put our approach to the test.
We applied the developed strategy to predict the activity of M. eupeus toxins. From the analysis of dynamic and hydrophobic properties (Fig. 6, A and B), we assign BeM9 and MeuNaTx␣-1 and -2 toxins to either the insect or ␣-like group (͗MHP SM ͘ MD ϭ 0.051 Ϯ 0.018, Ϫ0.069 Ϯ 0.022, and Ϫ0.033 Ϯ 0.039, respectively; compare with the values in Table 1; typical values for mammal toxins are ϽϪ0.1). Furthermore, BeM9 has dynamic features characteristic of insect toxins: a relatively rigid RT loop and rather flexible ␤ 2 -␤ 3 loop. Please note that our assignment disagrees with the phylogenic tree (Fig. 6C); MeuNaTx␣-1 is suggested to belong to the mammal toxin group from homology.
BeM9 was produced recombinantly in a conventional E. coli system with Trx as the fusion partner. CNBr was used for target peptide separation from Trx, and the toxin was then recovered by HPLC (Fig. 7A); the yield was 2 mg/liter of bacterial culture. BeM9 activity was then tested against a number of mammalian Na v s, and the insect Para channel expressed in X. laevis oocytes using the voltage clamp technique (Fig. 7B). Similarly to typical ␣-like toxins, BeM9 was found to affect both insect and mammalian Na v s, with the exception of Na v 1.2, the major isoform found in the central nervous system. Both MeuNaTx␣-1 and -2 were characterized as ␣-like toxins by Zhu et al. (36), supporting our assumptions.

DISCUSSION
Finding essential similarities and differences in structures of biological molecules is a major challenge. In contrast to small molecules, where pharmacophore-based analyses (49) and QSAR-approaches (50) are applied routinely to mine active molecules from computer databases of chemical compounds, there are no widespread techniques that can address this task for proteins. The protein-protein docking field is coming of age, but it still has a number of limitations (51) and can hardly be applied if the spatial structure of either partner is unknown or poorly determined. One of the most essential properties of a molecule is its surface, which represents the interface of intermolecular communication. The concept of molecular surfaces FIGURE 4. SMs are more hydrophilic in mammal than insect ␣-toxins. A, MHP spherical projection maps reveal alteration of hydrophobic and hydrophilic areas. The SMs are enclosed by a red border (this is a projection from the Aah2 toxin static structure). For clarity, projections of C ␣ atoms are shown for Aah2, Lqq III, and BmK M1 toxins on the mammal, insect, and ␣-like group maps, respectively. For mammal toxins, the "functionally variable" residues (see Fig. 1 persists in structural and computational biology for more than 40 years (52-54), but its potential has more to deliver to the field. Surface mapping and comparison is widely used in computational molecular biology and computer-aided drug design (18,(55)(56)(57)(58). For example, calculation of hydrophobic "complementarity" in the binding pockets at the protein/ligand interface uncovers important principles in molecular recognition, and this idea may be applied to improve docking results (15,28). Moreover, surface hydrophobicity is a good measure for conformational transitions (42,43) and generally "tags" areas of intermolecular recognition (16).
Surface Mapping of Scorpion ␣-Toxins-In this work, we utilize a new approach to combine several computational methods for whole-molecule mapping of various physicochemical properties on the molecular surface. We intentionally simplify the complex shape of molecules by projecting their surface onto a sphere. This operation is legitimate for relatively small molecules of approximately spherical shape (like ␣-toxins), and it can be readily adapted for ellipsoid shape. The following advantages of the new surface mapping method should be outlined. 1) It allows for easy and clear visualization of the whole molecular surface at once. The area of interest can easily be "zoomed" with less distortion. 2) Dynamic behavior of the molecules can easily be taken into account by calculating the "average map" over individual maps for MD snapshots. Even such rigid molecules as ␣-toxins have intrinsic dynamics, which may be even more important for other objects. 3) Map averaging reveals the most essential common features in groups of molecules; the less important features will be "averaged out" in the resulting map. 4) Comparison of averaged maps for groups of molecules with different activity highlights functional patches (Fig. 4). Moreover, the differential maps are easily constructed to emphasize the differences.
The main shortcoming is transformation of the spherical surface to a two-dimensional coordinate grid, because it distorts data; the other way is not to build projections but to perform comparison directly in the spheroid vertices (18). Technical details of the method and assessment of its general applicability using a set of biologically active polypeptides will be published elsewhere.
Scorpion ␣-Toxins Possess Modular Organization-The spatial structure of scorpion ␣-toxins may be described in terms of a ␤␣␤␤ scaffold (see Fig. 2). Together with plant defensins that present the same type of fold, scorpion toxins are classified into a separate superfamily of knottins by SCOP, and into the homologous superfamily 3.30.30.10 (mixed ␣-␤ class) by CATH.
Despite their relatively small size, close inspection identifies modular organization of scorpion ␣-toxins. Indeed, our results support dissection of these molecules into two modules or subdomains, the core modules and SMs (see above and Fig. 1 for localization). We note that in the conventional expressions "core domain" and "RC domain" (a part of SM), the term "domain" is used quite incorrectly and prevails due to historical reasons. It is advisable instead to use the more correct terms "part," "subdomain," or "module." The dissection arises from the following observations. 1) MD reveals that the two modules of scorpion ␣-toxins demonstrate essential motions independent from each other and can therefore be considered as "dynamic domains" (Figs. 2 and 3A). Moreover, the high mobility of SMs reliably distinguishes mammal toxins (Fig. 3B). It is important to note that analysis of available NMR and x-ray structures of ␣-toxins readily shows higher mobility of the "RC domain" as compared with the core domain in each particular structure (59 -61), yet only MD simulation in the same setup is suitable for comparison of dynamic features of different molecules. Two glycine residues (Gly-4 and Gly-17 in Aah2) are conserved in mammal toxins, and these are believed to act as "hinges" for the SMs. This conclusion is supported by considering the mobility of in silico mutated toxins. 2) Whereas virtually no difference in bulk physico-chemical properties is noted between the core modules in mammal, insect, and ␣-like toxins, the SMs of the former are significantly more hydrophilic (elegantly visualized on the spherical projection maps; Fig. 4A). Although previous studies suggested the division of ␣-toxin molecules into two parts (37,39,41,62), never before was this clearly illustrated from the MD and surface hydrophobicity. 3) Functionally, the core module seems to determine the overall ability of ␣-toxins to target Na v s, whereas the SM determines toxin specificity and eventually its classification as mammal, insect, or ␣-like. This conclusion is derived from numerous mutagenesis studies (63-65) and is illustrated by our results. Probably the most direct evidence was provided by the Gurevitz group by excision of the "RC domain" from the insect toxin Lqh ␣IT and its implantation into the mammal toxin Aah2; the resulting chimera featured activity of the "RC domain" donor Lqh ␣IT (37,62). 4) From the evolutionary point of view, we may consider the SM as a fast changing segment, whereas the core module is more conserved (for distribution of "functionally variable" residues over the toxin surface map, see Fig. 4A). The SM seems to evolve more quickly to adjust the toxin specificity to target different Na v s. Note that many of the amino acid residues that have been proposed to evolve under positive selection (66, 67) (see red arrows in Fig. 1) are located inside the SM. Two positions, 39 and 41, reside in the ␤ 2 -␤ 3 loop, thereby supporting our finding that this loop is part of the SM. It was also mentioned in the literature that the C  terminus of scorpion toxins has gone through structural rearrangements during evolution, probably to adapt toward new target sites (68).
Although the three groups of toxins may be differentiated based on the functional and structural properties, we should note that they rather form a continuum, not a discrete pattern, with several toxin species presenting "intermediate" properties that are difficult to assign to a particular group. This fact is illustrated by the limited correlation between the toxicity of ␣-toxins to mammals or insects and the hydrophobic/hydrophilic properties of their surface (Table 1).
We should also notice that current classification of Na v toxins, which is based on animal toxicity and competitive binding data, is vague (e.g. "insect toxins" are not completely insectspecific; they are often reported active on mammalian channels (also see Table 1)). More biochemically accurate classification based on individual channel isoform recordings is needed. If a sufficiently complete body of such data were available, computational analysis would result in more straightforward structure-activity relationships.
Differences between Toxins Are Mirrored in the Structure of the Target Ion Channels-To date, the three-dimensional structure of eukaryotic Na v s remains elusive. A breakthrough in the field was the recently presented crystal structures of bacterial Na v s (44 -46). The major difference between eukaryotic Na v s and their prokaryotic counterparts is that the ␣-subunits of the former are monomers, whereas the latter are homotetramers. A problem of repeat domain orientation arises for eukaryotic Na v s; both "clockwise" and "counterclockwise" orientations seem possible. Clockwise orientation (if viewed from the extracellular side) was predicted from analysis of interactions with a pore-blocking -conotoxin (69) and is currently backed by most investigators in the field. This orientation was also determined from mutant double cycle analysis of scorpion ␤-toxin Css4 and the rat Na v 1.4 channel (70). A clockwise disposition of VSDs with respect to the S5-S6 segments of the same subunit takes place (so that VSD I, for instance, is in close proximity to PD II) in bacterial Na v (44). Scorpion ␣-toxins are known to interact with the so-called receptor site 3, which is believed to locate on the extracellular surface of VSD IV and PD I of Na v s (9 -11).
Because the SMs of insect toxins, determining their taxon specificity, are significantly more hydrophobic than those of mammal toxins, we compared extracellular loops of the respective target channels (i.e. insect versus mammalian Na v s) (Fig. 5). We find that in terms of hydrophobicity, the extracellular surface of VSD IV is very similar in channels from different animals (and this is not the case for VSDs I-III). Conversely, the extracellular surface of PD I is significantly more hydrophobic in insect channels. It is therefore reasonable to conclude that the conserved core module of scorpion ␣-toxins binds to loops S1-S2 and S3-S4 in repeat domain IV, whereas the SM interacts with loops S5-P/P-S6 from repeat domain I of Na v s (this corresponds to a "clockwise" arrangement of the repeat domains; Fig.  8). This hypothesis, based on our simple consideration of toxin and receptor properties, is supported by the results of less advanced sequence analysis (71) and computer docking simulations performed by other groups (10, 11).

CONCLUSION
Our simulation results suggest that scorpion ␣-toxins possess modular organization, and individual modules interact with different parts of their target channels. We propose that the stable and conserved core module binds to the conserved VSD IV and provides toxin activity toward Na v s per se. In contrast, the mobile and variable SM interacts with the variable loops of PD I. The latter interaction probably underlies the observed taxon specificity (mammals versus insects) of ␣-toxins.
Thorough comparison of structural, hydrophobic/hydrophilic, and dynamic properties of mammal, insect, and ␣-like toxins led us to identification of molecular determinants underlying their specificity. The SM was found considerably more hydrophilic and flexible in the mammal toxins, whereas in insect toxins, the same module was much more hydrophobic and rigid. As expected, ␣-like toxins feature intermediate hydrophobicity and flexibility. We hypothesize that ␣-toxins have acquired a modular architecture in the evolutionary arms race to effectively target the multidomain Na v s.
Surface mapping serves as an alternative method to predict orphan ␣-toxin activity. We validated the approach by assigning the specificity of several toxins from M. eupeus venom. Importantly, the method has shown advantages compared with the conventional sequence-based predictions.
Our findings may aid the development of novel Na v ligands for treatment of channelopathies or fight against agricultural pests. Moreover, the proposed algorithm for mapping of physico-chemical properties on the molecular surface is of a general nature and may be utilized for detailed comparison of groups of proteins and protein-protein complexes. ; the pore is located in the center, and VSDs are located around it. Results of molecular modeling and biochemical experiments suggest that toxin core modules interact with conserved VSD IV, whereas the variable and flexible SMs may bind to pore loops from repeat domain I. The view is from the extracellular side of the membrane.