Molecular characterization of the interaction of sialic acid with the periplasmic binding protein from Haemophilus ducreyi

The primary role of bacterial periplasmic binding proteins is sequestration of essential metabolites present at a low concentration in the periplasm and making them available for active transporters that transfer these ligands into the bacterial cell. The periplasmic binding proteins (SiaPs) from the tripartite ATP-independent periplasmic (TRAP) transport system that transports mammalian host–derived sialic acids have been well studied from different pathogenic bacteria, including Haemophilus influenzae, Fusobacterium nucleatum, Pasteurella multocida, and Vibrio cholerae. SiaPs bind the sialic acid N-acetylneuraminic acid (Neu5Ac) with nanomolar affinity by forming electrostatic and hydrogen-bonding interactions. Here, we report the crystal structure of a periplasmic binding protein (SatA) of the ATP-binding cassette (ABC) transport system from the pathogenic bacterium Haemophilus ducreyi. The structure of Hd-SatA in the native form and sialic acid–bound forms (with Neu5Ac and N-glycolylneuraminic acid (Neu5Gc)), determined to 2.2, 1.5, and 2.5 Å resolutions, respectively, revealed a ligand-binding site that is very different from those of the SiaPs of the TRAP transport system. A structural comparison along with thermodynamic studies suggested that similar affinities are achieved in the two classes of proteins through distinct mechanisms, one enthalpically driven and the other entropically driven. In summary, our structural and thermodynamic characterization of Hd-SatA reveals that it binds sialic acids with nanomolar affinity and that this binding is an entropically driven process. This information is important for future structure-based drug design against this pathogen and related bacteria.

Protein-protein interactions and protein-ligand interactions play a key role in all living organisms, ranging from prokaryotes to eukaryotes (1,2). These interactions are highly specific and regulate several downstream signaling pathways and cellular metabolic processes (3,4). Understanding the molecular details of protein-ligand interactions is a key step in the structure-assisted drug discovery process. Electrostatic interactions, hydrogen-bonding interactions, van der Waals interactions, and hydrophobic interactions contribute to the binding affinity of a ligand to a protein. Thermodynamically, both enthalpy (⌬H) and entropy (⌬S) contribute to the binding affinity. The difference in enthalpy and entropy between the bound and unbound forms dictates the spontaneity of the binding reaction. The enthalpy of the system is mainly derived from noncovalent interactions, such as hydrogen bonding, ionic interactions, and van der Waals contacts between polar side chains of the amino acids and ligand. In addition, hydrophobic interactions between ligands and proteins are responsible for the displacement of water, which contributes to either positive or negative net entropy and generates randomness or order within the system. In biological systems, enthalpy-entropy compensation of protein-ligand interactions dictates how spontaneous the reactions will be (5,6).
Sialic acids are nine-carbon-containing sugars, and the carboxylate group at the C1 position imparts the acidic nature to the molecule. In vertebrates and in associated symbiotic and pathogenic bacteria, these sugars are present as the outermost sugars on glycoproteins and glycolipids of the outermost cell surface (7,8). Based on structural differences, sialic acids exist in ϳ50 different isoforms, and the two most common sugars include N-acetylneuraminic acid (Neu5Ac) 3 and N-glycolylneuraminic acid (Neu5Gc). Chemically, Neu5Gc differs from Neu5Ac by a single hydroxyl group at the C5 position (9 -11). Most pathogenic bacteria that reside in the mucus-rich regions lack the biosynthetic pathways to make Neu5Ac, so they scav-enge and transport host-derived sialic acids for their long-term survival. Sialic acids are transported across the bacterial periplasmic membrane by tripartite ATP-independent periplasmic (TRAP), ABC, or major facilitator superfamily (MFS) transport systems. After transport of sialic acid into the cytoplasm, it is incorporated as the outermost sugar on the lipooligosaccharide (LOS) by CMP-sialic acid synthetase (SiaB) and sialyltransferases (SiaA, Lic3A, Lic3B, or LsgB). Pathogenic bacteria incorporate sialic acid during lipid glycosylation to evade the host immune system by "molecular mimicry" (12). Previously, it was reported that mutating either SiaB or the sialyltransferase gene Lic3A, which encodes for ␣2,3-sialyltransferase, exhibited less resistance to the killing effects of human serum (13,14). In addition to lipid glycosylation, part of the internalized sialic acid is often converted into fructose 6-phosphate by a pathway of enzymes that include NanA, NanK, NanE, NagA, and NagB that enables the bacteria to use them as an energy source (15).
Gram-negative bacteria, such as Haemophilus influenzae, Pasteurella multocida, Fusobacterium nucleatum, and Vibrio cholera, utilize the TRAP transport system for sialic acid transport (16,17). The sialic acid-binding proteins (SiaPs) of the TRAP system have been structurally and functionally characterized (18 -20). These proteins contain two domains, and sialic acid binds between these two domains in a conserved binding pocket. In SiaPs, ligand binding generates conformational changes in the hinge region, which cause the movement of two domains to close on the substrate. This mechanism is called either the Venus flytrap mechanism or Pac-Man model (21,22). Thermodynamic studies have shown that the binding of sialic acids to SiaPs is an enthalpically favored process.
However, Haemophilus ducreyi, another Gram-negative bacteria, has been reported to use the ABC transport system for sialic acid transport, utilizing ATP hydrolysis to import sugars across the periplasmic membrane (23,24). H. ducreyi is a pathogenic Gram-negative coccobacillus belonging to the pasteurellaceae family. It causes a sexually transmitted genital disease called chancroid (25). This disease is a major public health concern in developing and low socio-economic level countries due to an increased risk of co-transmission with HIV (26,27). The pathogenicity of H. ducreyi in humans is caused by several virulence factors, including the hemolytic toxin (28, 29) and cytolethal-distending toxin (30 -32). In addition, the LOS of H. ducreyi acts as a putative virulence factor that aids in bacterial adherence to human skin fibroblasts and mucosal epithelial cells (33,34). LOS terminates with N-acetyllactosamine and sialyl-N-acetyllactosamine, which is heavily sialylated (35).
In H. ducreyi, the ABC transport system that transports sialic acid is structurally encoded by the four-gene operon SatABCD. In this operon, SatA encodes a periplasmic binding protein that binds to sialic acid in the periplasm. SatBCD encodes the ABCtype integral membrane transporter that transports the delivered sugar using the energy derived from ATP hydrolysis (36). Similar to the maltose-binding protein system, it is expected that SatA bound to Neu5Ac/Neu5Gc anchors onto the transmembrane domain of the ABC transporter. This signals ATP binding, the sialic acid is then released from SatA, and the sugar is transported from the periplasm to the cytoplasm by SatBCD (37)(38)(39).
ABC transporters import and efflux a variety of different substrates across the lipid bilayer, including ions, peptides, proteins, and sugars. Although several periplasmic binding proteins of ABC transporters that transport a wide range of other substrates have been described previously, the structural and molecular mechanism of sialic acid binding to SatA in ABC transporters is not known. The sequence similarity between SiaP from TRAP transporters and SatA from ABC transporters is ϳ20 -25%; thus, it is important to structurally and functionally characterize SatA from ABC transporters.
Here, we report the structures of the sialic acid-binding protein of H. ducreyi, which uses the ABC transporter system (Hd-SatA), in unliganded and ligand-bound forms. In addition, we have thermodynamically characterized Hd-SatA binding to its ligands Neu5Ac and Neu5Gc. Site-directed mutagenesis of residues that bind to the sugars followed by measurement of binding affinities shed light on the contribution of these residues to the binding phenomenon.

Hd-SatA unliganded, Neu5Ac-bound, and Neu5Gc-bound structures
Hd-SatA was crystallized in three different forms: unliganded, Neu5Ac-bound, and Neu5Gc-bound. Hd-SatA has low sequence identity (20 -25%) with other periplasmic sugarbinding proteins, and there are no homologous structures available in the Protein Data Bank. We first determined the crystal structure of selenomethionine-derivatized Hd-SatA bound to Neu5Gc to 2.4 Å resolution by the multiple-wavelength anomalous dispersion method (Fig. 1). These crystals belong to the space group C121, and the model was refined to an R-factor of 0.17 and free R-factor of 0.25. The crystal structure shows two molecules in the asymmetric unit, and the electron density clearly shows the presence of Neu5Gc in the binding pocket. The crystals of the Hd-SatA-Neu5Ac complex diffracted to a higher resolution of 1.49 Å, and these crystals belong to the P2 1 2 1 2 1 space group. This crystal structure was determined by molecular replacement using the Hd-SatA-Neu5Gc structure as the search model. It was refined to an R-factor of 0.17 and a free R-factor of 0.197. In the Hd-SatA-Neu5Ac structure, the electron density map clearly shows the presence of Neu5Ac in the binding pocket (Fig. 2, inset). The crystals of the Hd-SatA unliganded form diffracted to 2.1 Å, and the structure was determined by molecular replacement using the Hd-SatA-Neu5Ac structure as the search model. These crystals belong to the space group P12 1 1 and were refined to an R-factor of 0.20 and a free R-factor of 0.26. The crystal structures of unliganded Hd-SatA and Hd-SatA-Neu5Ac forms have one molecule in the asymmetric unit.
The details of the crystallography data and refinement statistics are presented in Table 1. The coordinates and structure factors for Hd-SatA unliganded, Hd-SatA-Neu5Ac and Hd-SatA-Neu5Gc have been deposited in the Protein Data Bank (PDB) with accession numbers 5ZA4, 5Z99, and 5YYB, respectively.
The overall structure contains three ␣/␤ domains, highlighted in Fig. 1. The structure contains an N-terminal domain that is larger than the C-terminal domain due to the presence of an extended region in the N terminus and can be divided into two domains, I and III. The N-terminal domain contains amino acids from the N terminus and C terminus. Domain I contains residues 1-38, 160 -239, and 461-487 (from the C terminus), which form six ␤ strands that are sandwiched between three ␣ helices. Domain III is composed of residues 39 -159, which form six ␤ strands and two ␣ helices. Domain II contains the C terminus residues 244 -456, which form six ␤ strands enclosed by seven ␣ helices. The N-and C-terminal domains are connected by two ␤ strands, which are composed of residues 240 -243 and 457-460, and these two strands may serve as hinge regions. The crystal structure clearly shows that the ligand is buried deep inside the binding pocket formed between domains I and II, and there are no apparent interactions of the ligand with domain III. Upon ligand binding, both domains I and II come together to trap the ligand, akin to the Venus fly trap mechanism, which is similar to the pattern that is observed in other sugar-binding proteins.
The Dali server search results showed that many structures are structurally similar to Hd-SatA, but they have less than 25% sequence identity. Some of the structures with the highest matching Z scores are dipeptide-binding protein DppA from Escherichia coli (with a Z score of 40.9) (PDB code 1DPP) (40), periplasmic oligopeptide-binding protein AppA from Bacillus subtilis (PDB code 1XOC) (41), antibiotic agrocinopine A transport protein in Agrobacterium tumefaciens (PDB code 4ZE9) (42), and periplasmic lipoprotein HbpA in Haemophilus parasuis (PDB code 3M8U) (43).

The sialic acid-binding pocket of Hd-SatA
The binding of Neu5Ac in the binding pocket of Hd-SatA is depicted in Fig. 2. There are several hydrogen-bonding interactions observed between different amino acids in the binding pocket and the Neu5Ac molecule. In SiaPs of TRAP transporters, the major interactions are salt bridges between an Arg and the carboxylate group (C1 position) of sialic acid (18,19). In Hd-SatA, there are no amino acids in the binding pocket that make salt bridges with the ligand. There is only one Arg in the binding pocket, at position 408, and it forms a hydrogen bond with the hydroxyl group at the C9 position. His 333 and Ser 362 are two polar residues closest to the carboxylate group at C1, and they form hydrogen bonds with the carboxylate group. Ser 18 and Ser 362 form a hydrogen bond with the hydroxyl group at C8, and Ser 27 forms a hydrogen bond with hydroxyl groups at positions C8 and C9. Pro 16 has a water-mediated hydrogenbonding interaction with O2. Tyr 15 , Thr 477 , and Asp 383 also display water-mediated hydrogen-bonding interactions with the hydroxyl group at position C4. The binding pocket of Hd-SatA-Neu5Gc is similar to Hd-SatA-Neu5Ac, and there are no structural changes between these two sugar-bound forms.
In summary, SiaPs from TRAP transporters have more charged residues in their binding pocket, and their interactions with sialic acids are mostly mediated by hydrogen bonds, ionic interactions, and salt bridges. However, Hd-SatA has few charged residues and more polar residues in the binding pocket, and its interaction with Neu5Ac is mainly mediated by hydrogen bonds and hydrophobic interactions.

Hinge region analysis
The  Binding of sialic acids to Hd-SatA is entropically driven chain torsion angles between Phe 241 and Tyr 242 , between His 456 and Arg 457 , and between Arg 457 and Lys 458 are primarily responsible for the hinge movement. In both cases, the largest differences are in the angle. Further structural analysis reveals that there are significant conformational changes of amino acids in the binding pocket upon ligand binding. In the open conformation, Arg 457 from hinge two is oriented toward the binding pocket, and His 333 and Tyr 15 from domains I and II are far away from the binding pocket. Upon ligand binding, the Arg 457 side chain moves away from the binding pocket to form hydrogen bonds with Ser 222 and Tyr 225 , and amino acids His 333 and Tyr 15 move closer to the ligand to form hydrogen bonds with the Neu5Ac. Thus, there are a series of conformational changes upon ligand binding that lead to the opening and closing of the two domains of the protein (Fig. 3).

Thermodynamics of Neu5Ac and Neu5Gc binding to Hd-SatA and its mutants
Isothermal titration calorimetry (ITC) was used to measure the thermodynamics of the interactions between Hd-SatA and its ligands Neu5Ac and Neu5Gc. Fitting the data using singlesite binding mode analysis indicates that Hd-SatA binds to Neu5Ac and Neu5Gc in a 1:1 stoichiometric ratio with nanomolar binding affinity. Further binding studies showed that the entropy contribution has a larger effect on the overall free energy of the system compared with enthalpy (K d ϭ 133 Ϯ 22 Binding of sialic acids to Hd-SatA is entropically driven nM, ⌬H ϭ Ϫ2.21 Ϯ 0.023 kcal/mol and ϪT⌬S ϭ Ϫ7.172 kcal/ mol). These results are similar to Hd-SatA binding to Neu5Gc (K d ϭ 277 Ϯ 58 nM, ⌬H ϭ Ϫ2.986 Ϯ 0.056 kcal/mol and ϪT⌬S ϭ Ϫ5.96 kcal/mol) (Fig. 4). Therefore, the interaction between Hd-SatA and sialic acid is an entropically driven process. Based on the determined structures and thermodynamic properties of SiaPs from TRAP transporters, we anticipated that the enthalpic contribution to the free energy of binding in Hd-SatA can be increased by substituting a few polar residues to charged residues in the binding pocket. To test this, we made a total of 10 substitutions in the binding pocket of Hd-SatA, S27N, S27K, S362K, S362R, S362A, H333K, H333R, H333A, R408A, and R408K, and their thermodynamic properties were probed using ITC. Ser 27 , which forms hydrogen bonds with the hydroxyl groups at positions C8 and C9, was modified to Lys and Asn. ITC binding studies showed that S27K had no effect on the interaction, but S27N bound endothermically to Neu5Ac (Fig. S1) and exothermically to Neu5Gc. Ser 362 and His 333 , which form hydrogen bonds with the C1 carboxylate of Neu5Ac, were mutated to S362K, S362R, S362A, H333K, H333R, and H333A, respectively. Ser 362 and His 333 were individually mutated to Arg and Lys to investigate whether an ATP transporter can be made to have similar binding properties as periplasmic binding proteins in TRAP transporters via salt bridge formation (18). However, these substitutions had binding properties similar to the WT protein, although there was no measurable binding affinity with the H333K substitution. The R408A and R408K substitutions did not show any measurable binding to Neu5Ac. The binding affinities, including the enthalpic and entropic contributions of sialic acids binding to Hd-SatA WT and its mutants, are presented in Table 2. The thermodynamic studies show that there were few minor changes in the thermodynamic properties of the protein due to site-directed mutagenesis.

Database search to identify proteins binding to the same ligand with different binding modes
In a PDB database search, we found two proteins, a maltose transporter (PDB code 3PUY) and a bacterial maltosyltransferase (PDB code 4U33), that bound to the same ligand (maltose) with different modes of binding. Previously, it was reported that maltose binding to an MBP-maltose transporter is an entropically driven process (44). The MBP-maltose transporter has a large hydrophobic binding pocket; hence, the displacement of water molecules might be the reason for the entropically favored binding process. However, unlike the data pre- a This data set is also the data set from the peak wavelength of selenomethionine protein used to determine the structure.

Binding of sialic acids to Hd-SatA is entropically driven
sented here, no experimental evidence is present to establish whether the thermodynamics of binding is different in the maltose-binding protein of bacterial maltosyltransferase.

Discussion
Pathogenic bacteria have evolved different mechanisms to evade the host immune system, including coating their outer membranes with host nine-carbon sugars, known as sialic acids, that are scavenged from the hosts using TRAP/ABC/ MFS transporter systems. Pathogenic bacteria display these sugars as the outermost moiety to mimic eukaryotic outer membrane composition, which aids in eluding the host immune response. Although several studies have been reported in the last few decades related to sialic acid scavenging proteins and sialic acid incorporation, structural information at the atomic level on sialic acid binding modes as well as the amino acid environment of the different transport systems are just beginning to be revealed. Here, we have structurally and functionally characterized the sialic acid-binding protein from H. ducreyi that belongs to the ABC transport system. The results from structural and thermodynamic studies support the conclusion that Hd-SatA binds to sialic acids with nanomolar affinity, and this process is entropically driven.
Hd-SatA is a three-domain protein, where sialic acid binds between domains I and II. Domain III is a small part of the N-terminal domain, which does not interact with the ligand. Initially, Fukami-Kobayashi et al. (45) classified substrate-binding proteins based on their ␤ sheet topology as Class I and Class II proteins. After this initial classification, many more proteins were structurally and functionally characterized with a wide range of ligands. Nearly a decade later, Berntsson et al. (46) further classified these substrate-binding proteins into a total of six groups (Cluster A to Cluster F) based on their structural similarity. In this classification, SiaPs from TRAP transporters were categorized in Cluster E. All other clusters have proteins that belong to ABC transport systems. SatA from H. ducreyi can be categorized into Cluster C because it has an extra N-terminal domain. Cluster C has other proteins that belong to Class II ABC transporters, where they bind to a number of different ligands, including di-and oligopeptides, nickel, and arginine.
Based on previous structural reports, it was hypothesized that proteins like AppA from B. subtilis and OppA from Lactococcus lactis have an extra domain to accommodate large ligands, such as oligopeptides (41,47); however, in the case of Hd-SatA, the reason and necessity for the extra domain is still unknown.
SiaPs from H. influenzae, P. multocida, V. cholera, and F. nucleatum, which utilize the TRAP transport system, have been structurally and functionally characterized (18 -20). The common features of sialic acid-binding proteins from ABC and TRAP transport systems are as follows: (i) the ligand binds between two globular domains, and (ii) the amino acids in the hinge regions are responsible for opening and closing the two domains upon ligand binding. But the salient features of SiaPs from TRAP transporters are as follows: (i) two arginine residues (Arg 127 and Arg 147 ; numbers correspond to H. influenzae SiaP) in the binding pocket form salt bridges with the C1 carboxylate group of the sialic acid, and (ii) Glu and His from the hinge region form and break a series of hydrogen bonds upon ligand binding (18). These features show that the binding pocket has charged residues and can form salt bridges and hydrogen bonds with the negatively charged ligand. In the case of Hd-SatA, there are only two charged residues in the binding pocket, Arg 408 and Asp 383 . The other residues in the binding pocket are either polar or hydrophobic (Tyr 15 , Ser 27 , Gly 28 , Ala 29 , Leu 105 , Pro 381 , His 333 , Asp 383 , and Trp 397 ). This is an important difference between SiaP and SatA, where one ligand (sialic acid) is binding to two different proteins (SiaP and SatA) in two different amino acid environments. Results from thermodynamic studies confirmed that the binding of sialic acid in Hd-SatA is a more entropically favored process, whereas binding is a more enthalpically favored process in the SiaP of the TRAP transport system (Table S1). The structures provide a molecular explanation for the observed thermodynamic properties.
We calculated the binding pocket volumes in the sialic acidbinding proteins of the TRAP and ABC transport systems using the CASTp (Computed Atlas of Surface Topography of proteins) server (48). The binding pocket volumes of H. influenzae SiaP in the open and closed conformations are 834.5 and 191.9 Å 3 . This shows that upon ligand binding, there is a modest change in the volume of 643 Å 3 . In Hd-SatA, the binding pocket volumes of the open and closed conformations are 1490.4 and 242 Å 3 . Thus, upon ligand binding, there is a large volume change of 1200 Å 3 (Fig. S2). This large volume change upon ligand binding may displace a lot of water molecules, which may account for the increased entropic contribution in ABC transporters. This analysis shows that the binding mechanism of sialic acid-binding proteins is determined by the amino acid composition in the binding pocket and the cavity size in the protein.
These preliminary structural and protein homology studies suggest that the binding of the sialic acid by heterologous peptides is the result of convergent evolution; the sialic acidbinding proteins of the ABC transport system and TRAP transport system evolved independently to accomplish similar sialic acid binding functions. However, this hypothesis requires further testing.
This interesting result persuaded us to search the PDB database to explore whether there are other proteins that bind to the Table 2 Binding affinities and thermodynamic parameters of native protein and proteins with site-specific amino acid substitutions ITC with a Microcal instrument was used for thermodynamic measurements, and data were analyzed using SEDPHAT (for Hd-SatA WT) and Origin software (for Hd-SatA mutants). NB, no binding.

Binding of sialic acids to Hd-SatA is entropically driven
same ligand with a different motif of the binding pocket. Our database search shows that the maltose transporter from E. coli and bacterial maltosyltransferase from Mycobacterium tuberculosis both bind maltose but utilize two distinct binding pockets. The binding pocket of the maltose transporter (PDB code 3PUY) is mainly composed of polar and hydrophobic residues, such as Tyr, Phe, Gly, Leu, Asn, Phe, and Ser. This binding pocket amino acid composition is similar to the binding pocket of Hd-SatA. It was previously reported that binding of maltose to the MBP-maltose transporter is an entropically driven process. This supports the concept that, in both the MBP-maltose transporter and Hd-SatA, the hydrophobic binding pocket is responsible for entropic binding to their respective ligands. In the bacterial maltosyltransferase, the binding pocket has many charged residues, including Arg, Lys, Asp, and Glu, which is similar to TRAP transporter SiaPs. Based on the previous structural and thermodynamic studies of SiaPs from the TRAP system, we hypothesize that binding of maltose to bacterial maltosyltransferase might be an enthalpically driven process (Fig. 5).
In summary, pathogenic bacteria have evolved to bind sialic acids with similar binding affinities, using different binding topologies. The topological and chemical properties of the binding pocket lead to different energetic processes that govern binding. This study of SatA of H. ducreyi from the ABC transport system and analysis of previously reported SiaPs from the TRAP transport system together provides important and divergent structural details about sialic acid-binding proteins in two different transport systems. This information is important for future structure-based drug design against these pathogenic bacteria.

Cloning and mutagenesis of Hd-SatA WT
The DNA encoding Neu5Ac-binding protein (SatA) from H. ducreyi in PET101D-TOPO was amplified by PCR and subcloned into the BamHI and XhoI sites of a pET21a (Novagen) vector as a fusion with a C-terminal His tag (NCBI accession number WP_010945473). The gene sequence was verified by DNA sequencing and designated Hd1669 (FSS-End) pET21a. Site-specific substitutions S362R, S362A, S362K, S27K, S27N, H333R, H333K, H333A, R408A, and R408K were generated in Hd-SatA using site-directed mutagenesis with corresponding mutagenic primers. These substitutions were confirmed by sequencing.

Protein expression and purification
The protein expression of the recombinant plasmids was carried out in E. coli BL21(DE3) star cells. The cells were transformed with the plasmid and grown overnight at 37°C in lysogeny broth medium containing 100 g/ml ampicillin. The cells were reinoculated with 1% overnight culture in fresh lysogeny broth medium containing ampicillin. Then these cells were cultured at 37°C until the absorbance reached 0.6, and the cultures were induced with 100 M isopropyl ␤-D-thiogalactopyranoside. The cells were incubated at 25°C for 4 h. Later, the cells were harvested at 13,000 rpm for 30 min. For protein purification, each 1-liter pellet was resuspended in 25 ml of resuspension buffer containing 20 mM HEPES, 150 mM NaCl, 5 mM imidazole (pH 8.0), which was further supplemented with a protease inhibitor mixture tablet lacking EDTA (Roche Applied Science). The resuspended pellet was treated with lysozyme and DNase and incubated on ice for 30 min following lysis using an Emulsiflex C3 from Avestin at 15,000 p.s.i. The lysate was centrifuged at 13,000 rpm for 30 min and further purified on a Talon affinity column (Bio-Rad) using Profinia (Bio-Rad). The column was first equilibrated with resuspension buffer, and the lysate was loaded onto the column. Following a wash with 10 column volumes of wash buffer (20 mM HEPES, 500 mM NaCl, 5 mM imidazole, pH 8.0), the protein was eluted with four column volumes of elution buffer containing 20 mM HEPES, 150 mM NaCl, and 500 mM imidazole, pH 8.0. The eluted protein was dialyzed overnight in 20 mM HEPES and 10 mM NaCl (pH 8.0). The protein was further purified by anion-exchange chromatography (column Hi Trap Q FF, GE Healthcare). The column was first equilibrated with Buffer A (20 mM HEPES, 10 mM NaCl, pH 8.0) and loaded with the dialyzed protein. The protein was eluted with a gradient of Buffer A to the same buffer containing 1 M NaCl. The protein eluted at a salt concentration of 150 mM. The protein fractions were pooled and purified using size-exclusion chromatography (Sepharose 200 column, GE Healthcare) using 20 mM Tris-Cl, 100 mM NaCl (pH 8.0). The samples corresponding to the protein peak were pooled, and concentration was measured using the Bradford assay (Bio-Rad). To express selenomethionine-derivatized protein of Hd-SatA, a protocol similar to that of Doublié et al. (49) was used. Further, the protein was purified using the same method as the native protein.

Crystallization, structure determination, and refinement
Hd-SatA bound with Neu5Gc-To obtain crystals of the protein-ligand complex, selenomethionine-derivatized Hd-SatA protein at a concentration of 50 mg/ml was mixed with Neu5Gc at a molar ratio of 1:10. The hanging-drop vapor diffusion method was used for crystallization trials. Crystals were obtained in a buffer containing 20 mM MES (pH 6.2), 60 mM NaCl with 30% PEG 1500 as a precipitant. The Hd-SatA-Neu5Gc crystals were mounted in loops, and X-ray data were collected at three different wavelengths (Molecular Biology Consortium, Advanced Light Source, using a NOIR1 detector). For phasing, data were collected from selenomethionine-incorporated crystals of Neu5Gc-bound Hd-SatA at selenium peak wavelength (0.9778 Å). Further selenomethionine-substituted protein structure refinement was carried out using the automated SOLVE-RESOLVE pipeline (50). Further refinements were done using Phenix, and model building was carried out using COOT (51). The coordinates and structure factors for the Hd-SatA-Neu5Gc complex have been deposited in the PDB with accession number 5YYB.
Hd-SatA bound with Neu5Ac-Hd-SatA protein at a concentration of 50 mg/ml was mixed with Neu5Ac at a molar ratio of 1:10. The crystals were obtained in buffer containing 20 mM MES, pH 6.2, 60 mM NaCl with 30% PEG 1500 as a precipitant by the hanging-drop diffusion method. The Hd-SatA-Neu5Ac complex crystals were mounted in loops, and X-ray data were collected at the IMCA-CAT beamline using the MAR-CCD detector at the Advanced Photon Source (Argonne, IL). The data were indexed, integrated, and scaled using HKL2000, and Binding of sialic acids to Hd-SatA is entropically driven further molecular replacement was carried out using the Hd-SatA-Neu5Gc structure as a search model. Manual model building was performed using COOT, and further refinements were performed using Phenix. The coordinates and structure factors for the Hd-SatA-Neu5Ac complex have been deposited in the PDB with accession number 5Z99.
Hd-SatA without ligand-The crystals of Hd-SatA were grown by the hanging-drop vapor diffusion method by mixing 1 l of protein at 40 mg/ml concentration with the 1 l of the reservoir. Crystals were obtained from the reservoir solution containing 0.2 M sodium iodide and 20% (w/v) PEG 3350. These crystals were mounted in loops, and X-ray data were collected

Binding of sialic acids to Hd-SatA is entropically driven
from the IMCA-CAT beamline using the MAR-CCD detector at the Advanced Photon Source (Argonne, IL). Molecular replacement was performed using Neu5Ac-bound Hd-SatA as a search model. Manual model building was performed using COOT, and further refinements were performed using Phenix. The coordinates and structure factors for Hd-SatA have been deposited in the PDB with accession number 5ZA4.

Thermodynamic studies of Hd-SatA
Thermodynamic studies were performed using isothermal calorimetry (MicroCal ITC microcalorimeter, Malvern, GE Healthcare). All titrations were carried out at 25°C, and the protein samples were in a buffer containing 20 mM HEPES and 10 mM NaCl (pH 8.0). Protein concentrations were measured using the Bradford assay (Bio-Rad). The concentrations of Hd-SatA WT and proteins with site specific amino acid substitutions were varied in different titrations. Hd-SatA WT ITC titrations with Neu5Ac and Neu5Gc was carried out from three independent preparations and technical triplicates for each preparation. All ITC data from proteins with mutations in Hd-SatA were technical replicates. Each experiment consisted of 20 injections, with the first injection 1 l and the rest of the injections 2 l with 180 s between the titrations. The nonspecific heats of dilution liberated during the titration of the sugars and protein were calculated by averaging the heat liberated during the last 3-5 injections after saturation. The raw heat released during each injection was calculated by deleting these heats of dilution.
Global fit analysis from technical replicates was done for Hd-SatA WT using SEDPHAT (52). The analysis was repeated with three different protein preparations, and all global fit analysis led to similar values. All of the mutant protein data were analyzed using the ORIGIN ITC software package (Microcal) using a single-site binding model, and errors are fit to curves. In the ORIGIN software, nonlinear least squares analysis was used to calculate the values of stoichiometry, affinity, change in enthalpy (⌬H), and change in entropy (⌬S).

Database search to identify proteins that bind to the same ligand with different binding modes
To identify proteins where the same ligand bound to completely different amino acids, the complete database of 121,700 PDB structures was downloaded. Of these structures, 98,011 had no ligand bound to them, so these structures were eliminated. From the remaining structures, ligands were extracted, and a directory was created for each ligand. Approximately 23,700 structures were sorted into groups based on their ligand specificity. As we were interested in finding two proteins with dissimilar binding sites, only the ligands to which more than one protein bound were retained. Finally, this whole process resulted in the sorting of 6015 ligands. The majority of current methods use binding pocket geometry rather than the physicochemical properties or the microenvironment of the binding site; therefore, they may report the similar binding sites as dissimilar ligand-binding sites. For example, a replacement from Arg to Lys may be considered as a dissimilarity. Therefore, we chose the pocket feature algorithm, which considers the similarity of the binding site as a function of shared microenviron-ments rather than the geometric orientations of the amino acids in the binding site. "Microenvironment" refers to residues around a given center within a radius value of 7.5 Å. This feature calculates the microenvironment in protein A and B; further, the Tanimoto coefficient was calculated based on similar properties (53). A score of 7 or greater signifies similar binding pockets, whereas scores between 0 and 1 signify dissimilar binding sites. We selected a score of 0 to identify completely dissimilar sites. All ligands with a molecular mass less than 110 Da were deleted, including the proteins that bound to these molecules. Among these, ligands bound to sites other than the active site were also ignored. The resulting structures were then visually examined.

Accession numbers
Structure factors and coordinates of Hd-SatA unliganded, Hd-SatA-Neu5Ac, and Hd-SatA-Neu5Gc structures have been deposited in Protein Data Bank with accession codes 5ZA4, 5Z99, and 5YYB, respectively.