Using a Galactose Library for Exploration of a Novel Hydrophobic Pocket in the Receptor Binding Site of the Escherichia coliHeat-labile Enterotoxin*

The binding of the B subunits ofEscherichia coli heat-labile enterotoxin (LT) to epithelial cells lining the intestines is a critical step for the toxin to invade the host. This mechanism suggests that molecules which possess high affinity to the receptor binding site of the toxin would be good leads for the development of therapeutics against LT. The natural receptor for LT is the complex ganglioside GM1, which has galactose as its terminal sugar. A chemical library targeting a novel hydrophobic pocket in the receptor binding site of LT was constructed based on galactose derivatives and screened for high affinity to the receptor binding site of LT. This screening identified compounds that have 2–3 orders of magnitude higher affinity toward the receptor binding site of LT than the parent compound, galactose. The present findings will pave the way for developing simple and easily synthesizable molecules, instead of complex oligosaccharides, as drugs and/or prophylactics against LT-caused disease.

The binding of the B subunits of Escherichia coli heatlabile enterotoxin (LT) to epithelial cells lining the intestines is a critical step for the toxin to invade the host. This mechanism suggests that molecules which possess high affinity to the receptor binding site of the toxin would be good leads for the development of therapeutics against LT. The natural receptor for LT is the complex ganglioside GM1, which has galactose as its terminal sugar. A chemical library targeting a novel hydrophobic pocket in the receptor binding site of LT was constructed based on galactose derivatives and screened for high affinity to the receptor binding site of LT. This screening identified compounds that have 2-3 orders of magnitude higher affinity toward the receptor binding site of LT than the parent compound, galactose. The present findings will pave the way for developing simple and easily synthesizable molecules, instead of complex oligosaccharides, as drugs and/or prophylactics against LT-caused disease.
Escherichia coli heat-labile enterotoxin (LT) 1 is the causative agent of traveler's diarrhea and is also responsible for the death of hundreds of thousands of children each year in developing countries (1,2). There is currently no prophylaxis and only a very labor-intensive therapy against diarrhea caused by LT. Therefore, the discovery of potent inhibitors which can block the function of LT is very important for the development of effective drugs for the prevention and treatment of LTcaused diarrhea. The toxin is a heterohexamer and is assembled in the periplasm of the bacterium with one A subunit which contains its enzymatic active site and five identical B subunits which are responsible for target cell recognition. The mechanism of action by LT can be separated into several stages. After release from the bacterium, LT first binds to epithelial cells lining the intestines through its B pentameric subunits. The A subunit of LT is then translocated across the cell membrane. Inside the intestinal cell, the A subunit of the toxin modifies the ␣ subunit of the trimeric protein G s so that G s␣ loses its GTPase activity and remains constitutively in its GTP-bound state (3). This in turn causes a continuous stimulation of adenylate cyclase. The resulting elevated levels of cyclic AMP in the cell lead to massive loss of fluid and ions from the cell, the characteristic pathology of enterotoxigenic disease. Based on this mechanism, one approach of developing drugs against LT would be to create molecules which have high affinity for the receptor binding site of LT and can therefore block the binding of LT to the epithelial cells. The natural cell surface receptor for LT is ganglioside GM1 (Gal-␤1,3-GalNAc-␤1,4-(NeuAc-␣2,3)-Gal-␤1,4-Glc-␤1,1-ceramide). The oligosaccharide part of GM1 (GM1-OS) is responsible for binding to LT. Because of the high cost of synthesizing large amounts of complex oligosaccharide such as GM1-OS, the use of GM1-OS itself or its close derivatives as drugs against LT would not be economically feasible. Therefore, the discovery of simple and easily synthesizable small molecules which can effectively compete with GM1-OS for binding to the receptor binding site of LT would be of great value.
The three-dimensional structure of LT in its apo state is well known (4). In addition, the crystal structure of the complex between GM1-OS and cholera toxin (CT), a closely related toxin with 80% sequence identity to LT, is also known (5). Structurally, both LT and CT have essentially identical binding sites for their natural receptor GM1 (5). Therefore, the two structures (LT and CT) can be used interchangeably in the design of easily synthesizable small molecules which can inhibit the receptor binding process of LT. The crystallographically determined structure of cholera toxin bound to its natural receptor (CT⅐GM1-OS (5)) reveals which regions in the toxin can be exploited in drug design. The two terminal sugars, galactose and sialic acid, make the most contributions to the binding of GM1 to the toxin, and therefore their binding pockets are the primary targets. Binding of the terminal galactose group is very specific, because all hydroxyls are involved in multiple hydrogen bonds with the protein, and substituting this galactose moiety may be extremely difficult. The sialic acid moiety on the other hand is less intimately bound by the protein. The sugar ring of the sialic acid makes hydrophobic interactions with Tyr-12, whereas the carboxylic acid, the hydroxyl, and the N-acetyl substituents form hydrogen bonds with the protein backbone. The glycerol tail is only involved in water-mediated hydrogen bonding. However, close reexamination of the structure showed that the glycerol tail is very close to a hydrophobic pocket formed by Ile-58 and Lys-34 of the neighboring B subunit (see Fig. 1A), although no significant hydrophobic interactions are present. Subsequently, we used the program GRID with a hydrophobic probe (6) to explore hydrophobic binding sites in LT. The results suggest that the proposed pocket is indeed favorable for hydrophobic binding determinants ( Fig.  1B) in CT as well as in LT.
The proposed hydrophobic target region formed by Ile-58 and Lys-34 has never been occupied in any of the other toxin structures complexed to different galactose derivatives (7)(8)(9). Therefore, we wanted to explore this region by screening ligands composed of galactose, a linker, and a hydrophobic group for their ability to inhibit toxin binding to GM1. Galactose was used as an anchor to direct binding to the GM1 pocket, which is needed to prevent possible unspecific binding of the hydrophobic group to other hydrophobic regions on the protein. The linker was necessary to bridge the distance between the galactose C1 atom and the target site for the hydrophobic group, the Ile-58/Lys-34 pocket.
A library of compounds was synthesized to obtain a large and diverse selection of ligands with a hydrophobic group linked to the galactose anchor. For this library three different anchors were used, each of which had a different substituent on C1 of galactose (see Fig. 2). For the linker, we decided to use polymethylene. It was realized that conformational freezing of this very flexible linker upon binding would lead to loss of entropy. However, the rationale was that the hydrophobic group has to bind rather tightly to compensate for this entropic loss in order for the ligand to show reasonable inhibition. In that way we would only select very tight binding hydrophobic groups for which we can improve the linker in later stages of the project. For the hydrophobic group, we opted for considerable freedom with respect to types of ring, which were allowed to have any kind of heteroatoms, bonding, and substituents. Using a solution phase library synthesis protocol we recently developed, 2 72 compounds were synthesized. These library compounds were tested for their ability to inhibit binding of LT to gangliosides. Two of these compounds appeared to be the galactose derivatives with the highest affinity for the receptor binding site observed so far.

EXPERIMENTAL PROCEDURES
GRID-The program GRID, Version 16 (6), was used to indicate energetically favorable binding pockets in the receptor binding site of LT for a hydrophobic group. To find such favorable binding pockets, a hydrophobic probe was used with the default parameters to determine the interaction energies with the protein on a grid with 0.5-Å spacing. The grid was placed over the 2.2-Å resolution structure of LT⅐galactose (11) with the waters and galactose omitted while the grid covered a volume of 23 ϫ 20 ϫ 20 Å.
Modeling the Linker Length-Gal-␤NHCO-(CH 2 ) n -Ph, Gal-␣O-(CH 2 ) 2 -NHCO-(CH 2 ) n -Ph, and Gal-␤O-(CH 2 ) 2 -NHCO-(CH 2 ) n -Ph were built in InsightII (Version 97.0, Biosym/MSI) to represent, respectively, libraries I, II, and III. The galactose in binding site D of the LT⅐galactose structure (11) was used as the starting point from which the ligands were extended. The dihedrals of these compounds were adjusted manually to reach the Ile-58/Lys-34 pocket, leaving the galactose rigid including the C1-N or C1-O dihedral in its optimal position. For library I, this optimal position has an H1-C1-N1-C7Ј dihedral angle of 20°, because in the Cambridge Structural Data Base (12) five structures of pyranoses which are N-acetylated at C1 have an average dihedral of 20°Ϯ 15°. For libraries II and III, the glycosidic torsion prefers the exo-anomeric conformation; H1-C1-O1-C7Ј ϭ Ϫ60°Ϯ 60°for ␣D-sugars, and 60°Ϯ 60°for ␤D-sugars (13). The ligands were energyminimized using a conjugate gradient algorithm and the CFF91 forcefield until the r.m.s. derivative was smaller than 0.001 kcal mol Ϫ1 Å Ϫ1 . The library compounds were built starting from n ϭ 1, until n was large enough for the phenyl group to be within 4 Å of either Ile-58 or Lys-34.
ACD3D Data Base Computer Screening-A two-dimensional substructure search was conducted in the Available Chemicals Directory, ACD3D 97.2 (Molecular Design Ltd., San Leandro, CA). The software used for searching was the program ISIS (Molecular Design Ltd.) (14).
Docking III3J Using SAS-For flexible docking of III3J to the rigid 2 F. Hong and E. Fan, submitted for publication.
FIG. 1. A, GM1 in its binding site in CT as observed in the 1.25-Å crystal structure (5). The solid hexagons represent the following sugar moieties of GM1: Gal, GalNAc, Glc, and Gal. The sialic acid moiety from GM1 is represented by sticks. B, results from GRID using a hydrophobic probe on a grid with 0.5-Å spacing over the LT⅐galactose structure (11). The green and turquoise balls represent grid points with GRID energies lower than Ϫ0.5 kcal/mol. The turquoise balls highlight the hydrophobic site formed by Ile-58 and Lys-34. target site in pLT the program SAS (25) (Stochastic Approximation with Smoothing) was used. The protein coordinates used were prepared as follows: (i) the coordinates of pLT complexed to galactose (11) were taken from the Protein Data Bank, (ii) galactose was deleted from the coordinate set, (iii) the positions of the hydrogens on hetero atoms were determined using the HB2NET option in the program WHATIF (16), (iv) other hydrogens were added with InsightII, and (v) charges were assigned using the CFF91 forcefield. Affinity grids of the protein were calculated using the AUTOGRID program supplied with the AUTODOCK 2.2 package (17).
The ligand was built in InsightII, subsequently energy-minimized using a conjugate gradient algorithm and the CFF91 forcefield until the r.m.s. derivative was smaller than 0.001 kcal mol Ϫ1 Å Ϫ1 , and prepared for flexible docking using the program add_hydrogens supplied with the SAS package (25).
The default parameters were used for running the docking program SAS. The ligand was docked 50 times, each time doing 150 cycles of 6000 iterations. After running SAS, the solutions were clustered in groups with r.m.s. deviations lower than 1.0 Å. The clusters were ranked by the lowest energy representative of each cluster.
Protein and Chemicals-Porcine LT holotoxin was expressed from plasmid pROFIT-LT in E. coli strain MC1061 (18). The protein was purified using the protocol of Uesaka et al. (19), except that the clarified cell lysate was subjected to a 30% ammonium sulfate precipitation in 20 mM Tris-HCl buffer (pH 7.5) prior to affinity chromatography on immobilized D-galactose. The purity was estimated to be at least 95% on the basis of silver-stained SDS-polyacrylamide gel electrophoresis gels. Anti-LT B monoclonal antibody mAb 118 -87 was a kind gift from Dr. T. Hirst (University of Bristol, Bristol, UK). Commercially obtained assay materials were: IgG horseradish peroxidase conjugate (Roche Molecular Biochemicals) and GD1b (Matreya, Pleasant Gap, PA). Commercially obtained acids for library synthesis were: 6-maleimidocaproic acid, 3-cyclohexylpropionic acid, and cyclohexanebutyric acid (Fluka, Milwaukee, WI); 3-phthalimidopropionic acid and 7-phenylheptanoic acid (Lancaster, Windham, NH); phenytoin-N-butyric acid (Calbiochem, La Jolla, CA); ␤-N-indolepropionic acid (Pfaltz & Bauer Inc., Waterbury, CT); and all other acids were from Aldrich.
Synthesis- Fig. 2A depicts a short schematic of the synthesis, which uses trialkylphosphine-mediated amide formation (20). The detailed library synthesis protocol will be published elsewhere. 2 All library compounds screened gave satisfactory mass spectroscopy results and were judged to be Ͼ95% pure by thin layer chromatography or high performance liquid chromatography (with a small amount of the corresponding acid starting material as the contaminant, which was tested to have no adverse effect on the screening assay). Compounds selected for IC 50 measurements were synthesized in larger scale following a similar protocol as the library synthesis and were all purified by high performance liquid chromatography before use.
LT GD1b ELISA-The LT GD1b ELISA was performed as described by Minke et al. (21). Test samples consisted of 0.2 g/ml porcine LT toxin preincubated with the library compound for 2 h at room temperature. For initial screening, the test samples were diluted in a 0.1% (w/v) bovine serum albumin/phosphate-buffered saline solution (saline solution containing 150 mM NaCl and 10 mM potassium phosphate at pH 7.2), and subsequently the solutions were filtered over a 0.22-m membrane. For determination of the IC 50 values, the test samples were prepared in 10% dimethyl sulfoxide (Me 2 SO), 0.1% bovine serum albumin/phosphate-buffered saline.
All experiments were carried out in quadruplicate and validated against a concentration gradient of 0, 0.1, 0.2, and 0.3 g/ml toxin. IC 50 values were calculated from at least five different concentrations of competitive ligand by nonlinear regression as described previously (22) with the statistical package S-PLUS (Mathsoft, Inc., Cambridge, MA). For the final IC 50 values, the IC 50 was determined at least two times in independent experiments. The reported IC 50 of a ligand is the weighted average (weight ϭ 1/estimated S.D. 2 ) of the IC 50 values from its different determinations.

RESULTS AND DISCUSSION
The binding pocket of the glycerol tail of GM1 suggested the presence of a hydrophobic site in LT and CT formed by Ile-58 and Lys-34, which was supported by the fact that a hydropho-  FIG. 3. Inhibition of toxin binding to GD1b-coated microtiter plates by the galactose library compounds. Inhibition was determined using the LT GD1b ELISA as described under "Experimental Procedures." The LT concentration was 0.2 g/ml. The ligand concentration was 5 mM unless the result is marked by a star, which refers to a lower and unknown ligand concentration because of problems during synthesis or dissolution of the library compound. The arrows indicate the compounds that were tested further in Fig. 4.   FIG. 4. Inhibition of toxin binding to GD1b-coated microtiter  plates by II3J (A) and III3J (B). Inhibition was determined using the LT GD1b ELISA as described under "Experimental Procedures." The LT concentration was 0.2 g/ml. For II3J, the experiment was performed in the absence and presence of 10% Me 2 SO (DMSO). For III3J, the experiment was only done in the presence of 10% Me 2 SO. bic probe had very high affinity in this region as determined by the program GRID (Fig. 1). To explore this binding pocket, we decided to construct a library from galactose and carboxylate building blocks ( Fig. 2A). 2 Manual docking procedures were used to determine the minimal length of the methylene linker of the carboxylate building block: a phenyl ring was positioned in the Ile-58/Lys-34 pocket of LT while the phenyl was attached to galactose in the galactose binding site through a methylene linker joined by an amide bond. For sub-library I (Gal-␤NHCO-(CH 2 ) n -Ph) at least three methylene units were needed to bridge the distance between the galactose-C1-NH and the hydrophobic pocket. For sub-libraries II and III (Gal-O-(CH 2 ) 2 -NHCO-(CH 2 ) n -Ph, ␣ for sub-library II and ␤ for sub-library III), at least three and two methylene units were needed, respectively, to reach the Ile-58/Lys-34 pocket. Therefore, it was decided to use carboxylates with two to six methylene units attached to a hydrophobic group as building blocks for a galactose library to screen for ligands that inhibit LT binding to GD1b by making use of the hydrophobic pocket.
To select appropriate building blocks for library synthesis, the Available Chemicals Directory was searched for the substructure depicted in Fig. 2B, which yielded 485 carboxylate containing molecules. These compounds were screened for chemical properties, and the following substituents were excluded: additional carboxylates, free alcohols, free amines/ imines, free thiols, aldehydes, borines, seleniums, furans, thienyls, and radicals. In addition, the following groups were rejected: porphyrins, steroids, and microperoxidases. Of the remaining 242 compounds we rejected the doubles, in which we considered methyls and halogens to be identical. Also, we discarded compounds without a price listing or compounds that cost more than $100. This led to a group of 58 compounds of which a subgroup of 33 compounds was selected based on diversity. Of these 33 acids that were ordered, only 24 were actually available and used in the library synthesis. The different linker lengths and different substituents of this final group of acids are depicted in Fig. 2C.
After the synthesis following the protocol developed by us, 2 72 library compounds were constructed with the general structure as drawn in Fig. 2A and the different substituents (Rgroups) as drawn in Fig. 2C. The library compounds were tested at 5 mM for inhibition of LT binding to GD1b, and the results are depicted in Fig. 3. GD1b was chosen as the appropriate ganglioside because the toxin binds 11 times weaker to GD1b than to GM1 (23), which makes it possible to screen for lower affinity inhibitors. We would like to point out that a few compounds were tested at a concentration below 5 mM as a result of a low yield in the synthesis or because of solubility problems; these compounds are marked with an asterisk in Fig.  3. The following compounds showed more than 50% inhibition and were subsequently tested at 0.5 mM: I2J (I2J denotes the sub-library I compound with a linker of two methylene units and substituent 2J), I3J, I4C, II2B, II2C, II2F, II2G, II3J, II4B, II4C, and II6A. At 0.5 mM only galactose library compounds II3J and II6A showed significant inhibition in the LT ELISA, respectively, 100% and 60%. Subsequently these compounds were synthesized at larger scale to confirm their inhibitory power and to determine the IC 50 . Unfortunately, the results with compound II6A from the larger scale do not confirm with the earlier, screening results: the IC 50 is only 3.2 Ϯ 0.1 mM. The better result obtained during the initial screening may have been because of impurities in the smaller scale library synthesis. However, compound II3J is indeed an extremely good inhibitor with an IC 50 of 0.23 Ϯ 0.04 mM (Fig. 4A), which means that the affinity is 300-fold higher than that of galactose (21). Because II3J is such a good inhibitor and compound III3J was only tested at an unknown and very low concentration, we also decided to resynthesize library compound III3J at a larger scale and test it in an appropriate solvent. The presence of 10% Me 2 SO in the II3J sample solutions did not have any effect on the IC 50 of II3J (Fig. 4A), and in addition, III3J dissolved completely in the presence of 10% Me 2 SO. Therefore, the IC 50 of III3J was determined in the presence of 10% Me 2 SO and turns out to be 40 Ϯ 2 M (Fig. 4B). The IC 50 of III3J is even 5-fold better than its ␣-isomer in library II or in other words it is 1500-fold better than galactose.
Besides identifying several tight binding inhibitors of LT binding to GD1b, another motivation for designing, synthesizing and screening a galactose library was to analyze characteristics of this library for the design of future inhibitors. First of all, the effect of the linker length should be discussed. As was predicted, a linker of two methylene units is too short for libraries I and III (compounds I2J and III2J being exceptions, probably because of the presence of a second phenyl group). However, for the compounds with longer linkers in libraries I and III and for library II, there is not a clear correlation between linker length and inhibition of LT binding to GD1b. Second, library II distinctly contains better inhibitors (Fig. 3), although the best compound is from library III. This confirms our earlier findings that ␣ anomers of galactose derivatives in general are better inhibitors than their ␤ anomers (21). To conclude, library II contains relatively the most potent inhibitors, and even compounds with the smallest linker in this library perform well in the ELISA. Therefore, it is probably FIG. 5. Several binding modes of III3J in LT as suggested by the docking program SAS (25). The second solution had a score of Ϫ84.28 kcal/mol with two members in its cluster. The fifth solution had a score of Ϫ79.08 kcal/mol with three members in its cluster. The third solution had a score of Ϫ71.86 kcal/mol with one member in its cluster. These solutions were selected to demonstrate the diversity in the proposed binding modes. For completeness, the first ranked solution had a score of Ϫ89.92 cal/mol and had two members in its cluster. The turquoise spheres are identical to the ones in Fig. 1B and represent the hydrophobic binding site.
wise to expand this library in the future and also include compounds in library II with only one methylene group as the linker.
Of course our immediate goal is to follow up the hits of this successful library screen. To design the next round of inhibitors based on the very potent inhibitor III3J, we need to know its exact binding mode. Initially, we tried to expedite the design process by predicting the binding mode of III3J in LT using the docking program SAS. However, as shown in Fig. 5, no single binding mode was preferred. Instead, of the 50 solutions suggested by the program, only 7 were within 1 Å of another solution. Therefore, we are in the process of crystallizing the LT⅐III3J complex to obtain a better model for the binding mode of III3J in LT.