Structural and Functional Insight into the Carbohydrate Receptor Binding of F4 Fimbriae-producing Enterotoxigenic Escherichia coli *

Background: F4 fimbriae produced by enterotoxigenic Escherichia coli mediate attachment to eukaryotic host receptors. Results: The structure of lactose bound to the F4 fimbrial adhesin FaeGad was elucidated. Conclusion: Lactose interacts at a subdomain grafted on the FaeGad core domain. Significance: The co-complex structure explains the finely tuned receptor specificity of F4ad fimbriae; additionally, the carbohydrate binding site differs among FaeG variants. Enterotoxigenic Escherichia coli (ETEC) strains are important causes of intestinal disease in humans and lead to severe production losses in animal farming. A range of fimbrial adhesins in ETEC strains determines host and tissue tropism. ETEC strains expressing F4 fimbriae are associated with neonatal and post-weaning diarrhea in piglets. Three naturally occurring variants of F4 fimbriae (F4ab, F4ac, and F4ad) exist that differ in the primary sequence of their major adhesive subunit FaeG, and each features a related yet distinct receptor binding profile. Here the x-ray structure of FaeGad bound to lactose provides the first structural insight into the receptor specificity and mode of binding by the poly-adhesive F4 fimbriae. A small D′-D″-α1-α2 subdomain grafted on the immunoglobulin-like core of FaeG hosts the carbohydrate binding site. Two short amino acid stretches Phe150–Glu152 and Val166–Glu170 of FaeGad bind the terminal galactose in the lactosyl unit and provide affinity and specificity to the interaction. A hemagglutination-based assay with E. coli expressing mutant F4ad fimbriae confirmed the elucidated co-complex structure. Interestingly, the crucial D′-α1 loop that borders the FaeGad binding site adopts a different conformation in the two other FaeG variants and hints at a heterogeneous binding pocket among the FaeG serotypes.

tions with ETEC strains affect hundreds of millions of persons worldwide annually, mainly travelers and young children in developing countries (3). ETEC that are taken up by the consumption of contaminated water or food colonize the small intestine where they secrete heat-labile and/or heat-stable enterotoxins, thereby causing diarrhea (4). As an initial and crucial step in pathogenesis, ETEC express fimbrial colonization factors that steer species-specific adherence and colonization of the intestinal tract. Human ETEC possess over 20 distinct colonization factors (5). In porcine ETEC, F4 fimbriae (previously termed K88 fimbriae) form the most common colonization factor associated with intestinal infection of newborn, suckling, and newly weaned piglets, leading to significant death and morbidity (6).
Most fimbriae belong to a class of cell surface organelles that are assembled by the conserved chaperone/usher pathway, named after two essential components: a periplasmic chaperone protein and outer membrane pilus assembly platform, termed usher (7). The chaperone stabilizes and delivers pilus subunits to the usher, where subunits are added to the base of a non-covalent polymer by a fold complementation mechanism (8,9). F4 fimbriae are encoded by the fae operon, which comprises genes coding for the regulatory proteins FaeA and FaeB (10), the distal tip protein FaeC (11), the usher FaeD (12), the periplasmic chaperone FaeE (13), the minor fimbrial shaft subunits FaeF and FaeH (14), the major subunit FaeG (15), and the minor subunits FaeI and FaeJ, whose roles remained obscure until now. FaeG forms the major structural component of the F4 fimbriae and also incorporates the adhesive properties of the fibers. Therefore, contrary to the most well studied chaperone/usher pili, where receptor binding is determined by a single two-domain adhesin incorporated at the fimbrial tip, binding characteristics of F4 fimbriae are part of the major subunit and are displayed throughout the length of the fiber (15,16). Three naturally occurring serological variants of F4 fimbriae (F4 ab , F4 ac , and F4 ad ) each feature a related but yet different binding and hemagglutination profile (17,18). Differences in binding specificity were attributed to each of the three F4 fimbriae variants (F4 ab , F4 ac , and F4 ad ), and could be interchanged by replacement of a residue stretch in the FaeG protein (19). Based on the receptor profiles in the brush borders, six different pig phenotypes can be distinguished that differ in susceptibility to specific FaeG variants (20). An extensive literature search describes studies seeking to unravel the F4 receptor specificities. Both specific piglet intestinal glycoproteins and glycolipids have been identified as F4 receptors, and a three-receptor model was proposed, which was by later observations adapted and changed to include five putative receptors (21)(22)(23)(24). A subset of brush borders glycoproteins ranging in ranging in molecular mass from 45 to 70 kDa was demonstrated to interact with all three types of F4 fimbriae (25). Two intestinal mucin-type sialoglycoproteins (IMTGP-1 and IMTGP-2) were identified as receptors for F4 ab and F4 ac fimbriae (22,26,27), and later a ␤-linked galactose was shown to constitute an important component in their recognition by F4 ac fimbriae (28). Intestinal transferrin (GP74) was shown to be a F4ab-specific receptor (29), with exoglycosidase treatment pointing toward the pivotal role of GlcNAc residues in the core of the N-glycan chain on GP74. A F4 ad -specific receptor was identified as an intestinal neutral glycosphingolipid (IGLad) (24). Grange et al. (24) demonstrated terminal ␤-linked galactose to be an essential component of IGLad, and furthermore their results strongly indicated neolactotetraosylceramide (Gal␤1-4GlcNAc␤1-3 Gal␤1-4Glc␤1Cer) was the IGLad receptor. The important role for galactose and/or N-acetyl galactosamine residue in the F4 receptor structure was highlighted in different studies; most often these residues were present at the non-reducing end in ␤-linkage (24, 28, 30 -36). In other studies, the interaction of all F4 variants with different glycosphingolipids was demonstrated, such as lactosylceramide, gangliotriaosylceramide, gangliotetraosylceramide, globotriaosylceramide, lactotetraosylceramide, and lactotetraosylceramide (24,30,32,37). A more recent study investigated the glycosphingolipid recognition by the different F4 variants and found that the F4 ab and F4 ac variants showed more similarities in their glycosphingolipid recognition patterns when compared with the F4 ad variant (30). For example, F4 ab and F4 ac fimbriae interacted with both sulfatide and galactosylceramide, whereas F4 ad fimbriae did not (30). In contrast, F4 ad fimbriae bound to gangliotriaosyl-and gangliotretraosylceramide, unlike the ab or ac variants.
Previously, we determined the crystal structure of monomeric N-terminally deleted, donor strand complemented FaeG (FaeG ntd/dsc ) of variant ad (38). Equivalent constructs for the ab, ac, and ad variants are employed in this study in co-crystallization trials with the above described carbohydrate receptors, resulting in the first structural insight into ligand binding by the poly-adhesive F4 fimbriae. The co-complex structure between FaeG ad and lactose allowed us to explain the finely tuned receptor specificity of F4 ad fimbriae. We also obtained the apo structure of FaeG ntd/dsc variant ab, and together these results hinted at the non-conserved nature of the lactose binding pocket on the FaeG surface as the crucial DЈ-␣1 loop significantly differs among the FaeG variants.

EXPERIMENTAL PROCEDURES
Expression and Purification of the FaeG ntd/dsc Variants-The different constructs of N-terminally deleted, donor strand complemented FaeG (FaeG ntd/dsc ) were expressed and purified as described previously (38). In short, E. coli C43 (DE3) cells containing a plasmid encoding the faeG ntd/dsc construct were grown in LB medium while shaking at 37°C and were induced with 1 mM isopropyl ␤-D-1-thiogalactopyranoside for overnight expression at 37°C upon reaching an A 600 of 1. The cytoplasmic content was collected by breaking the cells using a Constant Systems cell cracker at 20,000 p.s.i. at 4°C and removal of cell debris by centrifugation at 18,000 ϫ g for 15 min. Initially, the cytoplasmic extract was subjected to a nickel affinity purification step, and FaeG ntd/dsc was eluted by increasing the concentration of imidazole. FaeG ntd/dsc -containing fractions were pooled and dialyzed against 20 mM formic acid, pH 3.8, after which many contaminants readily precipitated, and these were removed by centrifugation. FaeG ntd/dsc was further purified using cation exchange chromatography (HiTrap SP FF, GE Healthcare Life Sciences) at pH 3.8 and eluted by increasing the concentration of NaCl to 1 M. Afterward, depending on the purity, FaeG ntd/dsc -containing fractions were subjected to anion exchange chromatography (HiTrap Q FF, GE Healthcare Life Sciences) using 20 mM HEPES, pH 8, to equilibrate the column and load the protein, and using 20 mM HEPES, pH 8.0, 1 M NaCl for elution. As a final step, protein solutions were subjected to gel filtration chromatography (Sephacryl S-100 HiPrep 26/60 column, GE Healthcare Life Sciences). The runs were performed in 20 mM Tris, pH 8.0, 150 mM NaCl.
Crystallization, Data Collection, Structure Determination, and Refinement-Apo FaeG ntd/dsc variant ab (23 mg/ml) and the complex between FaeG ntd/dsc variant ad and lactose (20 mg/ml of protein, 25 mM lactose) were crystallized against a solution containing 18 mM Na/K-phosphate and 18% PEG 3350, respectively, and in condition B12 of the Morpheus crystallization screen (Molecular Dimensions; 100 mM Trizma (Tris base)/Bicine, pH 8.5, 0.03 M of each halogen (sodium fluoride, sodium bromide, sodium iodide), 37.5% v/v MPD_P1K_ P3350 (2-methyl-2,4-pentanediol, PEG1000, PEG3350). The crystals were flash-cooled to 100 K in their crystallization solution supplemented with 15% glycerol for data collection. Single wavelength data were collected at 0.98 Å wavelength, and data were processed with XDS and XSCALE (39) and Truncate from the CCP4 suite (40). Data were phased by molecular replacement with Phaser (40) using coordinates of the previously determined FaeG ntd/dsc variant ad structure (Protein Data Bank (PDB) code 3GGH) (38). Additional electron density was observed for a lactose ligand. Galactose and glucose monosaccharides were obtained from the HIC-Up database (41), positioned into the electron density map, and restrained by the monomer library of Refmac. The models were further improved using the graphics program COOT (42), and maximum likelihood was refined using Refmac5.5 (40) against the native dataset. Crystal parameters and data processing statistics for all structures are summarized in Table 1.
Design of Mutant F4 Fimbriae-Full-length faeG ad was amplified using primers K88-64/K88-65 that contain the attB1 and attB2 sites from total genomic DNA of the F4 ad -positive E. coli isolate C1360-79 and subsequently introduced into the Gateway entry vector pDONR221 (Invitrogen), yielding pENT39. The site-specific mutations in faeG ad were introduced by overlap PCR (primers are given in Table 2; the first PCR reactions were performed with inward primers SeqLA1 and SeqLB and mutation-specific outward primers, and the second overlap PCR reaction was done with primers K88-152 and K88-153) and subsequent reintroduction of the PCR fragments in the unique EcoRI and Eco47III sites of the pENT39 using the In-fusion Advantage PCR cloning kit (Clontech).
Sequences of all mutants were confirmed by nucleotide sequencing.
The F4 ab gene cluster comprising all genes (faeCDEFGHI) except the regulatory genes faeA and faeB and the minor subunit faeJ was amplified using primers K88-156 and K88-157 (containing the attB1 and attB2 sites) from total genomic DNA of the F4 ab -positive E. coli C1289-78 isolate and introduced into the pGV5558 plasmid, which is a derivative of the pTrc99A vector (43) containing the pDONR221 Gateway cassette in its unique SmaI site. Transformed E. coli DH5␣ were assayed for expression of the F4 fimbriae by colony blotting and agglutination with polyclonal antibodies directed against the FaeG major subunit. To delete the faeG ab gene, the plasmid pHD935 carrying the F4 ab encoding gene cluster was digested with the restriction enzyme Eco47III, only present in the faeF and faeG genes. To delete the faeG gene and to restore the faeF gene, a PCR fragment obtained with primers K88-164 and K88-165 and containing the faeF missing part as well as a unique PmeI site was introduced with In-Fusion cloning (Clontech) in the Eco47III-digested pHD935. The resulting plasmid pHD936 carries the intact faeCDEF-HI genes and a deleted faeG gene. The Eco47III site in the faeF gene was mutated without changing the faeF reading frame, whereas the Eco47III site at the end of the faeG gene was unaffected. This pHD936 plasmid was used to introduce the wild type or mutant faeG ad sequences between the PmeI and EcoR47III sites with the In-Fusion technology. We therefore amplified the wild type or mutant faeG ad sequences using primers K88-168 and K88-169. The resulting plasmids are carrying a complete F4 ab encoding gene cluster with mutant faeG ad or wild type faeG ad genes. The capability of the different clones to produce F4 ad fimbriae was subsequently analyzed.
Preparation of Heat-sheared F4 Fimbriae-Bacterial strains were grown in 100 ml of LB medium supplemented with ampicillin (100 g/ml) at 37°C overnight with gentle shaking. Cells were washed twice by centrifugation for 25 min at 2000 rpm in a table-top centrifuge (Heraeus Megafuge 1.0R) and resuspended in 600 l of PBS. The bacterial suspension was heated to PDB entry 4WE2 4WEI a Statistics for outer resolution shell are given in parentheses. b R meas ϭ ͚h (nh/nh Ϫ 1) ͚l ͉Ihl Ϫ ͗Ih͉͘/͚h ͚l ͗Ih͘, where nh ϭ the number of observations for reflection h, Ihl ϭ the intensity for observation l of reflection h, and ͗Ih͘ ϭ the average intensity for reflection h. c R work ϭ ͚hkl ʈF obs ͉ Ϫ ͉F calc ʈ/͚hkl ͉F obs ͉. d R free is defined as above but calculated for 5% of randomly chosen reflections that were excluded from the refinement. e r.m.s., root mean square. f NA, not applicable.

Primer name Primers (5 3 3) Amino acid substitution
60°C for 20 min to detach the F4 fimbriae from the cell surface. After centrifugation of the bacterial cells, the released fimbriae were present in the supernatant. Detection of the presence of FaeG ad in the F4 ad fimbriae occurred by Western blot analysis with rabbit anti-FaeG polyclonal antiserum. Hemagglutination Assay-Strains expressing mutant and wild type F4 ad fimbriae were grown overnight in LB medium at low shaking speed to ensure that the F4 ad fimbriae were not detached. Bacterial cells were washed two times in PBS. Finally, the cell pellet was dissolved in ice-cold PBS (50ϫ less than initial volume), and the A 600 was determined for each strain, after which they were diluted with PBS until an identical optical density was reached (A 600 of 50). Guinea pig RBC were employed in the hemagglutination assay as F4 ad fimbriae are able to specifically agglutinate these. For the washing step, 3 ml of RBC was added to 30 ml of PBS and subsequently centrifuged at 380 rcf in a fixed angle rotor during 10 min to spin down the RBC. The washing step was repeated at least three times and as often as needed until the supernatant was clear and free of lysed RBC. After the final washing step, 60 ml of PBS was added to obtain 5% RBC. A 2-fold dilution series of 25 l bacteria (highest concentration A 600 ϭ 50) was added to a 96-well plate (BD Biosciences) on ice. To each well, 25 l of PBS and 50 l of washed RBC were added. The first well of each row was assigned as a negative control with only 50 l of PBS and 50 l of RBC added.
The 96-well plate was incubated during at least 1 h, after which the hemagglutination pattern became visible and the titer was determined.

Structural Insight into Receptor Binding by the F4 Fimbrial
Adhesin FaeG Variant ad-We co-crystallized a stable, selfcomplemented form of the FaeG subunit of variant ad (FaeG ntd/dsc, ad ) (38) with the disaccharide lactose (Gal␤1-4Glc). The crystals diffracted to a resolution of 2.3 Å, and crystal parameters and data processing statistics are summarized in Table 1. Lactose interacts at the side of the FaeG fold at a shallow groove present on a small subdomain (Fig. 1 A). This binding domain is inserted between strands D and E of the immunoglobulin-like fold and comprises two short ␤-strands (DЈ and DЉ) and two short helices (␣1 and ␣2) (hence termed the DЈ-DЉ-␣1-␣2 subdomain). The DЈ-DЉ-␣1-␣2 subdomain was earlier speculated in being involved in receptor binding (44). FaeG ntd/dsc, ad -lactose interactions are restricted to the terminal galactose residue and involve two short amino acid stretches, Phe 150 -Glu 152 and Val 166 -Glu 170 , located on the DЈ-␣1 loop (Fig. 1, A and B). The sugar ring of galactose is sandwiched between the side chains of Phe 150 and Lys 167 . The presence of an aromatic residue facing the non-polar surface of FIGURE 1. Co-complex structure of the adhesive major subunit FaeG ntd/dsc variant ad with lactose. A, graphic representation of FaeG ntd/dsc variant ad bound to lactose (depicted in stick model). The lactose ligand is bound on the side of the FaeG major subunit by the DЈ-DЉ-␣1-␣2 subdomain (colored in green) grafted on the Ig-like core between strands D and E1. In particular, residues located on the DЈ-␣1 loop are involved in complex formation. Strands are colored cyan, loops are in gray, and a short unassigned ␣-helix is in red. B, close-up of the FaeG ntd/dsc variant ad binding site in complex with lactose. Two short amino acid stretches of the DЈ-␣1 loop are involved in receptor binding. Only the galactose monosaccharide of the lactose ligand is interacting with binding site residues. Both side chains and main chain groups of residues on both stretches are coordinating a stabilizing water molecule (depicted as a red sphere). The protein backbone is in ribbon representation (gray), hydrogen bonds are depicted as dashed green lines, and ligand residues are displayed in stick representation with carbon and oxygen atoms in yellow and red, respectively. C, electron density map contoured at 1.3 and displayed around the lactose ligand (depicted in stick model). Carbon and oxygen atoms are colored in yellow and red, respectively. the galactopyranose is a recurrent feature of galactose-binding proteins (45).
Specificity for the galactose moiety is further provided by hydrogen bond formation with residues on both continuous stretches. The carboxyl group of Glu 152 interacts with the C6 hydroxyl group. Glu 152 is part of a dyad in which the side chain of His 155 orients and stabilizes Glu 152 in its interaction with the lactose ligand. A buried water molecule is stabilized and coordinated by the Ser 169 hydroxyl, the Glu 170 carboxyl, and the main chain carbonyl group of Val 166 . Both side chains of Ser 169 and Glu 170 are in turn stabilized and oriented by Asn 97 and Ser 212 , respectively. These combined interactions position the buried water molecule to interact with the C4 hydroxyl group of galactose. In addition, the side chains of Asn 168 and Glu 170 form hydrogen bonds with the C2 and C3 hydroxyl groups, respectively, and the main chain amide group of Asn 168 is involved in a hydrogen bond with the C3 hydroxyl group. In contrast to the extensive interaction network with the terminal galactose, the lactose Glc moiety does not appear to contribute direct binding interactions. Of note, although positioned in the vicinity of the glucose C6 hydroxyl, the side chain amine of Lys 167 does not show clear electron density, indicating that it is not involved in a stable hydrogen bond interaction with the sugar.
An overlay of the apo-and lactose-bound FaeG ntd/dsc structures (0.75 Å root mean square deviation for 239 equivalent C␣ atoms) shows the absence of conformational changes in the receptor site upon binding the lactose ligand. Similarly, the structures of FimH, PapG, FedF, and F17-G do not differ upon binding of their respective glycan ligand (46 -49). For FimH, however, shear-enhanced conformational changes occur in the receptor binding site (50).
FaeG ad Is Responsible for F4 ad Fimbriae-mediated Attachment-To determine the physiological relevance of the lactose binding site identified in the FaeG ntd/dsc, ad structure, a mutational analysis was performed. FaeG ad mutants with residues directly interacting with the carbohydrate receptor (Phe 150 , Asn 168 , and Glu 170 ) were alternatively substituted by alanine and analyzed for binding capacity. WT and mutant faeG ad alleles were inserted in a cloned fae gene cluster comprising faeC, faeD, faeE, faeF, faeH, and faeI (e.g. lacking the regulatory genes faeA and faeB) under the control of the lac promoter. To verify production of WT and mutant F4 fimbriae, fimbrial material was heat-sheared from E. coli DH5␣ cells transformed with the respective plasmids (pHD949 WT , pHD945 F150A , pHD944 N168A , or pHD948 E170A ) and analyzed with anti-F4 immunoblot. For all mutants, the levels of surface-exposed FaeG in mutant F4 fimbriae were found to be similar to wild type (Fig. 2 B), demonstrating that the introduced mutations do not affect the stability of FaeG and exposure of the F4 fimbriae on the cell surface. Subsequently, to assess the role of residues Phe 150 , Asn 168 , and Glu 170 in F4 ad -mediated adherence, cultures expressing WT and mutant FaeG were used to determine the hemagglutination titer of guinea pig red blood ( Fig. 2A). All of the mutations introduced into FaeG ad reduced binding when compared with wild type FaeG ad (Fig. 2A). Mutating Asn 168 to alanine resulted in 2-fold difference in titer when compared with the wild type, indicating only a partial loss of lactose binding. In contrast, F150A and E170A almost fully abolished the interaction with carbohydrate receptors on the guinea pig red blood cells (8 -16-fold difference when compared with wild type F4 ad ). The hemagglutination assay thus validates the galactose binding site captured in the FaeG ad -lactose co-complex crystal structure.
F4 ad Fimbriae Exhibit a Finely Tuned Specificity-Based on our crystal structure, FaeG ad is found to bind the terminal galactose in the lactosyl epitope, but does not indicate an active involvement of the reducing end glucose in the binding interaction. Nevertheless, using TLC overlay assays, Coddens et al. (30) observed that F4 ad fimbriated bacteria bound lactosylceramide but not galactosylceramide. To further address the role of glucose in F4 ad receptor binding, we overlaid available experimentally determined conformers of galactosylceramide (4F7E, 3SDA, 3TNO, 3SDX, 1ZT4, 4EN3, 3HE6, and 3ARG) and lactosylceramide (3SDD, 1SX6, 2EUM, and 2EUD) found in the Protein Data Bank with the FaeG ad lactose structure by superimposition of the terminal galactose residue. In all analyzed galactosylceramide conformers, a kink is found that positions the ceramide group lateral to the FaeG surface. In the context of a lipid bilayer, the galactose epitope in these galactosylceramide conformers would be sterically hindered for FaeG ad binding (Fig. 3 ). Lactosylceramide on the other hand features an extended conformation that would not lead to a sterically hindered display of the terminal lactose when embedded in a bilayer (Fig. 3).

FIGURE 2. Hemagglutination assay with E. coli expressing F4 fimbriae that incorporate either wild type or mutant FaeG ad confirms co-complex structure between FaeG ad and lactose. A, summary of hemagglutination assay experiments of guinea pig red blood cells with E. coli expressing F4
fimbriae that incorporate either wild type or mutant FaeG ad . The bar graph represents the titer to which extent the corresponding strain is still able to agglutinate the red blood cells. The higher the titer, the better the F4 ad fimbriae recognize receptors present on the red blood cells. The ⌬faeG strain did not demonstrate any hemagglutination capacity. B, Western blot of heat shock-detached F4 ad wild type and mutant fimbriae to evaluate levels of FaeG ad expression. The intensity of the bands reflects the amount of FaeG ad incorporated in the F4 ad fimbriae. The molecular masses of the marker proteins are indicated in kDa.
The FaeG ad -lactose co-complex structure further reveals that substitutions on either the C3 or the C4 position of galactose would result in steric clashes with FaeG. Indeed, sulfatide or sulf-lactosylceramide both have a sulfate group on the C3 position and do not interact with F4 ad fimbriae (30). The requirements of unsubstituted C3 or C4 positions also dictate that the galactose receptor residue involved in FaeG ad binding should be located at the non-reducing end and not internally in the glycan chain. Binding studies with chicken erythrocyte and reference glycosphingolipids identified glycans with an N-acetyl galactosamine at the non-reducing end to be accepted as F4 ad fimbrial receptors (30). Based on the FaeG ad -lactose structure, an N-acetyl group in the C2 position is predicted to come into close proximity with the side chain of Asn 168 (1.8 Å distance with the carbon atom of the carbonyl group), although a slight reorientation of the latter would enable the preservation of the hydrogen bond between the C2 substituent and the Asn 168 side chain. Coddens et al. (30) reported that lactosyl moieties with an added Gal/GalNAc in ␣1-3 linkage, but not ␣1-4 linkage, maintained F4 ad binding. When modeled into the FaeG ad binding pocket, the ␣1-4-linked Gal in globotriaose (Gal␣1-4Gal␤1-4Glc) results in a kink in the glycan chain such that when present in a bilayer, globotriaosylceramide is expected to be sterically hindered for binding of FaeG ad (Fig. 4  C). In the case of isoglobotriaose (Gal␣1-3Gal␤1-4Glc), the ␣1-3 linkage projects the reducing end glucose further away from the FaeG surface and positions its C1 hydroxyl group perpendicular to the fimbrial shaft (Fig. 4, A and B) so that polymerized FaeG subunits, as found in the F4 ad fimbriae, would not clash with the membrane (Fig. 4C). Coddens et al. (30) observed enhanced binding interaction when the ␣1-3 substituting glycan chain was further elongated, for example, as found for the newly discovered glycosphingolipids GalNAc␣1-3GalNAc␤1-3 Gal␤1-4Glc␤1Cer and GalNAc␣1-3GalNAc␤1-3 Gal␤1-4 GlcNAc␤1-3Gal␤1-4Glc␤1Cer. Based on our structural data, we propose that the more efficient interaction in these extended glycosphingolipids results from the additional glycan units that project the interacting non-reducing end residue further away from the host cell surface.
FaeG Employs a Variable Domain to Diversify Receptor Recognition-Apart from the ad isotype crystallized in complex with lactose (see above), two additional FaeG serotypes are known (ab and ac) that differ in receptor binding profile (17,18). Co-crystallization of donor strand complemented FaeG ab and FaeG ac in the presence of a selection of previously identified carbohydrate ligands (galactose, lactose, globotriaose (Gal␣1-4Gal␤1-4Glc), galabiose (Gal␣1-4Gal), N-acetyl galactosamine) (30) did not result in the x-ray structures of the binding complexes. Also, soaking experiments with apo crystals of FaeG ab or FaeG ac and the above mentioned sugars did not reveal additional electron density corresponding to the reported ligands. However, these crystals did allow the structure elucidation of apo FaeG serotype ab (FaeG ab ), which together with the previously published apo FaeG ac structure (PDB identifier 2J6R) (44) ( Table  1) enabled us to compare the galactose binding site observed in the FaeG ad -lactose structure or its equivalent region among the three FaeG serotypes. In FaeG ad , the lactose ligand is bound by residues located on the DЈ-␣1 loop. Strikingly, the different FaeG variants show large structural conformational changes in the equivalent loop regions. FaeG ac has a deletion of three residues in and near the amino acid stretch Val 166 -Glu 170 (Fig. 5 A), which significantly shortens the lower DЈ-␣1 loop (Fig. 5B). In the FaeG ad -lactose structure, the galactose unit forms direct interactions with the main chain amine group and side chain of residue Asn 168 . Due to the shortening of the lower DЈ-␣1 loop in FaeG ac , galactose would not be able to bind the DЈ-␣1 loop in an equivalent orientation. The shortened DЈ-␣1 loop in FaeG ac also positions Arg 165 in overlap with the galactose binding pocket as observed in FaeG ad . Furthermore, residue Glu 170 , which in the ad serotype structure interacts with the C3 hydroxyl group of galactose, is replaced by an alanine residue in FaeG ac (Figs. 1 and 5A). Together, these amino acid changes in the DЈ-DЉ-␣1-␣2 subdomain of the ac variant abolish the galactose binding site observed in the FaeG ad -lactose complex. In the FaeG ab structure, the lower DЈ-␣1 loop is shifted upwards when compared with the FaeG ad structure and overlaps with the lactose binding site seen for FaeG ad (Fig. 5C). Clashes between the upper DЈ-␣1 loop and the ligand therefore prevent the accommodation and binding of a ligand in a cleft created by the upper and lower DЈ-␣1 loop of FaeG ab . On the edge of the lower DЈ-␣1 loop, a substitution of Thr 163 in the FaeG ab for proline in the FaeG ad structure seems to be responsible for repositioning the DЈ-␣1 loop from an upward to downward position (Fig. 5C). The downward positioned orientation in FaeG ad is further stabilized by the formation of two hydrogen bonds between the side chain of Asn 165 and the main chain amine and carboxyl groups of Asn 97 . When comparing the sequences of the different variants, the interacting residues located in the upper DЈ-␣1 loop are conserved (Phe 150 is conserved in all variants; Glu 152 in FaeG ad is conserved when compared with similar Asp residue in FaeG ab/ac ). However, multiple sequence variations accumulated in the lower DЈ-␣1 loop, including the residues Asn 168 , Ser 169 , and Glu 170 of the FaeG ad sequence (Figs. 1 and 5A). In both the FaeG ab and the FaeG ac variants, the FaeG ad binding site is significantly distorted, and it seems unlikely that these sites can accommodate galactose or different host glycan receptors unless large structural conformational changes occur upon binding.

DISCUSSION
Pathogenic E. coli employ a vast array of cell surface adhesins to establish adherence to host tissue. These adhesins are often presented to host cell receptors by hair-like fimbriae or pili. The majority of fimbriae extend the two-domain tip adhesin from the bacterial cell surface, allowing pathogens to interact with host receptor present on otherwise inaccessible locations. In . Binding specificity of FaeG ad . Globotriaose (A) and isoglobotriaose (B) ligands were docked into the FaeG ad carbohydrate binding pocket. Terminal galactose residues substituted with a glycan chain in ␣1-4 linkage are not functional as FaeG ad receptors, whereas those in ␣1-3 linkage are able to interact with the FaeG ad major adhesive subunit (30). The ␣1-4 linkage of globotriaose positions the glycan chain lateral to the long axis of the FaeG ad subunit. A red arrow depicts the C1 atom, which connects the glycan with the sphingolipid part. When the glycosphingolipid is embedded in a membrane, the positioning of the reducing end glycan will result in steric hindrance and abolish the interaction between globotriaose and FaeG ad . The ␣1-3 linkage of isoglobotriaose on the other hand positions the C1 group more at an angle to the subunit's long axis, allowing interaction with membrane-embedded glycosphingolipids without steric hindrance (B and C). A longer glycan chain extends the lipid part further away from the adhesin surface and enhances binding. In the FaeG ad -lactose complex, the ␤1-4 linkage projects the glucose residue perpendicular to the fimbrial shaft and allows the interaction of membrane-embedded lactosylceramide with F4 ad fimbriae without steric hindrance (C). In panel C, the orientation of the reducing end glucose monomers of the three glycans is identical. Glycan structures were generated using the GLYCAM web tool. FaeG ad is depicted as molecular surface and colored gray. Glycan structures are depicted as stick models with carbon, oxygen and nitrogen atoms colored yellow, red, and blue, respectively. some fimbriae, the major subunit that builds up the fimbrial shaft has evolved into the adhesive subunit with multiple receptor binding sites presented along its length. Prototypical examples of this class are the F4 fimbriae, presented on the surface of enterotoxigenic E. coli. FaeG, the single domain adhesin of F4 fimbriae, has a dual role combining structural and adhesive properties. Interestingly, FaeG naturally occurs in three antigenic variants, each featuring a distinct hemagglutination profile and interaction pattern with porcine intestinal receptors. Throughout the past two decades, a multitude of studies tried to identify the putative F4 receptor(s), and evidence suggested the existence of both glycoprotein and glycolipid host receptors with a critical role for the carbohydrate part of these glycoconjugates.
In this study, the co-complex structure of FaeG ad with its lactose ligand is presented. This structure represents the first structural insight into carbohydrate binding by poly-adhesive F4 fimbriae. FaeG ad interacts via its DЈ-DЉ-␣1-␣2 binding domain with the minimal galactose binding epitope. The binding domain is grafted onto the immunoglobulin-like core and comprises most of the naturally occurring variations between FaeG variants on two elongated conformational stable loops, DЈ-␣1 and DЉ-E1. The adhesin-ligand interaction is attained primarily via hydrogen bond formation with amino acid residues located on the DЈ-␣1 loop, but also hydrophobic contacts between Phe 150 and the nonpolar side of the glycan ring. The involvement of a hydrophobic interaction with an aromatic side chain is a common theme in ligand recognition by fimbrial adhesins. FimH, PsaA, and FedF also make use of a tyrosine resi-due in their binding pocket to interact with mannose, galactose, and fucose residues, respectively (48,51,52). PapG and F17-G employ the hydrophobic interaction of a tryptophan side chain against the carbon atoms of the sugar (47,49). So far, most cocomplex structures that have been determined were between twodomain tip adhesins and their glycan receptors (46 -49, 51). The only structural insight of binding of single-domain adhesins thus far comes from the crystal structure of the homopolymeric PsaA adhesin of pH 6 antigen fimbriae of Yersinia pestis (52). PsaA features dual receptor recognition of both a ␤-1-linked galactose and a phosphocholine head group, thereby redirecting binding toward the surface of host tissue.
F4 fimbriae phylogenetically belong to the -fimbrial clade of chaperone-usher systems, whose members typically exhibit a flexible fibrillar structure (54). The location of the binding site on the side of the FaeG ad surface requires the fimbrial long axis to be arranged parallel to the supporting surface (either membrane or glycoprotein) on which the carbohydrate receptor is located. A flexible fimbrial structure allows the simultaneous interaction of multiple FaeG subunits of a single F4 fimbrium with surface-located receptors. Poly-adhesive F4 ad fimbriae have a stringent constraint regarding the orientation of their carbohydrate host receptors. The FaeG subunits should be able to approach the binding epitope perpendicular to the supporting surface to allow interaction and to avoid steric hindrance. Host carbohydrate receptors positioned parallel to the host membrane (by insertion of an ␣1-4 linkage) would be, independent of the length of the connecting glycan chain, inaccessible for interaction by the elongated F4 ad fimbriae due to steric clashes between the host membrane and surrounding FaeG subunits. Two-domain tip adhesins are less prone to host receptor orientation as most of them express their carbohydrate-interacting adhesins on the outmost end of a flexible tip or fimbrial structure, which allows them to access a broader array of oriented glycan structures without the concern of steric hindrance with the fimbrial shaft (46,47).
We were unable to quantify the affinity of the FaeG-lactose interaction using SPR with a surface-bound glycoconjugate and by isothermal titration calorimetry with the soluble carbohydrate receptor (results not shown), most likely due to the limitation in detectable affinity to upper M/low mM range as the concentration of FaeG ntd/dsc can only be augmented to a certain value. The interaction between two-domain tip adhesins (TDA) and their receptors has been quantified on multiple occasions (16). F17-G, the TDA of F17 fimbriae, has an observed K D of ϳ1.2 mM for its monosaccharide N-acetyl glucosamine ligand, reflecting the shallow F17-G binding groove (49). FimH, the TDA of type 1 pili, has a much higher affinity for its cellular receptors ranging from 2.3 M for mannose to up to 20 nM for the oligomannose-3 ligand (55,56), reflecting the deeply buried mannose binding pocket. PapGII features a K D value of 540 M for the disaccharide galabiose and 84 M for globotetraose ligands (57). However, until now, little information was available on affinities by which single-domain adhesins recognize their carbohydrate receptors. The interaction of galactose toward PsaA has an IC 50 value of ϳ40 mM (58), indicating that the single domain adhesins of poly-adhesive fimbria are likely to bind with less affinity to their target receptors. Upon colonization of the intestinal tract, F4 fimbriae avidity effects take place where multiple FaeG subunits simultaneously interact, resulting in a firm interaction with host tissue. A major advantage of the above described binding mode is to avoid clearance by secreted glycoproteins. The latter are secreted in the intestinal tract, and a high affinity of FaeG for glycoproteins with galactosyl residues in the terminal position could saturate the binding sites of F4 fimbriae. However, the observed low affinity enables the F4 fimbriae to shield their binding epitopes in the intestinal lumen from unwanted interactions and drives attachment to cell-attached glycan receptors by avidity effects. The immediate evolutionary advantage for the pathogen to develop such a mechanism is obvious. Also, the requirement of multiple concurrent interactions can only be fulfilled if ligands are correctly presented on the host tissue. The spatial distribution and nature of the immobilization surface therefore likely play a crucial role in determining the susceptibility to ETEC infections expressing F4 fimbriae, which could explain the existence of specific glycoprotein and glycolipid F4 receptors. A recurrent theme in the interaction of chaperone/usher-pili with host tissue manifests there is more than meets the eye. The F18 fimbrial adhesin FedF features a coincidence binding mechanism that targets FedF to glycosphingolipid receptors embedded in the cellular membrane (48). Type 1 pili and Cfa/I employ shear-enhanced adhesion to withstand the hydrodynamic forces acting on biological surfaces and direct binding to surface-bound receptors rather than toward soluble but otherwise identical receptors or soluble inhibitors (53, 59 -61).
Zhang et al. (19) exchanged variant-specific residues between the different FaeG variants. Mutating Glu 152 to Asn abolished binding of F4 ad fimbriae to the target tissue, whereas mutating residues Ala 133 , Met 147 , Ser 154 , and His 155 exhibited wild type binding (19). These results are in agreement with our co-complex structure between FaeG ad and lactose. Glu 152 is involved in a hydrogen bond interaction with the C6 hydroxyl group, whereas none of the other mutated residues are in close proximity to the carbohydrate ligand. In contrast, mutating residues Asn 152 , Leu 154 , and Ser 155 results in loss of binding of F4 ac fimbriae, and only the substitution of residue Val 133 did not alter the binding capabilities. Based on these results, the primary (or secondary) binding sites of F4 ac and F4 ad fimbriae are expected to overlap, but are not identical.
Comparison of the DЈ-␣1 loop between the three naturally occurring FaeG variants revealed important conformational differences between them and suggested either that the DЈ-␣1 loop of FaeG ab and FaeG ac undergoes a significant conformational change upon interaction with the ligand or that a different binding site is present on the surface of both variants. The first proposed possibility seems less likely as the hereto determined apo and bound structures of fimbrial adhesins did not reveal large structural changes in and around the binding site. Structural changes would require additional energy input and even further reduce the affinity between protein and receptor. Moreover, key residues in the galactose-FaeG ad interaction are lacking in the ab and ac variants. We assume the second possibility to be more likely as a variable surface is present on the FaeG subunit, which allows the accommodation of a distinct binding site. Variants ab and ac share a higher sequence identity and ligand specificity when compared with variant ad. For instance, the DЉ-E1 loops are almost conserved between the ab and ac variants and would make an ideal candidate to look for the variant-specific binding pocket. The generation of a hypervariable surface with multiple binding sites with a similar, although slightly different binding specificity would allow ETEC strains to attach and colonize to a much wider portion of the piglet population. For example, F4 ad fimbriae are unable to interact with galactose residues with a substituent sulfate group in the C3 position, whereas F4 ab efficiently interact with sulfated glycans. Also, F4 ad fimbriae cannot bind to glycan chains, which contain a ␣1-4 linkage; however, F4 ab fimbriae demonstrate high binding specificity on TLC overlay assays against galabiaosylceramide (Gal␣1-4Gal␤1Cer) and globotriaosylceramide (Gal␣1-4Gal␤1-4Glc␤1Cer) (30).
In summary, we provided the first structural insight into carbohydrate binding by poly-adhesive F4 fimbriae that allowed us to explain their carbohydrate specificity. The crucial DЈ-␣1 loop adopts remarkable different conformations between FaeG variants and led to the hypothesis that either conformational changes occur upon ligand binding or different variant-specific binding pockets exist on the FaeG surface.