Functional reprogramming of Candida glabrata epithelial adhesins: the role of conserved and variable structural motifs in ligand binding

For host – cell interaction, the human fungal pathogen Candida glabrata harbors a large family of more than 20 cell wall-attached epithelial adhesins (Epas). Epa family members are lectins with binding pockets containing several conserved and variable structural hot spots, which were implicated in mediating functional diversity. In this study, we have performed an elaborate structure-based mutational analysis of numerous Epa paralogs to generally determine the role of diverse structural hot spots in conferring host cell binding and ligand binding specificity. Our study reveals that several conserved structural motifs contribute to efficient host cell binding. Moreover, our directed motif exchange experiments reveal that the variable loop CBL2 is key for programming ligand binding specificity, albeit with limited predictability. In contrast, we find that the variable loop L1 affects host cell binding without significantly influencing the specificity of ligand binding. Our data strongly suggest that variation of numerous structural hot spots in the ligand binding pocket of Epa proteins is a main driver of their functional diversification and evolution.

For host-cell interaction, the human fungal pathogen Candida glabrata harbors a large family of more than 20 cell wallattached epithelial adhesins (Epas). Epa family members are lectins with binding pockets containing several conserved and variable structural hot spots, which were implicated in mediating functional diversity. In this study, we have performed an elaborate structure-based mutational analysis of numerous Epa paralogs to generally determine the role of diverse structural hot spots in conferring host cell binding and ligand binding specificity. Our study reveals that several conserved structural motifs contribute to efficient host cell binding. Moreover, our directed motif exchange experiments reveal that the variable loop CBL2 is key for programming ligand binding specificity, albeit with limited predictability. In contrast, we find that the variable loop L1 affects host cell binding without significantly influencing the specificity of ligand binding. Our data strongly suggest that variation of numerous structural hot spots in the ligand binding pocket of Epa proteins is a main driver of their functional diversification and evolution.
Human-pathogenic fungi often possess large repertoires of cell wall-associated adhesins for successful host recognition, invasion, and colonization (1)(2)(3)(4). Most known fungal adhesins have a common modular architecture and consist of an N-terminal domain for adhesion (A domain), followed by a large segment that contains variable numbers of glycosylated and serine-and threonine-rich repeats (B domain) and a C-terminal domain with a GPI (glycosylphosphatidylinositol) anchor for cell wall attachment (5). Thus, fungal adhesins are also referred to as GPI-CWP (GPI-anchored cell wall-associated proteins) adhesins (6). Functional and structural analysis of a number of fungal adhesins from Candida glabrata and Candida albicans, two major agents of human fungal infections, revealed that A domains recognize host surfaces by specific binding of glycan structures (7,8) or host peptides (9,10). These studies have uncovered specific regions or structural hot spots involved in ligand binding (11)(12)(13)(14)(15), but in most cases these motifs have not been further analyzed by detailed structure-based functional studies. Pinpointing the precise function of these structural motifs in determining ligand binding affinity and specificity, however, not only contributes to a better understanding of the molecular evolution of large protein families in general but also provides the structural basis for engineering synthetic adhesins with novel ligand binding patterns.
The opportunistic human fungal pathogen C. glabrata is a commensal of the human microbiome that accounts for up to 29% of total life-threatening Candida bloodstream infections (16). Because of its resistance to several antimicrobial agents and in combination with the more virulent C. albicans, it poses an increasing threat to human health, especially to patients with a compromised immune system (17,18). For adherence to human epithelial cells, C. glabrata carries more than 80 proteins with architectures of typical GPI-CWP adhesins (19,20). Among these, the Epa (epithelial adhesin) proteins form the largest family, with more than 20 members, depending on the strain background. Moreover, comparative genomic studies have shown that Epa-like adhesins can be found in diverse species of the Nakaseomyces genus, albeit with variable numbers often correlating with known pathogenicity (21).
Several functional studies using glycan array screening and host cell binding assays have been performed to uncover the ligand binding specificities of Epa cell wall proteins (8,12,13,15,22). These functional studies have employed either purified EpaA domains or EpaA-presenting surface display systems like the nonpathogenic yeast Saccharomyces cerevisiae. These binding studies show that roughly 70% of Epa family members recognize a wide variety of aand b-linked galactosides and that the rest binds to several nongalactosidic terminal glycans. Moreover, host cell binding studies have revealed that about 20% of Epa adhesins confer very efficient human epithelial cell adhesion and 35% mediate moderate binding, whereas the rest adhere only weakly to epithelial cells. Whether Epa family members are competent to bind to further glycans or host cells is currently not known.
A number of high-resolution crystal structures of the adhesion domains of Epa1 and Epa6 in complex with cognate glycan ligands have provided several structural features of Epa family members (12,15,23). These studies revealed that Epa proteins generally possess a highly conserved PA14/Flo5-like core domain as initially found in the structures of the anthrax-protective antigen and the adhesion domains of the S. cerevisiae flocculin Flo5 (24,25). In addition, further peripheral structural elements have been identified that appear to contribute to ligand binding affinity and specificity ( Fig. 1 and Fig. S1). These structural features include two calcium binding loops, CBL1 and CBL2, which form the inner binding pocket of Epa proteins, and three further loops, L1-L3, that form the outer pocket. Moreover, these structural studies show that the surface composition of EpaA domains is variable but that certain residues of the ligand binding pockets are highly conserved. These include a DcisD motif (25) in the inner pocket and a tryptophan residue in loop L3, as well as a disulfide bond linking loop L1 and L2 in the outer pocket ( Fig. 1). It has been suggested that these conserved surface elements are required for high-affinity ligand binding (12), but a detailed mutational analysis of these elements is still lacking. In addition, the binding pockets of EpaA domains contain highly variable residues that are located within CBL2 and the loops L1 and L2. Specifically, three of the residues of CBL2, positions II-IV, are highly variable. Interestingly, the sequence of these positions often correlates with the binding of certain galactosides, indicating that they are involved in conferring ligand binding specificity. In certain cases, CBL2 composition and ligand binding patterns do not correlate, suggesting that other variable elements, e.g. loops L1 and L2, contribute to ligand recognition (15). However, the involvement of these variable structural elements in determining ligand binding affinity and specificity has not been comprehensively addressed by structure-based mutational analysis to obtain a more general picture of programming ligand binding properties within the Epa family.
In this study, we have performed a comprehensive mutational analysis of diverse EpaA domains to determine the precise function of both conserved and variable structural motifs.
Specifically, we created EpaA variants with mutations in the highly conserved DcisD motif, the tryptophan residue in loop L3, the disulfide bond linking loops L1 and L2, exchanges in the highly variable CBL2 positions II-IV, and the variable loop L1. Mutated EpaA variants were then functionally characterized by in vivo studies using an S. cerevisiae expression system and human epithelial cells and complementary in vitro analyses employing glycan array assays and crystal structure analysis. Our data indicate that (i) the conserved structural motifs are crucial for efficient host cell binding, (ii) the variable loop CBL2 plays a central role in programming ligand binding specificity, and (iii) the variable loop L1 appears to affect host cell binding by influencing the affinity rather than the specificity of ligand binding. In summary, our study suggests that core functionality and diversity of the Epa family has been achieved by coevolution of crucial conserved and variable structural hot spots.

Functional analysis of conserved structural elements
We first assessed the role of three structural elements that are conserved in all known EpaA variants, specifically the DcisD motif, the tryptophan residue in loop L3, and the disulfide bond linking loops L1 and L2 (Fig. 1). For this purpose, we tested functionality of these elements at the example of Epa1A by construction of respective mutants. These included Epa1A D165A carrying an AcisD motif, Epa1A W198A carrying an alanine instead of a tryptophan residue in the L3 loop, and Epa1A C78S breaking the disulfide bond between loops L1 an L2. These variants were tested for their in vivo adhesion activity to epithelial cells by using an S. cerevisiae expression system and human A structural model of Epa1A is shown on the left depicting conserved and variable surface residues when comparing the 17 Epa paralogs present in C. glabrata strain CBS138. The degree of conservation/variability is color coded and was obtained by using the Con-Surf-server (46,47) and the multiple-sequence alignment shown in Fig. S1. The ligand binding pocket is presented on the right and shows highly conserved residues of the inner binding pocket, including the DcisD motif of CBL1 and an asparagine of CBL2 both conferring coordination of a Ca 21 ion, a tryptophan residue of loop L3, and an arginine residue at position I of CBL2. In contrast, residues of the inner binding pocket at positions III and IV of CBL2 are highly variable. The two conserved cysteines forming a disulfide bond between loops L1 and L2 are indicated.
Functional reprogramming of C. glabrata epithelial adhesins colorectal Caco-2 cells (Fig. 2, Fig. S2, Table 1). As shown in Fig. 2, all three mutations significantly suppress adhesion activity in vivo (Table S1), indicating that the respective conserved structural elements are indeed crucial for functionality.
Our previous structural and biochemical studies with Epa1A have shown that the indole side chain of W198 not only coaligns with the pyranose moiety of the terminal carbohydrate of bound ligands but also is crucial for efficient ligand binding in vitro (12). Therefore, we tested whether, apart from indole, other aromatic side chains fulfill this function. For this purpose, Epa1A W198Y and Epa1A W198H mutants were constructed and tested for epithelial cell binding, revealing that both variants confer efficient adhesion. Thus, functionality of the residue at position 198 is not restricted to indole and can be fulfilled by other aromatic side chains.

Reprogramming of ligand binding specificity by exchange of CBL2 loop motifs
Previous studies with Epa1A have shown that either exchange of its CBL2 motif to corresponding regions of other EpaA variants or random mutagenesis of positions II or III in CBL2 is sufficient to alter ligand binding specificity (12,13). Although these studies suggest that the CBL2 loop is crucial for programming ligand binding specificity within the whole Epa family, the functional role of CBL2 motifs of other natural EpaA variants has not been experimentally tested so far. Here, we analyzed the Adhesion of different EpaA domains to human epithelial Caco-2 cells was determined by expression and surface presentation of EpaA domains in S. cerevisiae strain BY4741 carrying appropriate plasmids without any adhesin (control) or with an expression cassette either lacking an EpaA domain (no A) or harboring different EpaA variants as indicated. Yeast strains were grown to logarithmic phase and uniformly labeled by fluorescence before incubation with a monolayer of Caco-2 cells for 2 h. After incubation, nonadhesive yeast cells were removed by washing, and fluorescence of adhesive cells was quantified along with a control accounting for the fluorescence of the total amount of yeast cells used in the assay. Relative adhesion values were calculated by determining the ratio between the fluorescence values measured for the adhesive yeast cells divided by the fluorescence values measured for the total amount of yeast cells used. Black bars indicate average adhesion values, and error bars indicate the standard deviation of at least three independent measurements (yellow dots). Magenta bars indicate the average adhesion values corrected for the amount of EpaA domains present at the yeast cell surface as determined in Fig. S2. The gray band indicates adhesion resulting from the carrier domain (no A). Statistical significance of the differences measured between different EpaA domains were calculated by an unpaired t test, and corresponding P values used for classification are listed in Table S1. functionality of the CBL2 loop motifs of Epa1A and five additional EpaA variants, Epa2A, Epa3A, Epa6A, Epa9A, and Epa10A, respectively. Specifically, we exchanged several CBL2 regions by using one variant as a donor and a second variant as a recipient. Subsequently, the functionality of the resulting chimeric variants was analyzed by measuring epithelial cell binding in vivo and determination of corresponding ligand binding profiles in vitro using mammalian glycan arrays from the Consortium for Functional Glycomics (CFG; www.functionalglycomics.org). In principle, we expected that the reprogrammed chimeric variants will exhibit functionality of either the recipient or the donor variant or would have novel properties not observed for the donor/recipient variants.
For epithelial cell binding, eight novel and three previously described Epa1A exchange variants (12) were expressed in S. cerevisiae and analyzed for conferring adhesion to Caco-2 cells. As shown in Fig. 2, this analysis revealed that four chi-meric variants retained the binding behavior of the recipient, including Epa1A CBL2Epa9 , Epa2A CBL2Epa1 , Epa3A CBL2Epa1 , and Epa9A CBL2Epa1 . In these four cases, transfer of a foreign CBL2 motif affected neither the high-efficiency (Epa1A) nor the lowefficiency (Epa2A, Epa3A, and Epa9A) epithelial cell binding of the recipients (Table S1). In one case, we found that the chimera exhibited binding behavior of the donor variant, namely, Epa1A CBL2Epa3 . Here, the epithelial binding efficiency of Epa1A was reduced to the level of the donor variant Epa3. In the next three cases, Epa1A CBL2Epa2 , Epa6A CBL2Epa1 , and Epa10A CBL2Epa1 , epithelial cell binding efficiency was comparable with neither the donor nor the recipient but exhibited a novel binding pattern (Table S1). Finally, the three variants Epa1A CBL2Epa6 , Epa2A CBL2Epa3 , and Epa3A CBL2Epa2 displayed the binding efficiency observed for both donor and recipient but did not lead to a novel binding behavior. In summary, these data show that the exchange of the CBL2 motif does not lead to a uniform reprogramming of target cell binding, because Epa1A EPA1  in pET-28(a)1 15 BHUM1784 Epa2A EPA2  in pET-28(a)1 15 BHUM1788 Epa3A EPA3  in pET-28(a)1 15 BHUM1790 Epa6A EPA6  in pET-28(a)1 15 BHUM1886 Epa9A EPA9  in pET-28(a)1 15 BHUM2495 Epa10A EPA10  in pET-28(a)1 15 BHUM1804 Epa1A CBL2Epa2 EPA1 (31- Epa3A EPA3  in BHUM1760 15 BHUM2020 Epa9A Epa10A EPA10   Functional reprogramming of C. glabrata epithelial adhesins chimeric variants exhibiting either recipient, donor, or novel functionality can be created to comparable extents. Next, we determined and compared the carbohydrate specificity profiles of seven chimeras (Fig. S3) together with the respective donor/recipient variants. For this purpose, the respective recombinant adhesion domains were fluorescently labeled and analyzed by glycan array screening as previously described (12,15). We then compared the different ligand binding patterns according to both the best bound glycan (best binder) as well as to the best bound terminal sugars (at least 50% binding efficiency relative to best binder) in combination with the linkage type to the penultimate carbohydrate moiety. As shown in Fig. 3, this analysis revealed that for four chimeras, novel best binders were found that differ from the bestbound glycans of recipients or donors. Specifically, novel best binders were a1-4-linked galactosides (Epa1A CBL2Epa2 and Epa1A CBL2Epa3 ) or b1-4-linked sulfated galactosides (Epa1A CBL2Epa9 and Epa3A CBL2Epa1 ) instead of b1-3-linked galactosides (Epa1A, Epa3A, and Epa9A) or nongalactosides (Epa2A). Similarly, novel ligand binding patterns were found for these chimeras when analyzing the best-bound terminal disaccharide types. For instance, Epa1A CBL2Epa2 and Epa1A CBL2Epa3 were both able to efficiently bind to a1-4-linked galactosides, ligands bound by neither recipient nor donor domains. Likewise, Epa1A CBL2Epa2 , Epa1A CBL2Epa3 , and Epa1A CBL2Epa9 have a novel ligand binding specificity toward sulfated galactosides. In the case of the three other chimeras, the best-bound ligands were found to be either identical or highly similar to the best binder of either the recipient (Epa9A CBL2Epa1 ), the recipient/donor (Epa1A CBL2Epa6 ), or the donor (Epa3A CBL2Epa2 ), respectively. The ligand binding patterns of these chimeras were either highly similar to (Epa9A CBL2Epa1 ) or significantly shifted toward the patterns of the respective donors (Epa1A CBL2Epa6 and Epa3A CBL2Epa2 ). In summary, these data indicate that the exchange of the CBL2 motif preferably favors the creation of novel ligand binding patterns instead of simply transferring ligand binding specificities from one EpaA domain to another.  (Table S2). Pie charts show the glycan types that are bound by the different EpaA domains with a strength of at least 50% relative to the best bound glycan (chip v5.4 ID numbers below) and for which terminal disaccharide types are unambiguous. Different classes of EpaA variants (natural variants, CBL2 exchange variants, and L1 exchange variants) are indicated on the left. Profiles of natural variants and of Epa1A CBL2Epa2 , Epa1A CBL2Epa2 , and Epa1A CBL2Epa2 were previously described (12,15) and were adjusted to match the data of CBL2 and L1 exchange variants obtained in this study. Glycan types shown are color coded according the terminal (primary) carbohydrate moiety and the linkage type to the following (second) carbohydrate unit as indicated on the lower left. The areas shown correspond to the logarithm of the fluorescence signal over noise for the best binder, as exemplified by the two black circles on the lower right. B, structures of best bound glycans listed in panel A according to the CFG chip v5.4 ID number. Symbols for carbohydrate moieties are indicated on the right.
Structural basis for CBL2-mediated ligand discrimination by Epa1A, Epa6A, and Epa9A Next, we wanted to understand how the structure of the CBL2 loop controls the ligand binding specificity in the Epa family. So far, only two structures are available, Epa1A and Epa6A. Here, we aimed at extending this list by solving the crystal structure of Epa9A. Epa9A was chosen because it has a binding profile that stands between that of Epa1A, which prefers glycans with terminal Galb1-3-linked disaccharides, and Epa6A, which has a high preference for terminal Gala1-4linked disaccharides (15). In addition, Epa9A functionally differs from Epa1A by preference of long and branched glycan structures, as revealed by the best binding glycans found by the array analysis. Moreover, Epa9A is characterized by an unusually long L1 loop, whose function has not been studied yet.
Here, we were able to solve the crystal structures of Epa9A in complex with lactose (Galb1-4Glc), lacto-N-biose (Galb1-3GlcNAc), and N-acetyl-D-lactosamine (Galb1-4GlcNAc) ( Fig.  4 and Table 2). The comparison with corresponding cocrystals of Epa1A and Epa6A reveals that the structural core of Epa9A is very similar to that of Epa1A and Epa6A (Fig. S4), as the overall root mean square deviation (r.m.s.d.) values between Epa9A and Epa1A/Epa6A are only 0.64/0.62 Å for 144/156 Ca atoms (Fig. 4B). Interestingly, the unusually long L1 region of Epa9A (Y70-G128) is undefined by electron density for its loop (N77-G107) protruding over the glycan binding site, indicating that this elongated loop is very flexible and, therefore, has an inhomogeneous orientation in the crystals. With respect to the ligand binding modes, the orientation of the terminal galactose moiety in the different complexes of Epa6A and Epa9A is comparable. However, interactions with this hexose vary because of different residues in the CBL2 region at positions III and IV, which are N228/D229 in Epa6A and Y259/H260 in Epa9A, respectively. Furthermore, Epa6A and Epa9A show clearly distinct binding modes for the recognition of the secondary hexose of the different disaccharide ligands. In the case of lacto-Nbiose (Galb1-3GlcNAc), the secondary GlcNAc residue is rotated by about 90°between Epa6A and Epa9A complexes (Fig. 4C). Direct interaction between GlcNAc and Epa6A is mediated by D227 and N228 (positions II and III) but only by D258 (position II) in the case of Epa9A. The conformation of the N-acetyl group of GlcNAc in Epa9A also mimics that of the C4-OH group of GlcNAc in the Epa6A complex. In the case of N-acetyl-D-lactosamine (Galb1-4GlcNAc), the ring moiety of the secondary GlcNAc is off-rotated by ;180°between the respective Epa6A and Epa9A complexes (Fig. 4D). This striking difference for the b1-4 linkage might be explained by differences in the number and type of ligand interactions as well as the distinct steric claims of the two binding pockets. Whereas Epa6A interacts with the 6-OH group of GlcNAc via CBL2 position III (N228), Epa9A contacts the same 6-OH group (3.0 Å) via CBL2 position II (D258) and the disulfide bridge (Fig.  4D). In the Epa9A complex, the CBL2 position III (Y259) is close to the N-acetyl group of the ligand (4.1 Å), which occupies the position of the 6-OH group found in the Epa6A complex. As a consequence, the ligand interactions found in Epa6A allow the sterically preferred distorted orientation of the secondary GlcNAc moiety, whereas in Epa9A both hexoses are almost coplanar. Obviously, the steric claim of the two voluminous residues at CBL2 positions III/IV of Epa9A enforce a coplanar arrangement of the two hexoses, whereas the smaller residues at the corresponding CBL2 positions in Epa6A appear to allow more flexibility in the orientation of the secondary sugar.
To further determine how the structure of CBL2 controls ligand binding specificity, we also aimed at solving the crystal structures of selected CBL2 exchange variants. To this end, we were able to solve the structures of Epa1A CBL2Epa9 and Epa9A CBL2Epa1 in complex with lactose (Table 2), enabling us to compare them with the corresponding complexes of Epa1A (23) and Epa9A. This comparison reveals that while the structures of the different binding pockets as well as the position of the terminal galactose moiety of the ligand are highly similar, the position of the secondary glucose moiety differs significantly (Fig. 5). The comparison of the Epa9A and Epa9A CBL2Epa1 complexes further reveals significant differences in the distances between the 6-OH group of glucose and the interacting residues (D or E, respectively) at CBL2 position II. This distance corresponds to 2.6 Å in the Epa9A complex but 3.4 Å in Epa9A CBL2Epa1 , indicating a less efficient binding of b1-4-linked galactosides in the CBL2 exchange variant. This finding agrees with our glycan profiling, which demonstrates that in contrast to Epa9A, Epa9A CBL2Epa1 is unable to efficiently bind to b1-4-linked galactosides and further explains the reprogramming of the ligand binding pattern of Epa9A CBL2Epa1 toward that of Epa1A (Fig. 3). Finally, a direct comparison of the secondary glucose position of Epa1A with the other variants is not possible because of the lack of sufficient electron density of this ligand moiety in the Epa1A complex (23). Interestingly, our complex with Epa1A CBL2Epa9 shows a defined position of the secondary glucose, which otherwise lacks direct binding to CBL2 but makes staggering interactions with a histidine of the affinity tag (Fig. S5).

Reprogramming of ligand binding by exchange of L1 loops
Our previous data indicate that variable regions other than CBL2 must be involved in ligand binding, because we identified four pairs of Epa domains with identical CBL2 motifs but distinct ligand binding patterns (15). Here, we examined the role of the L1 loop in determining target cell adhesion and glycan binding specificity, because this region is highly variable not only in sequence composition but also in length ( Fig. 4 and Fig.  S1). For this purpose, we constructed six chimeras in which L1 loops were exchanged between Epa1A, Epa6A, Epa9A, and Epa10A. These variants were chosen because (i) Epa9A and Epa10A have identical CBL2 motifs but exhibit significantly distinct glycan binding patterns, (ii) Epa9A and Epa10A differ from most other EpaA variants by carrying an extended L1 loop with additional 29 to 30 amino acids in their A domains, and (iii) Epa1A, Epa6A, and Epa9A are the structurally bestcharacterized Epa adhesion domains, enabling a solid design of chimeras. Resulting L1 chimeras were then characterized by epithelial cell adhesion and glycan profiling analogous to the CBL2 exchange variants.
Functional reprogramming of C. glabrata epithelial adhesins As shown in Fig. 2, epithelial cell binding revealed that two L1 chimeric variants, Epa9A L1Epa10 and Epa10A L1Epa9 , did not display an altered binding behavior (Table S1) and displayed the low binding efficiency observed for both donor and recipient. One variant (Epa1A L1Epa9 ) displayed the binding behavior of the recipient, because transfer of the L1 loop did not significantly reduce cell adhesion. In the case of three L1 chimeras (Epa6A L1Epa9 , Epa9A L1Epa1 , and Epa10A L1Epa1 ), epithelial cell binding was comparable with the binding behavior of the donor. The two chimeras Epa9A L1Epa1 and Epa10A L1Epa1 are especially striking, because in these cases transfer of the L1 loop from the strong binding Epa1A donor is sufficient to convert the much weaker binding Epa9A and Epa10A variants into strong epithelial cell binders. As such, Epa9A L1Epa1 and Epa10A L1Epa1 clearly differ from the Epa9A CBL2Epa1 and Epa10A CBL2Epa1 variants tested above (Table S1), which display the weak target cell behavior of their respective recipients. To test which of the two structural motifs CBL2 or L1 plays a dominant role, we also created two double mutants, Epa9A CBL21L1Epa1 and Epa10A CBL21L1Epa1 . Here, we found that both double mutants behaved like the CBL2 exchange variants (Table S1), indicating a dominant role for CBL2. In addition, we also analyzed the Epa1A CBL21L1Epa9 double exchange variant, because the Epa1A CBL2Epa9 and Epa1A L1Epa9 single exchange variants both behave like the Epa1A recipient. Here, we found that Epa1A CBL21L1Epa9 confers the weak adhesion behavior of the donor, indicating that both motifs must be exchanged to alter functionality. In summary, the results of this section argue for an important functional role of the L1 loop in target cell binding and further demonstrate that this functionality can be fully transferred from one EpaA variant to another by exchange of the L1 region.
To further analyze functionality of the L1 loop, we also determined the carbohydrate specificity profiles of five L1 chimeras. Here, we found that in the case of three chimeras (Epa1A L1Epa9 , Epa6A L1Epa9 , and Epa9A L1Epa10 ), glycan profiles were comparable with the profiles of the recipient domains. Moreover, best-bound ligands were also comparable with the recipient's best binders with b1-3or b1-4-linked terminal galactosides. In the cases of Epa9A L1Epa1 and Epa10A L1Epa9 , respectively, a pattern was found that was similar to the recipient patterns but included additional sulfated galactosides (Epa9A L1Epa1 ) or a1-4and b1-4-linked galactosides (Epa10A L1Epa9 ). Therefore, compared with CBL2, exchange of the L1 loop does not appear to have a major impact on ligand binding specificity.

Discussion
In this study, we have performed a comprehensive structurebased mutational analysis of both conserved and variable motifs of functionally diverse members of the family of epithe-lial adhesins of C. glabrata to contribute to a deeper understanding of the molecular evolution of protein families with large numbers of closely related members and to provide a better structural basis for the engineering of synthetic lectins with novel properties. Our study reveals that three highly conserved motifs, (i) the DcisD motif, (ii) the aromatic side chain of tryptophan residue in loop L3, and (iii) the disulfide bond linking loops L1 and L2, are essential for efficient host cell binding. The DcisD calcium binding motif is a hallmark of the large group of PA14/Flo5-like adhesin domains and is part of the DD-N signature that can be found in over 85% of the known fungal EpaAlike domains (15,26). Our previous study with the Flo5 adhesion domain from S. cerevisiae has shown that concomitant mutation of both aspartates of this motif abrogates the in vivo  Functional reprogramming of C. glabrata epithelial adhesins flocculation activity of Flo5 (25), but mutation of DcisD motifs in other PA14/Flo5-like domains has not been performed. Our analysis with Epa1A shows that mutation of a single aspartate is sufficient to reduce in vivo binding of this adhesion domain to epithelial cells and bolsters the functional importance of this highly conserved Ca 21 -binding motif. With respect to the conserved tryptophan in loop L3, we have previously shown that this aromatic residue is crucial for efficient binding of terminal galactose moieties in vitro and for efficient host cell binding in vivo (12). Our present mutational analysis suggests that the proposed stacking function of the indole side chain of W198 of Epa1A can be substituted by other aromatic groups such as imidazole or phenol, given that the corresponding mutational variants are fully functional in vivo (Fig. 2). However, functionally analogous tryptophans are highly conserved in the binding pockets of PA14/Flo5 adhesins of the glabrata group of Nakaseomyces yeasts and in structurally unrelated carbohydrate binding sites of galectins, which typically bind galactosides with high specificity (27). Replacements with other aromatics, like tyrosine, are only found for a few Epa1-like adhesins, e.g. in the yeast Kluyveromyces marxianus (GenBank entry BAP73700).
In contrast, fucose-specific F-type lectins prefer a histidine or phenylalanine residue in their glycan binding sites for stacking interactions (28). Thus, although our mutational analysis indicates that the different aromatic side chains are generally interchangeable, there may be subtle differences with respect to their stacking interactions with specific hexose ligands, which led to the apparent correlation between tryptophans and galactosides. Whereas the interaction energies between galactose and Trp, Tyr, or Phe are comparable, Trp offers more positional interaction options because of its size (29). Finally, our finding that the disulfide bond linking loops L1 and L2 of Epa1A are essential for efficient host cell binding indicates its crucial role in conferring stability and/or proper folding of the outer binding pocket, whose functional role in Epa-related adhesins has not been addressed so far by mutational analysis.
A central aspect of this study concerns the structure-based functional analysis of the CBL2 motif in EpaA domains, because previous work has revealed significant correlations between CBL2 sequences and binding specificity (12,13,15). As mutational studies were performed only for Epa1 so far, here we analyzed the functional role of CBL2 in five further adhesion domains from Epa2, Epa3, Epa6, Epa9, and Epa10. Our mutational data support a general functional role of this motif in EpaA domains, given the fact that most CBL2 exchange variants exhibit altered host cell binding. Our glycan array analysis also reveals that most CBL2 exchange variants show specific ligand binding patterns and retain galactosides as best binders. However, the binding profiles of most CBL2 exchange variants resemble the recipient and/or donor profiles only in part and include novel specificities. Therefore, it appears that exchange of whole CBL2 motifs alone is insufficient to completely transfer a given specificity profile, indicating that in general a given CBL2 motif must be combined with other structural features to reprogram Epa domains for specificity.
Here, we also provide novel high-resolution insights into the programming of ligand binding by structural hot spots. Our previous structural studies with Epa1A and Epa6A have shown in detail how positions II and III of CBL2 can confer discrimination between aand b-linked galactosides (12,15). In the present study, we now provide high-resolution structures of not only an additional natural variant, Epa9A, but also of two CBL2 exchange variants, Epa1A CBL2Epa9 as well as Epa9A CBL2Epa1 . Importantly, these structures allow us to address the question of to what extent the spatial orientation of a given CBL2 motif varies in different EpaA domains and programs ligand binding. Here, a comparison of the cocrystal structures of Epa9A and Epa1A CBL2Epa9 bound to lactose reveals that the spatial orientation of the CBL2 residues II and III, respectively, is identical in both variants, but that the structures exhibit significant differences with respect to the binding mode of the secondary sugar moiety (Fig. 5). This finding again supports the view that CBL2 motifs must generally be combined with other structural features to unambiguously program ligand binding specificity. Our structural data might also explain the finding that the binding of Epa9A to a-linked galactosides cannot be efficiently transferred to Epa1A by exchange of the corresponding CBL2 motif. However, we have not been able to obtain cocrystals of Epa9A or Epa1A CBL2Epa9 bound to a-linked galactosides to address this issue. Because of the unavailability of appropriate glycan ligands for cocrystallization, we could not further investigate our finding that Epa1A CBL2Epa9 , compared with Epa9A, preferentially binds sulfated galactosides. Our data allow a further structural comparison of lactose-bound Epa1A and Epa9A CBL2Epa1 , revealing that in this example both the spatial orientation of CBL2 residues II and III as well as the corresponding glycan profiles are highly similar. This finding indicates that the exchange of the CBL2 motif alone is sufficient to transfer ligand binding specificity in this case. However, because the in vivo and the in vitro binding efficiencies of Epa9A CBL2Epa1 are significantly reduced, one might infer that this variant is largely nonfunctional. Finally, our data allow us to determine the effects of various CBL2 motifs on the binding pocket conformation and the ligand binding profiles of a given EpaA domain. Here, the comparison of Epa1A and Epa1A CBL2Epa9 reveals that the overall conformations of the two binding pockets are highly similar, but that there are significant differences regarding ligand binding specificity. The highly improved binding of Epa1A CBL2Epa9 toward sulfated galactosides might be explained by the exchange of CBL2 position II from glutamate to a sterically less demanding aspartate (E227D). This finding is in agreement with a previous study showing that the Epa1A E227A variant also displays improved binding to sulfated glycans (13).
A further important outcome of our study concerns the variable loop L1 of EpaA domains, whose function has not been addressed experimentally in previous studies. Our data with a number of different exchange variants demonstrates that L1 loops can have important functional roles with respect to host cell binding, as exemplified by the two gain-of-function variants Epa9A L1Epa1 and Epa10A L1Epa1 and the loss-of-function variant Epa6A L1Epa9 . In contrast to the CBL2 motif, however, the loop L1 does not appear to have a major impact on ligand specificity, as exemplified by our glycan array analysis. Nevertheless, the ligand binding profile of the gain-of-function variant Epa9A L1Epa1 reveals an increase in specificity toward sulfated galactosides, which might be attributed to enhanced host cell binding. What could the structural basis for the functionality of L1 be? Our structure of Epa9A does not directly contribute to answering this question, because its elongated loop L1 is not defined by electron density. However, L1 of Epa9A appears to be very flexible, indicating that this elongated loop adopts different conformations and, e.g. act as a regulatory lid for the inner binding pocket. This possibility is supported by the altered functionality of the three L1 exchange variants discussed above. Importantly, exchange of an elongated loop for a shorter version causes increased host cell binding (Epa9A L1Epa1 and Epa10A L1Epa1 ), whereas an opposite exchange leads to decreased binding efficiency (Epa6A L1Epa9 ). Moreover, our data indicate that such a regulatory function of L1 would require a matching combination of EpaA binding pocket and L1 sequence, because the transfer of the elongated loop L1 from Epa9A to Epa1A does not affect host cell binding or ligand specificity. A further interesting observation with regard to a possible function of L1 concerns the binding specificity toward more complex glycans, as measured by in vitro glycan profiling using purified proteins. Here, we have noticed that Epa9A and Epa10A, both carrying an elongated L1 loop, bind to branched glycans consisting of more than eight sugar moieties with a roughly 1.8-fold higher frequency than Epa1A, Epa3A, and Epa6A, all carrying short L1 loops. This indicates that longer L1 loops could act as regulatory lids of the inner binding pocket, and, when adopting an open conformation, might confer additional binding subsites for the outer moieties of longer and branched glycan ligands. Obviously, our data indicate that conditions used for in vitro glycan array analysis favor an open L1 conformation, whereas results from in vivo host cell binding appear to better reflect a closed state. Finally, our data with Epa9A L1Epa1 carrying a short loop suggest that the length or composition of L1 affects the ligand binding specificity by long-range fine-tuning of the binding pocket, given the additional binding of sulfogalactosides by this variant compared with Epa9A.
In general, our study underscores the view that deciphering the sugar code and successful lectin engineering are challenging tasks that crucially depend on highly detailed structural and functional studies on glycan-lectin interactions (30,31). Our study clearly shows that reaching these goals not only depends on uncovering novel lectin folds but also requires extensive functional analysis. Here, we used a directed mutational approach to directly address the evolution of ligand binding specificity in the Epa family of C. glabrata. Whereas we have found that reprogramming of Epa adhesins by exchange of structural hotspots is generally feasible, our study also demonstrates that programming of a defined specificity remains a highly challenging task. Here, a random mutagenesis and selection scheme of the CBL2 motif has been found to be a promising approach (13). Thus, future approaches using Epa-like adhesins and employing random-sequence library-directed evolution of structural hotspots (32,33) might allow the creation of lectins with defined specificities. Finally, our study provides a significant number of novel Epa proteins with well-characterized binding patterns that might be useful tools not only for glycan profiling of biological surfaces but also for medical diagnostics. Specifically, future application of these novel lectins and further Epa variants created by directed or random mutation of functionally relevant structural hotspots might include their use in the generation of microarray devices for high-throughput analysis of glycosylation patterns of single proteins, for diagnostic detection of complex carbohydrate structures on mammalian cells, e.g. in tumors, or for the analysis of glycans present on microbial cell surfaces (34). Further applications might include lectin affinity chromatography, histo-and cytochemical approaches, lectin blotting, and biosensor technologies (35).

Yeast strains
In vivo adhesion assays were performed with the nonadhesive S. cerevisiae strain BY4741 (European Saccharomyces cerevisiae Archive for Functional Analysis) carrying appropriate plasmids (Table 1). Standard methods for yeast culture medium and transformation were used as described previously (36).

Plasmids
All plasmids used in this study are listed in Table 1. Amino acid residues are numbered according to sequences described in the UniProt database (www.uniprot.org). For expression in S. cerevisiae, a YCplac33-based expression system was used carrying (i) the PGK1 promoter, (ii) the FLO11 secretion signal, (iii) a 3-fold hemagglutinin tag, (iv) the FLO11BC domain containing amino acids 214 to 1360, and (v) the FLO11 terminator. Mutation of CBL2 was performed by site directed mutagenesis. EPAA variants with switched L1 regions were generated by synthesis of DNA fragments (Thermo Fisher Scientific GeneArt, Regensburg, Germany). These fragments were then inserted into the S. cerevisiae expression plasmids via SacII/SacI restriction sites. For in vitro studies the EpaA variants were produced in E. coli by using the pET-28a(1) expression system. Construction of these expression plasmids was carried out by amplification of each EPAA domains with primers carrying NdeI/XhoI or NheI/XhoI restriction sites and the YCplac33 based expression plasmids described above as a template. The resulting fragments were inserted into the pET-28a(1) vector by restriction and ligation.

Recombinant overproduction and crystallization of EpaA domains
All EpaA domains were produced using the E. coli strain SHuffle T7 express (New England Biolabs GmbH, Frankfurt, Germany) and a low-temperature protocol (25). Proteins were purified by nickel-nitrilotriacetic acid affinity chromatography (Macherey-Nagel, Düren, Germany) and subsequent size exclusion chromatography using a HiLoad Superdex 75-pg column (GE Healthcare, Munich, Germany). All steps were carried out in AML buffer (20 mM Tris-HCl, pH 8.0, 200 mM NaCl, 50 mM lactose).

Data collection, structure solution, and analysis
Datasets were recorded at the BESSY II synchrotron (Berlin), Beamline 14.1, and at the ESRF (Grenoble), Beamline ID23-1. Data integration was done with XDS (37). XSCALE (38) or SCALA (39) was used for scaling, each run within the CCP4 (40) software suite. The structure of ligand-soaked Epa9A as well as Epa1A CBL2Epa9 was solved via molecular replacement with PHASER (41) using a carefully trimmed model of Epa1A (PDB entry 4AF9). For structure solution of Epa9A CBL2Epa1 , the structure of lactose bound Epa9A (PDB entry 4CP0) was used as a model. Refinement was done with alternating rounds of phenix.refine (42) and Coot (43).

High-throughput glycan binding assays and analysis
Recombinant EpaA domains were fluorescently labeled using an Alexa Fluor 488 THF kit (Life Technologies, Darmstadt, Germany) and applied to glycan arrays from the Consortium of Functional Glycomics (CFG) v5.4, respectively, at a protein concentration of 50 mg/ml or 200 mg/ml. Array surfaces were repeatedly washed and remaining fluorescence was measured and quantified. For comparison, all data obtained in this study and previously published data (12,15) were adjusted to glycans present on array v5.4 (Table S2).

Cell surface presentation and quantification of EpaA domains in S. cerevisiae
For expression and cell surface presentation of EpaA domains, the S. cerevisiae strain BY4741 carrying plasmids with the appropriate P PGK1 -3HA-EPAA-FLO11BC constructs was used (Table 1). Cultures were grown in low fluorescence yeast medium to an optical density at 595 nm of 1 before cells were washed three times in PBS-1% BSA. Cells then were incubated with a monoclonal mouse anti-HA antibody (H3663; Sigma Aldrich, Munich, Germany) at a dilution of 1:1000 in PBS-1% BSA for 30 min at room temperature. After three wash steps, cells were incubated in darkness with a Cy3-conjugated secondary goat anti-mouse antibody (C2181; Sigma Aldrich, Munich, Germany) at a dilution of 1:10,000 in PBS-1% BSA for 20 min at room temperature. After three further washing steps, a Zeiss Axiovert 200 M microscope was used to visualize S. cerevisiae cells with differential interference contrast and detect EpaA domains at the cell surface using a rhodamine filter set (AHF Analysentechnik AG, Tübingen, Germany). Cells were photographed with a Hamamatsu Orca ER digital camera (Hamamatsu, Bridgewater, NJ), and pictures were processed and analyzed using the Improvision Volocity software (Perki-nElmer, Hamburg, Germany). Fluorescence signals of at least 200 yeast cells were then quantified using ImageJ software (44).

Adhesion of S. cerevisiae to human epithelial cells
Adhesion assays were performed by following a modified protocol described previously (15) and by using the epithelial human cell line Caco-2 (American Type Culture Collection HTB-37) in combination with S. cerevisiae strain BY4741 carrying plasmids with the appropriate P PGK1 -3HA-EPAA-FLO11BC constructs ( Table 1). Preparation of human cells was performed as described previously (15). Briefly, 48-well polystyrene plates with confluent monolayers of Caco-2 cells were prepared after removal of the culture medium and addition of 250 ml fresh prewarmed DMEM without gentamicin. S. cerevisiae strains carrying appropriate plasmids were grown in YPD medium at 30°C to an optical density at 595 nm of 1, and cultures were washed three times in PBS-1% BSA. S. cerevisiae cells were then uniformly labeled by incubation with a polyclonal rabbit anti-Candida antibody (BP1006, OriGene, Herford, Germany) for 60 min, followed by three washing steps with PBS-1% BSA. A secondary DyLight488-conjugated goat anti-rabbit antibody (CW800, LI-COR, Bad Homburg, Germany) then was added and incubated for 30 min in darkness. After three additional wash steps, 0.4 optical density units of labeled yeast cells were added to a 48-well plate containing monolayers of human epithelial cells in 250 ml fresh prewarmed DMEM without gentamicin. After incubation for 120 min, nonadherent yeast cells were carefully removed by washing with 200 ml PBS per well. Residual cells were scraped from the plate, and adhesive yeast cells were quantified by using a fluorimeter. As a control (total cell control), 0.4 optical density units of yeast cells were added to the same 48-well plate to wells without mammalian cells containing 250 ml fresh prewarmed DMEM without gentamicin, and cells were quantified after incubation in a fluorimeter along with the adhesive cells. Relative adhesion values were then calculated by determining the ratio between the fluorescence values measured for the adhesive cells divided by the fluorescence values measured for the total cell control.

Bioinformatic analysis
Figures of protein structures were generated with the Molecular Graphics Software PyMOL v2.3.0 (Schrödinger, LLC). The alignment was constructed using Clustal Omega (45). Statistical analysis of data obtained by cell adhesion tests was performed using Prism software (GraphPad Software, La Jolla, CA, USA). Statistical differences were considered significant with a p value of ,0.01, as determined by an unpaired t test.

Data availability
The glycan array data obtained and/or analyzed in this publication have been deposited at the Consortium for Functional Glycomics (www.functionalglycomics.org) under the identifiers cfg_rRequests #2080, #2554, #2645, #2794, #2854, and #3502. The atomic coordinates and structure factors of EpaA domains obtained in this study have been deposited in the Protein Data