The 1.9 Å crystal structure of the extracellular matrix protein Bap1 from Vibrio cholerae provides insights into bacterial biofilm adhesion

Growth of the cholera bacterium Vibrio cholerae in a biofilm community contributes to both its pathogenicity and survival in aquatic environmental niches. The major components of V. cholerae biofilms include Vibrio polysaccharide (VPS) and the extracellular matrix proteins RbmA, RbmC, and Bap1. To further elucidate the previously observed overlapping roles of Bap1 and RbmC in biofilm architecture and surface attachment, here we investigated the structural and functional properties of Bap1. Soluble expression of Bap1 was possible only after the removal of an internal 57-amino-acid-long hydrophobic insertion sequence. The crystal structure of Bap1 at 1.9 Å resolution revealed a two-domain assembly made up of an eight-bladed β-propeller interrupted by a β-prism domain. The structure also revealed metal-binding sites within canonical calcium blade motifs, which appear to have structural rather than functional roles. Contrary to results previously observed with RbmC, the Bap1 β-prism domain did not exhibit affinity for complex N-glycans, suggesting an altered role of this domain in biofilm-surface adhesion. Native polyacrylamide gel shift analysis did suggest that Bap1 exhibits lectin activity with a preference for anionic or linear polysaccharides. Our results suggest a model for V. cholerae biofilms in which Bap1 and RbmC play dominant but differing adhesive roles in biofilms, allowing bacterial attachment to diverse environmental or host surfaces.

Growth of the cholera bacterium Vibrio cholerae in a biofilm community contributes to both its pathogenicity and survival in aquatic environmental niches. The major components of V. cholerae biofilms include Vibrio polysaccharide (VPS) and the extracellular matrix proteins RbmA, RbmC, and Bap1. To further elucidate the previously observed overlapping roles of Bap1 and RbmC in biofilm architecture and surface attachment, here we investigated the structural and functional properties of Bap1. Soluble expression of Bap1 was possible only after the removal of an internal 57-amino-acid-long hydrophobic insertion sequence. The crystal structure of Bap1 at 1.9 Å resolution revealed a two-domain assembly made up of an eight-bladed ␤-propeller interrupted by a ␤-prism domain. The structure also revealed metal-binding sites within canonical calcium blade motifs, which appear to have structural rather than functional roles. Contrary to results previously observed with RbmC, the Bap1 ␤-prism domain did not exhibit affinity for complex N-glycans, suggesting an altered role of this domain in biofilmsurface adhesion. Native polyacrylamide gel shift analysis did suggest that Bap1 exhibits lectin activity with a preference for anionic or linear polysaccharides. Our results suggest a model for V. cholerae biofilms in which Bap1 and RbmC play dominant but differing adhesive roles in biofilms, allowing bacterial attachment to diverse environmental or host surfaces.
Vibrio cholerae, the bacterium responsible for pandemic cholera, forms three-dimensional biofilms that aid in V. cholerae transmission, pathogenicity, and environmental persistence (1)(2)(3). The biofilm structure is composed of a specialized bacterial community with distinct growth stage properties, held together by an extracellular matrix of polysaccharides, proteins, and nucleic acids (3)(4)(5). V. cholerae biofilms have been implicated in mammalian pathogenicity by increasing the infectious dose and by providing resistance to the acidic environment of the stomach (1,4). In the natural aquatic niche, biofilms aid in the persistence of V. cholerae by providing protection from environmental threats such as predation and nutrient limitation (2,(5)(6)(7)(8).
The major components of the V. cholerae biofilm matrix are Vibrio polysaccharide (VPS), 3 nucleic acids, and the matrix proteins RbmA, RbmC, and Bap1 (3,8,9). Vibrio polysaccharide is formed by repeating units of an acetylated tetrasaccharide unique to V. cholerae, whose synthesis and export are carried out by the products of the vps I and vps II gene clusters (10 -12). RbmA and RbmC are two of six proteins encoded by the Rugosity and Biofilm Modulators (rbm) gene cluster (13,14). Bap1 (Biofilm-Associated Protein 1), which shares substantial sequence identity with a large fragment of RbmC, is encoded by a single gene downstream of the vps and rbm gene clusters. Synchronized up-regulation of vps I, vps II, and rbm gene clusters, as well as bap1, has been shown to occur during the biofilm production life stage of V. cholerae (15). The highresolution structure of RbmA uncovered a composition of tandem fibronectin type III domains and provided substantial insight into its contribution to the V. cholerae biofilm matrix (16,17). Less is understood about the structure and molecular mechanisms underlying the scaffolding roles played by Bap1 and RbmC (18).
Insights into the function of the biofilm matrix components have come from knockout mutagenesis studies of the biofilm matrix proteins (RbmA, RbmC, and Bap1) and microscopy utilizing fluorescently-labeled components of the biofilm. Investigation into V. cholerae biofilm formation and architecture often utilizes so-called rugose strains, which exhibit increased biofilm production, wrinkled colony morphologies, and the formation of a floating structure called a pellicle (19). Experiments utilizing deletion mutants of bap1, rbmC, or both genes in the rugose background suggest that Bap1 and RbmC have similar, additive, and essential roles in biofilm formation (14). Further analysis of the V. cholerae biofilm architecture using microscopy techniques provides additional insight implicating RbmA in V. cholerae cell-cell adhesion, Bap1 in surface attachment, and both Bap1 and RbmC in formation of dynamic envelopes that encase clusters of cells in the mature biofilm (9). Deletion of either bap1 or rbmC results in similar biofilm development profiles supporting their partially redundant function, whereas the double deletion results in cell clusters that are unable to attach to surfaces (10,21). In the context of in vitro growth assays on glass coverslips, the two proteins appear to be functionally redundant, as deletion of either single protein results in biofilms with apparently normal development and surface attachment (20). However, Bap1 and RbmC are not totally redundant as RbmC does not localize to the biofilm surface upon bap1 deletion (9).
Although the complete three-dimensional structures of RbmC and Bap1 are unknown, previous studies utilizing sequence-based prediction methods propose multidomain architectures for Bap1 and RbmC (14). The prediction for Bap1 includes four Vibrio, Colwellia, Bradyrhizobium, and Shewanella (VCBS) domains and four FG-GAP domains (both of which fall under the umbrella of the ␤-propeller structural motif), a calcium-binding EF-hand domain, and a ␤-prism-like lectin domain. Predictions for the structural organization of RbmC are similar to that of Bap1, but with one additional C-terminal ␤-prism domain and two N-terminal repeats resembling the C-terminal ␤␥-crystallin domain of StcE, a mucinase expressed by enterohemorrhagic Escherichia coli O157:H7 (EHEC O157:H7) (14,21,22). Whereas the core-predicted ␤-propeller domains of Bap1 and RbmC share ϳ60% sequence identity, the ␤-prism domains are more dissimilar with ϳ36% sequence identity (excluding gaps). One additional ␤-prism is also found in the V. cholerae genome attached to the poreforming toxin V. cholerae cytolysin (VCC) with ϳ33% sequence identity to the Bap1 ␤-prism domain. The three ␤-prism domains from RbmC and VCC all exhibit low-nanomolar affinity for the central core of complex N-glycans found abundantly on eukaryotic cell surfaces (23). Sequence alignments indicate that Bap1 has a 57-amino acid insertion within the ␤-prism domain, not present in the RbmC or VCC ␤-prism domains (23).
Here, we describe the crystal structure of Bap1 ⌬57 (missing the 57-amino acid ␤-prism domain insertion) at 1.9 Å resolution. The structure of Bap1 ⌬57 reveals a two-domain arrangement consisting of an eight-bladed ␤-propeller with a ␤-prism domain inserted within blade 6 via a flexible linker. In addition, we show evidence of Bap1 ⌬57 binding to anionic polysaccharides in a manner that differs significantly from the lectin-binding activity displayed by the ␤-prism domains of RbmC (23). Our studies support a model of V. cholerae biofilm architecture that allows for varied roles of biofilm scaffolding and surface attachment in aquatic versus host environments. Our results provide a starting point for understanding how Bap1 and RbmC may participate in structural and adhesive roles in building the V. cholerae biofilm matrix.

Å crystal structure of Bap1 ⌬57 reveals a two-domain architecture
Initial attempts to express full-length Bap1 (predicted molecular mass of 72.6 kDa) or the isolated Bap1 ␤-prism domain in E. coli resulted in insoluble material. However, we were successful in expressing a Bap1 construct in which the ␤-prism domain was genetically excised (Bap1 ␤-propeller ), suggesting that this domain contributes to the insolubility of the full-length Bap1 protein. Given that isolated RbmC and VCC ␤-prism domains express in a soluble form (23) and that the distinguishing feature of the Bap1 ␤-prism domain is the 57-amino acid insertion, we made Bap1 ⌬57 and ␤-prism ⌬57 domain constructs with this insertion removed. To aid in expression and screening, we additionally fused a protease-cleavable GFP UV domain to the N terminus of the Bap1 ⌬57 and ␤-prism ⌬57 constructs. This two-pronged approach drastically improved soluble expression yielding material that exhibited monodisperse behavior by size-exclusion chromatography. Although the Bap1 ␤-propeller construct yielded crystals, these exhibited fiberlike properties and did not suitably diffract X-rays. The Bap1 ⌬57 construct also yielded three-dimensional crystals that diffracted to better than 2 Å resolution. X-ray data analysis indicated that crystals belonged to space group P4 1 2 1 (  Table 1).
The structure of Bap1 ⌬57 was solved by single isomorphous replacement with anomalous scattering (SIRAS) utilizing a crystal derivative soaked in K 2 Cl 4 Pt. Two platinum sites contributed to phasing, yielding maps with a clear molecular outline, and subsequent density modification led to interpretable electron density maps. Refinement against a 1.9 Å native data set resulted in a final R work of 15.8 and R free of 17.3. The overall structure reveals a ␤-propeller structural core tethered to an accessory ␤-prism domain by linkers at both the N-and C-terminal ends of the ␤-prism (Fig. 1A). These extended linkers contained higher B-factors in general than the two structural domains and did not participate in crystal-packing interactions. This arrangement suggests that the linkers are likely flexible in solution and that the relative orientation between the two domains results from the crystal lattice, although this cannot be confirmed as only one copy of the molecule was present per asymmetric unit.

Bap1 ␤-propeller domain adopts a canonical ␤-propeller fold with distinct features
Analysis of the topology of the Bap1 fold reveals an eightbladed ␤-propeller core with a ␤-prism accessory domain interrupting the ␤-propeller motif at a loop within blade 6 ( Fig. 1B). The fact that the Bap1 ␤-propeller domain can be expressed as a soluble protein (by deleting the ␤-prism domain from the loop in blade 6) suggests that the ␤-prism domain is not necessary for folding or stability of the ␤-propeller structure. Similarly, removal of the 57-amino acid insertion within the ␤-prism does not appear to affect folding of this domain, which exhibits a fold similar to ␤-prism domains from RbmC and VCC, with RMSDs of 1.0 and 1.1 Å, respectively. We observed no evidence for EDITORS' PICK: Bap1 biofilm matrix protein from V. cholerae higher-ordered oligomeric structures in solution or in the crystal packing lattice, suggesting that Bap1 ⌬57 exists as a monomer.
Each of the eight propeller blades consists of a four-stranded antiparallel ␤-sheet, with ␤-strands radiating out from their N-terminal strands (␤-strand A, ␤A) at the center of the toroid to their C-terminal strands (␤-strand D, ␤D) forming the outer edge of the propeller (shown in blade 1 of the ␤-propeller topology diagram in Fig. 1B). As is commonly found in ␤-propeller proteins, the Bap1 ␤-propeller domain contains a Velcro closure with the most N-terminal residues (Asp-31-Ser-41) acting as ␤D of blade 8, zipping the disc together via ␤-sheet hydrogen bonding (Fig. 1B). A relatively unstructured C-terminal loop extends from ␤-strand C (␤C) of blade 8 to the center of the cation-binding face (described below) and inserts into the core of the ␤-propeller, acting as a cone-shaped plug in what exists as a central cavity in other ␤-propellers.
A phased anomalous difference density map calculated from X-ray data collected at a wavelength of 1.25 Å showed two strong peaks (Ͼ5.0 ) in blade 1 of the ␤-propeller, which were interpreted as Ca 2ϩ ions (Figs. 1A and 2A). The presence of calcium blade motifs ((D/N)X(D/N)GDGXX(D/E)) (24) between ␤A and ␤B in blades 2-5 and 7 of the Bap1 sequence, as well as the orientation of these residues in the structural model, suggested the presence of additional ions; however, no peaks were observed for these locations in the anomalous difference maps (Fig. 2B). Analysis of the Bap1 ⌬57 model using the Check-MyMetal server (https://csgid.org/metal_sites), 4 which analyzes the coordination geometry of ions in refined structures (25), suggested the presence of Na ϩ coordinated by five of the calcium blade motifs, and thus five Na ϩ ions were modeled into in the ␤A-␤B loop of blades 2-5 and 7 (one Na ϩ per blade; Fig.  1, A and B). At this wavelength (1.25 Å), we would expect an observable anomalous signal from the Ca 2ϩ K-edge (fЉ ϭ 0.89 e) but a much weaker signal for Na ϩ (fЉ ϭ 0.08 e), supporting our interpretation. Based on the similarity to other calcium blade motifs, as well as the observation that exposure of Bap1 ⌬57 -GFP UV to EDTA causes precipitation of the protein, it is possible that the ions present in the Bap1 ⌬57 model play a role in structural stability, rather than a functional or enzymatic role (24).
Although our structure contains both Ca 2ϩ (Fig. 2C) and Na ϩ (Fig. 2D) ions, it is possible that all sites are typically occupied by calcium (24) and that physiological cations were replaced by Na ϩ during the protein purification process. Blade 6 (which contains the ␤-prism insertion) and blade 8 do not appear to bind ions. Comparing the sequence of the ␤A-␤B loop in blade 8 with the sequences of the ␤A-␤B loops of the five Na ϩ -coordinating blades reveals a sequence divergence from the calcium-blade motif that could plausibly abolish cation binding (Fig. 2B). All ions in the Bap1 ⌬57 model are present
EDITORS' PICK: Bap1 biofilm matrix protein from V. cholerae on the same face of the ␤-propeller, which we therefore refer to as the cation-binding face, situated on the side of the propeller where the ␤-prism domain is attached (Fig. 1A). The face opposite the cation-binding face of the ␤-propeller contains a central pocket surrounded by several aromatic residues, with ϳ800 Å 2 of exposed solvent-accessible surface area (Fig. 3, A and B, area including yellow and marine surfaces). A surface electrostatic potential map does not indicate any overwhelming charge density near this pocket, whereas the cation-binding face is acidic in nature (Fig. 3C). A DALI search of the Bap1 ⌬57 structure identified ␤-propeller homologs of Bap1, with the lowest RMSD at 2.4 Å (26) (top 10 are shown in Fig. S1A). Proteins identified by the DALI algorithm can be organized into three major functional categories, including scaffolding proteins, enzymes, and chaperones. In addition, two homologs identified by the DALI search are lectins. The protein homologs identified by the DALI search appear to represent mainly structural similarity, rather than functional homology due to a lack of conservation in residues critical to activity. For example, YesW and YesX are bacterial lyases identified by the DALI search with RMSDs of 3.7 and 3.8 Å, respectively (Fig. S1B). These enzymes are the only bacterial proteins identified in our search that share the calcium blade motif with Bap1 and also contain eight blades. However, it is not likely that these bacterial lyases are functional homologs of Bap1 because the lyase activity of YesW and YesX relies on a deep pocket and a calcium ion located in the central channel of the ␤-propeller (27), neither of which are present in the Bap1 ⌬57 crystal structure. In addition, several seven-bladed integrin family proteins were identified via the DALI search (RMSD 4.0 Å or higher), likely due to their coordination of calcium ions via calcium blade motifs (24).
␤-propeller domains are common in prokaryotes with diverse functions, including enzymatic activities. Several families of glycoside hydrolases (GH) exist with ␤-propeller folds, although mostly in five-bladed (GH families 43 and 62) and six-bladed (GH families 33, 34, 83, and 93) forms. These families typically contain active sites near the central axis of the ␤-propeller and utilize Tyr/Glu or Asp/Glu pairs/triads to catalyze hydrolysis (28, 29) (although some exceptions exist, including a seven-bladed rhamnosidase that utilizes a single histidine in the active site (30)). Inspection of the central cavity of Bap1 ⌬57 did not reveal Tyr/Glu or Asp/Glu pairs consistent with hydrolase or sialidase activities, and a structure-based sequence alignment (Fig. 3D) of the eight Bap1 ⌬57 blades did not indicate the presence of Asp-box motifs found in some glycoside hydrolase ␤-propeller families (31). Although the absence of these motifs does not pre- The protein structure is shown in cartoon representation with the ␤-propeller colored in marine and the ␤-prism colored in yellow. Two modeled Ca 2ϩ ions and five Na ϩ ions are shown as green and blue spheres, respectively. The C-terminal region that forms a plug within the central ␤-propeller cavity is colored red. A key aspartic acid residue (Asp348) that forms essential contacts with bound carbohydrates in homologous ␤-prism lectin domains from Vibrio cholerae (in RbmC and VCC) is shown in stick representation. B, schematic diagram illustrating Bap1 ⌬57 topology. The socalled "Velcro closure" of the ␤-propeller domain is represented by a zigzag line, and the location of the 57-amino acid insert that has been removed genetically is represented by an arrow in A and a dashed magenta line in B. All structural representations generated using the PyMOL Molecular Graphics System, Version 2.2.0 Schrödinger, LLC. EDITORS' PICK: Bap1 biofilm matrix protein from V. cholerae clude Bap1 exhibiting glycoside hydrolase activity, it suggests that if this were the case a noncanonical mechanism might be at play or that the active site may be located somewhere outside of the central cavity.
␤-Propeller lectins, also called PropLecs, are another common utilization of the ␤-propeller fold that typically contain carbohydrate-binding sites at the interfaces between blades. An algorithm for detecting PropLec proteins was recently described resulting in a database of predicted sequences based on conserved families of PropLec domains (32).
Although the method did predict a number of Vibrio Pro-pLec proteins, Bap1 was not identified by the screen. The Bap1 structural alignment (Fig. 3D) indicates that the sequence identity between blades is quite low (5-35%), and we do not observe any conserved sequences (aside from the calcium/sodium-binding sites) or pockets between the interfaces of multiple blades. Although Bap1 may not fit into the profile for a typical PropLec ␤-propeller protein, this does not preclude that additional carbohydrate-binding sites may be present.  (58) as implemented in PyMOL. The putative carbohydrate-binding pocket of the ␤-prism domain is basic as compared with the rest of the protein and is highlighted by an arrow. D, structure-based sequence alignment of the Bap1 ⌬57 blades generated using Swiss-PdbViewer (59). The gray-shaded area denotes the calcium/ sodium-binding motifs. The secondary structure is shown above with the location of the ␤-prism domain noted in blade 6. Sequence alignment figure generated by ESPript 3 (60) using the % Multalin coloring scheme and a 0.5 similarity score.
EDITORS' PICK: Bap1 biofilm matrix protein from V. cholerae The Bap1 ⌬57 ␤-prism domain falls into the jacalin-related lectin (JRL) protein family, which consists of a pseudo-3-fold arrangement of Greek key motifs. Most JRLs bind carbohydrates in only one sugar site, located at the top of Greek key I, but three previously characterized ␤-prism I domains expressed by V. cholerae (two from RbmC and one from VCC) bind their carbohydrate ligands at a single site at the top of Greek key II (23,33).
The ␤-prism domain of Bap1 ⌬57 is located sequentially between ␤A and ␤B (␤22 and ␤35* in context of the full structure) of blade 6 within the ␤-propeller domain. The Bap1 ⌬57 ␤-prism domain adopts a canonical ␤-prism I architecture, made up of 12 ␤-strands arranged into three antiparallel ␤-sheets with Greek key folds. The surface of the Bap1 ⌬57 ␤-prism domain opposite its attachment to the ␤-propeller makes up the region that constitutes a carbohydrate-binding site in the other three ␤-prisms expressed by V. cholerae: RbmC ␤-prism 1 , RbmC ␤-prism 2 , and VCC ␤-prism (23,33). In the Bap1 ⌬57 structure, this end of the ␤-prism features a positivelycharged, lysine-rich surface with a central cavity not found in the other V. cholerae ␤-prism I lectin domains (Fig. 4A). A bound citrate molecule is situated on one side of this lysine-rich groove (Fig. 4A). Because citrate was required to obtain the optimal crystal form and because this molecule is located in an area with extensive crystal contacts, it is possible that its interaction with the Bap1 ⌬57 ␤-prism is an artifact of crystallization. However, it is curious that the citrate molecule is found occupying the same site as the primary mannose in the RbmC ␤-prisms 1, 2 and VCC ␤-prism structures (Fig. 4, B and C) (23,33). The citrate molecule is coordinated by hydrogen bonding through backbone amines of Gly-344, Ala-345, and Val-346, and the terminal amine of the Lys-501 side chain, and by van der Waals interactions with Asp-348 and His-500 (Fig. 4D). The citrate coordination pattern is of interest because all of these residues, except His-500 (which is not present in the other ␤-prisms, as described below), are structurally homologous to those identified in carbohydrate binding by the crystal structures of RbmC ␤-prism 2 and VCC ␤-prism (Fig. S2) (23,33).
The citrate molecule present in the Bap1 ⌬57 structure binds in the carbohydrate-binding site found in the other V. cholerae ␤-prism domains, and further comparison of the four V. cholerae ␤-prism domains presents a perplexing conundrum as it appears that the Bap1 ⌬57 ␤-prism should be capable of binding carbohydrates in a manner similar to its homologs, yet experi-mental evidence suggests it does not. The most significant structural divergence of Bap1 ␤-prism from the other V. cholerae ␤-prism lectins lies in the ␤11-␤12 loop, which contains a seven-amino acid loop insertion between ␤11 and ␤12 (␤33* and ␤34* in context of Bap1 ⌬57 ) (Fig. 4, B and E). Whereas sequence alignments suggest that Trp-629/948/706 (RbmC ␤-prism 1 / RbmC ␤-prism 2 /VCC ␤-prism ) and Leu-630/949/707 are substituted in Bap1 by Lys-495 and Gln-503 (respectively) with the seven-amino acid loop between them (Fig. 4E), structural align- ␤-prism domain. Aromatic and lysine residues are shown in gray and blue stick representation, respectively. A citrate molecule from the crystallization buffer was found in close proximity to Asp-348 with putative hydrogen bonding interactions highlighted by dotted lines. A short seven-residue loop unique to Bap1 is shown in green, and the location of the 57-amino acid insertion is shown in magenta. An extended linear cavity is outlined in gray space-surface representation (determined using the cavities and pockets feature of PyMOL with a 7-Å cavity detection radius and a three-solvent radius cavity detection cutoff). B, superposition of the Bap1 ⌬57 (yellow) ␤-prism domain and RbmC2 (wheat) ␤-prism domain with bound mannotriose (green stick representation). Key residues involved in RbmC2/mannotriose interactions are shown as dotted lines (23). Short insertion loops unique to RbmC2 (red) and Bap1 (green) occupy similar locations. Key Bap1 residues are indicated by arrows with RbmC2 residue numbers in parentheses. C, a citrate molecule is located in a region with several crystal contacts (␤-propeller domain is colored marine; ␤-prism domain is colored yellow; and residues involved in citrate coordination are colored dark green). D, schematic shows polar and nonpolar interactions between citrate molecule and the Bap1 ␤-prism domain. At the crystallization pH (5.5), the citrate molecule is expected to be at least partially protonated, allowing Asp-348 to make additional hydrogen-bonding interactions with a citrate carboxylate group. Putative hydrogen-bonding interactions are shown as green dotted lines and van der Waals interactions as red arcs. Data were generated using LigPlotϩ version 2.1 (61). E, sequence alignment shows a key region of the Bap1, RbmC ␤-prism 1 , RbmC ␤-prism 2 , and VCC ␤-prism domains. The Bap1 7-amino acid insertion is highlighted in green. Sequence alignment was generated using the MUSCLE algorithm implemented in MEGA version 7.0 (62) and ESPript 3 (60).
EDITORS' PICK: Bap1 biofilm matrix protein from V. cholerae ment shows that Lys-501 and Gln-503 of Bap1 are present in the spatial location of the WL motif and should, in theory, be able to accommodate binding to mannose or the N-glycan pentasaccharide core, especially considering that Lys-501 forms a hydrogen bond with the citrate molecule in the Bap1 ⌬57 structure (Fig. 4D). An additional major difference between the Bap1 ⌬57 ␤-prism and the other V. cholerae ␤-prism domains is with regard to surface electrostatics. The surface of the Bap1 ⌬57 ␤-prism near the homologous carbohydrate-binding site has a lysine-rich and positively charged groove (Figs. 3C and 4A). Although this region could reasonably facilitate binding of Bap1 to anionic polysaccharides or surfaces, it cannot explain why Bap1 ⌬57 does not bind carbohydrates in a fashion similar to its homologs or why it does not colocalize with VPS at the surface of the biofilm (20).

57-Amino acid insertion in the Bap1 ␤-prism domain
Bap1 has been shown to contribute to the hydrophobicity of V. cholerae biofilms, specifically within pellicles formed at the air-water interface (34). In the absence of Bap1, V. cholerae biofilm pellicles have decreased elasticity and are more hydrophilic than WT rugose biofilms (34). Whereas Bap1 and RbmC have been shown to play somewhat redundant roles in the V. cholerae biofilm matrix (9,20,35), the hydrophobic nature of WT V. cholerae pellicles represents an instance where RbmC cannot rescue the absence of Bap1. The failure of RbmC to rescue the hydrophobic nature of the pellicle formed by a bap1 deletion mutant suggests that some component of Bap1 that is absent in RbmC contributes significantly to the hydrophobicity of the V. cholerae biofilm pellicle. The 57-amino acid insertion is the lone region of Bap1 absent in RbmC and exhibits traits that could aid in maintaining the hydrophobicity of the pellicle (Fig. S3A). Secondary structure prediction using the JPred server suggests the 57-amino acid insertion adopts mainly ␤-sheet structure, and multiple amyloid prediction algorithms suggest that it may also exhibit amyloidogenic propensities (Fig. 5A) (36 -42). The 57-amino acid insertion occurs at the N-terminal end of the 10th ␤-strand in the ␤-prism domain (␤32* in context of the full structure, Fig. 1B). Residues of this ␤-strand as well as the loop connecting the 9th and 10th ␤-strands are not involved in carbohydrate binding in any known ␤-prism I homologs. Therefore, the removal of this insertion from Bap1 is not expected to alter any potential lectin activity of the Bap1 ⌬57 ␤-prism domain. This is not to say, however, that removal of the insertion does not impede other functional roles played by Bap1 in the V. cholerae biofilm matrix.
Bap1, including the 57-aa insertion, can be expressed in an E. coli system as a GFP UV -fusion; however, all the recombinant protein product is found in the insoluble cell lysate pellet. Conversely, removal of the insertion in the Bap1 ␤-prism domain in Bap1 ⌬57 alleviates this solubility issue and yields abundant protein in the cell lysate supernatant fraction. In an attempt to produce the 57-aa insertion peptide for further analysis, we created a cleavable GFP UV fusion (Insertion 57 -GFP UV ) and expressed this construct in E. coli. As we observed with the full-length Bap1 construct, this new construct was completely insoluble, with only truncated GFP UV material missing the fusion insertion present in the cell supernatant (Fig. 5B). The insoluble inclusion body pellet could be solubilized in 8 M urea and purified by nickel-nitrilotriacetic acid chromatography, but attempts at refolding the fusion on the column (by slowly reducing the urea concentration), by dilution, or by dialysis failed to produce soluble material. This suggests two possibilities: the insertion renders the full-length protein (whether Bap1 or Insertion 57 -GFP UV ) insoluble, or an interaction between the insertion and some component of Bap1 or GFP UV results in cross-linked aggregation or misfolding of the protein. Of course, Bap1 is secreted functionally from V. cholerae where the protein is presumably directed through a secretory pathway, which might also involve chaperones to assist with folding or solubility. Evaluation using Kyte-Doolittle hydropathy values indicates that 63.7% of the residues in the insertion display a hydrophobic nature (Fig. S3B) (43), which may support the former hypothesis. However, the possibility of this insertion having amyloidogenic propensities could support the latter EDITORS' PICK: Bap1 biofilm matrix protein from V. cholerae hypothesis (Fig. 5A). It is plausible that the insertion acts to help attach the V. cholerae biofilm matrix to hydrophobic surfaces (19), such as the lipid membrane of a cell, or that amyloid formation within the insertion might aid in biofilm formation as seen in biofilms formed by other bacteria (44). A strongly hydrophobic or amyloidogenic subdomain present in Bap1 could explain its localization to the attachment surface observed in fluorescence microscopy studies on V. cholerae biofilm architecture (9,20,35).

Functional analysis of Bap1 ⌬57 suggests putative lectin activity
Based on super-resolution microscopy of developing biofilms, Bap1 plays a role in cell-surface adhesion and formation of dynamic cell-encasing envelopes with VPS and RbmC (9). In our investigation of the putative sugar-binding activity of Bap1, we determined that unlike the ␤-prism domains present on RbmC and VCC (23), fluorescently-labeled Bap1 ⌬57 did not appear to bind complex mammalian sugars present on the Mammalian Glycan Screen version 5.3 (Fig. S4). Furthermore, analysis using isothermal titration calorimetry (ITC) did not reveal binding activity of Bap1 ⌬57 constructs to the complex glycosylated protein asialofetuin or monosaccharides (L-guluronic acid, D-galactose, or D-glucose) present in VPS (Fig. S5A).
To test the possibility of Bap1 ⌬57 binding to linear polysaccharides like VPS, we employed a native gel-shift experiment using substrates similar to approaches used with extracellular polysaccharide-interacting proteins from Pseudomonas aeruginosa (45). For polysaccharides, we used alginic acid, bacterial alginate, dextran, 2-hydroxyethylcellulose, and xanthan gum. As controls, we included several proteins with compatible electrophoretic mobilities and varying molecular masses, including BSA, concanavalin A, and GFP UV . Our prior experience demonstrates that the RbmC ␤-prism domain demonstrates some affinity for dextran and dextran-containing purification resins, which contain ␣1,3and ␣1,6-branches like those found in core N-glycans (23).
Native polyacrylamide gel-shift assays reveal association of Bap1 ⌬57 but not Bap1 ␤-propeller to alginic acid (Fig. 5C). Polysaccharide binding by Bap1 ␤-prism could not be analyzed by native polyacrylamide gel shift as its pI is above that of the gel buffering system. In addition, Bap1 ⌬57 displayed a gel shift with xanthan gum in the native polyacrylamide gel shift assay, but not with dextran (Fig. S5B). Taken together, these data suggest that unlike RbmC, Bap1 ⌬57 does not bind complex mammalian N-glycans, but rather displays an affinity for anionic polysaccharides or polysaccharides with a linear backbone, as both alginic acid and xanthan gum are negatively charged. This is particularly interesting as VPS present in V. cholerae biofilms is an acetylated linear polysaccharide with a glycine adduct (with a free carboxyl group) linked to guluronic acid (11,12,46).
To test whether a linear polysaccharide could occupy the Bap1 ⌬57 ␤-prism cavity, models for alginic acid and VPS were prepared using CarbBuilder version 2.1.30 (47) and manually docked into the cavity density calculated by PyMOL. Although the exact register and orientation of the ligands may vary, the width of the polysaccharides is comparable with the size of the cavity (Fig. 5D). In RbmC, we observe flexibility in this region in the presence and absence of ligands (23), and in Bap1 this region is near a crystal contact that may influence its unliganded conformation. These considerations might provide additional flexibility in the size of the cavity, particularly when sidechain rearrangements are allowed. An attractive feature of this model is that acetyl groups on polysaccharides would be in close proximity to lysine groups surrounding the ␤-prism cavity potentially forming polar interactions. The VPS glycine carboxyl on the guluronic acid moiety could additionally form a salt bridge with nearby lysine residues, further strengthening this interaction.

RbmC model based on Bap1 ⌬57 structure
To gain better insight into the comparative structural features of RbmC and Bap1, we constructed a model for RbmC based on available structural and sequence-based information (the comparative domain organization is shown in Fig. 6A). The ␤-propeller domains of Bap1 and RbmC are 67.5% identical in sequence, suggesting a high degree of structural similarity. For our model, we attached the previously determined RbmC ␤-prism 2 lectin domain (23) to the C terminus and tandem ␤␥-crystallin domains from E. coli StcE (22) to the N terminus of our Bap1 ⌬57 model (Fig. 6B). The linker lengths between domains were modeled using the database sequence for RbmC and are assumed to be flexible similar to the linkers connecting the ␤-prism and ␤-propeller domains in Bap1 ⌬57 . Based on the linker distance constraints, it is likely that both StcE domains and both ␤-prism domains protrude from the same cationbinding face of the ␤-propeller domain in RbmC. Furthermore, the carbohydrate-binding pockets of RbmC ␤-prism 1 and RbmC ␤-prism 2 are likely oriented in the same outward position EDITORS' PICK: Bap1 biofilm matrix protein from V. cholerae on the bottom face of the RbmC molecule (Fig. 6B). Because both ␤-prism domains in RbmC bind similar N-glycan core structures (23) commonly found on cell-surface proteins with low nanomolar affinity, the possibility exists for strong avidity effects from polyvalent binding to cell-surface ligands. The integration of two N-glycan-binding sites, two putative StcE domain-binding targets (22), and a possible ␤-propeller ligand suggests that RbmC could serve to bridge multiple biofilm matrix components to target cell surfaces. Bap1, however, might form an overlapping but more limited scaffold for a different subset of matrix and surface ligands.

Discussion
We report the 1.9 Å crystal structure of the V. cholerae biofilm matrix protein Bap1 ⌬57 and demonstrate that the Bap1 ⌬57 ␤-prism domain displays a sugar-binding profile that differs from other V. cholerae ␤-prism domains (RbmC and VCC), with putative specificity for anionic polysaccharides and/or polysaccharides with a linear backbone. Additionally, we suggest that a 57-amino acid insertion that was deleted for E. coli expression might modulate the solubility of Bap1 leading to the surface deposition (9) and hydrophobic properties of biofilms observed in functional studies (34). This might help to explain the observation that surface-attached Bap1 appears to grow radially outward from founder cells (9) as presumably newly secreted Bap1 molecules deposit onto the surface or grow outward through interactions mediated by the 57-aa insertion. Consistent with published microscopy studies, the ␤-prism domains of Bap1 and RbmC would be primarily involved in adhesive interactions between the biofilm matrix and environmental surfaces, whereas the ␤-propeller domains might form interactions with protein or other substrates within the biofilm matrix, including other matrix proteins and VPS. The Bap1 ␤-prism domain may also interact with VPS, or other charged polysaccharides found in the marine environment (like alginates). Whereas RbmC and Bap1 are both sufficient on their own to support biofilm attachment in vitro (20), specific and potentially strong adhesive interactions mediated by ␤-prism domains might play a more important role on diverse surfaces encountered within the V. cholerae lifecycle.
In the aquatic niche, the ability of Bap1 to bind anionic polysaccharides or abiotic surfaces (via the ␤-prism insertion) would provide a survival advantage by promoting attachment to a multitude of substrates, including extracellular polysaccharides found on phyto-and zooplankton or macroflora such as macroalgae. In addition, the increased elasticity provided to the biofilm matrix by Bap1 may confer increased tensile strength that aids in survival in environments where dynamic movement (such as ocean currents) is abundant (34). Furthermore, the putative hydrophobic and/or amyloidogenic nature of the 57amino acid insertion in the Bap1 ␤-prism may play a role in attachment to hydrophobic or other abiotic surfaces. The putative lectin activity of Bap1 ⌬57 presented here and the contribution of Bap1 to the hydrophobic nature of V. cholerae biofilms (34) support our hypothesis of a dual role played by Bap1 in attaching V. cholerae biofilms to both biotic and abiotic surfaces in the aquatic niche.
In the host gut environment, we propose that RbmC plays a dominant role by binding complex N-glycans (23). Thus, in effect, ingested biofilm fragments decorated with RbmC could be "captured" by ligands on epithelial cell surfaces. This role may be aided by the N-terminal ␤␥-crystallin repeat domain, as the homologous domain in EHEC O157:H7 mucinase, StcE, has been shown to play a role in binding to cell surfaces (22). In support of this hypothesis, a study performed by Liu et al. (48) provides experimental evidence of interaction between mucin and V. cholerae biofilms.
Additional investigation of the unique specificity of lectin activity displayed by Bap1 and RbmC is needed. Because of the lack of hits in the mammalian glycan screen and apparent preference for linear and/or anionic polysaccharides, it is not likely that Bap1 plays a role in penetration of the mucosal layer of the mammalian gut. However, it is crucial to further investigate the polysaccharide specificity of Bap1 in the context of its interactions with other matrix components. These studies in conjunction with structures of Bap1 complexed with carbohydrate ligands will provide a framework for understanding the network of complex molecular interactions that underlie biofilm assembly and adhesion in V. cholerae.
GFP UV fusion constructs were transformed into T7Express E. coli (New England Biolabs) or BL21-CodonPlus (DE3)-RIL (Agilent) E. coli (for Insertion 57 -GFP UV ), and overnight cultures were diluted 20-fold into fresh LB-Miller media supplemented with 50 g/ml carbenicillin. Expression cultures were subsequently grown with shaking (210 rpm) at 37°C until reaching an A 600 of 0.5 to 0.7, at which point the cells were induced with 1 mM isopropyl ␤-D-1-thiogalactopyranoside (IPTG). Following induction with IPTG, Bap1 ⌬57 -GFP UV and Bap1 ␤-propeller -GFP UV were grown for an additional 18 h at 18°C, while Bap1 ␤-prism -GFP UV was grown for an additional 4 h at 30°C. Following expression, cells were pelleted by centrifugation in a Sorvall LYNX 6000 centrifuge using an F9-6 ϫ 1000 rotor (Thermo Fisher Scientific) at 3,900 ϫ g for 12 min at 4°C. Cell pellets were resuspended in 1ϫ TBS (20 mM Tris, pH EDITORS' PICK: Bap1 biofilm matrix protein from V. cholerae 7.5, 150 mM NaCl) to a final volume of ϳ10 ml per pellet from 1 liter of expression growth. Samples were stored at Ϫ80°C until the time of purification. Thawed cell cultures were supplemented with protease inhibitor mixture (Roche Applied Science) and 10 mM imidazole. Cell lysis was performed via passage three times through an EmulsiFlex-C5 high pressure homogenizer (Avestin, Inc.) at ϳ18,000 p.s.i. The cell lysate was cleared via centrifugation in a Sorvall LYNX 6000, at 40,000 ϫ g for 25 min using an F20-12ϫ50 rotor (Thermo Fisher Scientific).
Recombinant Bap1 ⌬57 -GFP UV was purified from the cleared supernatant by a combination of nickel-affinity (5-ml HisTrap HF, GE Healthcare) and desalting (50-ml Bio-Scale Mini Bio-Gel, Bio-Rad) chromatography columns on a Profinia Protein Purification System (Bio-Rad). After sample loading, the nickel column was washed with 2 column volumes (CV) of 1ϫ TBS and 5 CV of 1ϫ TBS plus 40 mM imidazole. Bap1 ⌬57 -GFP UV was eluted from the nickel column using 1ϫ TBS supplemented with 250 mM imidazole and desalted into 30 ml of ion-exchange Buffer A (IEX Buffer A: 20 mM Tris, pH 7.5, 50 mM NaCl). GFP UV was cleaved from the fusion protein by incubating with a 1:1000 w/w ratio of human ␣-thrombin (Hematologic Technologies, Inc.) at room temperature for 1.5 h. The cleavage reaction was stopped using 1 mM 4-(2-aminoethyl)benzenesulfonyl fluoride hydrochloride. Cleaved Bap1 ⌬57 was separated from its His-tagged GFP UV fusion partner via anionic exchange on a 5-ml Q-column HF (GE Healthcare) equilibrated in IEX Buffer A. GFP UV was eluted from the Q-column in 20 mM Tris, pH 7.5, 200 mM NaCl, and Bap1 ⌬57 eluted in 60 ml of 20 mM Tris, pH 7.5, 250 mM NaCl. Eluted Bap1 ⌬57 was concentrated in a 30-kDa cutoff Vivaspin20 centrifugal filter (GE Healthcare) and buffer exchanged by passage over a Superdex 200 Increase 10/300 GL (s200i, GE Healthcare) gel-filtration column in 1ϫ TBS.
Bap1 ␤-propeller -GFP UV was purified from cleared cell lysate in a similar fashion as Bap1 ⌬57 -GFP UV , desalted into 1ϫ TBS, and cleaved using a 1:1000 w/w ratio of trypsin (Sigma) at room temperature for 1.5 h. Cleaved Bap1 ␤-propeller was separated from the His-tagged GFP UV by tandem nickel-affinity and gelfiltration chromatography, with a 1-ml HisTrap HF column (GE Healthcare) attached to the top of the s200i gel-filtration column, and eluted in a buffer appropriate for downstream applications. Bap1 ␤-prism -GFP UV was purified from the cleared cell lysate by nickel-affinity chromatography using Toyopearl 650 M AF-chelate resin (Tosoh Biosciences) charged with nickel sulfate, using the procedure described above for the His-Trap HF column. Bap1 ␤-prism -GFP UV eluted from the Toyopearl chelate resin was cleaved from the fusion protein by a 1:500 w/w trypsin digest carried out at room temperature for 1.5 h. Bap1 ␤-prism was separated from GFP UV by passage over an s200i gel-filtration column. Purity of all constructs was determined by SDS-PAGE.

Crystallization of Bap1 constructs
Bap1 ⌬57 was purified and separated from its GFP UV fusion partner as described above and concentrated to 7.5 mg/ml for crystallization studies. Crystallization screening and optimization were performed using commercial sparse matrix and grid screening by the hanging-drop method in 24-well VDX plates (Hampton Research). Native crystals were grown by pipetting a 1:1 v/v ratio of protein sample and precipitant solution (0.1 M sodium citrate and 10% PEG 3,350) onto siliconized coverslips and suspending them over 0.5 ml of reservoir solution. Crystals were harvested after approximately 2 weeks. Native crystals were cryoprotected in mother liquor containing 20% glycerol, and flash-cooled in liquid nitrogen. Heavy atom derivatization was achieved by soaking native crystals in a drop of mother liquor with 10 mM K 2 Cl 4 Pt for 3 min, followed by back soaking in mother liquor plus 20% glycerol, without the heavy atom for ϳ30 s, and flash-cooling in liquid nitrogen.
Bap1 ␤-propeller domain crystals were grown via hangingdrop vapor diffusion with a 1:1 v/v ratio of 8.75 mg/ml protein and precipitant solution (0.2 M BisTris, pH 7.0, 15% PEG 8,000, and 20% glycerol) at either 17 or 25°C. Crystals were harvested and flash-cooled in liquid nitrogen, with no additional cryoprotectant as the precipitant solution contained 20% glycerol.

X-ray structure determination of Bap1 ⌬57
X-ray data were collected at the NSLS-II 17-ID-1 (AMX) beamline at Brookhaven National Laboratory equipped with a Dectris Eiger X 9 M pixel-array detector. The structure of Bap1 ⌬57 was solved via single isomorphous replacement with anomalous signal with native crystals soaked in K 2 Cl 4 Pt. Data were indexed using XDS (50) and scaled with Aimless (51). The structure was phased by SIRAS using native and K 2 Cl 4 Pt derivative data using the AutoSol module of Phenix (52). Density modification (ϳ60% solvent) and automatic model building by Phenix using data to 1.9 Å led to the placement of 538 residues (of 608 expected residues in the construct) with a map-model correlation coefficient of 0.84 and R work and R free values of 0.25 and 0.27, respectively. Automatic model rebuilding was carried out using ARP/wARP version 7.6 (53) resulting in a new model with 600 residues and R work and R free values of 0.193 and 0.239, respectively. The final model was refined using phenix.refine (54) utilizing automatic water-picking and target weight optimization algorithms as implemented in Phenix (54). Refinement progress was monitored by tracking the R work /R free ratio (with R free representing 5% of total reflections). Iterative model rebuilding (into 2Fo-Fc and Fo-Fc maps) and comprehensive validation was carried out with Coot (55) and the Phenix implementation of MolProbity (56) to final R work and R free values of 0.158 and 0.173, respectively.

Mammalian glycan array
Bap1 ⌬57 -GFP UV was purified as described above and eluted from the s200i column in 0.1 M sodium bicarbonate, pH 8.3. Purified protein was concentrated in a 30-kDa cutoff Vivaspin6 (GE Healthcare) to 2.1 mg/ml and fluorescently labeled using a Molecular Probes AlexaFluor488 labeling kit (Life Technologies, Inc.). The labeled protein sample was sent to the Consortium for Functional Glycomics (www.functionalglycomics. org) 4 for analysis, as described previously (23), at concentrations of 5 and 50 g/ml.

Polysaccharide interaction by native acrylamide gel electrophoresis
Bap1 ⌬57 and Bap1 ␤-propeller were purified as described, with final elution from gel filtration in 50 mM imidazole, pH 7.0, 50 mM NaCl. Lyophilized BSA and concanavalin A (Thermo Fisher Scientific) were prepared as controls by dissolving in 50 mM imidazole, pH 7.0, 50 mM NaCl. GFP UV was expressed in T7 SHuffleExpress (New England Biolabs) and purified by Ni-affinity and gel-filtration chromatography using conventional methods, with a final elution in 1ϫ TBS. 6 g per lane of the appropriate protein sample was used in gel-shift analysis. Gels were prepared with final concentrations of 7.5% bisacrylamide (Bio-Rad), 25 mM Tris, pH 8.3 (Thermo Fisher Scientific), 250 mM glycine (Research Products International), 0.05% ammonium persulfate (Thermo Fisher Scientific), and 0.025% TEMED (Bio-Rad). Running buffer consisted of 25 mM Tris, pH 8.5, and 192 mM glycine. Gel electrophoresis was carried out at 200 V for 120 min at room temperature. After electrophoresis, gels were fixed in 50% methanol, 10% acetic acid for 15 min, stained in 10% acetic acid, 0.02% Coomassie Brilliant Blue G-250 for 1 h or overnight, and then destained in 10% acetic acid.