Advertisement

The streptococcal multidomain fibrillar adhesin CshA has an elongated polymeric architecture

Open AccessPublished:March 30, 2020DOI:https://doi.org/10.1074/jbc.RA119.011719
      The cell surfaces of many bacteria carry filamentous polypeptides termed adhesins that enable binding to both biotic and abiotic surfaces. Surface adherence is facilitated by the exquisite selectivity of the adhesins for their cognate ligands or receptors and is a key step in niche or host colonization and pathogenicity. Streptococcus gordonii is a primary colonizer of the human oral cavity and an opportunistic pathogen, as well as a leading cause of infective endocarditis in humans. The fibrillar adhesin CshA is an important determinant of S. gordonii adherence, forming peritrichous fibrils on its surface that bind host cells and other microorganisms. CshA possesses a distinctive multidomain architecture comprising an N-terminal target-binding region fused to 17 repeat domains (RDs) that are each ∼100 amino acids long. Here, using structural and biophysical methods, we demonstrate that the intact CshA repeat region (CshA_RD1–17, domains 1–17) forms an extended polymeric monomer in solution. We recombinantly produced a subset of CshA RDs and found that they differ in stability and unfolding behavior. The NMR structure of CshA_RD13 revealed a hitherto unreported all β-fold, flanked by disordered interdomain linkers. These findings, in tandem with complementary hydrodynamic studies of CshA_RD1–17, indicate that this polypeptide possesses a highly unusual dynamic transitory structure characterized by alternating regions of order and disorder. This architecture provides flexibility for the adhesive tip of the CshA fibril to maintain bacterial attachment that withstands shear forces within the human host. It may also help mitigate deleterious folding events between neighboring RDs that share significant structural identity without compromising mechanical stability.

      Introduction

      Bacteria occupy almost every ecological niche on Earth (
      • Green J.L.
      • Bohannan B.J.
      • Whitaker R.J.
      Microbial biogeography: from taxonomy to traits.
      ,
      • Martiny J.B.
      • Bohannan B.J.
      • Brown J.H.
      • Colwell R.K.
      • Fuhrman J.A.
      • Green J.L.
      • Horner-Devine M.C.
      • Kane M.
      • Krumins J.A.
      • Kuske C.R.
      • Morin P.J.
      • Naeem S.
      • Ovreås L.
      • Reysenbach A.L.
      • Smith V.H.
      • et al.
      Microbial biogeography: putting microorganisms on the map.
      ). Their capacity to colonize diverse environments is in part enabled by their ability to adhere to the surfaces of materials and other cells. Adherence allows anchorage and persistence within a defined environment, confers significant evolutionary advantage, and promotes bacterial infection in animals and humans (
      • Kline K.A.
      • Fälker S.
      • Dahlberg S.
      • Normark S.
      • Henriques-Normark B.
      Bacterial adhesins in host-microbe interactions.
      ,
      • Pizarro-Cerdá J.
      • Cossart P.
      Bacterial adhesion and entry into host cells.
      ). Identifying and characterizing the cellular machineries employed by bacteria to adhere and colonize is of broad fundamental interest and may inform the development of anti-infective agents, medical devices, or vaccines (
      • Wizemann T.M.
      • Adamou J.E.
      • Langermann S.
      Adhesins as targets for vaccine development.
      ,
      • Klemm P.
      • Vejborg R.M.
      • Hancock V.
      Prevention of bacterial adhesion.
      ).
      Frequently, bacteria utilize proteinaceous surface decorations termed adhesins to facilitate attachment to extracellular target molecules. Different adhesins recognize and bind different (a)biotic targets, and there is considerable diversity in the molecular architectures of these important polypeptides. Larger filamentous adhesins may be grouped into one of two categories based on their distinguishing structural features: pili and fibrils. Pili have been implicated in numerous physiological processes and are found in both Gram-positive and Gram-negative bacteria (
      • Proft T.
      • Baker E.N.
      Pili in Gram-negative and Gram-positive bacteria: structure, assembly and their role in disease.
      ,
      • Allen W.J.
      • Phan G.
      • Waksman G.
      Pilus biogenesis at the outer membrane of Gram-negative bacterial pathogens.
      ,
      • Kang H.J.
      • Coulibaly F.
      • Clow F.
      • Proft T.
      • Baker E.N.
      Stabilizing isopeptide bonds revealed in gram-positive bacterial pilus structure.
      ). Fibrillar adhesins are produced by a wide variety of bacteria. They exhibit considerable sequence diversity, and much still remains to be learned about their structures and functions. Fibrils are usually composed of a single polypeptide, which is covalently anchored to the cell wall via a C-terminal LPXTG motif (
      • Larson M.R.
      • Rajashankar K.R.
      • Patel M.H.
      • Robinette R.A.
      • Crowley P.J.
      • Michalek S.
      • Brady L.J.
      • Deivanayagam C.
      Elongated fibrillar structure of a streptococcal adhesin assembled by the high-affinity association of α- and PPII-helices.
      ,
      • Macintosh R.L.
      • Brittan J.L.
      • Bhattacharya R.
      • Jenkinson H.F.
      • Derrick J.
      • Upton M.
      • Handley P.S.
      The terminal A domain of the fibrillar accumulation-associated protein (Aap) of Staphylococcus epidermidis mediates adhesion to human corneocytes.
      ,
      • Rego S.
      • Heal T.J.
      • Pidwill G.R.
      • Till M.
      • Robson A.
      • Lamont R.J.
      • Sessions R.B.
      • Jenkinson H.F.
      • Race P.R.
      • Nobbs A.H.
      Structural and functional analysis of cell wall-anchored polypeptide adhesin BspA in Streptococcus agalactiae.
      ).
      Streptococcus species, including both commensal strains and pathogens, are prodigious producers of fibrillar adhesins (
      • Jameson M.W.
      • Jenkinson H.F.
      • Parnell K.
      • Handley P.S.
      Polypeptides associated with tufts of cell-surface fibrils in an oral Streptococcus.
      ,
      • Wu H.
      • Mintz K.P.
      • Ladha M.
      • Fives-Taylor P.M.
      Isolation and characterization of Fap1, a fimbriae-associated adhesin of Streptococcus parasanguis FW213.
      ,
      • Wu H.
      • Fives-Taylor P.M.
      Identification of dipeptide repeats and a cell wall sorting signal in the fimbriae-associated adhesin, Fap1, of Streptococcus parasanguis.
      ,
      • Froeliger E.H.
      • Fives-Taylor P.
      Streptococcus parasanguis fimbria-associated adhesin fap1 is required for biofilm formation.
      ). Streptococcus gordonii, a pioneer oral bacterium and opportunistic pathogen, employs the fibrillar adhesin CshA (cell surface hydrophobicity protein A) to enable binding to host cell surfaces and other microorganisms (
      • McNab R.
      • Forbes H.
      • Handley P.S.
      • Loach D.M.
      • Tannock G.W.
      • Jenkinson H.F.
      Cell wall-anchored CshA polypeptide (259 kilodaltons) in Streptococcus gordonii forms surface fibrils that confer hydrophobic and adhesive properties.
      ). This ∼259-kDa polypeptide shares <10% sequence identity to any protein of known structure (
      • McNab R.
      • Forbes H.
      • Handley P.S.
      • Loach D.M.
      • Tannock G.W.
      • Jenkinson H.F.
      Cell wall-anchored CshA polypeptide (259 kilodaltons) in Streptococcus gordonii forms surface fibrils that confer hydrophobic and adhesive properties.
      ,
      • McNab R.
      • Holmes A.R.
      • Clarke J.M.
      • Tannock G.W.
      • Jenkinson H.F.
      Cell surface polypeptide CshA mediates binding of Streptococcus gordonii to other oral bacteria and to immobilized fibronectin.
      ,
      • Holmes A.R.
      • McNab R.
      • Jenkinson H.F.
      Candida albicans binding to the oral bacterium Streptococcus gordonii involves multiple adhesin-receptor interactions.
      ). CshA possesses a distinctive multidomain architecture, comprising an N-terminal signal peptide (41 aa residues),
      The abbreviations used are: aa
      amino acid(s)
      RD
      repeat domain
      SEC
      size-exclusion chromatography
      SAXS
      small angle X-ray scattering
      r.m.s.
      root mean square
      EOM
      ensemble optimization method
      Fn
      fibronectin
      HSQC
      heteronuclear single quantum coherence
      TOCSY
      total correlation spectroscopy.
      a nonrepetitive target binding region (778 aa), a repetitive region composed of 17 sequentially arrayed repeat domains (RDs; ∼100 aa each), and an LPXTG anchor (see Fig. 1). CshA forms peritrichous fibrils of ∼60 nm on the surface of S. gordonii (
      • McNab R.
      • Forbes H.
      • Handley P.S.
      • Loach D.M.
      • Tannock G.W.
      • Jenkinson H.F.
      Cell wall-anchored CshA polypeptide (259 kilodaltons) in Streptococcus gordonii forms surface fibrils that confer hydrophobic and adhesive properties.
      ), and heterologous expression of this protein on the surface of Enterococcus faecalis results in the formation of a dense furry layer comprised of multiple closely associated CshA polypeptides, which confers adhesive properties (
      • McNab R.
      • Forbes H.
      • Handley P.S.
      • Loach D.M.
      • Tannock G.W.
      • Jenkinson H.F.
      Cell wall-anchored CshA polypeptide (259 kilodaltons) in Streptococcus gordonii forms surface fibrils that confer hydrophobic and adhesive properties.
      ). Similarly, ΔcshA strains of S. gordonii show reduced binding to other oral microorganisms and host molecules, including fibronectin (Fn) (
      • McNab R.
      • Holmes A.R.
      • Clarke J.M.
      • Tannock G.W.
      • Jenkinson H.F.
      Cell surface polypeptide CshA mediates binding of Streptococcus gordonii to other oral bacteria and to immobilized fibronectin.
      ,
      • Holmes A.R.
      • McNab R.
      • Jenkinson H.F.
      Candida albicans binding to the oral bacterium Streptococcus gordonii involves multiple adhesin-receptor interactions.
      ,
      • Jakubovics N.S.
      • Brittan J.L.
      • Dutton L.C.
      • Jenkinson H.F.
      Multiple adhesin proteins on the cell surface of Streptococcus gordonii are involved in adhesion to human fibronectin.
      ). Recently, the molecular details of host Fn binding by CshA were established, with this polypeptide shown to bind Fn via a distinctive “catch-clamp” mechanism, mediated by discrete domains within the nonrepeat region of the protein (
      • Back C.R.
      • Sztukowska M.N.
      • Till M.
      • Lamont R.J.
      • Jenkinson H.F.
      • Nobbs A.H.
      • Race P.R.
      The Streptococcus gordonii adhesin CshA protein binds host fibronectin via a catch-clamp mechanism.
      ). This mode of binding involves the action of the intrinsically disordered N-terminal domain of the protein and its neighboring ligand-binding domain, which function in concert to form a robust protein–protein interaction via a readily dissociable precomplex intermediate.
      Figure thumbnail gr1
      Figure 1Schematic representation of CshA. Individual RDs are highlighted in blue with proposed interdomain linkers highlighted in red. The adhesive nonrepeat region is highlighted in green.
      In this study, using a combination of structural and biophysical methods, we show that the >175-kDa multidomain repeat region of CshA (CshA_RD1–17) adopts an elongated polymeric structure in solution, with a distinctive conformation dictated by the interplay of fully and partially ordered domains and intrinsically disordered regions. Equilibrium folding studies of individual CshA repeat domains reveal diversity in the stabilities and unfolding profiles of these proteins, despite their often considerable (>90%) sequence identities. The NMR structure of CshA_RD13 has been determined, which identifies a previously unreported all β-fold flanked on either terminus by unstructured linker regions. Complementary AUC and small-angle X-ray scattering (SAXS) studies of CshA_RD1–17 provide support for the CshA repeat region adopting a transitory structure characterized by alternating regions of order and disorder. Together, our data suggest a molecular architecture within which individual repeat domains contribute additive strength to the intact polypeptide but also minimize the likelihood of domain misfolding that may arise as a consequence of high sequence and structural identity to adjacent RDs. This is enabled via the acquisition of destabilizing mutations that preclude the adoption of a fully folded state. Our work identifies a distinctive polymeric protein architecture and resolves the molecular intricacies of its structure and organization. In turn, this provides greater insight regarding the capacity for bacterial adhesins to promote colonization of sites within the host that are continuously exposed to the flow of blood, saliva, or tissue fluids.

      Results

      The intact CshA repeat region adopts an extended polymeric structure in solution

      Consistent with previous domain assignments, the repeat region of CshA was considered to comprise residues 820–2500 of the 2507-amino acid full-length CshA polypeptide (
      • Back C.R.
      • Sztukowska M.N.
      • Till M.
      • Lamont R.J.
      • Jenkinson H.F.
      • Nobbs A.H.
      • Race P.R.
      The Streptococcus gordonii adhesin CshA protein binds host fibronectin via a catch-clamp mechanism.
      ) (Fig. 1). The intact CshA repeat region, from here on referred to as CshA_RD1–17, was amplified from S. gordonii DL1 (
      • Pakula R.
      • Walczak W.
      On the nature of competence of transformable streptococci.
      ) chromosomal DNA and cloned into the pOPINF expression vector (
      • Berrow N.S.
      • Alderton D.
      • Sainsbury S.
      • Nettleship J.
      • Assenberg R.
      • Rahman N.
      • Stuart D.I.
      • Owens R.J.
      A versatile ligation-independent cloning method suitable for high-throughput expression screening applications.
      ) (Table S1). The resulting construct was used to facilitate overexpression of an N-terminally hexahistidine-tagged variant of CshA_RD1–17 in Escherichia coli, and the resulting recombinant material was purified to homogeneity using a two-step process. CshA_RD1–17 was found to be a homogeneous, monodisperse species in solution, of >95% purity. Analysis of CshA_RD1–17 using CD spectroscopy, followed by deconvolution of the resulting spectrum into secondary structural elements, revealed the protein to be predominantly β-sheet (∼45%), with a significant disorder content (∼39%; Fig. 2A). Sedimentation velocity analytical ultracentrifugation confirmed that the polypeptide is monomeric and adopts an extended configuration in solution with an f/f0 value of 2.84 (Fig. 2B and Table S2). Complementary SAXS analysis (Fig. 2C and Table S3) provided further evidence that CshA adopts an elongated structure, with a radius of gyration of 120 Å and maximum diameter of 408 Å, as derived from the pair distance distribution (P(r)) function (Fig. 2D). Structural disorder is apparent from the Kratky plot, which diverges from the baseline at high q, and the Porod exponent, which is lower than observed for a well-folded globular protein (Fig. 2E). The structural disorder evident from CD and SAXS analysis suggests a flexible dynamic structure, in keeping with the biological role of CshA. The measured scattering data are well-described by the flexible cylinder model (Fig. 2F and Table S4), in which CshA is characterized by a higher Kuhn length and lower contour length than that expected for a random coil. The large deviation from random coil behavior is consistent with a significant proportion of folded regions in the solution structure. These data imply that the polypeptide adopts an elongated, flexible ultrastructure in solution that occupies an ensemble of configurations.
      Figure thumbnail gr2
      Figure 2The intact CshA repeat region adopts an extended polymeric structure in solution. A, far-UV CD spectrum of CshA_RD1–17. B, sedimentation velocity AUC of CshA_RD1–17. C, SAXS profile for CshA_RD1–17 (black circles) and inverse Fourier transform fit for P(r) distribution (red line). D, P(r) distribution for CshA_RD1–17 derived from the scattering profile shown in C. E, Kratky plot derived from the scattering profile shown in C. F, SAXS data for CshA_RD1–17 fitted to a flexible cylinder model.

      Individual CshA repeat domains exhibit varying stabilities and unfolding behaviors

      Having established the solution ultrastructure of CshA_RD1–17, we next sought to investigate the molecular origins of the polypeptide's physical properties. Comparative sequence analysis of assigned CshA repeat domains reveals considerable variation in the amino acid sequences of these regions (Fig. 3A and Fig. S1). The repeat region comprises a central core of domains with very high sequence identity (domains 3–14) punctuated by the deviant repeat domain 7. The sequence of this domain diverges significantly from those of the other 16 repeat domains that comprise the intact repeat region. Surprisingly, a significant number of adjacent domains located within the central 3–14 core exhibit high sequence identity. Domains 3 and 4, domains 5 and 6, domains 10 and 11, domains 11 and 12, and domains 12 and 13 share >90% sequence identity (Fig. 3A), an arrangement that contravenes current dogma regarding the organization of tandemly arrayed domains within multidomain proteins (
      • Borgia M.B.
      • Borgia A.
      • Best R.B.
      • Steward A.
      • Nettels D.
      • Wunderlich B.
      • Schuler B.
      • Clarke J.
      Single-molecule fluorescence reveals sequence-specific misfolding in multidomain proteins.
      ). The sequence identities of the terminal domains of CshA_RD1–17, namely 1, 15, 16, and 17, are significantly lower than those identified in the central core region. Interestingly, in addition to repeat domain 17, domains 6, 11, 12, and 13 all possess a C-terminal LPXTG cell-wall anchor motif, suggesting that evolutionary pressure to present the adhesive nonrepeat region of CshA at a maximal distance from the cell surface may have driven extension of the repeat region via gene duplication.
      Figure thumbnail gr3
      Figure 3Individual CshA repeat domains exhibit varying stabilities and unfolding behaviors. A, heat plot showing primary amino acid sequence identities between pairs of CshA repeat domains. The values correspond to the percentages of sequence identities. Numbered axes refer to individual CshA repeat domains. B, urea-induced equilibrium unfolding transitions for individual CshA repeat domains.
      In an effort to explore the structural significance of sequence variation between individual CshA repeat domains, a subset of these proteins were cloned, recombinantly overexpressed in E. coli, and purified to homogeneity using the same general strategy (Table S1). Representative domains were selected covering a breadth of amino acid sequences. These were repeat domains 1, 3, 5, 7, and 13. Each could be readily produced in high quantities and to high purities (>95%, as judged by SDS-PAGE analysis). The stabilities and unfolding behaviors of each of these proteins were assessed in vitro by monitoring their unfolding in the presence of increasing concentrations of the chemical denaturant urea (Fig. 3B and Table 1). Unfolding behavior was monitored by intrinsic tyrosine fluorescence, exploiting the presence of at least one such residue in each of the repeats 1, 3, 5, 7, and 13. Of the isolated domains examined, CshA_RD13 exhibited the highest overall stability (−3.42 kcal mol−1), whereas remarkably, CshA_RD5 showed no fluorescence intensity change when titrated with urea, despite the 91% sequence identity with repeat 13, including the two tyrosine residues at precisely the same positions: 52 (residue Tyr2084) and 92 (residue Tyr2123) (Fig. S1). CD spectroscopy of repeat domain 5 also indicated that this domain was largely unstructured, even in the absence of urea (data not shown). Although CshA_RD3 and CshA_RD7 are less stable than CshA_RD13, they do exhibit a mildly cooperative unfolding transition, whereas CshA_RD1 is barely stable even in the absence of urea but also exhibits a weakly cooperative unfolding transition. Complementary size-exclusion chromatography (SEC) analyses of individual CshA repeat domains provide further support for variability in the degree of foldedness of these proteins (Fig. S2). The largely unfolded CshA_RD5 elutes earlier from a SEC column than its better folded counterparts and significantly earlier than the well-folded CshA_RD13.
      Table 1Folding data for CshA repeat domains
      DomainΔGD–NH2OmD–N
      kcal mol−1kcal mol−1 m−1
      RD1−0.640.66
      RD3−1.531.17
      RD5
      RD7−1.471.24
      RD13−3.421.74

      Solution structure of CshA_RD13

      In an effort to provide a structural framework for the observed biophysical properties of CshA_RD1–17, CshA_RD13, which possesses the highest cross-domain sequence identity to all other CshA repeat domains (Fig. 3A), was selected for structure elucidation. Of the five single repeat domains produced recombinantly, CshA_RD13 has the greatest tolerance to urea unfolding, suggestive of high stability (Fig. 3B). The structure of this protein was determined using solution NMR (Fig. 4 and Figs. S3 and S4). Assignment proved challenging because of repetitive sequence motifs and a high degree of mobility leading to both the absence of some signals and the doubling (or more) of others (Fig. 4A). Nonetheless, a high degree of assignment was achieved for the core region of the protein covering residues 2053–2130 (Table 2). The N-terminal region (residues 2032–2052 plus a 19-residue tag) was found to be largely unstructured with few inter-residue NOEs and no unambiguously assignable long-range NOEs. For this reason, no structural restraints were included for this part of the sequence, and the structure was only calculated and validated for residues 2053–2130. In addition to the high degree of disorder in the N-terminal part of CshA_RD13, several other regions of slow exchange (ms) were detected. Two sets of NMR signals were observed for the initial N-terminal loop comprised of residues 2053–2062, of which only the major set was used for structure calculations. A hydrogen–deuterium exchange experiment showed that the Val2059 NH group is involved in a hydrogen bond that persists for over an hour, suggesting that interconversion between these conformations is either very slow or, more likely, that they are very similar and both involve a hydrogen bond between Val2059H and Asp2056O (as determined from initial structure calculations conducted without hydrogen bond restraints). Multiple conformations were also observed for residues Asp2113 and Asn2115, which lie in the β5-β6 loop. The β5-β6 loop lies adjacent to the N-terminal loop, suggesting that slow exchange between these two regions may be coupled. The β4-β5 loop (Pro2096–Pro2106) and C-terminal tail (Ser2124–Val2130) are both ill-defined in the structural ensemble (Fig. 4B), which is in part due to several broad, missing, or unassigned signals and thus a low density of structural restraints that might reflect the underlying dynamics of these regions. Several residues along the outside edge of the β3 and β5 strands have NOEs that could not be assigned to residues within the globular domain. Most likely these arise from interactions with the N-terminal tail of the protein, although no unambiguous assignment to particular residues was possible.
      Figure thumbnail gr4
      Figure 4Solution structure of CshA_ RD13 reveals a new protein fold. A, 1H-15N HSQC spectrum of CshA_RD13 recorded at 20 °C and 14. 1T (600 MHz). Boxed contours in green are shown at a lower contour level than the rest of the spectrum. Starred peak labels in red indicate additional minor conformations, not included in the structure calculations. Peak labels with negative peak numbers indicate peaks from the unstructured His tag. B, ensemble of 15 lowest energy structures of CshA_RD13. D, sequence of CshA_RD13 depicted in B, D, and E. D, cartoon diagram of CshA_RD13 with secondary structure elements labeled. E, structure of CshA_RD13 highlighting the composition of the hydrophobic core of the protein.
      Table 2NMR assignment, structure calculation, and validation statistics for CshA_RD13
      Degree of assignment
      Residues 2053–2130.
       Backbone (Cα, C′, N, and HN) (%)85.0
       Side-chain H (%)88.4
       Side-chain non-H (%)74.5
      Number of restraints
       Distance restraints
           Intraresidue (|ij| = 0)489
           Sequential (|ij| = 1)307
           Medium range (2 ≤ |ij| < 5)165
           Long range (|ij| ≥ 5)369
           Ambiguous470
           Total1800
       Hydrogen bond restraints29
       Dihedral angle restraints (Φ/Ψ/χ1)120 (59/59/2)
      Restraint statistics
      Values reported by ARIA 2.3 (37).
       r.m.s. of distance violations (Å)0.018 ± 0.002
       r.m.s. of dihedral violations (°)0.57 ± 0.09
       Violations > 0.5 Å0
       Violations > 0.3 Å1.3 ± 1.5
       Violations > 0.1 Å14.2 ± 2.9
      r.m.s. from idealized covalent geometry
      Values reported by ARIA 2.3 (37).
       Bonds (Å)0.0041 ± 0.0002
       Angles (°)0.53 ± 0.03
       Impropers (°)1.87 ± 0.14
      Structural quality
       Ramachandran statistics
      Values reported by Procheck (44).
           Most favored regions (%)80.1
      Residues 2053–2130.
      /84.6
      Ordered residues (residues 2056–2099 and 2104–2124) as calculated by PSVS 1.5 (41).
           Allowed regions (%)18.9
      Residues 2053–2130.
      /15.4
      Ordered residues (residues 2056–2099 and 2104–2124) as calculated by PSVS 1.5 (41).
           Generously allowed regions (%)0.3
      Residues 2053–2130.
      /0.0
      Ordered residues (residues 2056–2099 and 2104–2124) as calculated by PSVS 1.5 (41).
           Disallowed regions (%)0.7
      Residues 2053–2130.
      /0.0
      Ordered residues (residues 2056–2099 and 2104–2124) as calculated by PSVS 1.5 (41).
       CING % ROG scores (R/O/G)
      Residues 2053–2130.
      (
      • Doreleijers J.F.
      • Sousa da Silva A.W.
      • Krieger E.
      • Nabuurs S.B.
      • Spronk C.A.
      • Stevens T.J.
      • Vranken W.F.
      • Vriend G.
      • Vuister G.W.
      CING: an integrated residue-based structure validation program suite.
      )
      27/18/55
       Verify3D Z score (
      • Lüthy R.
      • Bowie J.U.
      • Eisenberg D.
      Assessment of protein models with 3-dimensional profiles.
      )
      −2.41
      Residues 2053–2130.
       Prosa II Z score (
      • Bhattacharya A.
      • Tejero R.
      • Montelione G.T.
      Evaluating protein structures determined by structural genomics consortia.
      )
      −0.54
      Residues 2053–2130.
       Procheck Z score (Φ/Ψ) (
      • Laskowski R.A.
      • Macarthur M.W.
      • Moss D.S.
      • Thornton J.M.
      Procheck: a program to check the stereochemical quality of protein structures.
      )
      −3.38
      Residues 2053–2130.
      /−2.75
      Ordered residues (residues 2056–2099 and 2104–2124) as calculated by PSVS 1.5 (41).
       Procheck Z score (all) (
      • Laskowski R.A.
      • Macarthur M.W.
      • Moss D.S.
      • Thornton J.M.
      Procheck: a program to check the stereochemical quality of protein structures.
      )
      −4.97
      Residues 2053–2130.
      /−4.55
      Ordered residues (residues 2056–2099 and 2104–2124) as calculated by PSVS 1.5 (41).
       MolProbity Z score (
      • Lovell S.C.
      • Davis I.W.
      • Adrendall 3rd, W.B.
      • de Bakker P.I.
      • Word J.M.
      • Prisant M.G.
      • Richardson J.S.
      • Richardson D.C.
      Structure validation by Cα geometry: φ, ψ and Cβ deviation.
      )
      −1.13
      Residues 2053–2130.
       No. of close contacts11
      Residues 2053–2130.
      ,
      Value reported by PDB validation software.
      Coordinates precision (r.m.s. deviation)
       All backbone atoms (Å)1.2
      Residues 2053–2130.
      /0.6
      Ordered residues (residues 2056–2099 and 2104–2124) as calculated by PSVS 1.5 (41).
      /0.4
      Residues in secondary structure as calculated by PSVS 1.5 (residues 2067–2069, 2075–2078, 2082–2087, 2091–2096, 2107–2112, and 2118–2123) (41).
       All heavy atoms (Å)1.6
      Residues 2053–2130.
      /1.0
      Ordered residues (residues 2056–2099 and 2104–2124) as calculated by PSVS 1.5 (41).
      /0.7
      Residues in secondary structure as calculated by PSVS 1.5 (residues 2067–2069, 2075–2078, 2082–2087, 2091–2096, 2107–2112, and 2118–2123) (41).
      a Residues 2053–2130.
      b Values reported by ARIA 2.3 (
      • Rieping W.
      • Habeck M.
      • Bardiaux B.
      • Bernard A.
      • Malliavin T.E.
      • Nilges M.
      ARIA2: automated NOE assignment and data integration in NMR structure calculation.
      ).
      c Values reported by Procheck (
      • Laskowski R.A.
      • Macarthur M.W.
      • Moss D.S.
      • Thornton J.M.
      Procheck: a program to check the stereochemical quality of protein structures.
      ).
      d Ordered residues (residues 2056–2099 and 2104–2124) as calculated by PSVS 1.5 (
      • Bhattacharya A.
      • Tejero R.
      • Montelione G.T.
      Evaluating protein structures determined by structural genomics consortia.
      ).
      e Value reported by PDB validation software.
      f Residues in secondary structure as calculated by PSVS 1.5 (residues 2067–2069, 2075–2078, 2082–2087, 2091–2096, 2107–2112, and 2118–2123) (
      • Bhattacharya A.
      • Tejero R.
      • Montelione G.T.
      Evaluating protein structures determined by structural genomics consortia.
      ).
      CshA_RD13 adopts a β-sandwich fold comprising two three-stranded anti-parallel β-sheets arranged at an angle of ∼35° relative to one another (Fig. 4C). The two sheets are connected by an 11-amino acid linker that fuses β4 to β5 and forms an extended loop that wraps around the C-terminal apex of the protein. The interface between the two β-sheets is predominantly hydrophobic and forms the compact core of the protein (Fig. 4D). There is a high degree of amino acid sequence conservation in this region in other CshA repeat domains, suggesting that each domain retains this unique core fold. In addition to the highlighted hydrophobic residues, the two tyrosine residues of CshA_RD13 (Tyr2084 and Tyr2123) reside at the β-sandwich interface (Fig. 4D), adding credence to the validity of our folding studies. Assessment of the solvation state of this pair of residues is likely to provide an accurate measure of protein unfolding. The overall shape of CshA_RD13 can be likened to a cylinder, which is tapered at both termini. The terminal regions of the protein present sizeable patches of charge, suggestive that neighboring repeat domains may be able to engage in complementary charge–charge interactions with one another. Four of the five residues universally conserved in all 17 repeat domains (Gly2082, Gly2090, Gly2102, and Asp2113 in CshA_RD13) contribute to the constrainment of tight turns between individual β-strands (β2-β3, β3-β4, β4-β5, and β5-β6; SI). The fifth residue is located in the disordered N-terminal interdomain linker.

      CshA_RD1–17 adopts a transitory dynamic structure comprising alternating regions of order and disorder

      To reconcile our structural and biophysical data, we attempted to construct a pseudoatomic model describing the molecular architecture of CshA_RD1–17 in its entirety. The partial foldedness of the polypeptide implied from our SAXS and CD data, in addition to the variations in unfolding behavior of selected repeat domains and the observation of disordered linker regions at either terminus of the CshA_RD13 NMR structure, suggests that CshA may adopt a structure comprised of alternating regions of order and disorder. To verify this model, we applied an ensemble optimization method (EOM) to our SAXS data (Fig. 5 and Table 3). Homology models of each repeat domain were generated and used to formulate pseudoatomic models describing the molecular architecture of CshA_RD1–17, in which well-folded domains alternate with disordered regions approximated by a random coil. The data were well-described by a model containing all 17 RD homology structures, although the calculated ensemble Rg (99 Å) was lower than that determined experimentally (120 Å) (Fig. 5 and Table 3). The contour length of this structure is ∼660 Å, more than half of that determined from the data using the flexible polymer model. However, this model underestimates the proportion of disordered structure as measured using CD. To compensate, only repeat domains predicted to be largely ordered (1, 3–4, 7–8, 14–16) were included in the model, yielding an ensemble with an average Rg in agreement with our experimentally measured value (Fig. 5 and Table 3). These findings indicate that the ultrastructure of CshA_RD1–17 does not adhere to a standard “beads-on-a-string” configuration, wherein individual well-folded RDs are arranged in a defined sequence within the polypeptide chain, but rather a highly dynamic architecture wherein a subset of RDs fail to adopt a fully folded state, thus leading to a highly dynamic transitory structure dominated by the interplay of ordered, disordered, and partially ordered regions.
      Figure thumbnail gr5
      Figure 5CshA_RD1–17 adopts a transitory dynamic structure comprising alternating regions of order and disorder. A, Rg distributions for CshA_RD1–17 modeled with 17 globular repeat domains (blue) and 8 globular repeat domains (red). The Rg distribution of the full pool of structures generated with RANCH is shown as a solid color, and the Rg distribution of the selected ensemble is shown as a black line. B, EOM fits for CshA_RD1–17 modeled with 17 globular repeat domains (blue line) and 8 globular repeat domains (red line).
      Table 3EOM fitting parameters for different models
      Parameter
      Repeat domains in model1–171, 3–4, 7–8, 14–16
      Ensemble Rg99.04120.0
      Ensemble Dmax300.94353.34
      Fit quality (χ2)2.7203.008

      Discussion

      Fibrillar adhesins are an important family of bacterial surface proteins that make significant contributions to environmental and host colonization, biofilm formation, host tissue invasion, and pathogenicity. As virulence factors, they represent attractive targets for the development of therapeutic strategies and interventions. Although many fibrillar adhesins have been identified in commensal and pathogenic bacteria, only a small number of these proteins have been subjected to detailed molecular level characterization. Examples include SasG, M protein, and the AgI/II family polypeptides (
      • Larson M.R.
      • Rajashankar K.R.
      • Patel M.H.
      • Robinette R.A.
      • Crowley P.J.
      • Michalek S.
      • Brady L.J.
      • Deivanayagam C.
      Elongated fibrillar structure of a streptococcal adhesin assembled by the high-affinity association of α- and PPII-helices.
      ,
      • Rego S.
      • Heal T.J.
      • Pidwill G.R.
      • Till M.
      • Robson A.
      • Lamont R.J.
      • Sessions R.B.
      • Jenkinson H.F.
      • Race P.R.
      • Nobbs A.H.
      Structural and functional analysis of cell wall-anchored polypeptide adhesin BspA in Streptococcus agalactiae.
      ,
      • Gruszka D.T.
      • Wojdyla J.A.
      • Bingham R.J.
      • Turkenburg J.P.
      • Manfield I.W.
      • Steward A.
      • Leech A.P.
      • Geoghegan J.A.
      • Foster T.J.
      • Clarke J.
      • Potts J.R.
      Staphylococcal biofilm-forming protein has a contiguous rod-like structure.
      ,
      • Gruszka D.T.
      • Whelan F.
      • Farrance O.E.
      • Fung H.K.
      • Paci E.
      • Jeffries C.M.
      • Svergun D.I.
      • Baldock C.
      • Baumann C.G.
      • Brockwell D.J.
      • Potts J.R.
      • Clarke J.
      Cooperative folding of intrinsically disordered domains drives assembly of a strong elongated protein.
      ,
      • Gruszka D.T.
      • Mendonça C.A.
      • Paci E.
      • Whelan F.
      • Hawkhead J.
      • Potts J.R.
      • Clarke J.
      Disorder drives cooperative folding in a multidomain protein.
      ,
      • Formosa-Dague C.
      • Speziale P.
      • Foster T.J.
      • Geoghegan J.A.
      • Dufrêne Y.F.
      Zinc-dependent mechanical properties of Staphylococcus aureus biofilm-forming surface protein SasG.
      ,
      • Troffer-Charlier N.
      • Ogier J.
      • Moras D.
      • Cavarelli J.
      Crystal structure of the V-region of Streptococcus mutans antigen I/II at 2.4 A resolution suggests a sugar preformed binding site.
      ,
      • Forsgren N.
      • Lamont R.J.
      • Persson K.
      Crystal structure of the variable domain of the Streptococcus gordonii surface protein SspB.
      ,
      • Forsgren N.
      • Lamont R.J.
      • Persson K.
      Two intramolecular isopeptide bonds are identified in the crystal structure of the Streptococcus gordonii SspB C-terminal domain.
      ,
      • Larson M.R.
      • Rajashankar K.R.
      • Crowley P.J.
      • Kelly C.
      • Mitchell T.J.
      • Brady L.J.
      • Deivanayagam C.
      Crystal structure of the C-terminal region of Streptococcus mutans antigen I/II and characterization of salivary agglutinin adherence domains.
      ). Each of these adhesins exploits a startlingly disparate molecular mechanism to facilitate the formation of fibrillar structures on the bacterial cell surface.
      The S. gordonii fibrillar adhesin CshA plays an important role in host colonization. CshA possesses a distinctive modular architecture that comprises 17 β-sandwich domains fused in series by flexible linkers. Although there is diversity in the sequences of individual repeat domains, amino acid sequence analysis suggests that each retains a conserved hydrophobic core that forms the basis of a compact protein fold. The structure of the representative repeat domain CshA_RD13 has been elucidated and provides a valuable test subject for understanding CshA repeat domain structure and function. The high degree of mobility in CshA_RD13 made assignment and structure calculation for this protein challenging; nonetheless, the core globular part of the protein is well-defined (Fig. 4). DALI analysis of CshA_RD13 failed to identify any closely related structural homologues of the protein, and technically the domain exhibits a new fold. However, the flattened β-sandwich is reminiscent of Ig domains found in many other repeat domain–containing proteins such as titin and cadherin (
      • Smith B.O.
      • Picken N.C.
      • Westrop G.D.
      • Bromek K.
      • Mottram J.C.
      • Coombs G.H.
      The structure of Leishmania mexicana ICP provides evidence for convergent evolution of cysteine peptidase inhibitors.
      ).
      Folding studies of individual CshA RDs reveals remarkably variable stabilities considering their high sequence identities (Fig. 3). Five of the repeat domains (domains 1, 3, 5, 7, and 13) were expressed individually and subjected to equilibrium unfolding to assess their relative stabilities. Repeat domains 3, 7, and 13 all displayed a cooperative unfolding transition with a relatively small free energy of folding, although not unusual for small domains (for example, see the study by Gruszka et al. (
      • Gruszka D.T.
      • Whelan F.
      • Farrance O.E.
      • Fung H.K.
      • Paci E.
      • Jeffries C.M.
      • Svergun D.I.
      • Baldock C.
      • Baumann C.G.
      • Brockwell D.J.
      • Potts J.R.
      • Clarke J.
      Cooperative folding of intrinsically disordered domains drives assembly of a strong elongated protein.
      )). Equilibrium unfolding of CshA_RD1 revealed a weakly cooperative transition (mD-N = 0.66 kcal mol−1m−1) and only very marginal stability (0.64 kcal mol−1), indicating that a significant proportion of the molecules are unfolded even in native conditions. Because CshA_RD1 is markedly divergent from all of the other repeat domains, it is difficult to relate differences in sequence to changes in stability. Interestingly, the sequences of the terminal repeat domains CshA_RD1 and CshA_RD17 differ considerably from those located centrally within the polypeptide. This may reflect the fact that they have coevolved to be adjacent to a nonrepeat domain and the cell wall, respectively.
      Because this repeat follows the nonrepetitive region in the overall CshA structure, it may require the presence of that region to interact and stabilize it. No transition could be observed at all with CshA_RD5, which is surprising because it has 91% identity with CshA_RD13. An examination of the differences between the primary sequences of CshA_RD5 and CshA_RD13 with respect to the NMR structure of the latter suggests some differences that may be responsible for destabilizing CshA_RD5 relative to CshA_RD13. Thr2077 on β1b and both Pro2118 and Thr2122 on β5 in CshA_RD13 are all solvent-exposed to some degree and have been substituted with valine, leucine, and isoleucine, respectively, in CshA_RD5, leading to unfavorable exposure of hydrophobic residues to the aqueous solvent. Pro2079, which forms part of a type II β-turn between strands β1b and β2, is substituted with a serine, which statistically has a greater preference for type I β-turns.
      Mapping amino acid conservation across all 17 repeat domains onto the structure of CshA_RD13 indicates partial conservation of hydrophobic residues that reside within the hydrophobic cores of each RD. The central section of CshA_RD1–17 comprises 12 of 13 serially arrayed repeat domains that possess a high degree of sequence identity and appear closely structurally related (Fig. 3A). The sequential arrangement of high similarity domains is at odds with the known sequence to folding relationships in tandemly arrayed protein domains, in which sequence disparity between neighboring domains is postulated to minimize protein misfolding (
      • Borgia M.B.
      • Borgia A.
      • Best R.B.
      • Steward A.
      • Nettels D.
      • Wunderlich B.
      • Schuler B.
      • Clarke J.
      Single-molecule fluorescence reveals sequence-specific misfolding in multidomain proteins.
      ). It is tempting to speculate that interdomain linker length and disorder plays an important role in this process, ensuring that the spatial distance between neighboring domains is sufficient to allow each individual domain to adopt its fully folded conformation prior to translation of its superseding neighbor. However, what is clear from our folding studies and corroborated by hydrodynamic analysis of CshA_RD1–17 is that a subset of CshA RDs do not adopt a well-folded conformation either alone or in the context of the intact CshA polypeptide. This generalized loss in foldedness appears to arise because of the acquisition of destabilizing mutations within the hydrophobic core of some repeat domains. Significantly, these mutations appear to arise in instances where there is significant amino acid sequence identity to neighboring domains (Fig. 3A and Fig. S1). This may represent a strategy to minimize the likelihood of inter-domain misfolding events, thus mitigating adhesin aggregation on the bacterial cell surface. Alternatively, the solvent-exposed hydrophobic residues may help to mediate interaction between CshA polypeptides during assembly of the cell-surface adhesive layer.
      The functional significance of the dynamic transitory structure of CshA_RD1–17 is yet to be unambiguously established, however, it is unquestionable that the combination of folded and partially folded regions will confer a high degree of flexibility to the polypeptide. This may enable the optimal projection of CshA's adhesive tip from the S. gordonii cell surface and in doing so maximize the capture radius of the adhesin. In addition, the partially folded structure may provide a mechanism of force damping following fibronectin binding. This could offer a mechanical advantage by mitigating the effects of shear forces following target engagement. This would be of particular significance in the bloodstream, where it is necessary for S. gordonii to maintain an intimate association with the surface of host cells while resisting the force of blood flow. The transitory structure of CshRD1–17 would provide a deformable tether with the capacity to dissipate the kinetic energy of binding under flow.
      In summary, here we report the identification and characterization of an entirely new architecture for multidomain bacterial surface proteins as typified by the S. gordonii adhesin CshA. This ultrastructure is characterized by the presence of fully and partially folded repeat domains, along with regions of intrinsic disorder, which affords a dynamic yet mechanically robust polymeric structure. Our study extends the diversity of natural protein architectures that are employed to enable microbial adherence to biotic and abiotic substrata and provides new insight into the capacity for bacteria to adhere and persist at sites exposed to shear forces. Moreover, this information establishes a foundation for the development of interventions that target CshA and related polypeptides that can be applied to disease prevention and anti-biofouling strategies.

      Experimental procedures

      Gene cloning

      DNA sequences encoding CshA_RD1–17, CshA_RD1, CshA_RD3, CshA_RD5, CshA_RD7, and CshA_RD13 were amplified from S. gordonii DL1 (
      • Pakula R.
      • Walczak W.
      On the nature of competence of transformable streptococci.
      ) chromosomal DNA using appropriate primers (Table S1), incorporating appropriate consensus sequences for subsequent cloning into the expression vector pOPINF (
      • Berrow N.S.
      • Alderton D.
      • Sainsbury S.
      • Nettleship J.
      • Assenberg R.
      • Rahman N.
      • Stuart D.I.
      • Owens R.J.
      A versatile ligation-independent cloning method suitable for high-throughput expression screening applications.
      ), precut with HindIII and KpnI. Ligations were performed using the In-FusionTM (Clontech) cloning system as per the manufacturer's instructions. The resulting constructs encode N-terminally hexahistidine-tagged variants of each of the proteins under investigation. The sequences of all constructs were verified by DNA sequencing before being transformed into E. coli BL21 (DE3) cells for protein expression.

      Protein expression

      For the expression of unlabeled CshA_RD 1–17, CshA_RD1, CshA_RD3, CshA_RD5, CshA_RD7, and CshA_RD13, cultures of E. coli BL21 (DE3) cells harboring the respective expression plasmid were grown with shaking (200 rpm) in 1 liter of LB (Luria-Bertani) broth supplemented with carbenicillin (50 μg ml−1) at 37 °C, to A600 = 0.4–0.6. Protein expression was induced by the addition of isopropyl β-galactopyranoside to a final concentration of 1 mm, and the cell cultures were transferred to 20 °C with shaking at 200 rpm and grown for a further 16 h. For expression of 15N-labeled CshA_RD13, a culture (100 ml) of E. coli BL21 (DE3) cells harboring CshA_RD13::pOPINF was grown overnight at 37 °C with shaking at 200 rpm. The cells were harvested by centrifugation, washed in resuspension buffer (50 mm Tris-HCl, 150 mm NaCl, pH 7.5), and used to inoculate 1 liter of M9 minimal autoinduction medium (50 mm KH2PO4, 25 mm Na2HPO4, pH 6.8, 10 mm NaCl, 0.5% glycerol, 0.05% glucose, 0.2% α-lactose, 2 mm MgSO4) supplemented with carbenicillin (50 μg ml−1), trace elements (10 ml, 100×, to a final concentration of 13.4 mm EDTA, 3.1 mm FeCl3·6H2O, 0.62 mm ZnCl2, 76 μm CuCl2.2H2O, 42 μm CoCl2·6H2O, 162 μm H3BO3, 8.1 μm MnCl2·6H2O) and 1 g/liter 15NH4Cl. The cells were grown with shaking at 37 °C to A600 = 0.4–0.6 and were then grown with shaking (200 rpm) at 20 °C for a further 16 h. For expression of 15N13C-labeled CshA_RD13, a culture (100 ml) of E. coli BL21 (DE3) cells harboring CshA_RD13::pOPINF was grown overnight with shaking at 37 °C. The cells were harvested by centrifugation, washed in resuspension buffer, and used to inoculate 2 liters of M9 minimal medium (50 mm KH2PO4, 25 mm Na2HPO4, pH 6.8, 10 mm NaCl, 1 mm MgSO4, 0.3 mm CaCl2, 1 mg ml−1 biotin, 1 mg ml−1 thiamin), supplemented with carbenicillin (50 μg ml−1), trace elements (5 ml/liter, 100×), 0.5 g/liter 15NH4Cl, and 2 g/liter 13C glucose. The cells were grown to A600 = 0.8–0.9. Protein expression was induced by the addition of isopropyl β-galactopyranoside (1 mm), and the cell cultures were transferred to 25 °C, with shaking at 200 rpm and grown for a further 16 h.

      Protein purification

      All recombinant proteins were purified using the same general strategy. The cells were harvested by centrifugation and lysed. Cell debris was removed by centrifugation, and the remaining supernatant liquids were applied to a HiTrap Ni2+ affinity column (GE Healthcare). The proteins were eluted with an imidazole gradient of 10–500 mm over 15 column volumes. The fractions (2 ml) found to contain the target protein of interest (as identified by SDS-PAGE analysis) were pooled and concentrated. Protein samples were subjected to further purification using SEC by passage through either a Superdex 16/60 S75 column (CshA_RD1, CshA_RD3, CshA_RD5, CshA_RD7, and CshA_RD13) or a Superdex 16/60 S200 column (CshA_RD1–17), both from GE Healthcare. For unlabeled proteins, SEC was performed in 50 mm Tris-HCl, 150 mm NaCl, pH 7.5. For labeled proteins, SEC purification was performed in 20 mm phosphate, 50 mm NaCl, pH 7.5. Protein-containing fractions were pooled, concentrated to 20 mg ml−1, and stored at 4 °C.

      Analytical ultracentrifugation

      Sedimentation velocity analytical ultracentrifugation experiments were performed using a Beckman Optima XL-I. Sedimentation of the CshA_RD1–17 was monitored at 40000 rpm and 20 °C using the UV-visible absorption system at a wavelength of 280 nm. The sample concentration was 6.22 μm in buffer (20 mm Tris-HCl, 150 mm NaCl, pH 7.5). The sedimentation profiles were fitted in SEDFIT using the continuous distribution c(s) Lamm equation model. The partial specific volume of CshA_RD1–17 (0.7279 cm3 g−1) was calculated from the primary sequence using SEDFIT. The density and viscosity of the buffer were measured using an Anton–Paar rolling-ball viscometer (Lovis 2000 M/ME) and found to be 1.002921 g cm3 and 1.0218 mPa·s, respectively.

      Small angle X-ray scattering

      SAXS data of CshA_RD1–17 were collected at the Diamond Light Source synchrotron (Beamline B21) with a fixed camera length configuration (4.014 m) at 12.4 keV. Size-exclusion chromatography–coupled SAXS (SEC-SAXS) using an Agilent HPLC system was utilized to collect the data. The sample was measured at a concentration of 25.8 μm in buffer (20 mm Tris-HCl, 150 mm NaCl, 5 mm KNO3, 1% sucrose, pH 7.5). Two-dimensional scattering profiles were reduced using in-house software. The data were scaled, merged, and background-subtracted using the ScÅtter software package (
      • Förster S.
      • Apostol L.
      • Bras W.
      Scatter: software for the analysis of nano- and mesoscale small-angle scattering.
      ). GNOM and BAYESAPP were used to generate pair distance distribution plots from the scattering curves. Form factor fitting was carried out with SASVIEW using a flexible cylinder model. The model describes a chain that is defined by the contour length (L) and the Kuhn length (b). The Kuhn length is defined as twice the persistence length, over which the chain can be described as rigid, and values above that expected for a random coil can be ascribed to the range of possible torsional angles between residues and to folded structural elements within the polypeptide. The contour length is the linearly extended length of the particle without stretching the backbone. For completely disordered chains behaving as a random coil, b is between 18–20 Å. The theoretical contour length for a fully disordered protein is 3.84 Å per residue and is defined by the number of residues and the spacing between Cα positions. EOM was to analyze the experimental data using the ensemble optimization. RANCH was used to generate a pool of 10,000 independent conformational models based on the primary sequence and homology models of folded RD domains. GAJOE was used to select an ensemble of models whose combined theoretical scattering profiles best approximated the measured data using a genetic algorithm.

      Proteolytic His-tag cleavage

      Following nickel affinity and size-exclusion purification of recombinant CshA_RD1, CshA_RD3, CshA_RD5, CshA_RD7, and CshA_RD13 CshA proteins, their hexahistadine tags were cleaved off by 3C protease digestion (Pierce). This was carried out according to manufacturer's protocol (Pierce): 3C protease (1 mg ml−1) was incubated with His-tagged CshA protein (5 mg ml−1) overnight at 4 °C with agitation. The cleaved CshA proteins were separated from the uncleaved material by passage through a HiTrap Ni2+ affinity column (GE Healthcare) equilibrated with buffer (20 mm potassium phosphate, 100 mm NaCl, pH 7.0). Cleaved protein was eluted with 5 column volumes of the same buffer. Uncleaved protein was then eluted with elution buffer (20 mm potassium phosphate, 100 mm NaCl, 1 m imidazole, pH 7.0). Cleaved protein was concentrated to 5–10 mg ml−1.

      Equilibrium unfolding studies

      Equilibrium unfolding studies were performed by monitoring the change in intrinsic tyrosine fluorescence as a consequence of increasing urea concentration. All spectra were collected using a Horiba–Jobin YVON Fluorolog. Protein concentrations of 10 μm in buffer (20 mm potassium phosphate, 100 mm NaCl, pH 7.0), plus varying concentrations of urea, were mixed, and samples were left to equilibrate for 1 h at 20 °C prior to analysis. All fluorescence experiments were performed at 23 °C. For each sample, an emission spectrum was measured over the range 290–320 nm using an excitation wavelength of 278 nm. For analysis, the fluorescence intensity at 306 nm was plotted as a function of urea concentration, and the data were fitted to a two-state equilibrium unfolding model.

      NMR spectroscopy

      NMR data sets were collected at 20 °C, utilizing a Varian VNMRS 600-MHz spectrometer with a cryogenic cold probe. All NMR data were processed using NMRPipe (
      • Delaglio F.
      • Grzesiek S.
      • Vuister G.W.
      • Zhu G.
      • Pfeifer J.
      • Bax A.
      NMRpipe: a multidimensional spectral processing system based on Unix pipes.
      ). 1H-15N HSQC, HNCACB, CBCA(CO)NH, HNCA, HNHA, HN(CO)CA, HNCO, HN(CA)CO, C(CO)NH, HCCH-TOCSY, 15N-TOCSY-HSQC, 15N-NOESY-HSQC, 13C-NOESY-HSQC, and aromatic 13C-NOESY-HSQC (150-ms mixing time) experiments were collected. A hydrogen–deuterium exchange experiment was conducted by recording 1H-15N HSQC experiments at several intervals following dissolution of freeze-dried protein in D2O. Two-dimensional 1H-1H TOCSY and NOESY experiments were recorded on the fully exchanged protein sample. 15N-NOESY-HSQC and 13C-NOESY-HSQC spectra (150 ms mixing time) were also recorded at 20 °C on a Varian INOVA 900 MHz spectrometer with a cryogenic cold-probe (Henry Wellcome Building for NMR, University of Birmingham). Proton chemical shifts were referenced with respect to the water signal relative to DSS. Spectra were assigned using CcpNmr Analysis 2.4 (
      • Vranken W.F.
      • Boucher W.
      • Stevens T.J.
      • Fogh R.H.
      • Pajon A.
      • Llinas M.
      • Ulrich E.L.
      • Markley J.L.
      • Ionides J.
      • Laue E.D.
      The CCPN data model for NMR spectroscopy: development of a software pipeline.
      ). Structure calculations were conducted using ARIA 2.3 (
      • Rieping W.
      • Habeck M.
      • Bardiaux B.
      • Bernard A.
      • Malliavin T.E.
      • Nilges M.
      ARIA2: automated NOE assignment and data integration in NMR structure calculation.
      ). 20 structures were calculated at each iteration except iteration 8, in which 200 structures were calculated. The 20 lowest energy structures from this iteration went on to be water-refined, and the 15 lowest energy structures were chosen as a representative ensemble. Network anchoring was used during iterations 0, 1, and 2, and all iterations were corrected for spin diffusion (
      • Linge J.P.
      • Habeck M.
      • Rieping W.
      • Nilges M.
      Correction of spin diffusion during iterative automated NOE assignment.
      ). Two cooling phases, each with 8000 steps, were used. Torsion angle restraints were calculated using both TALOS+ (
      • Shen Y.
      • Delaglio F.
      • Cornilescu G.
      • Bax A.
      TALOS plus: a hybrid method for predicting protein backbone torsion angles from NMR chemical shifts.
      ) and DANGLE (
      • Cheung M.S.
      • Maguire M.L.
      • Stevens T.J.
      • Broadhurst R.W.
      DANGLE: a Bayesian inferential method for predicting protein backbone dihedral angles and secondary structure.
      ). Restraints were included for residues where both programs gave an unambiguous result in the same area of the Ramachandran plot. The restraints were based on those provided by DANGLE but extended if the TALOS+ restraints went beyond these. This process resulted in slightly fewer, looser restraints than either program on their own but aimed to reduce the number of over-restrained angles. χ1 angle restraints were introduced for Val2107 and Val2109 because the orientation of these side chains was clearly defined by their NOE pattern, although the selection of structures based on global energy scores meant that not all structures resulted in these orientations unless these restraints were introduced. The hydrogen–deuterium exchange experiment showed 28 NH groups to be protected after 8 min, including two Gln side-chain amides (see Fig. S4). In addition, NOEs were observed to a ThrHγ1 hydrogen, suggesting that this was also involved in a hydrogen bond. Initial structure calculations were conducted without hydrogen bond restraints. Hydrogen bond donors were then identified, and corresponding hydrogen bond restraints were included in later calculations. Structures were validated using the Protein Structure Validation Software suite 1.5 (
      • Bhattacharya A.
      • Tejero R.
      • Montelione G.T.
      Evaluating protein structures determined by structural genomics consortia.
      ) and CING (
      • Doreleijers J.F.
      • Sousa da Silva A.W.
      • Krieger E.
      • Nabuurs S.B.
      • Spronk C.A.
      • Stevens T.J.
      • Vranken W.F.
      • Vriend G.
      • Vuister G.W.
      CING: an integrated residue-based structure validation program suite.
      ).

      Author contributions

      C. R. B., V. A. H., K. L., V. V. P., A. E. P., D. F., S. G. B., M. P. C., A. H. N., and P. R. R. formal analysis; C. R. B., K. L., and M. P. C. validation; C. R. B., V. A. H., K. L., V. V. P., A. E. P., S. G. B., and M. P. C. investigation; C. R. B., V. A. H., K. L., H. F. J., S. G. B., M. P. C., A. H. N., and P. R. R. methodology; C. R. B., V. A. H., K. L., V. V. P., A. E. P., D. F., H. F. J., S. G. B., M. P. C., A. H. N., and P. R. R. writing-original draft; C. R. B., V. A. H., K. L., V. V. P., A. E. P., D. F., H. F. J., M. P. C., A. H. N., and P. R. R. writing-review and editing; V. A. H., M. P. C., and P. R. R. data curation; D. F., H. F. J., S. G. B., M. P. C., A. H. N., and P. R. R. conceptualization; H. F. J., S. G. B., M. P. C., A. H. N., and P. R. R. supervision; H. F. J., A. H. N., and P. R. R. funding acquisition; H. F. J. and P. R. R. project administration.

      Acknowledgments

      We thank Dr. Sara Whittaker (University of Birmingham) and Dr. Roz Ellis (University of Bristol) for assistance with NMR data collection, Dr. Paul Curnow (University of Bristol) for assistance with CD data analysis, and Dr. Robert Rambo (Diamond Light Source) for assistance with SAXS data collection.

      Supplementary Material

      References

        • Green J.L.
        • Bohannan B.J.
        • Whitaker R.J.
        Microbial biogeography: from taxonomy to traits.
        Science. 2008; 320 (18497288): 1039-1043
        • Martiny J.B.
        • Bohannan B.J.
        • Brown J.H.
        • Colwell R.K.
        • Fuhrman J.A.
        • Green J.L.
        • Horner-Devine M.C.
        • Kane M.
        • Krumins J.A.
        • Kuske C.R.
        • Morin P.J.
        • Naeem S.
        • Ovreås L.
        • Reysenbach A.L.
        • Smith V.H.
        • et al.
        Microbial biogeography: putting microorganisms on the map.
        Nat. Rev. Microbiol. 2006; 4 (16415926): 102-112
        • Kline K.A.
        • Fälker S.
        • Dahlberg S.
        • Normark S.
        • Henriques-Normark B.
        Bacterial adhesins in host-microbe interactions.
        Cell Host Microbe. 2009; 5 (19527885): 580-592
        • Pizarro-Cerdá J.
        • Cossart P.
        Bacterial adhesion and entry into host cells.
        Cell. 2006; 124 (16497583): 715-727
        • Wizemann T.M.
        • Adamou J.E.
        • Langermann S.
        Adhesins as targets for vaccine development.
        Emerg. Infect. Dis. 1999; 5 (10341176): 395-403
        • Klemm P.
        • Vejborg R.M.
        • Hancock V.
        Prevention of bacterial adhesion.
        Appl. Microbiol. Biotechnol. 2010; 88 (20694794): 451-459
        • Proft T.
        • Baker E.N.
        Pili in Gram-negative and Gram-positive bacteria: structure, assembly and their role in disease.
        Cell Mol. Life Sci. 2009; 66 (18953686): 613-635
        • Allen W.J.
        • Phan G.
        • Waksman G.
        Pilus biogenesis at the outer membrane of Gram-negative bacterial pathogens.
        Curr. Opin. Struct. Biol. 2012; 22 (22402496): 500-506
        • Kang H.J.
        • Coulibaly F.
        • Clow F.
        • Proft T.
        • Baker E.N.
        Stabilizing isopeptide bonds revealed in gram-positive bacterial pilus structure.
        Science. 2007; 318 (18063798): 1625-1628
        • Larson M.R.
        • Rajashankar K.R.
        • Patel M.H.
        • Robinette R.A.
        • Crowley P.J.
        • Michalek S.
        • Brady L.J.
        • Deivanayagam C.
        Elongated fibrillar structure of a streptococcal adhesin assembled by the high-affinity association of α- and PPII-helices.
        Proc. Natl. Acad. Sci. U.S.A. 2010; 107 (20231452): 5983-5988
        • Macintosh R.L.
        • Brittan J.L.
        • Bhattacharya R.
        • Jenkinson H.F.
        • Derrick J.
        • Upton M.
        • Handley P.S.
        The terminal A domain of the fibrillar accumulation-associated protein (Aap) of Staphylococcus epidermidis mediates adhesion to human corneocytes.
        J. Bacteriol. 2009; 191 (19749046): 7007-7016
        • Rego S.
        • Heal T.J.
        • Pidwill G.R.
        • Till M.
        • Robson A.
        • Lamont R.J.
        • Sessions R.B.
        • Jenkinson H.F.
        • Race P.R.
        • Nobbs A.H.
        Structural and functional analysis of cell wall-anchored polypeptide adhesin BspA in Streptococcus agalactiae.
        J. Biol. Chem. 2016; 291 (27311712): 15985-16000
        • Jameson M.W.
        • Jenkinson H.F.
        • Parnell K.
        • Handley P.S.
        Polypeptides associated with tufts of cell-surface fibrils in an oral Streptococcus.
        Microbiology. 1995; 141 (11550706): 2729-2738
        • Wu H.
        • Mintz K.P.
        • Ladha M.
        • Fives-Taylor P.M.
        Isolation and characterization of Fap1, a fimbriae-associated adhesin of Streptococcus parasanguis FW213.
        Mol. Microbiol. 1998; 28 (9632253): 487-500
        • Wu H.
        • Fives-Taylor P.M.
        Identification of dipeptide repeats and a cell wall sorting signal in the fimbriae-associated adhesin, Fap1, of Streptococcus parasanguis.
        Mol. Microbiol. 1999; 34 (10594831): 1070-1081
        • Froeliger E.H.
        • Fives-Taylor P.
        Streptococcus parasanguis fimbria-associated adhesin fap1 is required for biofilm formation.
        Infect. Immun. 2001; 69 (11254614): 2512-2519
        • McNab R.
        • Forbes H.
        • Handley P.S.
        • Loach D.M.
        • Tannock G.W.
        • Jenkinson H.F.
        Cell wall-anchored CshA polypeptide (259 kilodaltons) in Streptococcus gordonii forms surface fibrils that confer hydrophobic and adhesive properties.
        J. Bacteriol. 1999; 181 (10322009): 3087-3095
        • McNab R.
        • Holmes A.R.
        • Clarke J.M.
        • Tannock G.W.
        • Jenkinson H.F.
        Cell surface polypeptide CshA mediates binding of Streptococcus gordonii to other oral bacteria and to immobilized fibronectin.
        Infect. Immun. 1996; 64 (8926089): 4204-4210
        • Holmes A.R.
        • McNab R.
        • Jenkinson H.F.
        Candida albicans binding to the oral bacterium Streptococcus gordonii involves multiple adhesin-receptor interactions.
        Infect. Immun. 1996; 64 (8890225): 4680-4685
        • Jakubovics N.S.
        • Brittan J.L.
        • Dutton L.C.
        • Jenkinson H.F.
        Multiple adhesin proteins on the cell surface of Streptococcus gordonii are involved in adhesion to human fibronectin.
        Microbiology. 2009; 155 (19661180): 3572-3580
        • Back C.R.
        • Sztukowska M.N.
        • Till M.
        • Lamont R.J.
        • Jenkinson H.F.
        • Nobbs A.H.
        • Race P.R.
        The Streptococcus gordonii adhesin CshA protein binds host fibronectin via a catch-clamp mechanism.
        J. Biol. Chem. 2017; 292 (27920201): 1538-1549
        • Pakula R.
        • Walczak W.
        On the nature of competence of transformable streptococci.
        J. Gen. Microbiol. 1963; 31 (13941150): 125-133
        • Berrow N.S.
        • Alderton D.
        • Sainsbury S.
        • Nettleship J.
        • Assenberg R.
        • Rahman N.
        • Stuart D.I.
        • Owens R.J.
        A versatile ligation-independent cloning method suitable for high-throughput expression screening applications.
        Nucleic Acids Res. 2007; 35 (17317681): e45
        • Borgia M.B.
        • Borgia A.
        • Best R.B.
        • Steward A.
        • Nettels D.
        • Wunderlich B.
        • Schuler B.
        • Clarke J.
        Single-molecule fluorescence reveals sequence-specific misfolding in multidomain proteins.
        Nature. 2011; 474 (21623368): 662-665
        • Smith B.O.
        • Picken N.C.
        • Westrop G.D.
        • Bromek K.
        • Mottram J.C.
        • Coombs G.H.
        The structure of Leishmania mexicana ICP provides evidence for convergent evolution of cysteine peptidase inhibitors.
        J. Biol. Chem. 2006; 281 (16407198): 5821-5828
        • Gruszka D.T.
        • Wojdyla J.A.
        • Bingham R.J.
        • Turkenburg J.P.
        • Manfield I.W.
        • Steward A.
        • Leech A.P.
        • Geoghegan J.A.
        • Foster T.J.
        • Clarke J.
        • Potts J.R.
        Staphylococcal biofilm-forming protein has a contiguous rod-like structure.
        Proc. Natl. Acad. Sci. U.S.A. 2012; 109 (22493247): E1011-E1018
        • Gruszka D.T.
        • Whelan F.
        • Farrance O.E.
        • Fung H.K.
        • Paci E.
        • Jeffries C.M.
        • Svergun D.I.
        • Baldock C.
        • Baumann C.G.
        • Brockwell D.J.
        • Potts J.R.
        • Clarke J.
        Cooperative folding of intrinsically disordered domains drives assembly of a strong elongated protein.
        Nat. Commun. 2015; 6 (26027519): 7271
        • Gruszka D.T.
        • Mendonça C.A.
        • Paci E.
        • Whelan F.
        • Hawkhead J.
        • Potts J.R.
        • Clarke J.
        Disorder drives cooperative folding in a multidomain protein.
        Proc. Natl. Acad. Sci. U.S.A. 2016; 113 (27698144): 11841-11846
        • Formosa-Dague C.
        • Speziale P.
        • Foster T.J.
        • Geoghegan J.A.
        • Dufrêne Y.F.
        Zinc-dependent mechanical properties of Staphylococcus aureus biofilm-forming surface protein SasG.
        Proc. Natl. Acad. Sci. U.S.A. 2016; 113 (26715750): 410-415
        • Troffer-Charlier N.
        • Ogier J.
        • Moras D.
        • Cavarelli J.
        Crystal structure of the V-region of Streptococcus mutans antigen I/II at 2.4 A resolution suggests a sugar preformed binding site.
        J. Mol. Biol. 2002; 318 (12054777): 179-188
        • Forsgren N.
        • Lamont R.J.
        • Persson K.
        Crystal structure of the variable domain of the Streptococcus gordonii surface protein SspB.
        Protein Sci. 2009; 18 (19609934): 1896-1905
        • Forsgren N.
        • Lamont R.J.
        • Persson K.
        Two intramolecular isopeptide bonds are identified in the crystal structure of the Streptococcus gordonii SspB C-terminal domain.
        J. Mol. Biol. 2010; 397 (20138058): 740-751
        • Larson M.R.
        • Rajashankar K.R.
        • Crowley P.J.
        • Kelly C.
        • Mitchell T.J.
        • Brady L.J.
        • Deivanayagam C.
        Crystal structure of the C-terminal region of Streptococcus mutans antigen I/II and characterization of salivary agglutinin adherence domains.
        J. Biol. Chem. 2011; 286 (21505225): 21657-21666
        • Förster S.
        • Apostol L.
        • Bras W.
        Scatter: software for the analysis of nano- and mesoscale small-angle scattering.
        J. Appl. Crystallogr. 2010; 43: 639-646
        • Delaglio F.
        • Grzesiek S.
        • Vuister G.W.
        • Zhu G.
        • Pfeifer J.
        • Bax A.
        NMRpipe: a multidimensional spectral processing system based on Unix pipes.
        J. Biomol. NMR. 1995; 6 (8520220): 277-293
        • Vranken W.F.
        • Boucher W.
        • Stevens T.J.
        • Fogh R.H.
        • Pajon A.
        • Llinas M.
        • Ulrich E.L.
        • Markley J.L.
        • Ionides J.
        • Laue E.D.
        The CCPN data model for NMR spectroscopy: development of a software pipeline.
        Proteins. 2005; 59 (15815974): 687-696
        • Rieping W.
        • Habeck M.
        • Bardiaux B.
        • Bernard A.
        • Malliavin T.E.
        • Nilges M.
        ARIA2: automated NOE assignment and data integration in NMR structure calculation.
        Bioinformatics. 2007; 23 (17121777): 381-382
        • Linge J.P.
        • Habeck M.
        • Rieping W.
        • Nilges M.
        Correction of spin diffusion during iterative automated NOE assignment.
        J. Magn. Reson. 2004; 167 (15040991): 334-342
        • Shen Y.
        • Delaglio F.
        • Cornilescu G.
        • Bax A.
        TALOS plus: a hybrid method for predicting protein backbone torsion angles from NMR chemical shifts.
        J. Biomol. NMR. 2009; 44 (19548092): 213-223
        • Cheung M.S.
        • Maguire M.L.
        • Stevens T.J.
        • Broadhurst R.W.
        DANGLE: a Bayesian inferential method for predicting protein backbone dihedral angles and secondary structure.
        J. Magn. Reson. 2010; 202 (20015671): 223-233
        • Bhattacharya A.
        • Tejero R.
        • Montelione G.T.
        Evaluating protein structures determined by structural genomics consortia.
        Proteins. 2007; 66 (17186527): 778-795
        • Doreleijers J.F.
        • Sousa da Silva A.W.
        • Krieger E.
        • Nabuurs S.B.
        • Spronk C.A.
        • Stevens T.J.
        • Vranken W.F.
        • Vriend G.
        • Vuister G.W.
        CING: an integrated residue-based structure validation program suite.
        J. Biomol. Nmr. 2012; 54 (22986687): 267-283
        • Lüthy R.
        • Bowie J.U.
        • Eisenberg D.
        Assessment of protein models with 3-dimensional profiles.
        Nature. 1992; 356 (1538787): 83-85
        • Laskowski R.A.
        • Macarthur M.W.
        • Moss D.S.
        • Thornton J.M.
        Procheck: a program to check the stereochemical quality of protein structures.
        J. Appl. Crystallogr. 1993; 26: 283-291
        • Lovell S.C.
        • Davis I.W.
        • Adrendall 3rd, W.B.
        • de Bakker P.I.
        • Word J.M.
        • Prisant M.G.
        • Richardson J.S.
        • Richardson D.C.
        Structure validation by Cα geometry: φ, ψ and Cβ deviation.
        Proteins Struct. Funct. Genet. 2003; 50 (12557186): 437-450