The Extracellular Protein Factor Epf from Streptococcus pyogenes Is a Cell Surface Adhesin That Binds to Cells through an N-terminal Domain Containing a Carbohydrate-binding Module*

Background: Epf is a multidomain cell surface protein from Streptococcus pyogenes. Results: Epf mediates adhesion to human epithelial cells through an N-terminal domain comprising two common binding modules. Conclusion: Epf is an adhesin with a novel binding-domain supported on a stalk built from tandem helical repeat domains. Significance: Adhesins such as Epf are important for colonization and infection by Streptococci. Streptococcus pyogenes is an exclusively human pathogen. Streptococcal attachment to and entry into epithelial cells is a prerequisite for a successful infection of the human host and requires adhesins. Here, we demonstrate that the multidomain protein Epf from S. pyogenes serotype M49 is a streptococcal adhesin. An epf-deficient mutant showed significantly decreased adhesion to and internalization into human keratinocytes. Cell adhesion is mediated by the N-terminal domain of Epf (EpfN) and increased by the human plasma protein plasminogen. The crystal structure of EpfN, solved at 1.6 Å resolution, shows that it consists of two subdomains: a carbohydrate-binding module and a fibronectin type III domain. Both fold types commonly participate in ligand receptor and protein-protein interactions. EpfN is followed by 18 repeats of a domain classified as DUF1542 (domain of unknown function 1542) and a C-terminal cell wall sorting signal. The DUF1542 repeats are not involved in adhesion, but biophysical studies show they are predominantly α-helical and form a fiber-like stalk of tandem DUF1542 domains. Epf thus conforms with the widespread family of adhesins known as MSCRAMMs (microbial surface components recognizing adhesive matrix molecules), in which a cell wall-attached stalk enables long range interactions via its adhesive N-terminal domain.

scriptomics study of a ⌬nra mutant of GAS serotype M49 (11). Ralp3 was found to be encoded together with the two virulence factors, Eno (streptococcal surface enolase) and SagA (streptolysin S), and with Epf (extracellular protein factor) in the eno ralp3 epf sagA (ERES) genomic region. Of these genes, the eno sagA block is present in all GAS serotypes, whereas ralp3 and epf were shown to be restricted to serotypes M1, M4, M12, M28, and M49 (11). Interestingly, the novel protein Epf shows many hallmarks of a potential adhesin.
Like many members of the diverse family of bacterial adhesins called MSCRAMMs (microbial surface components recognizing adhesive matrix molecules) (12,13), the 205-kDa protein Epf has a number of conserved C-terminal repeat domains and a unique N-terminal domain (see Fig. 1). It also has an N-terminal signal sequence and a C-terminal LPXTGtype sortase recognition motif, which indicates that Epf is attached to the streptococcal cell wall (11). In Epf from serotype M49, 18 C-terminal repeat domains, classified as DUF1542s (domains of unknown function 1542; PFAM accession number PF07564, amino acids 387-1841), were identified. The N-terminal domain has no homology with any protein of known structure or function (11).
An Epf orthologue from GAS serotype M1 has also been described and called LSA (14). LSA has an N-terminal domain similar to Epf and 20 DUF1542 repeat domains. Although no function has been attributed to either Epf or LSA, it was shown that LSA is essential for GAS virulence in a mouse model of infection (14). Other streptococcal species such as Streptococcus suis also produce proteins with a domain structure similar to that of Epf (15), including a unique N-terminal domain and C-terminal DUF1542 repeats.
Here, we investigated the potential function of Epf as an adhesin, analyzed the respective role of the Epf domains, solved the crystal structure of the N-terminal domain of Epf, and characterized the DUF1542 repeats. To our knowledge, this is the first study providing structural insight into the widespread group of proteins containing DUF1542 repeats.
Eukaryotic Cell Adherence and Internalization-Adherence to and internalization into epithelial cells was quantified using the antibiotic protection assay (17). 24-well plates were inoculated with 2.5 ϫ 10 5 cells/well in DMEM without antibiotics. The cells were allowed to grow to confluence. For the assay, the cells were washed with DMEM and infected with GAS M49 wild type and mutant strains at a multiplicity of infection of 1:10 in DMEM. Two hours after infection, the cells were washed extensively with PBS, detached from the wells by trypsin treatment, and lysed with sterile distilled water. The viable counts of GAS (colony-forming units) released from the lysed cells were determined by serial dilution in PBS and plating on THY agar. For the assessment of bacterial internalization, 2 h after infection, the cells were washed with PBS and incubated with DMEM supplemented with penicillin (50 units/ml) and streptomycin (5 mg/ml) for an additional 2 h. Subsequently, the cells were washed and lysed, and the GAS viable counts were determined as described above.
To measure the direct interaction of EpfN with epithelial cell lines, latex beads (carboxylate-modified, polystyrene, fluores-FIGURE 1. Domain structure of Epf from group A streptococcus serotype M49. The sequence range of each of the principal constructs used in this analysis, EpfN, EpfDUF1-4, EpfDUF1-16, and EpfNDUF1-3, is also shown. cent yellow-green; Sigma) were used in adherence assays. Briefly, 10 8 yellow fluorescent latex beads (1 m) were incubated with 50 l of purified proteins (100 g/ml) in PBS overnight at 4°C. After washing steps, free binding sites on the bead surface were blocked by incubation in 200 l of bovine serum albumin (10 mg/ml) for 1 h at room temperature. The beads were then washed again and suspended in DMEM without any supplementation. Beads exclusively treated with bovine serum albumin were used in all experiments as negative control. Cell lines were infected for 2 h under 5% CO 2 atmosphere with the seeding strategy of 35 beads/cell. After incubation the cells were washed with PBS, and the number of adherent beads and cells were counted and related to each other. To elucidate the effect of plasminogen (Plg) on Epf adherence, cell lines were also pretreated with human Plg (2 g ml Ϫ1 ) for 30 min. Unbound Plg was removed by washing the cells with PBS, and the cells were infected with beads as described above.
Purification of EpfN for Crystallization-For crystallization, EpfN was purified by affinity chromatography using Strep-Tactin Superflow High Capacity beads (IBA GmbH) followed by size exclusion chromatography in buffer C (16). Selenomethionine-substituted EpfN (SeMet-EpfN) was produced in a similar manner to native EpfN, with the exception that 15 min before induction the following amino acids were added to the expression culture: 60 mg/liter L-selenomethionine, 100 mg/liter L-lysine, 100 mg/liter L-phenylalanine, 100 mg/liter L-threonine, 80 mg/liter L-isoleucine, 80 mg/liter L-leucine, 80 mg/liter L-valine, as previously described (18). SeMet-EpfN was purified in the same way as EpfN, with the addition of ␤-mercaptoethanol to the initial purification buffers and of 1 mM Tris(2-carboxyethyl)phosphine hydrochloride to buffer C.
Crystallization of EpfN-EpfN was crystallized using an in situ proteolysis protocol, in which EpfN mixed with chymotrypsin at a weight ratio of 1000:1 was subjected to hanging drop vapor diffusion against 20 -25% (w/v) PEG3350 and 100 -400 mM KCH 3 COO or KCl (16). Crystals of SeMet-EpfN were obtained using a similar mixture of SeMet-EpfN with chymotrypsin in 20 -25% (w/v) PEG3350 and 100 -400 mM KCH 3 COO, which was streak-seeded with native EpfN crystals using a cat whisker. In all crystallization experiments, 1 l of EpfN (19.5-20 mg/ml in buffer C) was mixed with an equal volume of the reservoir solution and equilibrated at 18°C. For flash-cooling in liquid nitrogen, the crystals were cryo-protected with buffer C containing 30% (v/v) glycerol, 17.5% (w/v) PEG3350, and 140 mM KCH 3 COO (KCH 3 COO condition) or a mixture of 70% (v/v) ParatoneN and 30% (v/v) paraffin (KCl condition).
Structure Determination and Refinement-All of the data sets (Table 1) were collected at Beamline MX2 (03ID1) of the Australian Synchrotron (Victoria, Australia) and processed as described previously (16). The program suite AutoSHARP (19) was used to solve the structure of EpfN in a multiple-wavelength anomalous dispersion experiment with two data sets from a single SeMet-EpfN crystal (Selenium Remote and Selenium Inflection in Table 1). All four expected selenium sites were found (correlation coefficient of 0.289) using the SHELXC/D package (20), and phase information was successfully derived using SHARP (21) with a figure of merit of 0.33/ 0.13 (acentric/centric). The resulting electron density map was modified using SOLOMON (22). Using Arp/Warp (23), two EpfN chains were automatically built to 94% completeness. The EpfN model was refined using diffraction data from a crystal of native EpfN grown in KCH 3 COO. Refinement was performed in iterative cycles of manual building with COOT (24) and maximum-likelihood refinement in autoBUSTER (25) with 2-fold noncrystallographic symmetry constraints. Water molecules were picked manually and automatically using autoBUSTER. The final EpfN model consists of two chains A (residues 56 -357) and B (57-353). Chain A was then used as a search model for molecular replacement with PHASER (26) to solve the structure of EpfN-KCl, which has 4 molecules per asymmetric unit. The EpfN-KCl model was refined as above, using 4-fold noncrystallographic symmetry constraints. Full details of both refinements are included in Table 1. Model geometry was assessed using MolProbity (27). The figures were created using PyMol (28).
Structure and Sequence Analysis-The Protein Data Bank (PDB) was searched for homologous structures using SSM (29) and DALI (30). The crystal packing was analyzed using PISA (31).
CD Spectroscopy-For CD analysis, the construct DUF1-4 was dialyzed extensively against 10 mM phosphate buffer, pH 7.7. CD spectra were recorded at a DUF1-4 concentration of 1 M in a 1-mm quartz cuvette at 20°C. The final spectrum is the average of 10 measurements; the spectrum for the buffer was deducted. The secondary structure composition was estimated using the SOMCD algorithm (32).
Electron Microscopy-EpfDUF1-16 (5 l of a 0.1 mg/ml solution in buffer C) was adsorbed on to continuous carbon grids for 60 s. Sample solutions were blotted with Whatman #1 filter paper (Amersham Biosciences, GE Healthcare) and stained for 60 s with 20 l uranyl acetate solution (1% w/v; Electron Microscopy Sciences, Hatfield, PA) followed by final blotting. Low dose EM was performed using a Tecnai 12 electron microscope (FEI, Hillsboro, Oregon), and images were recorded at a nominal magnification of 52,000ϫ using an UltraScan 2k ϫ 2k CCD camera (Gatan, Pleasanton, CA).
Small Angle X-ray Scattering (SAXS)-SAXS analysis was performed at the SAXS/WAXS Beamline of the Australian Synchrotron. The construct EpfN-DUF1-3 in buffer C was analyzed over a concentration range of 0.25-2.0 mg ml Ϫ1 at 4°C. The data were recorded using a Pilatus 1M detector at a distance of 3.5 m. For absolute scaling, intensity data were normalized to water as standard. After buffer subtraction, the scattering curves were analyzed using PRIMUS (33) and AutoRG (34). The distance distribution function was calculated using GNOM (35). Sixteen dummy atom models were reconstructed using DAMMIF (36), of which fifteen were averaged with DAMAVER (37) and superimposed with the EpfN crystal structure using SUPCOMB13 (38). Further experimental details are listed in Table 2.
EpfN Binding to Immobilized Plasminogen-Native Plg was purified from human plasma as previously described (39). For the pulldown assay, purified Plg was prebound to lysine-Sepharose (GE Healthcare) in buffer D (5 mM Tris-HCl, pH 8.0, 100 mM phosphate); EpfN_b was then added to Plg at 5-fold molar excess and incubated at 4°C in the same buffer overnight with mixing. Unbound material was removed by washing with buffer D, and bound material was eluted with 25 mM ⑀-amino caproic acid. Plg alone and EpfN_b alone were included as controls.
Surface Plasmon Resonance-The interactions between Plg and recombinant Epf were analyzed with a BIAcore3000 system (Biosensor, La Jolla, CA) using CM5 sensor chips as described before (40). Briefly, the ligand EpfN was immobilized on the flow cell surfaces of the chip to densities up to 1500 response units using standard amine-coupling chemistry and the software tool "Application Wizard-Surface Preparation" (BIA-core 3000 Instrument Handbook). Each analyte-ligand complex was allowed to associate and dissociate for 3 and 5 min, respectively, with background subtraction using an unmodified flow cell as reference surface. To collect binding data, the analytes, i.e. plasminogen, fibronectin, fibrinogen, and collagen-I dissolved in 10 mM HEPES, 150 mM NaCl, 3 mM EDTA, 0.005% P20, pH 7.4, were flowed over the ligand and reference surfaces at concentrations of 125 nM and at a flow rate of 30 l min Ϫ1 . For concentration series, plasminogen was tested at 50, 125, 250, 500, 1000, and 2000 nM. The ligand surface was regenerated with a 15-s injection of 0.1% SDS at the end of each binding cycle.  Three independent replicate experiments were carried out to obtain meaningful and reliable results. The data from the BIAcore sensorgrams were fitted globally, using the one-step biomolecular association reaction model (1:1 Langmuir binding with drifting base line), which resulted in optimum mathematical fits, reflected by low 2 values (Ͻ5).

Requirement of Epf for GAS Adherence and
Internalization-Adherence to and internalization in human epithelial cells are essential steps for successful infection of the human host by GAS. We assayed the importance of Epf for these processes by infecting four different human epithelial cell lines for 2 h with wild type GAS serotype M49 and its epf-deficient mutant (epf Ϫ ). The epf Ϫ mutant showed significantly reduced adherence to human skin keratinocytes (HaCaT cells), by ϳ50% ( Fig.  2A). GAS expresses a number of other adhesins, and this experiment shows that even on this background, Epf plays a significant role in adhesion. Similarly, the epf Ϫ mutant showed decreased internalization rates for both HaCaT cells and the gingival epithelial cell line Ca9-22 (Fig. 2B). The internalization rate for HaCaT cells was low under the conditions used but is well in the range of previous studies (41). Both processes, adherence and internalization, were unaffected in the case of Detroit562 (pharyngeal) and HEp-2 (laryngeal) cells.
Adherence of Epf Domains to Epithelial Cell Lines-To identify the domain(s) conferring the adhesin activity on Epf, the constructs EpfN, EpfDUF1-4, and EpfDUF1-16 ( Fig. 1) were used to coat yellow fluorescent latex beads, which were incubated with human epithelial cell lines for 2 h. Uncoated beads were used as control and did not display any cell binding. In additional experiments, the cells were treated with Plg before infection with the Epf constructs. After the infection the cells were stained with DAPI (Roche Applied Science) and inspected by fluorescence microscopy. The results are summarized in Table 3.
EpfN-coated beads were found to bind to both Ca9-22 and HEp-2 cells directly, whereas beads coated with the constructs EpfDUF1-4 and EpfDUF1-16, which lack EpfN, showed only background levels of adherence to any of the four tested cell lines (data only shown for HEp-2 cells). Pretreatment of the tested cell lines with Plg enabled EpfN-coated beads to adhere to HaCaT cells, enhanced their adherence to HEp-2 cells, and had no influence on their adherence to Ca9-22 and Detroit562 cells.
Structure of the N-terminal Domain of Epf-The N-terminal domain of Epf (EpfN) yielded diffracting crystals in the presence of the protease chymotrypsin, which trimmed the original EpfN construct (residues 45-386) to residues 52-357 in situ (16). Two different crystal forms were obtained depending on the presence of KCH 3 COO or KCl. They were in space groups P2 1 2 1 2 1 and P2 1 , with two and four molecules/asymmetric unit, respectively ( Table 1). The structure of EpfN was solved using multiple-wavelength anomalous dispersion phases from a single crystal of SeMet-EpfN in the orthorhombic space group P2 1 2 1 2 1 . The initial model was then refined against data for both crystal forms of native EpfN. In all chains, in both crystal forms, there is clear electron density for residues 57-352, whereas residues 52-56 and 353-357 appear to be subject to varying degrees of disorder. Otherwise, no significant difference can be observed for the EpfN structures from these two space groups; the two structures show root mean square (rms)

TABLE 3 Adherence of latex beads coated with the Epf constructs EpfN, EpfDU1-4, and EpfDUF1-16 to human epithelial cell lines
The p values were calculated using the U test. Plg, pretreatment of cells with plasminogen. differences between the six chains of 0.3-0.7 Å for all 296 residues. In the following, we use as the reference structure EpfN chain A of the P2 1 2 1 2 1 crystal form (EpfN-KCH 3 COO in Table 1), which was refined at a resolution of 1.6 Å (R/R free ϭ 15.0/18.2%). This molecule is almost complete, comprising residues 56 -357. The EpfN molecule (Fig. 3) comprises two subdomains, EpfN1 (residues 56 -251) and EpfN2 (residues 252-357), which together form an elongated shape with a length of ϳ85 Å. Subdomain EpfN1 is folded into two twisted antiparallel ␤-sheets, A and B, with seven ␤-strands each. ␤-Sheet B (␤-strands d, e, f, h, k, l, and n) forms half a barrel, which wraps around the almost complete barrel of ␤-sheet A (␤-strands a, b, c, g, i, j, and m). Subdomain EpfN2 is folded as a ␤-sandwich of two antiparallel ␤-sheets, C and D, with three and four ␤-strands respectively (C: ␤-strands o, p, and s, D: ␤-strands q, r, t, and u). Three short ␣-helices decorate loops between the ␤-strands in both domains. The interface between the two domains EpfN1 and EpfN2 appears to be stabilized by an extensive hydrogen bond network that centers on the residues His-149, Arg-279, and Gln-334. Several ligands derived from the crystallization buffers were observed in the two EpfN structures, although most binding sites are occupied only in a single chain, e.g., by a potassium ion, glycerol, and an acetate ion.

Cell line
Recombinant EpfN is monomeric in solution, as judged by its retention volume in size exclusion chromatography (data not shown). The program PISA (31), which analyzes crystal packing contacts, gives a score of 0 for the likelihood that the crystal packing contacts imply oligomer formation.

Structural Relationships of the N-terminal Domain of Epf-A
search of the PDB for structural homologues of the full EpfN structure, using the programs SSM (29) and DALI (30), yielded no significant hits. The use of each EpfN subdomain in isolation, however, led to the identification of a number of proteins with similar folds, despite no significant sequence identity (Table 4 and Fig. 4).
Subdomain EpfN1 is structurally similar to a number of carbohydrate-binding modules (CBMs) with rms differences of 2.7-3.4 Å over 110 -134 aligned C ␣ positions (Table 4). In general, CBMs are noncatalytic domains of carbohydrate-modifying enzymes (42). EpfN1 is also similar to the ligand-binding domain of the human ephrin receptor (PDB entry 3NRU). Although sharing the same basic fold, EpfN1 does, however, have distinct differences from the ␤-sandwich fold CBMs and the ephrin receptor because of the presence in EpfN1 of additional loops and ␤-strands, such as ␤-strands c, d, and j and their associated loops. Strands c and j protrude from ␤-sheet A, which usually forms a flat or concave ligand-binding site in a typical CBM (Fig. 4, A and C).
The fold of the EpfN2 subdomain conforms to that of fibronectin type III domains. EpfN2 can be superimposed on the first type III domain of fibronectin (PDB 2HA1) (43) with an rms difference of 2.42 Å over 76 aligned C ␣ positions (Table 4 and Fig. 4B). Superpositions of the structures of fibronectin type III-type domains from other proteins such as the signaling receptor CDO or the neural cell adhesion molecule NCAM yield rms differences in a similar range ( Table 4).
Structure of the DUF1542 Repeats-Although the EpfDUF1-4 construct, which comprises four DUF1542 repeats, could be expressed in soluble form, it has not yet been possible to obtain crystals. We therefore turned to CD spectroscopy. The CD spectrum of DUF1-4 proved to be typical of an ␣-helical protein (Fig.  5). Using the SOMCD algorithm (32), this construct is estimated to consist of more than 90% ␣-helical structure. In analogy to adhesins such as the M protein, we find increasing conservation of the DUF1542 repeats toward the C terminus of Epf. In contrast to the M protein, however, which forms a coiled-coil dimer, size exclusion chromatography-mutliple-angle laser light scattering analysis showed that the EpfDUF1-4 construct is monomeric (data not shown).
To gain further structural information on the C-terminal DUF1542 repeat region, we used EM and SAXS analysis. The construct EpfDUF1-16, comprising 16 DUF1542 repeats, was analyzed by negative staining EM, which revealed long, flexible, very thin fibrous structures (Fig. 6). Their apparent dimensions are 50 -60 nm in length and 6 nm in thickness. Within a fiber, globular domains are recognizable. Thus, the DUF1542 repeats seem to form a series of small globular domains, arranged in tandem, to give a fiber-like stalk.
The construct EpfN-DUF1-3, which encompasses the EpfN domain and the first three DUF1542 repeats, was analyzed using SAXS (Table 2 and Fig. 7). The distance distribution function (Fig. 7C) is typical of extended multidomain proteins. An averaged dummy atom model was reconstructed for EpfN-DUF1-3 and shows an extended structure of ϳ200 Å (Fig. 7D). Although this structure is mostly featureless, it is notable that there is a thicker part (ϳ90 Å long) at one end. The EpfN crystal   (43), in black, and human CDO (3d1m) (50), in brown. In each case, the fold is represented as a C ␣ trace in stereo; structures were aligned using SSM (31). C, topology diagrams of EpfN1, Epha7, and XG34. Aligned ␤-strands are marked in the same colors, and additional ␤-strands are colored black. For clarity, decorating ␣-helices, which are present in all three structures, were omitted, and the lengths of ␤-strands are not to scale.

TABLE 4 Structural homologues of the two EpfN domains EpfN1 and EpfN2
The Protein Data Bank (PDB) was searched using SSM (29), and the most closely related structures are listed here with their ligand (if reported). rmsd, root mean square deviation; N alg , number of aligned C ␣ positions; SI, sequence identity; NA, not applicable. structure could be superimposed on this thick part of the model (Fig. 7D), leaving ϳ100 -120 Å for the three DUF1542 repeats. Thus, by combining the SAXS model of EpfN-DUF1-3 with the EM data for DUF1-16, Epf can be visualized as a fiber-like structure with the adhesion domain EpfN at one end.

Protein
Binding of EpfN to Plasminogen-Following previous reports that Epf might bind to the human plasma protein Plg (11), we further assessed this potential interaction using pulldown assays. Plg can be immobilized on lysine-Sepharose through its kringle domains and eluted with the lysine analog ⑀-amino caproic acid (Fig. 8A). EpfN alone did not bind to lysine-Sepharose. However, when EpfN was incubated with Plg that was immobilized on lysine-Sepharose, both proteins then co-eluted after the addition of ⑀-amino caproic acid (Fig. 8A). These results indicate that immobilized plasminogen is able to bind EpfN.
Quantification of the EpfN-Plasminogen Interaction-The binding strength between EpfN and Plg was quantified by surface plasmon resonance measurements employing the BIAcore system. For this real time biospecific interaction analysis, Plg was used as soluble analyte and EpfN as immobilized ligand. Increasing concentrations of the analyte were allowed to associate with EpfN, immobilized as the ligand on the CM5 chip. The sensorgram of this interaction was recorded (Fig. 8B), and the data obtained were used to calculate the dissociation constant (K d ; see also Fig. 8C). Given that typical K d values for biologically significant interactions are in the nanomolar to low millimolar range, the determined K d value of 0.28 M for the EpfN-Plg interaction implies a biologically relevant interaction. Other proteins such as fibronectin, fibrinogen, and collagen-I did not bind to EpfN in the BIAcore system (data not shown).

DISCUSSION
One of the earliest events of the infection process is the attachment of bacteria to epithelial cells, mediated by surface proteins that serve as adhesins. Group A streptococci are known to produce a plethora of adhesins, which vary among different serotypes. Examples include the M-protein, which is the major cell wall-associated virulence factor (2), adhesive pili that are assembled by the action of sortase enzymes (44), and the fibronectin-binding protein FbaB (40). These are all anchored covalently to the cell wall by the action of sortases that recognize a characteristic LPXTG sorting motif. Understanding their roles and mechanisms of action is essential for understanding disease but is made more difficult by the exposure of these proteins to human immune surveillance. This typ- ically results in wide sequence variation that makes it difficult to recognize structural and functional relationships from sequence alone.
Here, we have examined the structure and function of the putative streptococcal adhesin Epf. Epf and its homologues are found in serotypes M1, M4, M12, M28, and M49 and share several features that suggest they are adhesins. They are encoded in the so-called ERES pathogenicity island, adjacent to the transcriptional regulator ralp3 (11), they have an LPXTG sequence motif near the C terminus that implies covalent attachment to the cell wall, and they have a domain structure consisting of a unique N-terminal domain followed by a large number of conserved repeat domains, classified as DUF1542 repeats. This domain arrangement is analogous to that of many adhesins (45).
We have shown here that Epf is indeed an adhesin that is important for adhesion of GAS serotype M49 (strain 591) to human skin keratinocytes; an epf knock-out leads to a ϳ50% decrease in adherence to these cells ( Fig. 2A). Epf also appears to promote internalization into human skin keratinocytes and into the gingival cell line Ca9-22 (Fig. 2B). Interestingly, adhesion to other epithelial cell lines of pharyngeal or laryngeal origin was unaffected in the epf knock-out. The GAS strain 591 was originally isolated from a patient with a GAS skin infection, and in this clinical context, Epf on serotype M49 GAS may play a specific role in tissue tropism by conferring adherence to human skin.
Dissection of Epf into its N-terminal domain (EpfN) and constructs comprising varying numbers of the C-terminal DUF1542 repeats identifies EpfN as the adhesin domain. Only EpfN adhered to the human cell lines tested, namely to Ca9-22 (gingival) and HEp-2 (laryngeal) cells. In contrast, the DUF1542 repeats show no evidence of any involvement in adhesion. They appear to have a structural role, and the data from EM and SAXS analysis show that they form a fiber-like structure that supports the EpfN domain at its distal end.
This observation, that Epf consists of a ligand-binding N-terminal domain supported on a stalk formed by the C-terminal DUF1542 repeats, confirms the proposed similarities in overall architecture between Epf and other cell surface adhesins. Examples include the multidomain adhesin Cna from Staphylococcus aureus, which has a collagen-binding A domain near the N terminus, supported by a series of repetitive B domains (12,13), and the pili expressed by many Gram-positive pathogens, which have their adhesin domains at their tip, supported by a polymeric shaft formed from multiple Ig-like domains (46).
The crystal structure of the EpfN domain reveals striking structural homologies that were not apparent from its sequence. EpfN is folded into two subdomains, EpfN1 and EpfN2, which, in contrast to the DUF1542 repeats, are primarily formed from ␤-strands. The EpfN1 subdomain shows a close structural similarity to the ␤-sandwich class CBMs (Table 4 and Fig. 4A), which are usually found as binding domains associated with carbohydrate-modifying enzymes such as cellulases (42). The EpfN2 subdomain has the fibronectin type III fold, found in many human proteins that mediate protein-protein interactions including fibronectin itself and the neural cell adhesion molecule (Table 4 and Fig. 4B). Thus, both EpfN1 and EpfN2 display structural homology to domains that commonly participate in receptor-ligand interactions, in full support of the role EpfN evidently plays in GAS adhesion.
How might EpfN adhere to cells? The similarity of the EpfN1 subdomain to CBM domains suggested that it could bind extracellular carbohydrates on the host cell surface. However, a glycan array screen performed by the Functional Glycomics Consortium yielded no potential ligands (data not shown). It is possible that the chosen experimental conditions prevented the identification of a ligand or that this ligand was not included in the array. We note, however, that most classic CBM domains bind carbohydrate ligands in the concave face of ␤-sheet A (42). In the case of EpfN1, however, this face seems to be occluded by the additional ␤-strands c and j from ␤-sheet A (Fig. 4A). Moreover, the carbohydrate binding sites of CBMs almost always include an associated metal ion (42), whereas EpfN1 contains no bound metal. Some CBMs do, however, bind ligands at the edge of ␤-sheets A and B, as observed for the ephrin receptor (47) and some classical CBMs (48). A potential candidate for such a binding site could be the edge of ␤-sheet B, where the surface-exposed residues Trp-104, Tyr-114, and Arg-125 are located; aromatic side chains are strongly associated with carbohydrate binding (42).
It is also possible that EpfN recognizes a protein ligand. Protein binding by an EpfN1-like domain has been observed for the ephrin receptor A7 (Epha7) ligand-binding domain (47), and fibronectin type III domains like EpfN2 are well known for their ability to bind protein ligands. An ability to bind protein ligands is supported by the demonstrated ability of Epf to bind to the human plasma protein Plg.
Pretreatment with Plg increases adherence of GAS to, and invasion into, HaCaT keratinocytes, through integrin-mediated pathways (10). In our present studies with recombinant EpfN, we observed that the pretreatment of human epithelial cells with Plg enhanced binding to HEp-2 cells and enabled binding to HaCaT cells by recombinant EpfN (Table 3). We further showed that lysine-immobilized Plg bound to EpfN in a pulldown assay (Fig. 8). The interaction between Epf and Plg is of medium strength with a K d of 0.28 M. However, this interaction was not strong enough to be detected using analytical size exclusion chromatography (no co-elution; data not shown). It seems possible that the significant effect of Plg on the adherence of recombinant EpfN to epithelial cells involves the activation of other receptors on the host cells or requires additional factors.
In conclusion, we have shown that Epf is an adhesin of group A streptococcus. Adhesion is mediated by an N-terminal domain EpfN that is built from two subdomains, both of which have folds that are typically involved in receptor-ligand interactions. These adhesin domains are projected away from the streptococcal cell wall toward the host receptor by a stalk FIGURE 8. Interaction between EpfN and human plasminogen. A, binding of EpfN to immobilized plasminogen in a pulldown assay. Plg was immobilized on lysine-Sepharose and could be eluted with ⑀-amino caproic acid (lanes 4 -7). EpfN alone did not bind to lysine-Sepharose (lanes 8 -11) but co-eluted with plasminogen (lanes 12-15). Lanes 1 and 2 show pure plasminogen and EpfN samples. S, protein standards; SM, starting material; UB, unbound material after incubation with lysine-Sepharose; E1 and E2, fractions eluted with ⑀-amino caproic acid. B and C, quantification of the interaction of Plg (analyte) with immobilized EpfN (ligand) using surface plasmon resonance measurements. B, representative profile of the relative surface plasmon resonance responses for the association and dissociation of different analyte concentrations (50 -2000 nM). Injection of the analyte started at time 0 and ended after 180 s, whereupon the dissociation phase was documented for at least 270 s. Shown is one representative concentration series out of three independent experiments. C, the association and dissociation data of the interaction were fitted globally using the one step biomolecular reaction model (1:1 Langmuir binding with drifting base-line model: AϩB^AB), which resulted in optimum mathematical fits reflected by the lowest 2 values (0. [3][4][5]. The values for association rate (k a ), dissociation rate (k d ), association constant (K a ), and dissociation constant (K d ) were calculated from the binding data with the BIAevaluation software. The mean values from three independent experiments are presented in the table Ϯ standard deviation.
formed from ␣-helical DUF1542 repeats, which form a long thin fiber-like structure. This would enable long range interactions with host receptors and ensure protrusion of the adhesin domain beyond the extracellular carbohydrate. DUF1542 repeats are found in many surface-exposed proteins from Gram-positive bacteria, and it is likely that other proteins from a variety of Gram-positive bacteria may also have an N-terminal domain similar to EpfN. However, structural data may be needed to establish their homology because of the high sequence variation typical for N-terminal domains of cell wallattached proteins.