A Topological Model of the Baseplate of Lactococcal Phage Tuc2009*

Phages infecting Lactococcus lactis, a Gram-positive bacterium, are a recurrent problem in the dairy industry. Despite their economical importance, the knowledge on these phages, belonging mostly to Siphoviridae, lags behind that accumulated for members of Myoviridae. The three-dimensional structures of the receptor-binding proteins (RBP) of three lactococcal phages have been determined recently, illustrating their modular assembly and assigning the nature of their bacterial receptor. These RBPs are attached to the baseplate, a large phage organelle, located at the tip of the tail. Tuc2009 baseplate is formed by the products of 6 open read frames, including the RBP. Because phage binding to its receptor induces DNA release, it has been postulated that the baseplate might be the trigger for DNA injection. We embarked on a structural study of the lactococcal phages baseplate, ultimately to gain insight into the triggering mechanism following receptor binding. Structural features of the Tuc2009 baseplate were established using size exclusion chromatography coupled to on-line UV-visible absorbance, light scattering, and refractive index detection (MALS/UV/RI). Combining the results of this approach with literature data led us to propose a “low resolution” model of Tuc2009 baseplate. This model will serve as a knowledge base to submit relevant complexes to crystallization trials.

. In tailed Siphoviridae bacteriophages (Caudovirales order), which infect Lactococcus lactis and belong to the P335 group (4), the RBP is positioned in an organelle at the tip of the tail, the baseplate (5-7). Baseplates of phages TP901-1 and Tuc2009 have been the subject of thorough investigations (5)(6)(7), which have led to a consensus model for baseplate assembly in these two phages. Recently, the x-ray structure of the RBP from phage TP901-1 was determined (8). Comparisons of RBP structures from phages p2 and bIL170, which belong to the so-called 936 group, led us to propose that lactococcal phages RBPs consist of three domains or modules, the shoulders, neck, and head. The modules can be exchanged among phages, which may lead to an alteration in host specificity (9,10).
Phages TP901-1 and Tuc2009 share ϳ78 -97% amino acid identity in the individual protein components that make up their baseplate, which is formed by 5 and 6 proteins, respectively (Table 1): the tape measure protein (TMP), the distal tail protein (Dit), the tail-associated lysozyme (Tal), the upper baseplate protein (BppU), and the lower baseplate protein, which acts as the RBP (BppL). An additional protein is present in the base plate of Tuc2009, which was designated the baseplate-associated protein (BppA). TMPs are generally the longest ORF products observed in bacteriophages. According to secondary structure predictions, these proteins exist as a long continuous helix, with the exception of a small C-terminal section that contains both ␣-helices and ␤-strands. It has been shown that TMP length governs the size of the tail of the phage (11), suggesting that its long helix is embedded in the tail and runs as a length-determining connection between baseplate and portal protein. This architecture was seen for the first time in the structure of the SPP1 phage (12). The Tal protein is composed of two domains, which can be released by self-cleavage at a consensus site (13). The N-terminal domain is believed to participate in the assembly of the baseplate structure, whereas the C-terminal domain represents a peptidoglycanase activity. In contrast, little is known about Dit, besides its suggested role in orchestrating the formation of the initiator complex (IC) of the baseplate structure (7). Together with TMP and Tal, Dit is thought to form the IC, the core of the baseplate, on which BppU and BppL, the latter being the antireceptor, are plugged (5)(6)(7). It has been shown by transmission EM on TP901-1 that BppU forms the upper disk of the baseplate and is required for the fixation of BppL, which represents the lower disk (5,6). Therefore BppU and BppL should be expected to interact directly. BppL is the RBP that specifically recognizes the host, as demonstrated by experiments in which BppL from TP901-1 was swapped with that of Tuc2009, resulting in host specificity exchange (11). As revealed by the x-ray structure of the trimeric molecule, the ϳ30 first residues at the N terminus (the shoulders) of BppL form a 3 ␣-helix bundle followed by an interlaced ␤-prism domain (the neck) (8). These N-terminal domains share ϳ64% sequence identity between both phages. The helix bundle (the shoulders) form the domain probably involved in the interaction with BppU, and the sequence similarity observed between this domain in both phages explains why the swapping experiment was successful. The N-terminal domain is followed by the head domain, which recognizes the host's receptor. In TP901-1, as well as in the p2 or bIL170 phages, it is a trimeric arrangement of double Greek key domains (8 -10). However, the head domains of TP901-1 and Tuc2009 share no sequence identity at all, and therefore, the three-dimensional structure of the Tuc2009 head domain remains elusive. Secondary structure predictions are compatible with the Tuc2009 RBP head also being a double Greek key domain. Finally, BppA from Tuc2009 has no counterpart in the TP901-1 baseplate. It has been proposed to be a partner of BppU and/or BppL as an "accessory" protein (7).
Recently, the tail tip of the bacteriophage SPP1, a siphoviridae infecting Bacillus subtilis, has been described in a cryo-EM study (12). This study reveals the arrangement of the TMP, Dit, and Tal proteins, aligned in this order. Interestingly, the last 260 residues of TMP forming the cap, Dit, and the Tal-N terminus share significant identity (20 -30%) with the corresponding proteins in Tuc2009 or TP901-1 (12). The tail-tip structure of SPP1 therefore provides interesting resemblances to the pro-posed initiation complex of Tuc2009 and TP901-1. In this study, all proteins of the Tuc2009 tail and baseplate (except TMP) were expressed and purified.
Determining the molecular weights of proteins or their complexes can be achieved by using different techniques such as size exclusion chromatography (SEC), native polyacrylamide gel electrophoresis, mass spectrometry, light scattering, and analytical centrifugation. Among these, SEC is a simple, fast and non-destructive method for estimating the molecular weight of a protein in its native form, based on a calibration curve obtained with protein standards. However, the elution position depends not only on the molecular weight but also on the shape of the protein, and on its tendency to interact with the column matrix. To overcome these limitations, SEC can be used in combination with on-line multiangle laser light scattering and refractrometry. The molecular weight from this measurement is then independent of the elution order, permitting determination of the molecular weight of simple proteins, protein complexes, and their stoichiometry.
The on-line SEC-MALS/UV/RI used in this study is a system provided by Wyatt Technology (Santa Barbara, CA). It consists of a combination of three detectors placed in series after an SEC column: a static light scattering detector (SLS), a dynamic light scattering detector, and a refractometry detector. On top of this, multiwavelength on-line UV-visible absorbance is provided by the HPLC system. SLS measures the intensity of the scattered light as a function of angle. SLS measurements can yield the molar mass and root mean squared radius (R g ). Light scattering detector measures the fluctuations of light scattering intensity due to diffusion of the molecules allowing thus the determination of their hydrodynamic radius (R h ). Refractive index detector measures the concentration of protein in solution, the dn/dc (specific refractive index increment), and the absolute refractive index of the solution. Measurement of pro- tein concentration is essential for the absolute characterization of the molar mass (14). Using this HPLC/SEC-MALS/UV/RI system, we determined the self-association and the inter-association of the baseplate components, whereas we also measured their hydrodynamic radii. These data, together with some input from literature, made it possible to propose a topological model of the baseplate of Tuc2009, as well as an evaluation of the stoichiometry of the interacting components. We propose this model as a paradigm for the P335 phage species baseplate, with possible extension to other phages infecting Gram-positive bacteria.

EXPERIMENTAL PROCEDURES
Cloning, Expression, and Purification-Cloning of full-length ORF49, ORF52, and ORF53 in pQE60 were previously described by Mc Grath et al. (7). The DNA regions that represent the N-terminal portion of ORF50 (residues 1-590), the head ϩ neck section of ORF53 (residues 17-172), and the headonly fragment of ORF53 (residues 58 -172) were amplified using PCR from the Tuc2009 bacteriophage genome using specific Gateway TM primers containing attB sequences at both ends, a ribosome-binding site, and a C-terminal His 6 tag coding sequence. A bicistronic ORF51-ORF53-encoding construct was obtained using a two-step PCR: first, the coding regions of ORF51 and ORF53 were separately amplified using primers P1/INT-R and INT-F/P2, respectively (primer sequences are detailed in Table 2), to introduce a ribosome-binding site upstream from each ORF and allowing the presence of a sequence overlapping the two constructs; second, these two PCR products were used as DNA template for a second amplification with primers P1 and P2. The final construct consisted of DNA specifying a C-terminal His 6 -tagged ORF51 upstream of DNA encoding a C-terminal Strep-tagged ORF53. The amplicons representing ORF50-(1-590), ORF53-(17-172), ORF53-(58 -172) and the bicistronic ORF51-ORF53 were cloned by recombination in a Gateway TM pDEST14 vector (Invitrogen). Protein expression experiments were carried out using M15pREP4 (Qiagen) and Rosetta(DE3)pLysS (Novagen) strains for pQE60 and pDEST14 constructs, respectively. After an overnight induction with 0.5 mM isopropyl 1-thio-␤-D-galactopyranoside at 25 or 17°C depending on the target, cells were harvested by centrifugation for 10 min at 4000 ϫ g. Bacterial pellets were resuspended in 40 ml/liter of culture of lysis buffer (Tris 50 mM, NaCl 300 mM, imidazole 10 mM, pH 8.0) supplemented with 0.25 mg/ml lysozyme, 1 g/ml DNase, 20 mM MgSO 4 , and antiproteases (Complete EDTA-free antiproteases, Roche) and frozen at Ϫ80°C. After thawing and sonication, lysates were cleared by a 30-min centrifugation at 12,000 ϫ g. Overexpressed proteins were purified on a Pharmacia Á kta FPLC by nickel affinity chromatography (His-Trap 5-ml column, GE Healthcare) using a step-gradient of imidazole followed by a preparative Superdex 200 HR26/60 gel filtration in 10 mM Hepes, 50 mM NaCl, pH 7.5. An additional Strep-Tactin affinity purification (IBA BioTAGnology) was used for the purification of the ORF51-ORF53 dual protein complex. Purified proteins were concentrated using Amicon Ultra-15 ml (Millipore) and characterized by SDS-PAGE, matrix-assisted laser desorption ionization time-of-flight mass spectrometry (Brüker Autoflex), trypsin peptide mass fingerprint, far-UV circular dichroism (Jasco J-810), and dynamic light scattering (Zetasizer Nano-S, Malvern).
Blue Native PAGE-Blue native PAGE (15,16) was performed using a NativePAGE TM Novex 4 -16% BisTris gel (Invitrogen). Gels were run following the supplier's instructions loading 5 g of each protein. Native gels were stained with Coomassie R-250. The NativeMark TM Unstained Protein Standard (Invitrogen) was used to estimate the molecular weight of proteins after native electrophoresis by plotting R f (retardation factor) values versus log 10 molecular weight.
SEC with On-line Multiangle Laser Light Scattering, Absorbance, and Refractive Index (MALS/UV/RI) Detectors-SEC was carried out on an Alliance 2695 HPLC system (Waters) using a Silica Gel KW804 column (Shodex) eluted with 10 mM Hepes and 50 mM NaCl at pH 7.5 at a flow of 0.5 ml/min. Detection was performed using a triple-angle light scattering detector (Mini-DAWN TM TREOS, Wyatt Technology), a quasi-elastic light scattering instrument (Dynapro TM , Wyatt Technology), and a differential refractometer (Optilab rEX, Wyatt Technology). Molecular weight and hydrodynamic radius determination was performed by the ASTRA V software (Wyatt Technology) using a dn/dc value of 0.185 ml/g. Proteins were loaded at a final concentration of 0.02 mM.

RESULTS
Production of the Baseplate Proteins-We undertook the production of seven tail and baseplate proteins from Tuc2009: ORF45 (MTP), ORF48 (TMP), ORF49 (Dit), ORF50 (Tal), ORF51 (BppU), ORF52 (BppA), and ORF53 (BppL, representing the RBP). All these proteins could successfully be (over)produced either as full-length products, or as individual domains, using our previously published screening procedure (17,18). Except in the case of TMP, all proteins are soluble up to several milligrams/ml, making it possible to perform biophysical characterization and interaction studies on them.
ORF45 (MTP) was expressed as a soluble protein and could be concentrated up to 10 mg/ml. However, no further experi- ments have been performed on it. ORF48 (TMP) is a 1025residue long protein, which has been demonstrated to control the length of the phage tail (11). Secondary structure predictions suggest that TMP consists of two domains: a long, almost continuous, ␣-helix, spanning residues 1 to 960, and a putative mixed ␣/␤ domain from residue 961 to the C terminus. It has been suggested that the long ␣-helix of TMP drives the MTP assembly and interacts at its N terminus with the phage portal, whereas the C-terminal domain, possibly more globular, may interact with the other components of the IC, Dit, and/or Tal (6,7,12). We cloned two DNA fragments that encompass different sections of TMP: the full-length protein and the 961-1025 C-terminal domain. The former construct was not expressed in any of our screening conditions, whereas the latter was expressed as inclusion bodies, which could not be refolded using our standard procedures (18). ORF49 (Dit) has been proposed to orchestrate the IC formation once a certain concentration of Dit was reached (7). Dit was expressed as a soluble protein at concentrations up to 6 mg/ml. However, this protein appeared to be susceptible to protease cleavage, because several time-dependent degradation products appeared on SDS gels. ORF50 (Tal) incorporates a self-cleavage site at a consensus sequence GGSSGG, between the two last Gly (7,13). The N terminus, encompassing residues 1-590, was proposed to act as a structural domain, whereas the C-terminal domain of this tail fiber (represented by amino acid residues 591-906), bears the catalytic activity: the last 120 residues of its C terminus domain share ϳ40% sequence identity with the catalytic domain of a zinc-containing peptidase from Vibrio cholerae, the structure of which has been determined (PDB 2GU1). 4 We therefore cloned the two domains of Tal using the Gateway TM system. Both N-and C-terminal Tal domains are well expressed and soluble. The N-terminal domain could be concentrated up to 7 mg/ml, whereas the C-terminal domain was not subjected to further studies. These three components, TMP C-term, Dit, and Tal N-term, are believed to form the conical structure of the IC. In TP901-1, two proteins have been shown to assemble stepwise around the IC: first BppU, followed by BppL (5,6). In Tuc2009, it was assumed that the assembly of these proteins (represented by ORF51 and -53, respectively) may follow the same order (7). However, the situation in Tuc2009 is more complex because ORF52, which is absent in TP901-1, is probably associated with the IC structure. The orf52 gene is located between orf51 and orf53 on the phage genome. Its protein product, ORF52, has been identified among the phage structural proteins, but its exact localization could not be assigned using immuno-gold EM (7). Its involvement in the peripheral baseplate is therefore hypothetical. We were able to express ORF51 (BppU) as a soluble protein at a very high concentration of 50 mg/ml. ORF52 (BppA) is also well expressed and is soluble up to 10 mg/ml. These two proteins are stable and no time-dependent cleavage/degradation was detected. ORF53 or BppL acts as the RBP, conferring host specificity (11). We cloned the gene encoding ORF53 (BppL) and expressed it as a soluble protein. It was impossible, however, to obtain the intact protein, due to rapid proteolytic deterioration, probably at the N terminus, yielding shorter species. Protein cleavage has been observed for BppL of TP901-1, because its crystal structure lacks the first 15 residues (8).
Characterization of the Oligomerization State of the Tuc2009 Baseplate Proteins-The purified Tuc2009 proteins were subjected to weight and size analysis using the MALS/UV/RI Wyatt instrument (Table 1) , and appears as a protein with a mass of 60 kDa by MALS/UV/RI, a value pointing to a trimeric state. Because this protein is prone to proteolytic cleavage, the quality of the chromatographic data deteriorated within a few days, ultimately becoming uninterpretable. We expressed a variant of ORF53 that comprises the head domain only. This head domain is measured at 42,700 Da, compared with a theoretical mass of the monomer of 13,838 Da. These data clearly indicate that the trimeric state of the full-length protein is maintained in these constructs, as seen in phage p2 (9). The R h value of the head trimeric domain is 2.8 Ϯ 0.3 nm.
Interaction Studies Using MALS/UV/RI-Following these experiments on the individual proteins, we performed pairwise interaction analysis. When ORF51 and ORF52 were mixed, a new peak appeared with a mass of 215 (Ϯ15) kDa, which is in agreement with the theoretical mass of a heterohexamer of 3xORF51 ϩ 3xORF52 (210,810 Da) (Fig. 2). In contrast, mixing ORF51 and ORF53 did not yield any change in the chromatograms, as compared with the individual components. Indeed, this was very surprising, because we anticipated an interaction between ORF51 and ORF53, based on the results obtained for TP901-1 (5,6). This result can be explained by assuming that the observed cleavage of the N-terminal trimeric ␣-helix bundle, which anchors the RBP in the BP, prevents interaction with ORF51. Finally, mixing ORF52 and ORF53 also did not yield any change in the chromatograms.
To overcome the presumed cleavage problem occurring with ORF53, which might have caused the lack of interaction with ORF51, we co-expressed both proteins from a bicistronic construct (see "Experimental Procedures"). In the latter construct, the ORF51 protein was expressed to include a His 6 C-term tag, whereas ORF53 included a STREP C-term tag, making it possible to achieve rapid purification of the putative complex. This complex was indeed isolated and analyzed using the MALS/ UV/RI. The chromatogram displayed the expected peak at 170 kDa of a protein complex consisting of 3xORF51 ϩ 3xORF53, and another peak at 370 kDa, probably comprising a dimer of this aforementioned complex, i.e. 2 ϫ (3xORF51 ϩ 3xORF53) (Fig. 2). In this case, most of ORF53 remains intact, where its helix bundle is apparently protected through enclosure within ORF51 (Fig. 1a). However, the main peak lacks symmetry, indicating the presence of unbound ORF51, a possible trimer of ORF51 and a minor peak of possibly three shoulder ϩ neck domains of ORF53 attached to the ORF51 trimer. When we added ORF52 to this ORF51-ORF53 complex, we observed a major band of 255 kDa representing the complex of the three components, with a 3:3:3 ratio. We also observed a lower molecular weight shoulder that we interpret as a 3:3:3 complex in which the head domain of ORF53 has been cleaved, as well as a smaller band of higher molecular mass (ϳ520 kDa) corresponding to a dimer of the nonamer complex, 2ϫ (3xORF51 ϩ 3xORF52 ϩ 3xORF53) (Fig. 2). When we tried mixing proteins from the initiation complex, ORF49 (Dit) and ORF50N (Tal, amino residues 1-590), we did not observe a new peak, indicating that no interaction occurs, which may be explained by the absence of the TMP component of the assumed initiation complex.
Interaction Analysis Using Blue Native Gels-We performed blue native gradient gel (polyacrylamide 4 -16%) analysis of the components of the peripheral baseplate (Fig. 1b). In lane 51, ORF51 exhibits three bands, whereas on SDS gels (Fig. 1a) ORF51 appears as a unique band. Furthermore, the apparent mass of the first band, calculated using the calibration curve with standards (see "Experimental Procedures"), is much larger than expected, 140 versus 100 kDa. We suspect that ORF51 does not migrate at its expected size. Using the calibration curve, the molecular masses for the second and third bands are 295 and 437 kDa, respectively, i.e. those of a hexamer (280 kDa expected) and a nonamer (440 kDa expected). This autoassociation suggests that ORF51 is responsible for building the framework for the peripheral baseplate.
The co-expressed complex of ORF51 and ORF53 (lane 51/53) presents a more complex pattern of several bands. The slowest migrating band, at 63 kDa, corresponds to an excess of ORF53 trimer (60 kDa expected), whereas the next band at 140 kDa is that of an excess of ORF51. The following band, with an apparent molecular mass of 248 kDa is that of a 3:3 complex of ORF51 and ORF53, for which the theoretical mass is 170 kDa. This discrepancy results probably from the anomalous behavior of ORF51. The bands migrating more slowly than the 3:3 complex most likely correspond to a mixture of hexamer duplexes and complexes in which the head domain has been cleaved. The protein complexation between ORF51 and ORF52 (lane 51/52) exhibits, starting from the bottom, a band at 322 kDa, versus the expected 210 kDa. The next band is observed at 620 kDa and would represent a dimer of the former 322-kDa complex. The complex between ORFs 51, 52, and 53 (lane 51/52/53) exhibits a major band of apparent molecular mass of 400 kDa instead of 279 kDa. The next band, at 740 kDa would represent the dimer of this complex.
To correct for the anomalous behavior of ORF51 on these gels, we established an internal calibration curve based on the weights of ORF51 trimer, hexamer, and nonamer. In this case, the calculated molecular masses for complexes ORF51 ϩ 53, ORF51 ϩ 52, and ORF51 ϩ 52 ϩ 53 are 188, 240, and 300 kDa, respectively, which is much closer to the theoretical values of 170, 210, and 270 kDa.

DISCUSSION
The Peripheral Baseplate-Most EM pictures of TP901-1 or Tuc2009 baseplate display lateral views of the baseplate, in a direction perpendicular to the tail axis (5)(6)(7). In EM views of TP901-1, the baseplate appears to be crushed and its 6 RBPs (BppL) bulbs and peduncles can be identified in the EM image plane (5,6). The x-ray structure of TP901-1 RBP fits perfectly well into these low-resolution images once adjusted to the scale of the EM picture (8). Interestingly, in pictures in which the baseplate was rotated by 90°, 5 the baseplate presents a clear 6-fold symmetry. Although the available pictures are less clear for Tuc2009, the same conclusions should remain valid because the BP proteins of the two phages display 78 -97% identity, if we exclude their BppL. Analysis of Tuc2009 EM images made it possible to deduce the gross dimensions of the baseplate (7). The upper disk, formed by BppL and possibly BppA, has a diameter of 27 nm and a height of 7.5 nm, whereas the lower disk, the petticoat structure, is 21 nm wide and 7.5 nm high. In contrast, the TP901-1 upper disk is 23 nm wide, and the lower disk is wider (28 nm). In this latter phage, the IC conical structure is 7 nm high and the fiber extends 16 nm out of the conical structure (6). 5 H. Neve, personal communication. We have shown that the three proteins constituting the peripheric baseplate, ORF51, ORF52, and ORF53, form a homotrimeric complex of 3ϫ (ORF51 ϩ ORF52 ϩ ORF53). This ensemble has a R h of 6.2 nm, hence a diameter of 12.4 nm, a value much lower than the diameter of 27 nm measured on the EM images of Tuc2009 (7). This ensemble cannot, therefore, constitute alone the baseplate disks. Because contrast EM pictures have revealed a 6-fold symmetry for the baseplate, it is likely that the baseplate disks consist of a hexamer of ORF51 ϩ 52 ϩ 53 trimers. Straightforward geometrical calculations from the measured R h of this ensemble yield a baseplate diameter value comprised between 32 and 36 nm, a value higher but compatible with the EM measurement (27 nm). This slight difference between solution data and EM might be explained in two ways: (i) the proteins interpenetrate, or (ii) they are more elongated in the direction perpendicular to the baseplate disk. This latter explanation seems the most probable because ORF53 points downwards from the baseplate and therefore does not increment the R h horizontally (Fig. 3). A R h of 4.8 nm, slightly larger than the R h of ORF51 alone (4.5 nm) would lead to a baseplate dimension of 26.4 nm, close to the EM value, indicating thus that ORF52 participates little in the horizontal component of the R h .
Taking these considerations into account, as well as the observed R h of ORF53 head trimer (2.8 nm), we reach a lower disk diameter of 19 -21 nm (Fig. 3), in agreement with the EM data (21 nm) (7). With this model, however, the upper disk is thicker than observed by EM, whereas the lower disk thickness fits with EM data. Therefore, we have to assume that the ORF51 trimer is not spherical, but is more likely to resemble a flatter ellipsoid 7.5 nm thick, and somewhat larger than 24 -27 nm. Because ORF52 is not present in TP901-1, its presence in Tuc2009 is intriguing. ORF52 clearly participates in the upper disk of the baseplate, because we have shown that it forms a strong 3:3 complex with ORF51, but does not interact with ORF53 alone. ORF52 orthologs are present in several bacteriophages, whereas domains of other phage proteins involved in host recognition also display significant similarity with this ORF.
The Initiation Complex-The recent structure of the tail tip of SPP1 determined at 9-Å resolution by cryoEM, provides a wealth of information on interactions in the initiation complex, considering the significant homology between its proteins and those of Tuc2009 (12). In SPP1, the cap (TMP C-term), Dit, and Tal form, in this order, an elongated structure of 35 nm with a width that ranges between 3 and 14 nm (12). Both Dit and Tal were assigned to be present as trimers, whereas TMP might preferably be a hexamer. The SPP1 Tal N-terminal domain exhibits structural resemblance with a  Table 1). Five chromatograms are superposed: ORF51 and ORF52 (inset) (A), the complex ORF51 ϩ ORF52 (B), the complex ORF51-ORF53 (C), and the complex ORF51-ORF52-ORF53 (D). Note that ORF52 provokes a shift of the elution time of the complexes in B and D toward lower molecular weight, probably due to interactions with the column.
protein of known structure, the trimeric P22 tail spike (1TSP) (19). This protein is formed by three ␤-helical monomers that are assembled tightly. Our observation of a Tal trimer in solution is consistent with the cryoEM data on SPP1. Dit SPP1 is also interpreted to be present as a trimer in the cryoEM map. However, under our experimental conditions, Dit tuc2009 is a monomer, indicating that the interaction is less tight. It seems therefore that Tal-(1-590) promotes the trimeric assembly of Dit, as for ORF51-ORF52 and also described in the case of gp4 of phage P22, which exists as a monomer in solution and will only oligomerize upon binding to the portal protein (20).
Immuno-gold labeling experiments have failed to label Tal tuc2009 , suggesting that this trimer is masked from the solvent by the peripheral BP components. We measured a R h of 4.0 nm for Tal. The space within the ORF51 18-mer upper disk structure is estimated to be ϳ9 -10 nm in width, which would therefore be large enough to accommodate the Tal trimer (Fig.   3, top). A side view of the resulting assemblage suggests that Tal is fully buried, being inaccessible to large IgGs (Fig. 3). In contrast, both Dit and TMP can be visualized by immuno-gold labeling, but phages lacking the peripheral components of the BP exhibited stronger labeling. This suggests that both TMP and Dit are left partially accessible to solvent by the peripheral baseplate.
The Mass of the Tail and Baseplate-Knowing the composition of the tail and the baseplate, from literature or the present data, we can evaluate the mass of Tuc2009 organelles. The tail has the largest mass (3.4 MDa) counting only the MTP, but 4.0 MDa if we assign the TMP to the tail. The baseplate weight is 2.6 MDa (Table 3).
Evolutionary Considerations-The role of Tal illustrates interesting aspects of the mosaic assembly of phage proteins. B. subtilis phage SPP1 has been shown to recognize its host through an interaction of its tail tip, probably involving Tal, with YueB, a large membrane protein (21). Tal-(1-590), its structural domain, is similar between Tuc2009 and SPP1 (12). In contrast, the Tal tuc2009 C terminus differs from that of SPP1 in several ways: (i) it is capable of self-cleavage at a consensus site, and both forms, cleaved and uncleaved are observed in the phage particle, (ii) it has a clearly identified zinc-dependent protease domain at its tip, and (iii) it has not been shown to interact with any protein. These facts indicate that, besides a common scaffold function of the Tal N terminus in each of those phages, the C terminus might fulfill very different functions, adhesion and DNA injection initiation for one group, cell lysis for the other.
The tail tip constitutes the base of the phage tail and is the minimal structure observed on Siphoviridae such as SPP1 and T5, whereas in other Siphoviridae much larger appendages decorate this scaffold. An intriguing question is why some phages have developed these appendages, whereas others function satisfactorily with the minimal tail tip. It seems that although similar, Siphoviridae have developed different strategies to attach specifically to their hosts. In phage T5, the antireceptor, the monomeric gp5, associates tightly (subnanomolar) with a siderophore transport porin, FhuA (22). This mechanism is observed with two other phages, BF23, whose antireceptor binds to BtuB (23), a B12 vitamin transporter, and H8, which attaches to FepA, another iron transporter (24). The lactococcal phage c2 has been shown as well FIGURE 3. Topological model of Tuc2009 tail-tip and baseplate. Top, view perpendicular to the main tail axis. The 3:3:3 complex formed between ORFs 51, 52, and 53 is assembled in a large hexagonal structure. The center of the structure is occupied by the initiation complex, and more probably interacts strongly with ORF50-(1-590), which is non-accessible to specific antibodies (7). Bottom, view parallel to the main tail axis. The geometry of the initiation complex (ORFs 48, 49, and 50) is derived from Plisson et al. (12) and from stoichiometries obtained here (Table 1). The topology of the peripheral baseplate is calculated based on our MALS/UV/RI analysis. to recognize its host through an antireceptor/protein contact, very similar to the mechanism shown for SPP1 (21). Its receptor, named phage interacting protein, exhibits ϳ25% identity with YueB, the SPP1 receptor (25). Both proteins possess a hexahelical-membrane embedded domain, and a very large ektodomain of ϳ900 residues. However, their physiological function remains unknown.
In contrast to phages that bind membrane-attached proteins, it has been proposed that most lactococcal phages that belong to the 936 or P335 families exclusively attach to saccharidic receptors (3,26,27). This hypothesis has been corroborated by the recent elucidation of the three-dimensional structures of RBPs from representatives of both families (8 -10). We have shown that phages p2, bIL190 (936), and phage TP901-1 (P335) bind glycerol molecules in a crevice between the monomers of the trimeric head domain. This glycerol binding site is reminiscent of typical saccharide binding sites (28,29), with a floor constituted by an aromatic amino acid (Trp, Tyr, or Phe) and a dense network of H-bond acceptors or donors (here Arg, His, and Asp). We have shown experimentally, with fluorescence quenching experiments, that the RBPs or head domains tightly bind saccharides, in particular phosphoglycerol, a component of lipoteichoic acids. In the case of TP901-1, the K d values observed between the RBP and MurNAc-dipeptide or phosphoglycerol are 4 and 50 nM, respectively. However, these affinities are weaker than those reported for gp5 of phage T5 for FhuA. It should be recalled that TP901-1 or Tuc2009 baseplates possess 6 RBP trimers, which thus account for 18 sugar binding sites. No doubt that the avidity phenomenon induced by this stoichiometry should compensate largely for the weaker individual binding affinity.
Several phages, including the Myoviridae T4, have a twostep host binding mechanism (30). This is also the case with the siphoviridae T5, which possesses long fibers providing the first attachment, followed by gp5/FhuA binding (22). SPP1 does not exhibit any supplementary structure, but possesses an extra domain at the C terminus of some MTPs. 6 This domain clearly has a fibronectin fold and might be involved in recognition of host polysaccharides, on top of the already described interaction between the tail tip and the host protein YueB. For lactococcal phages, this two-step mechanism is documented for c2, which seems to recognize its host first by means of a reversible step, which involves a cell-wall rhamnose moiety, followed by an irreversible step that requires phage interacting protein (25). In contrast, host recognition by phages such as Tuc2009 and TP901-1 on the one hand, and sk1 and bIL170 on the other, seems to involve a single step binding to saccharides (3,26,27). If this is true, the mechanism of high avidity displayed by Tuc2009 or TP901-1 would be a clever way to overcome the weakness of a single step host recognition mechanism involving rather weak interactions with saccharides.