Bacteriophage SPP1 Tail Tube Protein Self-assembles into β-Structure-rich Tubes*

Background: In most bacteriophages, a long tail primarily built from tail tube proteins serves as a conduit for DNA delivery into the bacteria. Results: The tail tube protein of phage SPP1 self-assembles into tubes exhibiting a phage tail-like helical architecture. Conclusion: A three-dimensional model is proposed for the self-assembled tubes. Significance: This work opens the way for the generation of artificial tubular structures. The majority of known bacteriophages have long tails that serve for bacterial target recognition and viral DNA delivery into the host. These structures form a tube from the viral capsid to the bacterial cell. The tube is formed primarily by a helical array of tail tube protein (TTP) subunits. In phages with a contractile tail, the TTP tube is surrounded by a sheath structure. Here, we report the first evidence that a phage TTP, gp17.1 of siphophage SPP1, self-assembles into long tubes in the absence of other viral proteins. gp17.1 does not exhibit a stable globular structure when monomeric in solution, even if it was confidently predicted to adopt the β-sandwich fold of phage λ TTP. However, Fourier transform infrared and nuclear magnetic resonance spectroscopy analyses showed that its β-sheet content increases significantly during tube assembly, suggesting that gp17.1 acquires a stable β-sandwich fold only after self-assembly. EM analyses revealed that the tube is formed by hexameric rings stacked helicoidally with the same organization and helical parameters found for the tail of SPP1 virions. These parameters were used to build a pseudo-atomic model of the TTP tube. The large loop spanning residues 40–56 is located on the inner surface of the tube, at the interface between adjacent monomers and hexamers. In line with our structural predictions, deletion of this loop hinders gp17.1 tube assembly in vitro and interferes with SPP1 tail assembly during phage particle morphogenesis in bacteria.

More than 96% of bacterial viruses (bacteriophages or phages) have a tail (1). This structure is designed for phage attachment to bacteria, penetration of the cell envelope, and delivery of the viral genome into the host. In long-tailed phages, the main elementary unit of the genome delivery tube is the phage tail tube protein (TTP). 5 During assembly the tail adsorption apparatus is built first, thus providing a platform for helical polymerization of the TTP around a tape measure protein that defines the tail tube length (2,3). The tube is then tapered by tail completion proteins that build the interface for binding the viral capsid connector (4,5). When the tail adsorption apparatus recognizes its cognate receptor on the bacterial surface, a signal is transmitted through the tail to the head-totail connection, and viral DNA is ejected from the capsid to the targeted bacterium (6).
Structural analysis of long-tailed phages revealed that tails of Siphoviridae and Myoviridae phages share common features. In particular, they both exhibit a helical tube organization. The TTPs of Siphoviridae most frequently form stacked rotated hexamers with 6-fold rotational symmetry. The distance between hexamers is ϳ40 Å, and the rotation angle between them is ϳ20° (6 -10). In addition, Siphoviridae tails are often decorated with protein domains encoded within the TTP gene (6, 10 -12). Structures of the Siphoviridae phage SPP1 tail before and after DNA ejection showed that structural rearrangements occurred in the tail upon incubation with its cellular receptor (6).
The tail of Bacillus subtilis phage SPP1 includes two TTPs, gp17.1 (19.1 kDa) and gp17.1* (28.3 kDa), that share a common N-terminal sequence (11). A programmed translational frameshift of gene 17.1 leads to the production of gp17.1*, which contains an additional C-terminal domain as compared with gp17.1. In the wild-type phage, approximately one gp17.1* for every 3 eq of gp17.1 is incorporated into the tail structure. However, a virion coding for only gp17.1 is viable and infectious, demonstrating that the additional domain present in gp17.1* is dispensable for virion assembly and function. Here, we observe by electron microscopy (EM) that recombinant gp17.1 alone forms tubes constituted by stacked rings of TTP subunits with a helical organization similar to the one found in tails of SPP1 virions but with variable lengths. NMR and FTIR analyses show that gp17.1 adopts a partially folded structure in its monomeric form and undergoes further structural changes to build a ␤-sheet-rich helical tube. From EM and bioinformatics results, we propose a first model for the phage noncontractile tail TTP tube.

EXPERIMENTAL PROCEDURES
Cloning Procedures-A PCR fragment spanning from gene 16.1 to the end of gene 17.1 without a stop codon (coordinates 9641-11,011 of the SPP1 genome sequence, accession code X97918.2) was obtained by amplification of SPP1 wild-type DNA using oligonucleotides 5:pIA6For and 6:pIA6Rev 17.1 ( Table 1). The sequence of oligonucleotide 6 replaces the rare proline codons CCC at the end of gene 17.1 by more abundant CCG or CCT codons, and it also codes for a noncleavable C-terminal hexahistidine tag. The purified PCR fragment product was cleaved with EcoRI and HindIII and inserted into the pPT25 shuttle vector (11) digested by the same restriction enzymes generating pIA6. This vector was used for the first production trials of gp17.1 with a noncleavable C-terminal hexahistidine tag, further named gp17.1SPP1.
To increase the protein production yield, we further used a synthetic gene optimized for expression in Escherichia coli inserted into a pETM-13 vector (GeneScript, Piscataway, NJ). This gene also coded for gp17.1 with a noncleavable C-terminal hexahistidine tag. The only amino acid sequence difference between gp17.1SPP1 and gp17.1 is an extra glycine residue in position 2 in gp17.1. gp17.1⌬ (gp17.1 deleted from loop 40 -56) was similarly overproduced from a synthetic gene inserted in a pETM-13 vector. Plasmids were transformed into BL21Star (DE3) cells (Stratagene) and selected on kanamycin media at all stages.
To study gp17.1 assembly into the virion within the bacteria, the shuttle vector pIA14 (11,13,14), which replicates both in E. coli and in B. subtilis cells, was engineered to produce a gp17.1⌬ form of the TTP. PIA14 produces only the short form of the SPP1 TTP, gp17.1, because the translational frameshift leading to the long TTP form, gp17.1*, has been disrupted by mutagenesis (11). Deletion of 51 bp of the17.1 gene coding the TTP residues 40 -56 was done by the PCR-driven overlap extension method (15) with primers 1-4 generating pIA65 (Table 1).
Protein Labeling, Production, and Purification-E. coli BL21(DE 3) freshly transformed with plasmid pIA6 was grown at 37°C in Luria-Bertani (LB) broth supplemented with erythromycin (30 g ml Ϫ1 ) and chloramphenicol (10 g ml Ϫ1 ) until reaching an absorbance at 600 nm of 1. The culture was then diluted 10-fold into a minimal medium containing ( 15 NH 4 ) 2 SO 4 as the sole nitrogen source supplemented with the same antibiotics. Recombinant hexahistidine-tagged gp17.1 proteins were constitutively produced overnight at 37°C. Pelleted bacteria were resuspended in lysis buffer (50 mM Tris-HCl, pH 7.5, 150 mM NaCl, 0.1% Triton, 10 mM MgSO 4 ) and disrupted by sonication. Crude extracts were centrifuged at 4°C twice for 20 min at 27,000 ϫ g, and gp17.1 proteins were purified by metal-chelating chromatography. The supernatant was loaded onto a 5-ml Hi-Trap Ni-NTA column (GE Healthcare) pre-equilibrated with buffer A (20 mM NaH 2 PO 4 , pH 7.4, 500 mM NaCl). The column was washed with increasing con- centrations of buffer B (buffer A with 1 M imidazole) and the protein eluted at 250 mM imidazole. The eluate was dialyzed against a 50 mM sodium phosphate, pH 6.0, 150 mM NaCl buffer and concentrated by ultrafiltration (Vivaspin 20, 10-kDa molecular mass cutoff). At this step, the protein, further named gp17.1SPP1, was more than 95% pure as assessed by Coomassie Blue staining of SDS-PAGE. Protein concentration was determined by measurement of the UV absorbance at 280 nm, with an extinction coefficient of 18,910 M Ϫ1 cm Ϫ1 . The purified protein was then characterized using solution-state NMR by recording 1 H-15 N HSQC and 1 H- 15 N NOE experiments at 25°C and 700 MHz (Fig. 1, A and B) as described below.
gp17.1SPP1 was also analyzed using fluorescent-based thermal shift assay with a SYPRO Orange dye as a fluorescent probe (Invitrogen) (Fig. 1C). Reactions were carried out in duplicate in a 96-well fast PCR plate with 5 g of protein in a final volume of 20 l. The temperature gradient was performed in the range of 25-95°C at a 3°C min Ϫ1 rate on an Applied Biosystems 7900HT Fast real time PCR system. Fluorescence was recorded as a function of temperature in real time (excitation wavelength, 488 nm; emission spectrum, 500-660 nm), and the melting temperature was calculated with the ABI SDS2.4 software as the maximum of the derivative of the resulting denaturation curves.
E. coli BL21(DE3) strains transformed with the pETM-13 vectors coding for gp17.1 and gp17.1⌬ were grown at 37°C to an absorbance of 1.5, induced with 0.25 mM isopropyl 1-thio-␤-D-galactopyranoside, and incubated at the same temperature overnight. Uniformly (Ͼ98% for 15 N and 13 C and Ͼ70% for 2 H) 15 N-labeled, 15 N/ 13 C-labeled, and 15 N/ 13 C/ 2 H-labeled proteins were produced in minimal media containing 0.5 g/liter 15 NH 4 Cl and 2 g/liter [ 12 C 6 ]glucose or [ 13 C 6 ]glucose as the sole nitrogen and carbon sources. For triple labeling, cells were first grown in LB at 37°C overnight, then diluted 25-fold in a 99.9% 2 H 2 O-based M9 minimal medium, and incubated at 37°C overnight, and finally diluted 50-fold in the same deuterated minimal medium, induced with 0.25 mM isopropyl 1-thio-␤-Dgalactopyranoside and grown overnight at 37°C. Cells were harvested by centrifugation, resuspended in 20 mM phosphate buffer, pH 7.5, 150 mM NaCl and 8 M urea to solubilize inclusion bodies, and lysed by sonication. After addition of benzonase and 10 mM MgCl 2 , cells were centrifuged at 20,000 rpm for 1 h on a rotor JLA25.50 (ϳ48,000 ϫ g). Hexahistidine-tagged proteins were purified on about 7 ml of Ni-NTA resin (GE Healthcare) in denaturing conditions with 8 M urea, 20 mM phosphate buffer, pH 7.5 (buffer A). The resin was washed with buffer A containing 500 mM NaCl and 50 mM imidazole. Proteins were then refolded using a urea gradient from 8 to 0 M and eluted in buffer A containing 500 mM NaCl and 500 mM imidazole. gp17.1 exhibited a denaturation temperature similar to that of gp17.1SPP1 (55°C against 55.5°C) in the fluorescent-based thermal shift assay (data not shown).
For liquid-state NMR spectroscopy, the gp17.1 and gp17.1⌬ proteins were dialyzed against 20 mM sodium phosphate, pH 7.4, NaCl from concentrations 100 to 500 mM depending on the sample, 1 mM EDTA, and 1 mM PMSF overnight. They were concentrated using ultracentrifugation in Vivaspin 5000 (Vivascience). The final samples consisted of the 15 N-labeled, 15 N/ 13 C-labeled, or 15 N/ 13 C/ 2 H-labeled proteins at a concentration of 8 -10 mg/ml (0.4 -0.5 mM) in 20 mM sodium phosphate, pH 7.4, NaCl from 150 to 500 mM depending on the sample, 1 mM EDTA, and 90% H 2 O, 10% D 2 O. For solid-state NMR (ssNMR) spectroscopy, the 15 N, 13 C-labeled gp17.1 protein was dialyzed against 20 mM sodium phosphate, pH 7.4, 500 mM NaCl, 1 mM EDTA, and 1 mM PMSF overnight. Tube formation was obtained through incubation in this buffer at 37°C for 1 week. Before NMR experiments, the tubes were dialyzed against 20 mM sodium phosphate, pH 7.4, 100 mM NaCl, 1 mM EDTA, and 10% D 2 O was added to the sample.
To follow gp17.1 assembly into tubes, gp17.1 and mixtures of gp17.1 with gp17.1⌬ were loaded in denaturing conditions (8 M urea, 20 mM phosphate buffer, pH 7.5) on a Ni-NTA column, eluted by adding 500 mM imidazole to the buffer, and refolded by overnight dialysis against 20 mM sodium phosphate, pH 7.4, 50 mM NaCl, 1 mM EDTA. Proteins were concentrated by ultracentrifugation in Vivaspin 5000 (Vivascience) to concentrations of 0.5 to 8 mg/ml (0.025-0.4 mM). They were then run through size exclusion chromatography (SEC) in a Superdex 200 16/60 HR column (GE Healthcare) equilibrated in 20 mM phosphate buffer, pH 7.5, and 50 mM NaCl. Fractions corresponding to monomeric proteins were pooled and used for FTIR and EM assays.
Size Exclusion Chromatography-Analytical SEC and multiangle static light scattering (SEC-MALS) experiments were performed with a 2 mg/ml (0.1 mM) gp17.1 sample injected on a Shodex KW-804 column initially calibrated with BSA. The column was pre-equilibrated with 20 mM sodium phosphate buffer, pH 7.5, and 50 mM NaCl.
Liquid-state NMR Spectroscopy-The two-and three-dimensional 1 H/ 15 N/ 13 C/ 2 H NMR experiments were performed at 25°C on Bruker DRX600 and DRX700 spectrometers equipped with triple resonance TCI cryoprobes. All data were processed with Topspin 1.3 (Bruker Biospin, Germany) and analyzed using Sparky3 (T. D. Goddard and D. G. Kneller, University of California, San Francisco). First two-dimensional 15 N-1 H HSQC spectra were recorded on gp17.1SPP1 (Fig. 1, A  and B). To obtain higher production yields, we then used an optimized gene for overproduction of gp17.1 in E. coli and purified it in the same buffer used for gp17.1SPP1 studies. At a concentration higher than 6 mg/ml (0.3 mM), the NMR signal corresponding to the monomeric protein disappeared within a week. Screening different buffer conditions showed that increasing salt concentrations slowed down this process. Therefore, the following NMR analyses of gp17.1 were carried out in 20 mM phosphate buffer, pH 7.5, 500 mM NaCl. Comparison of the two-dimensional 1 H-15 N HSQC spectra recorded on gp17.1SPP1 in the presence of 150 mM NaCl and on gp17.1 in the presence of 500 mM NaCl showed overlapping NMR spectra, confirming that the two proteins adopt the same threedimensional conformation in this range of salt concentrations. All reported experiments were then carried out using the refolded protein gp17.1 in 500 mM NaCl. Three-dimensional HNCO, HNCA, HNCACO, HNCOCA, HNCACB, CBCA-(CO)NH, HBHA(CO)NH, and 15 N-and 13 C-edited 1 H-1 H NOESY experiments were recorded on 15 N/ 13 C-labeled samples. Three-dimensional HNCO, HNCA, HNCACO, HNCOCA, HNCACB, HNCOCACB, NNHN, and NCONHN experiments were recorded on 15 N/ 13 C/ 2 H-labeled samples. These experiments were analyzed to assign the backbone resonances of gp17.1. The assigned chemical shifts were submitted to Predator (16) to identify the gp17.1 secondary structure elements.
ssNMR Spectroscopy-gp17.1 tubes were obtained by incubation at 37°C of the refolded protein in 20 mM sodium phosphate, pH 7.4, 500 mM NaCl, 1 mM EDTA, and 1 mM PMSF overnight. Tubes were spun down at 20,000 ϫ g by centrifugation, and the pellet was transferred into a 3.2-mm zirconia rotor. All proton-driven spin diffusion (PDSD) experiments were conducted using a 3.2-mm triple resonance ( 1 H, 13 C, and 15 N) probe head at a static magnetic field of 18.8 tesla corresponding to 800 MHz 1 H resonance frequency (Bruker Biospin). All experiments were performed at a magic angle spinning rate of 10 kHz at Ϫ20°C with SPINAL 64 (17) 1 H decoupling during evolution and detection periods. The liquid state NMR chemical shift-based cross-peaks were generated using the program FANDAS (18) and were analyzed with Sparky3. The average chemical shifts of the amino acid residues in a ␤-structure were obtained from the work of Wang and Jardetsky (19). Tentative assignments were obtained using a combination of twodimensional correlation experiments that included two-dimensional 13 C, 13 C PDSD correlations obtained using under weak coupling conditions (20) with spin diffusion times of 30 and 150 ms to encode intra-residue and sequential effects, respectively. The 15 N, 13 C experiments were recorded using a SPECIFIC-CP (21) time of 3 ms. Two-dimensional 15 N, 13 C␣-Cx and two-dimensional 15 N, 13 CЈ-Cx were recorded with 13 C-13 C mixing times of 50 ms, respectively.
Fourier Transform Infrared (FTIR) Spectroscopy-Protein solution spectra were measured at 4 cm Ϫ1 resolution with a Bruker IFS 66 spectrophotometer equipped with a 45°N ZnSe attenuated total reflection attachment. After elution from the gel filtration column in 20 mM phosphate buffer, pH 7.5, and 50 mM NaCl, gp17.1 was lyophilized. For the FTIR experiments, gp17.1 samples were suspended in D 2 O and concentrated to about 3 mg/ml (0.15 mM). The concentrator flow-through was used as the reference buffer sample for measuring buffer signal. FTIR spectra of the protein sample and its corresponding buffer solution were initially recorded just after dissolution in D 2 O. Thereafter, spectra of both protein and buffer samples were recorded every 24 h to follow tube formation from 0 to 16 days. The protein sample was maintained at 37°C between spectra acquisitions. Spectra of both protein and buffer samples were recorded with an average of 30 scans. The buffer spectra were subsequently subtracted from the protein spectra. Each spectrum was normalized to the area between 1530 and 1750 cm Ϫ1 .
Electron Microscopy (EM)-To follow gp17.1 tube assembly, EM images of gp17.1 were obtained by transmission electron microscopy of negative-staining samples. About 5 l of the sample solution was applied on a carbon-coated Formvar copper grid (Agar Scientific). After 2 min, the sample droplet was blotted with filter paper, and 5 l of 0.5% phosphotungstic acid stain was applied onto the grid. After 2 min, the excess of staining solution was blotted with filter paper and then the grid was rinsed with deionized water and dried. Samples were imaged in a transmission electron microscope JEOL MET 1400 (120 kV) equipped with a CCD camera SC100 Orius at ϫ10,000 to 30,000 magnification.
To assess helical parameters of the gp17.1 tubes, images of samples negatively stained with 2% uranyl acetate were taken on a Tecnai F20, and data were recorded using a Gatan Ultrascan 4000 4 ϫ 4 camera at a magnification of ϫ80,000. Determination of the contrast transfer function was done for each CCD image using CTFFIND3 (22). Segments of tubes were selected manually using BOXER (EMAN (23)). Images were corrected for the CTF effects using Spider on the cut-out segments. Selected particles (segments) were normalized to an average density of zero and the same standard deviation of 1 for all images. All segments were aligned to the vertical rectangle and subjected to multivariate statistical analysis and classification (24). Class averages contained ϳ10 images/class, and their diffraction patterns were analyzed. The classes that produced the most symmetrical diffraction patterns were analyzed further. The maxima intensities on a layer line are defined by a distance R from the meridian and radius r of the virus that was described by a Bessel function J of order n: J n (2Rr) (25). The values of Bessel function orders were determined, and the parameters of the tube helix were defined using the selection rule for the helical symmetry as described in Stewart (25). The distance between gp17.1 rings was determined from the positions of layer lines in the diffraction patterns, and the angle of rotation around the tube axis between the rings was derived from the helical parameters of the tube.
Bioinformatics and Molecular Modeling-Search for the gp17.1 fold was done using the HHpred server (26), which identified four structural homologs. N-terminal gp17.1 residues 1-138 were predicted to adopt the fold of the N-terminal domain of the TTP gpV of phage (PDB code 2K4Q) with a probability of 99%. They were also aligned with residues 44 -152 of the base plate protein ORF46 from Lactococcus phage TP901-1 (PDB code 4DIV) with a probability of 64%, residues 35-137 of the base plate protein gp19.1 of SPP1 (PDB code 2X8K) with a probability of 53%, and residues 30 -136 of the prophage protein XKDM encoded by B. subtilis (PDB code 2GUJ) with a probability of 46%. All these protein fragments share a similar fold, organized around a ␤-sandwich decorated by a ␣-helix. To model the whole gp17.1 protein fragment 1-177, we used the I-TASSER server (27). It produced five models of full-length gp17.1, in which fragment 1-138 was structurally similar to the phage gpV structure, as predicted by HHpred. The convergence between the five models is high because the average root mean square deviation on their secondary structural elements of the ␤-sandwich is lower than 2.0 Ϯ 0.2 Å. All these models present a large unfolded loop between residues 40 and 56. I-TASSER calculated a TM score of 0.42 for the first model, indicating a high probability to predict a correct gp17.1 topology from this model.
A three-dimensional model for hexameric gp17.1 was computed from the HHpred and I-TASSER results. We used a multitemplate MODELLER procedure (28) in which we combined gpV (PDB code 2K4Q), gp19.1 (PDB code 2X8K), and I-TASSER structures to build the ␤-sandwich, the large ␤-sheet formed by loop ␤2␤3 (residues 40 -56) and the C-terminal region of gp17.1, respectively. The SPP1 tail cap gp19.1 structure was solved in a hexameric form, and its loop (gp19.1 residues 37-53) corresponding to gp17.1 loop ␤2␤3 is an essential element of the intermolecular interfaces (PDB code 2X8K (29)). We modeled the gp17.1 loop structure from the corresponding loop structure observed in gp19.1. Then the gp17.1 hexamer was assembled by superimposing the monomeric gp17.1 model onto each of the gp19.1 subunits. Loop ␤2␤3 forms a ␤-sheet located at the intermolecular interfaces in our hexameric gp17.1 model. Finally, from this model and the helical parameters derived from EM analysis of the tubes, we calculated a model of three stacked hexamers of gp17.1 that is representative of the tube structure.
Tail Assembly in B. subtilis-Phage SPP1sus45 (11,30) was amplified by infection of the B. subtilis HA101B permissive strain as described previously (31). Phage titers and reversion rates in the lysates were determined by titration in strains HA101B and YB886, respectively. Culture infection and phage purification through a discontinuous CsCl density gradient with preformed layers of 1.7, 1.5, and 1.45 g cm Ϫ3 CsCl in TBT buffer (100 mM NaCl, 10 mM MgCl 2 , 100 mM Tris-Cl, pH 7.5) were carried out as described previously (11,32).
Phage SPP1sus45 was titrated in B. subtilis YB886 bearing pIA14 or pIA65 plasmids coding for full-length gp17.1 or for gp17.1⌬, respectively. The titers were expressed as a percentage of the titer obtained by infection of the permissive B. subtilis strain HA101B to determine the level of complementation (33). Extracts of B. subtilis were prepared and analyzed by Western blot as described previously (34).

Monomeric SPP1 TTP Is Only Partially Folded in Solution-
In long-tailed bacteriophages, the tail tube is formed by a large number of TTP subunits. NMR analyses demonstrated that in phage , the TTP gpV N-terminal domain adopts a ␤-sandwich fold in its monomeric state (35). Several studies suggested that this fold is conserved in most TTPs of long-tailed phages (35,36) and might be a recurrent fold found in other components that extend the inner tube channel at the long tail extremities (5,36). Analysis of the sequence of the SPP1 TTP protein using HHpred also suggested that fragment 1-138 of gp17.1 adopts the fold of phage gpV N-terminal domain, even if these protein regions only share 14% of sequence identity.
To characterize the three-dimensional structure of gp17.1, 15 N-labeled gp17.1SPP1 was produced and purified by affinity chromatography. The protein is mainly monomeric at 4°C and 100 M as assessed by SEC (data not shown). Its NMR 1 H-15 N HSQC spectrum displays well dispersed cross-peaks, revealing that the protein is at least partially folded (Fig. 1A). NMR 1 H 3 15 N NOE measurements indicated that only 15 residues are completely unstructured in gp17.1SPP1 (Fig. 1A). Thermal stability monitored by a fluorescent-based thermal shift assay showed that gp17.1SPP1 unfolds cooperatively with a single transition (T m of 55.5°C), demonstrating that it exhibits a folded hydrophobic core (Fig. 1C). However, the large heterogeneity in the 1 H-15 N HSQC peak line widths observed at 25°C (Fig. 1, A and B) marks the presence of conformational exchange. This line width heterogeneity is similarly found on a large temperature range, from 10 to 40°C (data not shown). Moreover, following H-D exchange by NMR at 25°C showed that only few amide protons are not exchanged with deuterium after 1 h at 30°C (Fig. 1B). This demonstrates that most amide protons form marginally stable hydrogen bonds within gp17.1SPP1, which confirms that the protein structure is dynamic. Similar observations were made for gp17.1, which possesses only one amino acid difference relative to gp17.1SPP1, but it was produced and purified using a different protocol (simultaneous elution and refolding from an affinity chromatography column), yielding larger protein yields.
Triple 15 N, 13 C, and 2 H labeling was necessary to achieve 63% of the gp17.1 backbone frequency assignment (supplemental Table 1). About 110 peaks remained unassigned on the 1 H-15 N HSQC spectrum, corresponding to 67 unassigned non-proline residues and about 40 residues in minor conformations. Further assignment was prevented by the lack of signal corresponding to these peaks on the three-dimensional 1 H, 15 N, and 13 C NMR experiments. Analysis of the assigned gp17.1 NMR chemical shifts using Predator showed that very few secondary structure elements could be unambiguously identified on the basis of NMR data; only eight assigned residues showed backbone chemical shifts characteristic of ␤-strands, and no assigned residues corresponded to ␣-helices. Further NMR chemical shift assignment of residues belonging to predicted secondary structure elements ( Fig. 2A) was not possible, most probably because of the presence of conformational exchange processes. This hindered determination of the three-dimensional solution structure of gp17.1.
A model of the gp17.1 fold was calculated using I-TASSER. This model is consistent with the gp17.1 secondary structure predictions ( Fig. 2A). It is based on the sequence alignment between the fragment 1-138 of gp17.1 and the N-terminal region of gpV ( Fig. 2A). It also exhibits an additional ␣-helix in the region 139 -177 of gp17.1 (Fig. 2, B and C). To evaluate whether the NMR analysis of gp17.1 supports the gp17.1 model proposed by HHpred and I-TASSER, we projected the secondary structure elements identified using Predator onto this atomic model. Fig. 2B shows that the NMR observable ␤-strands correspond to ␤3, ␤7, and ␤6, all belonging to the same ␤-sheet. In contrast, no residues belonging to ␤-strands ␤1, ␤4, and ␤5 could be assigned. Analysis of the 1 H-15 N HSQC spectrum of gp17.1 recorded after 1 h of incubation in D 2 O FIGURE 2. Molecular modeling of the gp17.1 monomer. A, sequence alignment between residues 1 and 138 of SPP1 gp17.1 (177 amino acids) and residues 6 and 152 of the N-terminal domain of phage gpV (156 amino acids), as proposed by HHpred. gp17.1_SS and gpV_SS correspond to secondary structure predictions using PsiPred and GpV_NMR to secondary structure elements determined using DSSP from the NMR structure of gpV (PDB code 2K4Q). B and C, gp17.1 model calculated by I-TASSER, based on the structural analogy between gp17.1 and gpV. B, ribbon is colored based on the gp17.1 NMR chemical shift analysis as follows: unassigned residues in gray; assigned residues in secondary structure elements in red (as deduced from chemical shift analysis using Predator); and assigned residues not involved in secondary structures in yellow and green (for loop 40 -56). The eight residues in red (Thr-66, Tyr-67, Tyr-68, Asp-120, Gly-121, Val-126, Glu-127, and Ile-128) are labeled. C, ribbon is colored based on the gp17.1 1 H 3 15 N NOE analysis as follows: unassigned residues are rendered in dark gray; assigned residues with reduced local mobility in cyan (NOE Ͼ0.6); assigned residues in flexible regions in orange (0 Ͻ NOE Ͻ 0.6); and assigned residues in unstructured regions in red (NOE Ͻ0).
consistently revealed that three amide protons, corresponding to Tyr-68(␤3), Ile-128, and Ser-129(␤7), are involved in stable hydrogen bonds, together with nine other amide protons belonging to unassigned residues (Fig. 1B). Finally, we projected the 59 1 H 3 15 N NOE values that could be assigned to a gp17.1 residue (supplemental Table 1) onto the gp17.1 model (Fig. 2C). Most values higher than 0.6, corresponding to well structured residues, are found in the ␤-sheet formed by ␤3, ␤7, and ␤6 (Fig. 2C, cyan ribbon). On the opposite, most of the negative values, corresponding to unstructured residues, are located in the C-terminal fragment 166 -173, which adopts no secondary structure in the gp17.1 model (Fig. 2C, red ribbon). Altogether, the NMR and I-TASSER results suggest that the ␤-sheet formed by ␤3, ␤7, and ␤6 is well structured, and the ␤-sheet formed by ␤1, ␤4, and ␤5 exhibits large conformation exchange processes. The NMR signals corresponding to the two predicted ␣-helices are partially assigned, but their chemical shifts indicate that they are not structured as ␣-helices within monomeric gp17.1.
Purified gp17.1 Forms Tubes Rich in ␤-Structure-NMR analysis of gp17.1 at a concentration of 8 mg/ml (0.4 mM) and a temperature of 25°C revealed that the NMR signal of this protein reproducibly disappears with time. To follow its association state as a function of time, we purified unfolded gp17.1 in urea by affinity chromatography and then dialyzed the protein in a refolding buffer and injected the protein at 2 mg/ml (0.1 mM) on a SEC-MALS system (Fig. 3A). In these conditions, gp17.1 is eluted in two peaks. Measurement of the differential refractive index and the light scattering intensity corresponding to each peak provided an estimate of their mass. The second peak corresponds to a mass of 28.8 Ϯ 2.4 kDa, slightly higher than the mass of a monomer (20.2 kDa), whereas the first peak corresponds to a heterogeneous population of multimers with a molecular mass superior to 1000 kDa. These multimers (50 M) incubated for 3 days at 37°C form long tubes of different lengths as observed by EM (Fig. 3B).
To follow the formation of tubes from gp17.1 monomers, we recorded FTIR spectra and EM images of a refolded and SECpurified sample of monomeric gp17.1 incubated at 37°C (Fig.  4). We observed tubes 48 h after collection from the SEC column at 3 mg/ml (0.15 mM; Fig. 4A). The tubes increased in length over time. FTIR spectra showed a shift of a band near 1645 cm Ϫ1 assigned to random coil toward a band at 1620 cm Ϫ1 corresponding to ␤-sheet in the amide I band spectrum over the reaction time course (Fig. 4B). Changes in gp17.1 secondary structure were highlighted by calculating the difference absorbance spectra corresponding to the amide I band (bottom panel in Fig. 4B). They were obtained by subtracting the initial protein spectrum from spectra obtained at different incubation times at 37°C. The increase in absorbance around 1620 cm Ϫ1 with time is indicative of ␤-sheet formation in gp17.1. This increase of ␤-sheet content correlates with formation of gp17.1 tubes as observed by EM.
gp17.1 Tubes Structure Studied by ssNMR-Self-assembly of gp17.1 into long tubes prompted their structural analysis by ssNMR. Tubes were obtained from uniformly 13 C, 15 N-labeled gp17.1 proteins incubated at 37°C for 1 week and pelleted by centrifugation. 13 C-13 C PDSD experiments were recorded on the resulting sample at 253 K. The favorable 13 C line widths observed at 800 MHz in 13 C-13 C PDSD experiments below 1 ppm suggested that the gp17.1 polypeptide chain is well folded in the tubes (Fig. 5A). A 13 C-13 C PDSD spectrum was calculated in silico on the basis of the solution NMR chemical shifts; this enabled comparison between the solid-state (black peaks) and solution-state (green symbols) NMR fingerprints of gp17.1 (Fig.  5A). In several regions of the 13 C-13 C PDSD spectrum, including regions 3 and 4, the experimental and calculated peaks poorly superimpose. If, as strongly predicted, gp17.1 adopts a ␤-sandwich-like fold (Fig. 5A, blue symbols), we observe a significant improvement in the correlation between the experimental and calculated NMR cross-peaks (Fig. 5A). In the soluble monomer, most threonine and serine C␣-C␤ correlations exhibit secondary chemical shifts expected for random coil folds. In contrast, the same region in the ssNMR spectrum is dominated by cross-peak correlations typically found in ␤-strand conformations (Fig. 5, A and B, zoom 4) (19). Thus, additional ␤-strands rich in serine and threonine residues are observed in tubes. Furthermore, in the region corresponding to pre-equilibrated with 20 mM sodium phosphate buffer, pH 7.5, and 50 mM NaCl. The UV light absorbance (UV), differential refractive index, and light scattering intensity were measured as a function of the elution volume. Molecular masses calculated by the MALS-UV-refractive index method (gray) are plotted against elution volume with UV light absorbance (red) and light scattering intensity (blue) traces overlaid. The second peak corresponds to a mass of 28.8 Ϯ 2.4 kDa, slightly higher than the mass of a monomer (20.2 kDa), whereas the first peak corresponds to largely heterogeneous oligomers with a molecular mass superior to 1000 kDa. B, micrograph of material from the oligomer peak at 0.05 mg/ml, 3 days after elution from the chromatography column. The sample was negatively stained with 0.5% phosphotungstic acid.
proline C␣-C␤, most solid-state peaks are found at ␤-structurespecific chemical shifts (Fig. 5, A and B, zoom 3). Finally, in the region corresponding to alanine and leucine C␣-C␤ correlations, several solid-state peaks are found at either ␣-helix or ␤-strand-specific chemical shifts (Fig. 5B, zooms 1 and 2). Altogether, we conclude that the gp17.1 protein fold is stabilized in the self-assembled tubes with Ser, Thr, Pro, Ala, and Leu found in ␤-strand regions and ␣-helix conformations. This secondary structure amino acid content is consistent with the ␤-strands and ␣-helix predicted for gp17.1 (Figs. 2B and 5B), supporting the presence of the corresponding ␤-sandwich organization in the tubes.
gp17.1 Assembles into Helical Tubes of Stacked 6-Fold Symmetric Rings-Comparison of gp17.1 tubes with tails from SPP1 virions indicated that they have similar diameters of 11 nm and share a common morphology (Fig. 6, A and B). Stain penetration revealed by a dark line inside the tube shows that the tubes are hollow as observed for empty tail tubes of virions that ejected their DNA following incubation with the SPP1 bacterial receptor (Fig. 6A) (38). Class averages (ϳ10 images/class) of aligned segments of gp17.1 tubes were obtained, and their diffraction patterns were calculated (Fig. 6C). Analysis of these diffraction patterns showed that tubes have 6-fold symmetry and therefore are formed by stacked hexameric rings. The rings are 4.0 Ϯ 0.2 nm in height and are rotated 21.4 Ϯ 0.2°to form the helical tube (Fig. 6D). SPP1 virion tails are composed of ϳ40 stacked rings of ϳ40 Å in height, and each tail ring is rotated by ϳ21°relative to the previous one. Thus, the recombinant gp17.1 tubes assembled in absence of any other phage SPP1 protein show the same organization found for gp17.1 in the virion tail.
Modeling the gp17.1 Tube Structure Suggests a Critical Role for Loop ␤2␤3 in Tube Assembly-Although the gp17.1 monomer does not adopt a stable three-dimensional structure in solution, characterization of self-assembled gp17.1 by ssNMR showed that the gp17.1 protein acquires a stable fold in the tube (Fig. 5). The secondary structure content and ssNMR signature (Figs. 4B and 5) of gp17.1 in the tubes are consistent with a tertiary structure similar to the ␤-sandwich structure of phage TTP (Fig. 2, B and C). The same ␤-sandwich fold is found in the N-terminal part of SPP1 gp19.1, a phage protein of the Dit family (2,3). Dit proteins form a hexamer at the tail tube end that extends the tail channel to the tail adsorption apparatus (29). The x-ray structure of the gp19.1 hexamer (PDB code 2X8K) was used as a template to model the gp17.1 hexamer (Fig. 7). gp17.1 hexamers were then positioned using the helical parameters determined by EM to simulate a gp17.1 tube (Fig.  8A). In that model, the interface between two subunits consists of ␤-sheets connecting adjacent ␤-sandwiches (Fig. 8B). The large loop ␤2␤3, which is unstructured in the monomer (Fig. 2), is located at the interface between subunits and hexamers (Fig.  8B). Its structure was modeled from the corresponding segment in gp19.1 that forms a ␤-sheet at the gp19.1 hexamer inter-subunit interface. Loop ␤2␤3 from one gp17.1 subunit forms a two-stranded ␤-sheet that interacts with a ␤-strand of the ␤-sandwich of the neighboring gp17.1 subunit and may thus stabilize the three-dimensional structure of this subunit. This model suggests that loop ␤2␤3 is essential for tube assembly by contributing to the formation of a large interface between adjacent subunits of the tail hexamers (Figs. 7B and 8B). Moreover, in this model, loop ␤2␤3 participates in the contact surface between stacked hexamers (Fig. 8B).
Loop ␤2␤3 Is Necessary for gp17.1 Tube Self-assembly-A gp17.1 protein deleted for the segment 40 -56 (gp17.1⌬) was stably produced. Comparison of the NMR 1 H-15 N HSQC spectra of gp17.1SPP1 and gp17.1⌬ monomers revealed that they have a similar fold (Fig. 9). Attempts to assign the gp17.1⌬ 1 H-15 N HSQC spectrum faced the same difficulties encoun- tered when assigning the spectrum of gp17.1. The gp17.1⌬ HSQC signals that could be assigned corresponded to residues also assigned in gp17.1, and about 40% of the residues could not be assigned in gp17.1⌬. The heterogeneity of the gp17.1⌬ 1 H-15 N HSQC line widths showed that this protein is also subjected to conformational exchange processes. 1 H-15 N HSQC peaks corresponding to minor conformations are still observed in gp17⌬. However, SEC revealed that gp17.1⌬ does not form oligomers upon incubation for 3 days at 37°C in contrast to gp17.1 (Fig. 10A). Similarly EM did not show detectable formation of tubes when gp17.1⌬ at 3 mg/ml (0.15 mM) was incubated for 3 days at 37°C (data not shown).
After 3 weeks at 4°C, gp17.1 still exists in the monomeric (peak P2) and oligomeric (peak P1) forms as revealed by SEC (Fig. 10B). Its mixing with gp17.1⌬ reduces the amount of gp17.1 oligomers. Analysis on SDS-PAGE of each gel filtration peak confirms that the peaks containing the oligomers are only composed of gp17.1, whereas both gp17.1 and gp17.1⌬ are found in the monomer-containing peaks (Fig. 10C). Altogether, these chromatography results suggest that gp17.1⌬ is capable of interacting with gp17.1 and reduces gp17.1 propensity to assemble into oligomers.
We then tested whether gp17.1⌬ could interfere with assembly of wild-type TTPs to build the tail structure during wild-type infection. The mutant protein was produced as a soluble protein in B. subtilis YB886 (pIA65) in an amount comparable with the amount of wild-type TTP synthesized during SPP1 infection (Fig. 11A, lane 6). The efficiency of plating of SPP1 wild-type in cells producing gp17.1⌬ was only slightly reduced (ϳ80%, Fig. 11B), but the phage plaque size was considerably smaller than in normal SPP1 infections. The presence of gp17.1⌬ thus had a negative impact on phage multiplication in the cell (Fig. 11B). Analysis of the viral particles produced in the presence and absence of the gp17.1⌬ separated through a discontinuous CsCl density  gradient showed that a large amount of DNA-filled capsids accumulate when gp17.1⌬ is present during infection in contrast to normal wild-type infections in which complete phage particles predominate (Fig. 11C). This observation strongly suggests that the amount of tail particles competent for virus morphogenesis becomes limiting when assembly occurs in the presence of gp17.1⌬.

DISCUSSION
Here, we show that the TTP of phage SPP1, gp17.1, selfassembles into tubes. The organization of gp17.1 in the tubes is remarkably similar to the one found in virus tails. They are both formed by stacked hexamers, rotated by ϳ21°, and shifted along the tail axis by ϳ4 nm (Fig. 6) (6). A similar helical tail tube organization has been recently observed in the Siphoviridae Araucaria (10) and in the giant Myoviridae ⌽RSL1 (9).
gp17.1 monomers, which do not adopt a stable fold in solution (Fig. 2), undergo a conformational change characterized by stabilization or formation of ␤-structures to assemble tubes (Figs. 4 and 5). Bioinformatics, FTIR, and ssNMR results suggest that this structure is a ␤-sandwich similar to the structures of phage TTP protein (35) and Dit protein N-terminal domains, including the domain of Dit gp19.1 of phage SPP1 (29). The pseudo-atomic model of the gp17.1 tube built by homology modeling and using the tube helical parameters indicates that, within gp17.1 hexamers, intermolecular interfaces might result from interaction between gp17.1 ␤-sandwich structures through pairing of accessible ␤-strands. Recent docking of a TTP model into the cryo-EM density of the ⌽RSL1 virion also suggested that large intermolecular ␤-sheets were responsible for hexamer assembly within the tail tube (9). These interactions should stabilize the ␤-sandwich fold in the tube structure.
In our experiments, gp17.1 loop ␤2␤3 spanning residues 40 -56 is essential for gp17.1 tube formation in vitro (Fig. 10) and assembly of phage tails in vivo (Fig. 11). Positioning of this loop in the model of the helical tube suggests that it participates both in hexamer intersubunit contacts and in inter-hexamer interactions. gp17.1⌬, in which the loop was deleted, interferes with formation of wild-type gp17.1 tubes showing that the two gp17.1 forms interact and gp17.1⌬ disrupts the TTP assembly reaction. A disordered loop 50 -78 was also observed between strands ␤2 and ␤3 of gpV (35), which corresponds to SPP1 loop 40 -56, as proposed by secondary and tertiary structure predictions (Fig. 2B). As the gpV loop ␤2␤3 is strongly and negatively charged, its functional role was assessed by mutating Asp-61 and Asp-62 into alanines (35). It was shown that the mutant D61A/D62A is completely devoid of biological activity and that it displays a dominant negative phenotype in that it inhibits the growth of wild-type phage. The authors suggested that mutant gpV D61A/D62A incorporates into tail assembly intermediates but that its presence inhibits further tail tube growth. On this basis, it was postulated that loop ␤2␤3 mediates hexamerhexamer interactions essential for phage tail polymerization. The pseudo-atomic model of gp17.1 tube structures consistently suggests a role of SPP1 loop ␤2␤3 in intermolecular The blue spectrum was acquired on gp17.1SPP1 and is the same as in Fig. 1A.
The red spectrum was obtained on gp17.1⌬ in 20 mM sodium phosphate, pH 7.0, 100 mM NaCl. interactions (Figs. 7B and 8B). In this model, loop ␤2␤3 stabilizes the gp17.1 fold both by forming a ␤-sheet with the ␤-sandwich of the adjacent subunit within the hexameric ring and by contributing to interactions between rings of the gp17.1 helical tube.
The phage TTP fold is also found in bacterial Hcp-like proteins, which form a channel in type VI secretion systems (35). Interestingly, all Hcp-like proteins purified so far are hexamers in solution with the exception of EpvC from Edwardsiella tarda, which was found in both dimeric and hexameric states in solution (39). Furthermore, no in vitro self-association of these hexamers was reported to date. Surface plasmon resonance experiments showed that Hcp hexamers transiently interact with each other but are incapable of forming tubes alone at micromolar concentrations (40). After intermolecular disulfide bond engineering, Hcp1 from Pseudomonas aeruginosa selfassembles into nonhelical nanotubes that are observable by EM (41). These tubes can be readily disassembled by addition of a reducing agent. Thus, our study supports the view that the evolutionarily related phage and T6SS bacterial tubes are built from a common structural element that can be observed in a hexameric state. However, assembly of this element into tubes follows different pathways and might lead to different tubular architectures.
In conclusion, we report here the first structural analysis of a phage TTP self-assembled tube. Bacteriophage SPP1 TTP exists in a monomeric form that is partially folded. This form is probably favored in infected bacteria to prevent the formation of TTP tubes competing with assembly of biologically active tail structures. However, in vitro, increasing the gp17.1 concentration at physiological temperature leads to the formation of TTP tubes. We propose that tube assembly involves formation of ␤-sheets between adjacent ␤-sandwich structures and contacts involving loop ␤2␤3. Our results strongly suggest that these intermolecular interactions are present both in self-assembled tubes and in phage tails. However, it remains to be explored how they are modified after bacterial recognition and during DNA ejection. gp17.1 structural plasticity could be important for the observed SPP1 tail reorganization triggered during infection (6). Finally, our study demonstrates that the SPP1 TTP has the inherent property to fold and form tubes in the absence of any other phage protein. This remarkable feature opens the way for generation of TTP-based tubular structures grafted with molecules exhibiting specific functionalities. Such engineered nanotubes could be encoded by a single gene resulting from fusion of the TTP gene with the gene coding for the grafted protein (as observed in TTPs comprising a tube assembly domain and an additional accessory domain (11, 12)), or it FIGURE 10. SEC behavior of gp17.1⌬ and its interference with gp17.1 oligomerization. A, SEC of gp17.1 (black) and gp17.1⌬ (gray) after 3 days at 37°C (thick lines) or 1 week at 4°C (thin lines). All protein samples were injected at ϳ2 mg/ml on a Superdex 200 16/60 HR column equilibrated in 20 mM sodium phosphate, pH 7.5, 50 mM NaCl. Calibration of the column showed that P1 and P2 would correspond to molecular masses higher than 500 kDa and equal to 21 kDa in the case of globular proteins, respectively. B, SEC of gp17.1 (-), gp17.1 with 33% of gp17.1⌬ (-⅐⅐), 17% of gp17.1⌬ (-⅐), and 9% of gp17.1⌬ (--) after 3 weeks at 4°C. Concentrations of all the injected samples were between 1.5 and 2 mg/ml. Percentages corresponded to mass ratios. C, SDS-PAGE of fractions eluted in B. One fraction of peak P1 (top gel) and several fractions covering peak P2 (middle and bottom gels) of the different samples run in B were analyzed. A gp17.1⌬ sample was added as a reference in the top gel. Only gp17.1 was detected in peak P1, and both gp17.1 and gp17.1⌬ were eluted in peak P2. FEBRUARY 6, 2015 • VOLUME 290 • NUMBER 6 could be formed by chemically linking the TTP to the molecule to be grafted.