Specificity of Interactions among the DNA-packaging Machine Components of T4-related Bacteriophages*

Tailed bacteriophages use powerful molecular motors to package the viral genome into a preformed capsid. Packaging at a rate of up to ∼2000 bp/s and generating a power density twice that of an automobile engine, the phage T4 motor is the fastest and most powerful reported to date. Central to DNA packaging are dynamic interactions among the packaging components, capsid (gp23), portal (gp20), motor (gp17, large “terminase”), and regulator (gp16, small terminase), leading to precise orchestration of the packaging process, but the mechanisms are poorly understood. Here we analyzed the interactions between small and large terminases of T4-related phages. Our results show that the gp17 packaging ATPase is maximally stimulated by homologous, but not heterologous, gp16. Multiple interaction sites are identified in both gp16 and gp17. The specificity determinants in gp16 are clustered in the diverged N- and C-terminal domains (regions I–III). Swapping of diverged region(s), such as replacing C-terminal RB49 region III with that of T4, switched ATPase stimulation specificity. Two specificity regions, amino acids 37–52 and 290–315, are identified in or near the gp17-ATPase “transmission” subdomain II. gp16 binding at these sites might cause a conformational change positioning the ATPase-coupling residues into the catalytic pocket, triggering ATP hydrolysis. These results lead to a model in which multiple weak interactions between motor and regulator allow dynamic assembly and disassembly of various packaging complexes, depending on the functional state of the packaging machine. This might be a general mechanism for regulation of the phage packaging machine and other complex molecular machines.

and conversion of biochemical energy into mechanical motion.
Phage T4 packages its 171-kb, 56-m-long genome into a capsid shell that is 120 nm long and 86 nm wide (4). A packaging machine is assembled at the special portal vertex of an empty prohead by binding of five molecules of a motor protein, the large "terminase" 2 gp17 (gene product 17; 70 kDa), to the dodecameric portal (gp20; 61 kDa) (5). Prior to assembly, gp17, in complex with the small terminase, gp16 (18 kDa) ("holoterminase"), recognizes and cuts the concatemeric genome. The newly generated end is inserted into the portal channel, and the DNA is forced into the head utilizing ATP energy. After filling the head, which accommodates ϳ103% of the viral genome ("headful" packaging), the motor disengages and cuts the DNA. The gp17-DNA complex then docks on another empty prohead and continues packaging in a processive fashion, making a series of cuts after each headful packaging event (1). The phage T4 motor, translocating up to ϳ2000 bp/s and producing a power density twice that of an automobile engine (ϳ5,000 kW/m 3 ) is the fastest and most powerful packaging motor reported to date (6).
Biochemical and x-ray structural analyses show that gp17 consists of two domains, an N-terminal ATPase domain that provides energy for DNA translocation and a C-terminal translocase domain that supports DNA movement as well as making headful packaging cuts (7)(8)(9). The ATPase domain has two subdomains, a larger RecA-like nucleotide binding subdomain I and a smaller "transmission" subdomain II, which form a cleft into which ATP binds (10). The C-domain has an RNase H fold that most closely resembles resolvases (11)(12)(13) and consists of two putative DNA grooves, one for translocation and another for cutting (5). The domains are connected by a flexible hinge, which facilitates interdomain interactions and domain movements during DNA translocation. In the proposed mechanism (5), ATP hydrolysis triggers a conformational change causing the subdomain II to rotate by 6°, aligning an array of complementary charged pairs and hydrophobic surfaces between the domains. The electrostatic force pulls the C-domain bound DNA upward by ϳ7 Å, translocating 2 bp of DNA into the capsid. The adjacent C-domain then comes into alignment with the DNA phosphates, binding DNA and repeating the translocation cycle.
The packaging machine consists of a small terminase, gp16 (18 kDa), which is essential for DNA packaging in vivo (14), * This work was supported, in whole or in part, by National Institutes of Health, NIAID, Grant R56AI081726. This work was also supported by National Science Foundation Grant MCB-0923873. 1 To whom correspondence should be addressed: Dept but in vitro, where DNA translocation can be measured independent of other steps, the motor can function without gp16 (15)(16)(17). gp16 forms oligomeric "rings," each ring consisting of 8 -11 subunits (18 -20). Although gp16 does not possess any enzymatic activities, it has interaction sites that recognize the viral genome and nucleate the formation of a holoterminase complex with gp17 (7,18,20,21). gp16 has three domains: a central domain containing two long helices that are essential for oligomerization (20) and an N-terminal domain and a C-terminal domain that presumably interact with other packaging components, DNA, ATP, and gp17 (22). Recent evidence indicates that gp16 plays broader roles, as a regulator of the packaging motor, modulating the ATPase, translocase and nuclease activities associated with gp17 (16,22,23). The x-ray structure of an octameric small terminase from phage SF6 shows that the central long helices form a vase-shaped structure with a 17-27-Å central channel, whereas the N-and C-terminal domains form an appendage at the wider perimeter and a neck emanating from the center, respectively (24). However, which regions of this structure are important for interaction with other packaging components is not known.
Central to DNA packaging is the multitude of interactions that occur in concert and with precision to produce a DNAfull head carrying the full complement of viral genome. T4 is considered to be the most efficient phage, producing infectious virions to nearly theoretical efficiency of 1 (25). Among the numerous interactions involved in packaging are those between (i) gp16, gp17, and DNA to generate ends; (ii) gp16, gp17, gp20, and DNA to initiate translocation; (iii) gp20, ATPase domain, translocation domain, and DNA to drive directional motion of DNA; and so on (1). Furthermore, the composition and stoichiometry of complexes must be remodeled at various stages of translocation. The sequence and structure of interacting sites and how the interactions among the components modulate DNA packaging are poorly understood.
Here we investigated the interactions among the packaging machine components of T4-related phages, specifically between the motor (gp17) and the regulator (gp16). Our results for the first time identified multiple specificity determinants in both gp16 and gp17. A high degree of specificity is evident even among the related T4 phages. The sites of interaction are clustered in the variable N-and C-domains of gp16, whereas at least two sites are identified in or near the critical "transmission" subdomain II that regulates gp17-ATPase. Swapping of key diverged region(s) (e.g. replacing the C-terminal region of RB49 gp16 with that of T4) resulted in a switch of ATPase stimulation specificity to T4 gp17. These results provide insights into how gp16 interaction might affect the orientation of certain catalytic residues in the ATPase pocket. A model is presented in which multiple weak interactions between motor and regulator allow dynamic assembly and disassembly of complexes for precise orchestration of the packaging process.

EXPERIMENTAL PROCEDURES
Sequences and Alignments-The amino acid sequences of T4 and related phage terminases were obtained from GenBank TM . Sequence identity analyses were performed using BLAST from the NCBI data base (26). Multiple sequence alignments were conducted using ClustalW (27).
Cloning of Wild-type and Mutant Terminases from T4-related Bacteriophages-The g17 and g16 coding sequences of T4-related phages were amplified by PCR using purified phage DNA as templates (the phage DNAs from T4-related phages were a kind gift from Dr. James D. Karam (Tulane University School of Medicine)). The terminase mutant clones constructed in this study include a series of N-and C-terminal truncations and domain swap mutants. The truncations were constructed by PCR amplification of the desired part of the terminase DNA sequence with specific primers. Most domain swap mutants were constructed using the PCRdirected splicing by overlap extension strategy that was described in detail elsewhere (8,28). When swapping was close to the termini of a gene, altered sequences were directly introduced as part of the forward or reverse primers. The amplified DNA fragments were concentrated by ammonium acetate/isopropyl alcohol precipitation, digested with appropriate restriction enzymes, purified by agarose gel electrophoresis, and ligated to the linearized pET-15b or pET-28b plasmid DNA. In-frame insertion of these fragments into the vector will result in the fusion of a hexahistidine tag to the N terminus of each terminase construct. The ligated DNAs were transformed into Escherichia coli XL10 Gold-competent cells (Stratagene, La Jolla, CA), and miniprep plasmid DNAs were prepared by the alkaline lysis procedure (Qiagen, Valencia, CA). The presence of DNA inserts and their orientation were tested using restriction digestion and/or amplification with insert-specific primers. The accuracy of the cloned DNA was confirmed by DNA sequencing (Davis Sequencing, Inc., Davis, CA). The plasmids were then transformed into E. coli BL21 (DE3) pLysS-competent cells (Stratagene) for protein expression.
Purification of Wild-type and Mutant Terminases from T4related Bacteriophages-The E. coli strain BL21 (DE3) pLysS cells containing a given terminase clone were induced with IPTG 3 at 30°C for 2.5-3.5 h. The solubility of the overexpressed proteins was tested with bacterial protein extraction reagent (B-PER; Pierce). Cells were harvested by centrifugation at 8,200 ϫ g for 15 min at 4°C and were lysed using an Aminco French press (Thermo Fisher Scientific Inc., Waltham, MA). The cell lysate was centrifuged at 34,000 ϫ g for 20 min at 4°C to separate the soluble fraction from the pellet, and the supernatant containing soluble proteins was purified by successive chromatography on Histrap HP (affinity for the hexahistidine tag) and Hiload Superdex 200 preparation grade (size exclusion) columns using AKTA-PRIME and AKTA-FPLC systems (GE Healthcare), respectively (7). For some purifications, the samples after Histrap chromatography were purified by Mono Q-5/50 GL ion exchange chromatography (GE Healthcare) followed by Superdex 200 gel filtration. The purified proteins were concentrated by Amicon Ultra-15 centrifugal filters (Millipore, Temecula, CA) and stored at Ϫ70°C as small aliquots.
ATPase Assays-ATPase assays were performed according to the basic procedure developed previously with minor modifications (23). The purified gp17s (0.2-2 M), either alone or with gp16 at a gp16/gp17 molar ratio (the ratio of monomeric gp16 molecules to monomeric gp17 molecules) of 10:1, were incubated at 37°C for 20 min in a 20-l reaction mixture. The reaction mixture contained 0.05-2 mM unlabeled (cold) ATP and 75 nM [␥-32 P]ATP (specific activity, 3,000 Ci/mmol; GE Healthcare) in ATPase buffer (50 mM Tris-HCl, pH 7.5, 0.1 M NaCl, 5 mM MgCl 2 ). EDTA was added to a final concentration of 50 mM to stop the reaction, and the ATP hydrolysis products were separated by thin layer chromatography (TLC) on polyethyleneimine-cellulose plates (Sigma-Aldrich). The plates were air-dried for autoradiography and phosphorimaging (Storm 820 PhosphorImager, GE Healthcare). The radioactive spots were quantified using ImageQuant software (GE Healthcare). For determination of K m , unlabeled ATP concentration was varied (0.05-2 mM) while keeping the other components in the reaction mixture constant. Total P i produced was calculated from the 32 P i values, and K m and K cat were determined using SigmaPlot 8.0 software (Systat Software, Inc., Chicago, IL). Any background radioactivity at the 32 P i spot in the "buffer only" condition was subtracted from the test samples. This value is ϳ0.1% of the total [␥-32 P]ATP present in the reaction mixture. Data shown are averages of duplicate or triplicate values.
Nuclease Assays-In the in vivo assay, the E. coli BL21 (DE3) pLysS cells containing cloned large terminase constructs were induced with 1 mM IPTG, and 1.5-ml aliquots were collected at different time points. Half of each aliquot was used to verify the overexpression of the terminases by SDS-PAGE. The remaining half was subjected to plasmid DNA preparation by the alkaline lysis procedure (Qiagen). The plasmid DNAs were electrophoresed on a 0.8% (w/v) agarose gel. Uninduced cells or induced cells containing nuclease-deficient constructs show no cleavage of plasmid DNA. Induced nuclease-proficient constructs cleave resident plasmid and E. coli genomic DNAs and yield a DNA smear throughout the lane (9).
In the in vitro assay, the purified gp17 (0.5-2 M) was incubated either alone or in the presence of gp16 (4 -16 M) with 100 ng of 48.5-kb phage DNA (Fermentas) in a 20-l reaction mixture containing 5 mM Tris-HCl, pH 8, 6 mM NaCl, and 5 mM MgCl 2 for 15 min at 37°C. The reactions were terminated by the addition of EDTA to a final concentration of 50 mM, and the samples were electrophoresed on a 0.8% (w/v) agarose gel followed by ethidium bromide staining (11).
DNA Packaging-The purified gp17s (0.1-3 M) were incubated with proheads purified according to the procedure described previously (15) (1-2 ϫ 10 9 particles) and 300 ng of 48.5-kb phage DNA in a 20-l reaction mixture containing 50 mM Tris-HCl, pH 7.5, 5 mM MgCl 2 , 1 mM spermidine, 1 mM putrescine, 5% (w/v) polyethylene glycol, 1 mM ATP, and 100 mM NaCl for 45 min at room temperature. Unpackaged DNA was degraded by adding DNase I (Sigma-Aldrich) to a final concentration of 0.5 g/l and incubated for 30 min at 37°C. Proteinase K (Fermentas) was added to a final concentration of 0.5 g/l in 50 mM EDTA (pH 8) and 0.2% SDS, and the samples were incubated for 30 min at 65°C to release the packaged (DNase-protected) DNA. The reaction mixtures were loaded on a 0.8% agarose gel for electrophoresis. The ethidium bromide-stained DNA was quantified by the Gel DOC XR imaging system (Bio-Rad) (15).
M13 Phage Display Library Screening-Biopanning of the Ph.D.-12 phage display peptide library (M13 phage displaying random 12-mer peptide-gIII fusions; New England Biolabs) with T4 gp16 was carried out in 96-well polystyrene microtiter plates (29). The plate was coated with 150 l/well of 100 g/ml purified T4 gp16 diluted with 0.1 M NaHCO 3 (pH 8.6) and incubated with gentle agitation in a humidified chamber overnight at 4°C. Coated wells were blocked with 10% milk powder (w/v) in blocking buffer (0.1 M NaHCO 3 , pH 8.6, 0.02% NaN 3 ) for 1 h at 4°C and washed six times with TBST buffer (50 mM Tris-HCl, pH 7.5, 150 mM NaCl, 0.1-0.5% (v/v) Tween 20). To select M13 peptide-displayed phage bound to gp16, 10 11 phage from the Ph.D.-12 M13 phage display peptide library diluted in 100 l of TBST were added to each coated well and incubated for 15 min at room temperature with gentle shaking. The unbound phage were washed off with TBST buffer 10 times. The bound phage were either eluted with 100 l of elution buffer (0.2 M glycine-HCl, pH 2.2, 1 mg/ml BSA) for 10 min and neutralized with 15 l of 1 M Tris-HCl, pH 9.1, or eluted with 100 l of 1 M purified T4 gp17 in TBS buffer (50 mM Tris-HCl, pH 7.5, 150 mM NaCl) for 30 min. The eluted phage were amplified by infecting the E. coli ER2738 host strain (New England Biolabs) and titrated on LB/IPTG/X-Gal plates. About 10 11 phage from this stock were used for the next round of biopanning. After the final (third or fourth) round of biopanning, the eluted phage were titrated, and the plaque-purified DNA was prepared from single plaques. The nucleotide sequences corresponding to the 12-mer peptides fused to gpIII were sequenced (Davis Sequencing, Inc.). These gp16-interacting peptide sequences were then aligned with T4 gp17 sequence using ClustalW (parameter settings: gap open, 25; gap extension, 0.5).
Structural Modeling of gp16-Based on the sequence and secondary structure alignments of T4 gp16 with SF6 small terminase gp1, a homology monomeric model of T4 gp16 was constructed using the SWISS-MODEL program (30) using the x-ray structure of gp1 (Protein Data Bank code 3hef) (24) as a template. The C-terminal amino acids 134 -164 could not be modeled due to a lack of available template. The gp16 monomeric model was then aligned with the gp1 octamer model to generate the gp16 octamer model using the PyMOL program (31).

Selection of T4-related Phage Terminases for Analysis of
Functional Specificity-Most tailed phages encode small and large terminases, and their order on the respective genomes is well conserved (32). Small and large terminase sequences from a number of T4-related phages have been analyzed for sequence identity. The highest identity to T4 terminases was found, not unexpectedly, with the coliphage terminases from RB69 and JS98 (80 -90%). However, terminases from another coliphage RB49 showed more divergence (55% identity for the small terminase and 69% for the large terminase) ( Table 1). The lowest identity was observed with the Vibrio phages KVP40 and KVP20 (45% identity for the small terminase and 54% identity for the large terminase). Although these values are generally consistent with what was previously reported for other head genes, such as the major capsid protein gene 23 (34), the small terminase showed consistently greater divergence than the large terminase. This is in keeping with its role in the recognition of the respective viral genome for packaging initiation (1). Based on these analyses, the RB49 terminases that had moderately diverged and the KVP40 terminases that had greatly diverged from the T4 terminases were selected for functional and specificity analyses.
Overexpression and Purification of T4, RB49, and KVP40 Terminases-Small and large terminases from phages T4, RB49, and KVP40 were cloned into T7 expression vectors with an N-terminal hexahistidine tag and overexpressed in E. coli (Fig. 1A). The new protein bands that appeared after IPTG induction were present mostly in the soluble form (data not shown) and corresponded to the sizes calculated from the predicted amino acid sequences (Fig. 1A, arrows; see "Experimental Procedures" for details). All of the terminases were purified to Ͼ95% purity, as judged by SDS-PAGE (Fig. 1B). One or more shorter bands seen in some of the preparations most likely arose by nonspecific proteolysis during purification, as was observed previously (23).
Native PAGE showed that both the T4 and RB49 large terminases (gp17) migrated as a ladder of bands, presumably corresponding to monomers, dimers, trimers, and so on (Fig.  1C) (23). KVP40 gp17, however, remained at the top of the gel and in addition showed a smear below, which indicated that this protein may not have folded correctly. All of the small terminases (gp16) showed two oligomeric species (Fig. 1D). Previous analyses with T4 gp16 suggested that these correspond to single and double rings observed by scanning transmission electron microscopy, each ring having 8 -11 subunits (18,19,23).
The T4 large terminase also exhibits a sequence-nonspecific endonuclease activity that cleaves circular plasmid DNA, which is further degraded to a smear of short DNA fragments (11). This activity appears to make the initiation and termination cuts during the processive packaging of concatemeric viral genome (9,11,36). An in vivo nuclease assay used to analyze this activity showed that the RB49 gp17 exhibited a strong nuclease activity that is qualitatively similar to that of T4 gp17 (Fig. 2B; see the broad smears of fragmented DNA in induced lanes) (7). The KVP40 gp17 showed no significant in vivo nuclease activity (as assessed by the presence of broad DNA smear) (data not shown), but the purified protein showed in vitro nuclease activity that is comparable with that of T4 gp17 (Fig. 2C). However, unlike the T4 gp17, the KVP40 gp17 produced no DNA smear; instead, it showed progressive loss of substrate DNA with increasing gp17 concentration. The KVP40 gp17 nuclease activity, like that of T4 gp17, was completely inhibited by KVP 40 gp16 at a ratio of ϳ1 gp17 monomer to 1 gp16 oligomer (e.g. see Fig. 2C, lanes 20 (without gp16) and 21 (with gp16)).
RB49 and KVP40 Large Terminases Can Translocate DNA into T4 Proheads-In a defined in vitro DNA-packaging system (15), both the RB49 gp17 and the KVP40 gp17 translocate phage DNA into T4 proheads at 70 and 25% efficiency, respectively, of T4 gp17 (Fig. 2D). This suggests that the packaging motor (gp17) of one T4-related phage can interact with the prohead portal of another phage and drive DNA translocation, albeit inefficiently.
Stringent Specificity of Large and Small Terminase Interactions-When T4 gp17 was mixed with gp16 from various phages, robust ATPase stimulation was observed only with its own gp16 but not with a heterologous gp16. For instance, stimulation of T4 gp17-ATPase by RB49 gp16 was only 14% of that of T4 gp16. Stimulation by the more diverged KVP40 gp16 was even lower, 4% of that of T4 gp16 (Fig. 3A). Similarly, the RB49 gp17-ATPase was stimulated only by 12% by T4 gp16 and 2% by KVP40 gp16 when compared with its own gp16 (Fig. 3C).
The specificity of gp16-gp17 interactions appears to be associated, in large part, with the ATPase domain of the large terminase because a very similar ATPase stimulation pattern was observed when the full-length gp17 was replaced with the ATPase domain (T4-N360) (7) (Fig. 3B). Furthermore, K m and K cat data suggest that the specificity of gp16-gp17 interactions affected the ATP hydrolysis step but not the ATP binding step, because K cat was greatly decreased (up to Ͼ40-fold) with heterologous gp16s, whereas the K m did not change significantly (Ͻ3-fold). This is consistent with our previous results that showed that ATP binding to gp17 was not affected by gp16, but hydrolysis of gp17-bound ATP was (22,23).
Specificity of Terminase Interactions Correlates with Sequence Divergence-To test if the specificity of terminase interactions correlates with the degree of sequence divergence of the small terminase, gp16s from phages RB69, RB43, and 44RR2, which diverged to different extents (Table 1), were overexpressed and purified (Fig. 4A). Native PAGE and gel filtration profiles showed that these proteins showed similar oligomeric forms as T4 gp16, although the RB43 and 44RR2 gp16s showed a single predominant oligomer species (Fig. 4B) (data not shown). The ATPase stimulation results showed strong correlation between the degree of divergence and specificity. The gp16 from the closely related coliphage RB69 (90% sequence identity to T4 gp16) stimulated T4 gp17-ATPase up to 46%, whereas gp16s from more distantly related phages RB49 and RB43 stimulated only up to 16 and 20%, respectively, and the Aeromonas phage 44RR2 and Vibrio phage KVP40 gp16s that are even more diverged stimulated to Ͻ5% as compared with T4 gp16.
Sequence Alignments of T4-related gp16 Sequences Show Three Diverged Regions-Sequence alignments show that, of the three domains in gp16 (22), the central domain has a high degree of sequence similarity, whereas the N-and C-domains diverged, containing three stretches of diverged regions (Fig. 5). Region I is in the N-terminal domain between amino acids 1 and 25 (see T4 gp16 sequence), region II is at the junction of the central domain and the C-domain (amino acids 112-126), and region III is at the C terminus (amino acids 148 -164). This is consistent with our model (22) in which the central oligomerization domain forms the conserved structural core, whereas the N-and C-domains form the functional core that involves interactions with large terminase and DNA. Thus, it is probable that the diverged regions in the N-and C-domains might encode determinants for gp17-ATPase stimulation specificity.
C-terminal Region III Is the Most Essential for ATPase Stimulation Specificity-A series of swap mutants were constructed by exchanging one or more of the diverged regions from one phage gp16 with another (Fig. 6A). These include swapping of region I, II, or III from T4 to RB49 (mutants 6, 11, 13, and 16), from RB49 to T4 (mutants 17, 18, and 19), and from KVP40 to T4 (mutant 20). All of the swap mutants were overexpressed and purified to near homogeneity as soluble proteins (Fig. 6B). Swapping did not affect folding or oligomerization of gp16 because each of the proteins migrated as compact bands by native PAGE to positions similar to that of T4 gp16 oligomers (Fig. 6C) as well as showing similar elution profiles by FPLC gel filtration (data not shown).
Quantification of gp17-ATPase stimulation showed that the swap mutants 6 and 13, in which the diverged regions I and II were swapped from T4 to RB49, respectively, showed no major change in specificity (Fig. 6D), whereas the mutant 16, in which region III was swapped, showed a substantial FIGURE 3. Specificity of interactions between large and small terminases among T4-related phages. ATPase assays were performed using purified T4 gp17 (A), T4 ATPase domain (T4-N360) (B), or RB49 gp17 (C), each with purified gp16 from T4, RB49, and KVP40 at a gp16/gp17 molar ratio of 10:1. Concentrations used were as follows: T4 gp17, 0.6 M; T4-N360, 0.8 M; RB49 gp17, 0.2 M. The tables show K m and K cat values determined from the ATP concentration versus ATP hydrolysis data shown in the graphs. Percentages of ATPase stimulation were calculated from the K cat values by setting the K cat of each homologous gp17-gp16 pair at 100% and normalizing the K cat of the rest of the pairs relative to the respective homologous pair. change in specificity. This mutant, unlike the wild-type T4 gp16, stimulated both T4 and RB49 gp17s, and the RB49 gp17-ATPase stimulation was even higher than that of the T4 gp17-ATPase. Mutant 11, in which both regions II and III were swapped, behaved like RB49 gp16 (i.e. it essentially completely switched the ATPase stimulation specificity from T4 gp17 to RB49 gp17). Furthermore, mutant 17, in which the region III of RB49 gp16 was swapped with that of T4 gp16, even more dramatically switched its ATPase stimulation specificity, from RB49 gp17 to T4 gp17. Even the mutant 20, in which the region III of the highly diverged KVP40 gp16 was swapped with that of T4 gp16, showed a substantial shift in ATPase stimulation of T4 gp17 (22% of that of the homologous T4 gp16-gp17 pair) (Fig. 6D, mutant 20).
That the observed differences in ATPase stimulation could be due to differences in the optimal ratio of [gp16] to [gp17] required for stimulation was ruled out because very similar gp16/gp17 ratio to ATPase stimulation profiles were obtained for all of the swap mutants, and the optimal gp16/gp17 ratio for stimulation is ϳ8 -10:1 (Fig. 6, E and F). These analyses demonstrate that the diverged region III at the C terminus of gp16 is the most important specificity determinant for ATPase stimulation. Furthermore, the entire region III seems to be essential for specificity because splitting RB49 region III into two parts and swapping each with the corresponding T4 sequence (mutants 18 and 19) did not show a significant switch in the specificity (Fig. 6D).
Regions I and II Also Contribute to Specificity of Large and Small Terminase Interactions-A series of deletions at the C terminus of T4 gp16 resulted in progressive reduction of gp17-ATPase stimulation. The gp16-I128 construct (T4-I128, retaining amino acids 1-128 and deletion of amino acids 129 -164), in which region III was deleted, showed 63% of the wild-type activity (22). These results can be explained in light of the importance of region III in gp16-gp17 interactions, but the retention of as high as 63% activity and stimulation of RB49 gp17 to only 6% (Fig. 7C) indicated that regions I and II also contribute to specificity. In fact, the above data show that T4 gp16 region II accentuates the specificity switch by region III in some of the swap mutants (Fig. 6D, compare mutant 11 with mutant 16).
To further discern the contribution by region I, mutant 9 was constructed by swapping region I of T4 gp16 with that of RB49 in the background of the truncated mutant gp16-I128 (Fig. 7A). This mutant was also purified to near homogeneity (Fig. 7B) and shown to oligomerize similarly to the wild-type gp16 (data not shown). Although the stimulation activity is greatly reduced when compared with gp16-I128, mutant 9 showed a shift in ATPase stimulation, having no detectable activity with T4 gp17 but showing 11% activity with the RB49 gp17 (Fig. 7C).
Specificity Determinants in the Large Terminase-The large terminases show an overall higher amino acid sequence similarity than the small terminases (Table 1). Sequence alignments show that the least conserved regions in gp17 are at the N and C termini (38). Because the T4 ATPase domain (N360) is sufficient for stimulation specificity (Fig. 3B), the diverged N-terminal region might be part of the specificity determinants to interact with gp16. A gp17 swap mutant was therefore constructed by replacing the N-terminal 85 amino acids of T4 gp17 with the N-terminal 83 amino acids of RB49 gp17 (Fig. 8A) and purified as a soluble protein (Fig. 8B). This mutant showed ATPase stimulation by RB49 gp16 up to 45% of that of T4 gp16 (Fig. 8C), a significant shift in specificity (the wild type shows 14% stimulation) but not a complete switch, suggesting that other sites also contribute to specificity.
Multiple Sites in gp17 Potentially Interact with gp16-In order to identify gp16-interacting peptides that share similarity to regions in gp17, biopanning experiments were conducted using a random 12-mer M13 phage display library binding to a gp16-coated polystyrene plate. After 3-4 cycles of biopanning, 14 different gp16-binding peptide sequences were identified ( Table 2). Alignment of these peptides with gp17 sequence showed that at least seven potential interaction sites are present in gp17, five in the ATPase domain and two in the nuclease/translocase domain. Of these, the two most frequent matches, amino acids 37-52 and 290 -315, are either in the ATPase transmission subdomain II that regulates ATP hydrolysis (amino acids [37][38][39][40][41][42][43][44][45][46][47][48][49][50][51][52] or at the junction of subdomains I and II (amino acids 290 -315). Both of the sites interact with the adenine binding motif residues, YQ (amino acids 142 and 143) (38), and the 290 -315 site in addition interacts with Walker A Ser 161 and is adjacent to the ATPase coupling motif, TTT (amino acids 285-287) (5). Because these are part of a network of interactions that are critical for ATPase catalysis, gp16 interaction at these sites could lead to regulation of ATP hydrolysis and other functions of the packaging machine.

DISCUSSION
The basic components of the phage DNA-packaging machine are the capsid shell, dodecameric portal, small terminase, large terminase, and DNA. DNA translocation into capsid requires precise orchestration of a series of enzyme activities, protein-protein interactions, and protein-DNA interactions involving these components. Not only are these dynamic events, some happening on a millisecond time scale, but also the composition and stoichiometry of the complexes must change at different stages of packaging (1).
Biochemical characterization of several T4-related small and large terminases show that they have similar oligomerization and enzymatic properties. All of the large terminases show a weak ATPase activity that is stimulated by the respective small terminase, an in vitro DNA translocation activity that translocates externally added DNA into proheads, and a sequence-nonspecific nuclease activity that cleaves doublestranded DNA. These features are consistent with sequence and structural analyses, which established that the terminases share similar functional signatures and domain organization (5,11,22,38).
There is, however, clear divergence with respect to interactions among the packaging components, even among the related phages. Although all gp17s can assemble on T4 prohead portal and translocate DNA, the highest activity was observed with its own partner, T4 gp17, the next highest activity with the moderately diverged RB49 gp17, and the least activity with the most diverged KVP40 gp17. The gp16-gp17 interactions that lead to gp17-ATPase stimulation showed even more stringent specificity. The RB49 gp16 can stimulate T4 gp17 only up to 14% of T4 gp16 and the KVP40 gp16, a mere 5%. Similarly, the RB49 gp17 was maximally stimulated by its own gp16 but not by T4 gp16 or KVP40 gp16. Analysis of six different T4-related small terminases showed that differences in ATPase stimulation specificity correlated with sequence divergence. We therefore hypothesized that the specificity determinant(s) are probably located in the diverged regions of gp16 region I at the N terminus (amino acids , region II at the junction of the central and Cterminal domains (amino acids 112-126), and region III at the C terminus (amino acids 148 -164). Swapping of these regions resulted in a partial to complete switch in specificity. The most dramatic result was observed when RB49 gp16 region III was swapped with T4 region III, which completely switched specificity to T4 gp16, suggesting that region III has a key specificity determinant. However, the residues that confer specificity seem to be spread over a stretch of gp16 surface because splitting region III and swapping each half did not show a significant change in specificity. The swap mutant data further suggested that regions I and II also contribute to specificity, causing a partial switch in specificity.
Multiple sites in the gp17 molecule are involved in gp16 interactions. Swapping of the least conserved N-terminal 85 amino acids with RB49 gp17 resulted in a partial switch in specificity, which also meant that additional sites must be in-FIGURE 6. The diverged region III at the C terminus of gp16 has a key specificity determinant for gp17-ATPase stimulation. A, schematic of gp16 polypeptide showing the proposed domain organization (22). Numbers represent the number of amino acids in the T4 gp16 coding sequence. Various gp16 swap mutant polypeptides are shown as lines with different styles representing different gp16 sequences (solid dark, T4 gp16 sequence; dotted, RB49 gp16 sequence; solid light, KVP40 gp16 sequence) (e.g. mutant 6 was constructed by swapping the diverged region I of T4 gp16 with the region I of RB49 gp16). B, SDS-PAGE (10% gel) showing the purified gp16 swap mutants. Cloning, overexpression, and purification were performed as described under "Experimental Procedures." C, native non-denaturing PAGE (4 -12% gradient gel) showing the oligomeric state of the gp16 swap mutants. D, gp17-ATPase stimulation by the gp16 swap mutants. The activities shown are percentages of ATPase stimulation calculated from the K cat values by setting the K cat of each homologous gp17-gp16 pair (wild-type, WT) at 100% and normalizing the K cat of the swap mutants relative to the respective homologous pair. E and F, gp17-ATPase activity as a function of the gp16/gp17 molar ratio. For each gp17, the homologous gp16, the heterologous gp16, and three gp16 swap mutants were tested. Values represent average of duplicates from two independent experiments.
volved. Because these sites could not be inferred from sequence alone, a biopanning approach was used to select gp16interacting 12-mer peptides from a random phage M13 peptide library. Of these, the most frequently selected peptide (28 times) is similar to the gp17 amino acid sequence 290 -315 that is at the junction of subdomains I and II (39). The next most frequently selected peptide (11 times) showed similarity to the divergent N-terminal residues 37-52 in subdomain II, which is consistent with the swap mutant data that showed a partial switch in specificity and also with the evidence in phage , where genetic studies showed that the N terminus of large terminase gpA interacts with the C terminus of the small terminase gpNu1 (40,41).
The 37-52 and 290 -315 sites offer good rationalizations as to how gp16 interactions at these sites can lead to regulation of packaging ATPase activity. Both of the sites interact with the adenine-binding motif residues, YQ (amino acids 142 and 143) (38), which are critical for ATP hydrolysis. 4 Movement of the adjacent loop (Cys 125 -Arg 140 ) is thought to be coupled to ATP hydrolysis and DNA translocation (10). Furthermore, the 290 -315 site is adjacent to TTT (amino acids 285-287), also an ATPase-coupling motif that is critical for ATP hydrolysis in several phage terminases, including T4, , and herpes viruses (39,42). Structural and biochemical evidence shows that the above residues are part of a large interacting network that regulates ATP hydrolysis as well as transmitting signals to the C-terminal translocase domain. Mutations at certain residues in this network result in complete loss of ATP hydrolysis and DNA translocation, whereas others lead to substantial reduction in translocation velocity (8,39,42). It is therefore conceivable that gp16 interaction at these sites would cause conformational changes that either orient or misorient certain catalytic residues into the ATPase pocket regulating ATP hydrolysis.
Previously, we proposed that gp16 consists of three domains, a central domain containing two long helices that are essential for oligomerization (amino acids 36 -115), an Nterminal domain (amino acids 1-35) that interacts with DNA and gp17, and a C-terminal domain (amino acids 116 -164) that interacts with gp17 and ATP, a basic organization that seems to be well conserved in other small terminases (20,22,38). The recent atomic structure of phage SF6 small terminase (gp1) octamer confirmed this organization and generated 4 K. Kondabagil and V. B. Rao, unpublished data.   (7). The ATPase domain is divided into subdomain I (white) and subdomain II (dark gray). Amino acids 1-58 and 313-360 fold into subdomain II, as determined by the x-ray structures (5,10). B, SDS-PAGE (10% gel) showing the purified gp17 proteins. C, ATPase activity of WT and swap mutant gp17s stimulated by T4 and RB49 gp16s. Percentages of ATPase stimulation were calculated from the K cat values by setting the higher K cat value as 100% for each large terminase. Values represent the average of duplicates from two independent experiments. finer structural details (24). The two long central helices, ␣-4 and ␣-5, interact with ␣-5 of adjacent monomer to form a three-helix bundle. These bundles form a vaselike structure with a 17-27-Å diameter central channel. The wider end of the vase extends as a narrow cylinder formed by two interwoven C-terminal ␤-barrels. The smaller N-terminal domain hangs down from the wider edge of the vase. We have generated an octamer structural model for T4 gp16 using sequence and secondary structure alignments with SF6 gp1 (Fig. 9). Although recent mass spectrometry data suggest that gp16 may be 11-mer, evidence from a number of studies suggest that the stoichiometry of small terminase may not be fixed (18,23,(43)(44)(45). The octamer model thus gives the overall architecture of gp16 oligomer, which is very similar to gp1. The model further shows that the specificity regions I-III are relatively well exposed on the surface for interaction with either monomeric gp17 or the pentameric packaging motor (Fig. 10).
The location of the small terminase in the packaging machine is unknown. It has been speculated that it is either sandwiched between the portal and large terminase (43) or hang-ing below the portal-large terminase complex (24). The central channel, which is continuous with the portal and motor channels, was thought to be involved in active translocation of DNA. Neither of these models is consistent with the data from the T4 system: (i) gp16 is not essential for DNA translocation per se, which means that its role to move DNA through the channel, if any, is secondary (15,16); (ii) a prohead-gp17 complex can be isolated by mixing proheads with gp17 but not with gp16 either with or without gp17, suggesting that it is unlikely that gp16 directly interacts with the portal 5 ; and (iii) at best, only a minute amount of gp16 is associated with gp17 (17,35,46), suggesting that interactions between gp16 and gp17 are quite weak and unlikely to be the ones that can sustain stable association with the packaging machine that generates Ͼ60 piconewtons of force. Furthermore, all attempts to isolate a stable gp16-gp17 complex or gp16-gp17-gp20 complex thus far have failed (data not 5 Z. Zhang, S. Hegde, and V. B. Rao, unpublished data.  36 -115), and C-domain (amino acids 116 -164) are underlined by green, yellow, and cyan bars, respectively. B, homology model of T4 gp16 octamer constructed based on x-ray structure of SF6 gp1. The ribbon diagrams of the modeled N-domain, central domain, and C-domain are shown in green, yellow, and cyan, respectively. The C-terminal amino acids 134 -164 (including region III) could not be modeled due to a lack of available template. The homology model was constructed using the SWISS-MODEL program. The ribbon diagrams were generated using the PyMOL program.
shown), suggesting that the interactions between the packaging components are relatively weak, although the biological responses resulting from these interactions are quite profound (22,23,34).
Our results described here and elsewhere thus point to a novel model for regulation of the phage-packaging machine and complex molecular machines in general. Multiple weak interactions between motor (gp17) and regulator (gp16) would allow assembly and disassembly of complexes with different composition and stoichiometry, depending on the functional (e.g. cutting, translocation), assembly (e.g. holoterminase, portal-bound), or nucleotide (e.g. ATP, ADP, apo) state of the machine (Fig. 10). Unlike the classic models where stable interactions generate complexes with relatively fixed stoichiometry, this model would allow dynamic remodeling of complexes through a series of well orchestrated interactions in the packaging process.
The small terminase architecture is conducive to this type of regulation. Our results indicate that small terminases consist of "constant" (central oligomerization domain) and "variable" (N-and C-domains) domains. The constant domain allows clustering and exposure of the variable domains for efficient, yet dynamic, interactions with other packaging components. For instance, a gp16 oligomer having at least eight N-terminal DNA binding domains can efficiently capture viral DNA through multivalent interactions. Similarly, and at the same time, multiple C-domain interactions can capture gp17, nucleating the assembly of a holoterminase complex and triggering DNA cutting. These interactions can as easily be broken by subtle conformational changes following cutting, allowing gp17 to form new interactions with the portal to insert the bound DNA into the prohead and initiate packaging. This type of model will also have implications for faithful reproduction of viruses in mixed infections that frequently occur in nature. Specific interactions among packaging components allow the virus to select the correct genome and packaging partners for encapsidation. However, the stringency of interactions is not absolute, leaving considerable room to generate novel recombinant viruses. Thus, the specificity determinants that govern the interactions among pack-aging components modulate the functions of packaging machine, production of infectious virions, and generation of novel viruses.