Modular organization of T4 DNA polymerase. Evidence from phylogenetics.

We describe the use of a phylogenetic approach to analyze the modular organization of the single-chained (898 amino acids) and multifunctional DNA polymerase of phage T4. We have identified, cloned in expression vectors, and sequenced the DNA polymerase gene (gene 43) of phage RB69, a distant relative of T4. The deduced primary structure of the RB69 protein (RB69 gp43) differs from that of T4 gp43 in discrete clusters of short sequence that are interspersed with clusters of high similarity between the two proteins. Despite these differences, the two enzymes can substitute for each other in phage DNA replication, although T4 gp43 does exhibit preference to its own genome. A 55-amino acid internal gp43 segment of high sequence divergence between T4 and RB69 could be replaced in RB69 gp43 with the corresponding segment from T4 without loss of replication function. The reciprocal chimera and a deletion mutant of the T4 gp43 segment were both inactive for replication and specifically inhibitory (“dominant lethal”) to the T4 wild-type allele. The results show that phylogenetic markers can be used to construct chimeric and truncated forms of gp43 that, although inactive for replication, can still exhibit biological specificity.

In DNA replication, DNA polymerases bear the major responsibility for copying genomes with high accuracy. As a group, these enzymes display a variety of molecular types, but most are unified by exhibiting two catalytic functions that control fidelity: primer/template-dependent nucleotidyl transferase (polymerase) and DNA 3Ј exonuclease (proofreading function) (Kornberg and Baker, 1992). In bacteriophage T4, the two functions are part of the same polypeptide chain, product of phage gene 43 (gp43), whereas in some biological systems the polymerase and DNA 3Ј exonuclease activities are specified by separate protein subunits, e.g. Escherichia coli DNA polymerase III holoenzyme (Kelman and O'Donnell, 1995). Another E. coli enzyme, DNA polymerase I, resembles T4 gp43 in size and in possessing polymerase and DNA 3Ј exonuclease functions in the same polypeptide chain; however, unlike T4 gp43, polymerase I also has an N-terminal 5Ј to 3Ј exonuclease function. A third E. coli DNA polymerase, polymerase II, resembles T4 gp43 in biochemical properties and amino acid sequence motifs but is a little smaller in size than the phage enzyme (Cai et al., 1995). One group of DNA polymerases, the reverse transcriptases, lack editing function altogether (Skalka and Geoff, 1993). T4 gp43 also bears a sequence-specific RNA-binding autogenous translational repressor function (Andrake et al., 1988) that only partially overlaps the DNA binding function of the enzyme (Pavlov and Karam, 1994).
Typically, replication DNA polymerases work in complex with other proteins, which provide accessory functions that help meet a number of requirements and overcome a variety of constraints inherent to the semiconservative duplication of long supercoiled and condensed double-helical DNA genomes. In the case of T4, the interfacing of replication with other DNA metabolic processes in the phage-infected cell complicates definition of what constitutes a replication complex; however, it is clear that T4 gp43 works in partnership with several other phage-induced proteins, including the products of genes 32 (a single-strand-binding, Ssb, protein), 45 (sliding clamp), 44/62 (clamp loader, DNA-dependent ATPase), 41 (helicase), 61 (primase), and others (for recent reviews, see Kreuzer and Morrical (1994) and ). Some of the analogous proteins of the E. coli replicase are subunits of the polymerase III holoenzyme. Many studies have suggested that the single-chained 898-amino acid T4 DNA polymerase is organized into modules that specify its various activities (Lin et al., 1994;Nossal, 1969;Reha-Krantz, 1994;Spicer et al., 1988), but boundaries between modules remain largely undefined because of the interdependence of functions within the gp43 molecule and the lack of direct structural information defining modules and relating functions to one another. The most unambiguous identification of a T4 gp43 domain has been that of the DNA 3Ј exonuclease site, which can be differentially inactivated by a single amino acid substitution (D219A) (Frey et al., 1993) and demonstrated to exist in defined gp43 fragments (Lin et al., 1994). We report here results of phylogenetic studies that provide an expanded view of the modular organization of this multifunctional replication enzyme.
We identified, cloned, sequenced, and expressed the structural gene for DNA polymerase (gene 43) of phage RB69, whose genetic map is similar to that of the canonical T-even phages, although it excludes these other phages in mixed infections and does not recombine or exhibit phenotypic mixing with them (Russell, 1967;Russell and Huskey, 1974). We show here that the T4 and RB69 polymerase genes are related in primary structure and biological functions, but they are identical only at about 65% of nucleotide positions, and neither yields viable phage recombinants when propagated in cells carrying the cloned heterologous gene (frequencies lower than 10 Ϫ6 ). The deduced amino acid sequence of RB69 DNA polymerase also diverges from that of T4 gp43 (ϳ62% of residues identical plus ϳ14% chemically similar). 1 The divergence between the two gp43 phylogenetic relatives occurs in clusters rather than being distributive, and their amino acid similarity patterns suggest that domains essential to replication functions are highly conserved. Also, despite large differences in primary structure, plasmid encoded gp43 from either T4 or RB69 can complement the other protein for phage DNA replication; however, quantitatively, the T4 enzyme shows preference to its own genome. An internal 55-residue segment of very high divergence between the T4 and RB69 enzymes (37 dissimilar positions) could be replaced in RB69 gp43 with its counterpart from T4 to yield a biologically functional chimeric enzyme. The reciprocal domain exchange yielded a nonfunctional gp43 that was partially inhibitory to replication by wild-type T4 gp43. An internally deleted T4 gp43 was also specifically inhibitory to T4 replication by wild-type enzyme. These inhibitory proteins may retain activities that countereact wild-type gp43. The results demonstrate the use of phylogenetic markers to define exchangeable segments in the modular T4 DNA polymerase. The construction of chimeric and specifically deleted derivatives of this enzyme may ultimately help assign specific gp43 functions to specific modules.

MATERIALS AND METHODS
Phage and Bacterial Strains-In addition to T4 and RB69, wild-type strains of the following T4-related phages were used: T2, T6, RB6, RB18, RB19, RB51, and RB70. The RB phages were initially described by Russell (1967), and some have been used in other, more recent studies (Jozwik and Miller, 1992;Miller and Jozwik, 1990;Eddy and Gold, 1991). They were obtained from W. B. Wood (University of Colorado, Boulder, CO). Harvests from platings of the phages on E. coli CAJ70 (plate lysates) were used for phage purification by the polyethylene glycol precipitation method (Yamamoto et al., 1970), and DNA was subsequently purified from the phage by phenol extraction as described by Kricker and Carlson (1994). Southern blot analysis revealed that all, except RB69, were closely related to T4 (see "Results"), and consequently, T4 DNA primers could be used to amplify desired gene 43 DNA segments from most of these phages by polymerase chain reaction for subsequent cloning and sequencing.
The T4 gene 43 double amber mutant 43amE4322-B22 has been described previously (O'Donnell and Karam, 1972); it bears UAG codons for positions 386 and 731 of the gene product (Reha-Krantz, 1994). The RB69 gene 43 mutant 43sacd carries a small out-of-frame internal deletion that inactivates the phage DNA polymerase; the deletion was produced by cleaving cloned RB69 gene 43 DNA at a unique SacI site (see Fig. 4) and then treating the DNA with mung bean nuclease before religation and transformation of host cells. The sacd mutation was subsequently transferred to phage by marker rescue. E. coli BL21(DE3), which contains a T7 RNA polymerase gene under lac UV5 promoter control (Studier and Moffatt, 1986), was used as host for recombinant plasmids expressing the T4 and RB69 DNA polymerase genes under control of the T7 ⌽10 promoter in the pSP72 and pSP73 vectors sold by Promega. This E. coli strain was also used as host in plasmid-phage complementation assays. E. coli CR63 (Sup D, ser) was used for platings of T4 amber mutants, and E. coli strains CAJ70 (a UGA suppressor (Sambrook et al. (1967)) and S/6str r , both nonpermissive for phage amber mutants, were used to score for wild-type phage. Growth conditions and complementation assays were as described previously (Hughes et al., 1987).
RB69 Genomic Library Construction and Screening-The RB69 library was prepared for us by the technical staff at Stratagene (La Jolla, CA) who used our purified RB69 phage DNA, partially digesting it with DraI and SspI, and charging the ends with EcoRI oligonucleotide linkers before ligating the fragments to EcoRI-digested ZAP II cloning vector. Stratagene also provided the bacterial, phage, and plasmid strains and instructions for library screening. To screen for clones carrying gene 43 sequences, plaques from platings of the recombinant ZAP II lysate were hybridized to a 32 P-labeled riboprobe (1.5-2.0 ϫ 10 6 cpm/l) corresponding to an internal SspI fragment of RB69 gene 43 cloned in pSP72 (see "Results"). Preparation of riboprobes was as described by Hsu and Karam (1990). Several hybridizing clones were identified and their RB69 DNA inserts were transferred, in vivo, to pBluescript SK(Ϫ) phagemid by the use of f1 helper phage. These clones were then used to obtain the complete nucleotide sequence of RB69 gene 43 and its neighboring genomic segments. There was sufficient se-quence homology with the T4 genome to allow for identification of several RB69 genes.
Cloning an Expressible RB69 Gene 43-The library screening yielded one clone that contained the entire RB69 gene 43 in a DNA segment extending from the middle of the upstream regA gene to a point about 4 kilobase pairs downstream of the DNA polymerase gene. This clone was used to transfer the intact gene in a BglII-SmaI fragment to pSP72, where it could be expressed under control of the plasmid-borne T7 ⌽10 promoter.

DNA Polymerase Homologies among the T-even and Related
Phages-T4 gene 43 encodes the only DNA polymerase known to be essential for replication of this phage . We expected that changes in its primary structure, over the course of evolution, to have been of a type that conserved those aspects of tertiary structure that are necessary for replicative functions. To gain insights about the degree of divergence that has been allowed to survive evolution of gene 43, we screened several T4-related phages for their similarities in this gene. Fig. 1 shows results of Southern blot assays demonstrating gene 43 polymorphism among nine phages known to exhibit different degrees of serological and other similarities to T4 (Russell and Huskey, 1974). The experiment for Fig. 1A utilized DraI digests of genomic phage DNAs that were separated by agarose gel electrophoresis and hybridized, at low stringency conditions, to a 32 P-labeled RNA probe prepared by in vitro transcription of a cloned T4 gene 43. All revealed DNA fragments that hybridized with the T4 riboprobe. Under high stringency conditions for hybridization, the RB69 DNA sample yielded a very weak signal, or no signal was visible (results not shown). The RB69 DraI DNA fragment could be digested further with SspI to yield several smaller fragments that also hybridized poorly with the T4 probe (Fig. 1B). One of these fragments was cloned in pSP72 and ultimately served as the source of RNA hybridization probes for RB69 genomic library screening and isolation of RB69 gene 43 (see "Materials and Methods"). Gene 43 fragments from some of the other phages were also isolated, after polymerase chain reaction amplification, and cloned for sequencing studies.
Primary Structure of RB69 DNA Polymerase-The complete nucleotide sequence of RB69 gene 43 was derived from analysis FIG. 1. Southern blot analysis of genomic DNA preparations from T4 related phages. DNA samples were digested with either DraI (panel A) or DraI plus SspI (panel B), and the resulting digests were separated by electrophoresis on a 1% agarose gel, transferred to nitrocellulose filter, and hybridized (at 50°C) with a 32 P-labeled riboprobe prepared by antisense transcription of a complete T4 gene 43. Methods were as described by Hsu and Karam (1990). Autoradiogram lanes depict analyses of the following genomic DNA samples. of several overlapping DNA fragments isolated by genomic library screening and subcloning. The sequencing studies confirmed relatedness of the RB69 gene to T4 gene 43 (ϳ65% nucleotide identity) and also showed that the T4 and RB69 genetic maps have the same order for five cistrons: 43-regA-62-44-45 (results not shown). Differences in nucleotide sequence between the two phages do exist for all of these cistrons, as well as for intercistronic segments therein. The regA genes from the two sources have already been compared (Jozwik and Miller, 1992;Miller and Jozwik, 1990); the DNA polymerase genes are addressed in this report, and the other genes will be described elsewhere. 2 Fig. 2 shows an alignment between the primary structure deduced for RB69 gp43 from the nucleotide sequence and the known 898-amino acid sequence for T4 DNA polymerase (Spicer et al., 1988). There is a total of 348 single amino acid differences between the two proteins, including the five additional residues in RB69 gp43 (903 amino acids) as compared with the T4 enzyme. This represents almost 40% of all amino acid positions. Overall the two proteins are either identical or chemically similar at ϳ74% of all positions; however, most of the differences occur in clusters rather than at dispersed locations. One conspicuous difference between the two proteins involves an internal 55-residue segment, which exhibits only about 33% similarity (8 identical plus 10 chemically similar positions) between the two proteins, i.e. residues 498(T4)/ 501(RB69) to 552(T4)/555(RB69); this segment may extend further N-terminally to residue 482(T4)/485(RB69). Several shorter clusters with somewhat higher degrees of similarity (40 -50%) were also observed, particularly in comparisons between the N-terminal segments of the two proteins. Fig. 2 also shows the few sites of difference we have detected between T4 gp43 and its counterparts from other T4-like phages, which were only partially analyzed for their gene 43 sequences. When compared for the internal segment of high divergence between the T4 and RB69 proteins, only T2, T6, and RB18 showed differences from their T4 counterpart (one amino acid difference in each case; Fig. 2). As shown in Fig. 3, the amino acid similarity profiles for the internal gp43 segment of these phages are reflected in the corresponding nucleotide sequences 2 L.-S. Yeh, C.-C. Wang, and J. D. Karam, manuscript in preparation.
FIG. 2. Primary structure alignment for the T4 and RB69 DNA polymerases. The sequence of RB69 gp43 was deduced from DNA sequence determinations as described under "Materials and Methods." The sequence of T4 gp43 was determined in the studies by Spicer et al. (1988). The chart shows several landmarks on the T4 protein: EXO, conserved exonuclease motifs in DNA polymerases; POL, conserved sequence motifs in Family B DNA polymerases (Braithwaite and Ito (1993); also referred to as the Pol ␣ family (Joyce and Steitz, 1994;Wang et al., 1989)). The POL I, POL II, and POL III motifs (overlap with motifs C, A, and B, respectively, of Delarue et al., 1990) have been implicated as "polymerase sites" by mutational studies in T4 (see Reha-Krantz (1994) for a review). Amino acid residues underscored with a dot are suspected to be active site (POL or EXO) residues (Spacciapoli and Nossal, 1994). Some amino acid differences from T2, T6, and RB18 are also marked (see text for gene segments sequenced). The central shading marks the segment of least homology between the T4 and RB69 proteins.
of the structural gene. T2 and T4 gp43 also exhibited three amino acid differences between them within the first 106 residues of the protein, while T6 and RB70 gp43 were identical to the T4 protein in this N-terminal segment; these observations are based on sequencing only the first 318 base pairs of the genes from T2, T6, and RB70. Although no additional gp43 segments of the T4-related phages were examined, the data collected so far suggest that T4 gp43 is very closely similar to its T2, T6, RB6, R18, RB19, RB51, and RB70 counterparts but clearly distinct from RB69 gp43. A schematic of the clustered differences between the T4 and RB69 gp43 species is given in Fig. 4 and is consistent with a segmental structure for this class of single-chained DNA polymerases (Reha-Krantz, 1994;Lin et al., 1994), where conserved amino acid clusters may mark segments critical to replication or other vital functions of these enzymes.
Plasmid-mediated Expression and Complementation between the T4 and RB69 DNA Polymerases-DNA sequencing of RB69 gene 43 and its neighboring cistrons provided detailed information on the restriction map for this region of the phage chromosome and allowed us to design and effect transfer of the intact RB69 polymerase gene to the same T7-promoter expression vectors that we had used previously in studies of T4 gene 43 (Andrake and Karam, 1991). Biological activity of the cloned RB69 gene was measured by its ability both to complement an RB69 gene 43 mutant and to support growth of a T4 gene 43 double-amber mutant in infections of E. coli hosts harboring the plasmid-borne gene. Results are shown in Fig. 5. We found that despite the large differences in primary structure, RB69 DNA polymerase is able to substitute for its T4 counterpart in T4 DNA replication. This functional substitution occurs in the absence of wild-type recombinants (from exchange between the plasmid-borne and phage-delivered genes) as no such T4 recombinants could be detected (frequency Ͻ 10 Ϫ6 ). Thus, this system provides a complementation assay for the T4 and RB69 DNA polymerase genes that circumvents the need to use mixed infections of E. coli hosts with the two phages, which result in strong exclusion of T4 genomes by RB69 (Russell and Huskey, 1974). Complementation was also obtained in reciprocal experiments where a cloned T4 gene 43 was used to support growth of infecting RB69 and T4 gene 43 mutants, respectively. T4 DNA polymerase supported both T4 and RB69 DNA replication (Fig. 5), and no wild-type RB69 recombinants could be detected via marker rescue in the infections with RB69; however, under the experimental conditions used, the cloned T4 enzyme showed preference for replicating T4 as compared with RB69 genomes. This type of specificity of the T4 gp43 was particularly apparent in experiments that utilized liquid culture rather than plating assay as the growth condition (Fig. 5). Considerable plasmid-phage recombination (wild-type markerrescue frequencies of 0.1-0.5%) occurred in infections involving homologous T4 and RB69 gene 43 pairs, respectively.
Domain Swapping between the DNA Polymerases of T4 and RB69 -Complementation in the absence of wild-type phage recombinants facilitated further examination of the modularity of the gene 43 products from the two phages. The exceptionally high degree of divergence between the internal segments of the RB69 and T4 gp43 molecules prompted us to ask if this 55-

FIG. 3. The gene 43 segment of high divergence between T4 and RB69; comparisons with the corresponding segments of other T4-like phages.
This segment encodes the 55-amino acid sequences highlighted by the darkest shading in Fig. 2. Note that, except for RB69, the differences between the other phages and T4 ranged between two and five nucleotides within the 165-base span (97-99% identity between these phages and T4), and most of the differences involve the same positions of the homologous genes. The RB69 sequence differed from the T4 counterpart at 105 of the 165 nucleotide positions (i.e. the two sequences are 37% identical), with about 70% of the differences being related by transversion. A dash means the residue is identical to that in T4.

FIG. 4. Graphical representation of clustered similarities (identical and chemically similar residues) between the primary structures of the T4 and RB69 DNA polymerases.
The Gene 43 panel shows partial restriction maps for the two structural genes, and the cross-shaded (^) bars represent polymerase chain reaction generated DNA fragments from both RB69 and T4 gene 43 that were used in constructing chimeric and internally truncated gp43 species (see Figs. 5 and 6). Restriction site abbreviations were as follows: Bg, BglII; Bm, BamHI; Bx, BstXI; Dr, DraI; Pv, PvuII; Sc, SacI; Xh, XhoI. The BglII site (AgatcT), when introduced into the gene, created a 4-base pair (gatc) insertion; the bracketing A and T nucleotides of the site are part of the wild-type gene 43 sequence. The insertion was removed by mung bean nulease digestion and religation following BglII treatment. The gp43 similarities panel highlights segmental differences between the T4 and RB69 gp43 molecules. The different shadings represent similarities ranging from 33% (Ⅺ) to completely identical (s). The asterisk marks regions of less than 50% similarity. amino acid span is essential for DNA polymerase function or even present in the protein products of the genes from the two phages (e.g. Perler et al. (1992) and Xu et al. (1993)). We note, especially, the paucity of point mutations in this region of T4 gene 43 (Reha-Krantz, 1994). The nucleotide sequence for this span was confirmed to exist at the mRNA level by the use of reverse transcriptase in primer extension and Sanger sequencing assays (results not shown), and the analyses described below indicated that it is indeed present and necessary for function in the wild-type enzymes. Fig. 6 compares protein sizes for a number of gp43 constructs that were used in the biological experiments summarized in Fig. 5. Clearly, as observed with the in-frame internal delM deletions, removal of the 165-bp sequence for the 55-residue segment of gp43 results in synthesis of gp43 species that are shorter than wild-type protein (Fig. 6) as well as in loss of replication function (Fig. 5). We next asked if the divergent internal segments of the two gp43 species could substitute for each other. We constructed the chimeras CH-1 and CH-2 (Fig.  6) and assayed their biological activities by the plasmid-phage complementation assay (Fig. 5). One of the two alternate swaps, CH-2, did indeed exhibit a product that supported phage replication, whereas replacement of the T4 segment with its counterpart from RB69 yielded a chimera (CH-1) that was inactive for phage replication (Fig. 5). Interestingly, the CH-1 chimera exhibited a partial inhibitory effect on replication of an infecting T4 wild-type phage and must therefore actively counter the function of wild-type gp43, or perhaps inhibit its synthesis (Andrake and Karam, 1991). This replicon-specific "dominant lethal" effect was more evident in liquid culture than in plating assays (Fig. 5). The delM constructs from RB69 and T4 also exhibited replicon-specific trans-dominant inhibitory effects, whereas the C-terminal RB69 deletion (delC clone) and T4 frameshift (FS801) mutants used neither inhibited replication of infecting wild-type phage nor complemented infecting gene 43 phage mutants (Fig. 5) and are therefore probably devoid of biologically relevant activity. Preliminary studies indicate that the delM1 mutants retain RNA-binding (repressor) function, but it is still unclear if they are DNA binding as well. 3 It may also be important to note here that RB69 gp43 migrates more slowly than T4 gp43 in SDS-PAGE and that this property is associated with the 55-residue segment exchanged in CH-1 and CH-2 rather than with the additional five amino acids of the RB69 enzyme. The reason for the difference in electrophoretic behavior is not known.

DISCUSSION
The results described here underscore the utility of comparing functionally analogous proteins from organisms that belong 3 C.-C. Wang and J. D. Karam, unpublished observations. FIG. 5. Biological activities of T4-RB69 gp43 chimeras. For qualitative spot tests, 5 l of phage solution (about 10 5 particles) were deposited on lawns of E. coli BL21(DE3) carrying the desired plasmid, and plates were incubated overnight at 30°C. For quantitative tests, plasmid-bearing cells (at 3 ϫ 10 8 /ml) were infected and analyzed for phage production and DNA synthesis as described by Hughes et al., 1987. The numbers shown below the spots are relative values, comparing growth of the specified phage on the wild-type clone of the homologous gene. A value of 1.0 refers to a phage yield of 300 -500/cell with T4 and 150 -200 with RB69 infections. Measurements of DNA synthesis ([ 3 H]thymidine incorporation) were also carried out, and the results (data not shown) were consistent with phage yields. The infecting phage strains were: WT, wild-type; T4 43 Ϫ , T4 double-amber mutant 43E4322-B22; RB69 43 Ϫ , RB69 deletion mutant 43sacd. In the plasmidbearing cells used for spot testing and liquid culture experiments, WT, wild-type gene 43; delC1, deletion of the C-terminal 99 RB69 gp43 residues; delM1, deletion of the 55-residue internal gp43 segment of either T4 or RB69; and FS801, 4-base deletion affecting reading frame starting after amino acid residue 801 (a histidine) in T4 gp43; this construct yields wild-type T4 gene 43 recombinants when infected with the T4 43 Ϫ mutant used. CH-1 and CH-2 are reciprocal T4-RB69 gp43 chimeric forms (see Fig. 6 for constructions).
FIG. 6. Electrophoretic analysis of proteins made by plasmid clones of wild-type, partially deleted, and chimeric T4 and RB69 gene 43. The gene 43 constructs diagrammed in the figure were expressed in pSP72-bearing E. coli BL21(DE3) under T7 promoter control, and 35 S-labeled proteins were analyzed by SDS-polyacrylamide gel electrophoresis (SDS-PAGE) as described in Andrake and Karam (1991). Positions of the gene 43 products from these clones are marked with a "E" in the autoradiogram panel where the direction of protein migration is presented in left-to-right orientation. The delC1 and FS 801 constructs originated from WT clones of the corresponding genes that were digested with BstXI (see Fig. 4 for site location). With T4 FS801, BstXI digestion was followed by mung bean nuclease treatment and religation. With RB69 delC1, the sequence downstream of the BstXI site was removed by digesting the linearized WT plasmid with XhoI (site distal to the cloned gene boundary), and the truncated DNA was treated with mung bean nuclease and then religated. The CH-1, CH-2, and delM1 constructs were made by fusing different combinations of the polymerase chain reaction-generated fragments diagrammed in Fig. 4; the gp43 segment exchanged or deleted in these constructs spans from residue 498(T4)/501(RB69) to residue 552(T4)/ 555(RB69), i.e. the darkly shaded segment in Fig. 2. All constructs were confirmed by DNA sequencing.
in the same phylogeny but that are not very closely related. We can safely conclude that T4 and RB69 share common ancestry, although some segments of their genomes may have unrelated origins. Opportunities for horizontal transfer of genetic elements make questions about origins and diversification of genomes difficult to resolve in any biological system (Campbell and Botstein, 1983), but especially so with virulent phages where the host(s) that contributed to their natural selection cannot be ascertained (Shub, 1994). T4 and RB69 are only slightly similar in serological properties, but they resemble each other more closely in appearance and in physical and genetic properties of their nucleic acids (Russell, 1967;Russell and Huskey, 1974). Similarity by such criteria could relate any two phages that may have acquired functionally analogous genetic cassettes from already highly diverged, or even unrelated, origins, e.g. phages and 21 (Campbell and Botstein, 1983). Ultimately, we expect to encounter two levels of segmental divergence between the RB69 and T4 genomes: (i) intragenic divergence of the type we report here for the DNA polymerase genes and (ii) intergenic divergence, whereby entire segments of the two genomes will prove to be dissimilar in sequence (although perhaps still similar in function), as has been observed among the mix of immunity, replication, and assembly gene clusters of lambdoid phages (Susskind and Youderian, 1983). There are several examples of horizontal acquisition of genetic information in the evolution of T-even phages, including the capture of introns (Clyman and Belfort, 1994) and insertion elements (Miller and Jozwik, 1990) by some T-even genomes but not by others that are otherwise very closely related to one another (Figs. 1 and 2) and the sharing of tail fiber antigen determinants between some T-even and other phages like Mu and (Henning and Hashemolhosseini, 1994). The clustered differences between the primary structures of the DNA polymerases of T4 and RB69 (Fig. 2), and other evidence suggesting that T4 gp43 is a modular enzyme (for review, see Reha-Krantz (1994)), lead us to consider that intracistronic evolution of gene 43 may have also occurred by both horizontal and vertical change. In particular, the 55amino acid internal segment of least similarity between them could have either originated by horizontal transmission of two unrelated DNA segments that converged to a similarity level consistent with modern gp43 function, or diverged from one DNA origin by an unusually permissive acceptance of amino acid substitutions. In this regard, it is interesting to mention that there are no known conditional lethal missense mutations that map in this region of T4 gene 43 (Reha-Krantz, 1994) and that one of the highly conserved positions between the T4 and RB69 gp43 segments (i.e. Ser523 in T4 gp43) is changed to a chemically similar residue in T2 (to Thr) and a dissimilar one in T6 (to Asn) (Fig. 2), two phages that are very closely related to T4 in their gene 43 sequences (Fig. 3). Interestingly, however, we do note that the T4 and RB69 segments are both particularly rich in acidic amino acid content (pI ϳ4.7) and there may have been a selection for this chemical property during their evolution. We also note that the nucleotide sequence encoding this gp43 segment in RB69 possesses the high AT/GC ratio characteristic of T-even phages but with a different base distribution from T4 (Fig. 3). It will be interesting to find out if this segment will be less tolerant to mutational alternations if these caused a drift away from acidity.
The T4 and RB69 DNA polymerases can substitute for each other in phage replication, although the T4 enzyme appears to show a strong preference to its own genome (Fig. 5). Qualitatively, the broader specificity of the RB69 enzyme is a remarkable property in view of the presumption that several components in each of the T4 and RB69 DNA replicase assemblies must have co-evolved to maintain mutual recognition. In this context, the segmental distribution of conserved sequences between T4 and RB69 gp43 may prove to be mirrored by similarly clustered patterns of sequence conservation in other protein components of their respective replicases. Gp43 segments at which divergence in primary structure was permitted during evolution may serve any of a number of important roles, such as providing appropriate spacing between interacting activity domains of a folded enzyme or providing determinants that distinguish biological specificities of the two replication systems from each other. We have been testing for such roles in the 55-residue internal gp43 sector of highest divergence between T4 and RB69. In the "domain swap" experiments described here, we observed that reciprocal exchange of this segment between the two phages did not alter replicon specificity of the gp43 recipients (Figs. 5 and 6). In the one case, an RB69 gp43 with a T4-derived insert (CH-2) replicated T4 and RB69 equally well, i.e. exhibited RB69 gp43 characteristics. In the reciprocal exchange (CH-1), the chimeric gp43 was inactive for replication but had the added interesting property of inhibiting function from T4 (but not RB69) wild-type gp43. That is, it exhibited a T4-specific "trans-dominant" phenotype. A similar, but more inhibitory, phenotype was exhibited by a T4 gp43 that was deleted for this internal segment (T4 delM1). The deleted RB69 gp43 counterpart (RB69 delM1) was inhibitory toward both wild-type phages, which is a phenotype that mirrors the broader specificity of wild-type RB69 gp43. We conclude that this internal segment is essential for replication function but does not determine replicon specificity of gp43. Also, the transdominant phenotypes exhibited by internally deleted gp43 from T4 and RB69 and the inhibitory effect of CH-1 suggest that these proteins retain some of the activities of the wild-type enzymes and may compete with these if present in the same cell. It should be possible to localize the target for inhibition both by genetic and in vitro assay.
T4 DNA polymerase exhibits several amino acid sequence similarities with a number of eukaryotic DNA polymerases (Spicer et al., 1988;Braithwaite and Ito, 1993). Interestingly, the segment of divergence from RB69 gp43 appears as a "gap" in sequence alignments with these other enzymes (Wang et al., 1989) and is positioned to the immediate N-terminal side of a highly basic sequence motif (pI ϭ 10 -10.3; designated POL III in Fig. 2) that is conserved among DNA-dependent DNA polymerases from a variety of biological sources (Blanco et al. (1991) overlaps motif B of Delarue et al. (1990)). Since the segment from T4 can substitute for the one in RB69, the amino acid sequence contained therein may have no interactions with other parts of the intact gp43 molecule. The segment may be an innocuous linker or spacer whose divergence during evolution was limited only by amino acid changes that altered its length or interfered with other segments of the enzyme. Such explanations can be tested by site-directed and randomized mutagenesis. It is still possible that divergence of the gp43 internal segment is related to the evolution of different biological specificities in the two gp43 species examined here. Considering the incompatibility of T4 and RB69 in coinfected E. coli hosts (Russell and Huskey, 1974), the two phages must have experienced much of their natural selection in separate cellular environments, and their gene 43 products may have evolved different signatures that are not functionally distinguishable in the bacterial hosts that, by experimental design, were used for their initial detection.