Distribution, Sequence Homology, and Homing of Group I Introns among T-even-like Bacteriophages

Self-splicing group I introns are being found in an increasing number of bacteriophages. Most introns contain an open reading frame coding for a homing endo-nuclease that confers mobility to both the intron and the homing endonuclease gene (HEG). The frequent occurrence of intron/HEG has raised questions whether group I introns are spread via horizontal transfer between phage populations. We have determined complete sequences for the known group I introns among T-even-like bacteriophages together with sequences of the intron-containing genes td, nrdB, and nrdD from phages with and without introns. A previously uncharacterized phage isolate, U5, is shown to contain all three introns, the only phage besides T4 found with a “full set” of these introns. Sequence analysis of td and nrdB genes from intron-containing and intronless phages provides evidence that recent horizontal transmission of introns has occurred among the phages. The fact that several of the HEGs have suffered deletions rendering them non-functional implies that the homing endonucleases are of no selective advantage to the phage and are rapidly degenerating and probably dependent upon frequent horizontal transmissions for maintenance within the phage populations. Several of the introns can home to closely related intronless phages during mixed infections. However, the efficiency of homing varies and is dependent on homology in regions flanking the intron insertion site. The occurrence of optional genes flanking the respective intron-containing gene can strongly affect the efficiency of homing. These findings give further insight into the mechanisms of propagation and evolution of group I introns among the T-even-like bacteriophages.

The three self-splicing group I introns in bacteriophage T4 were the first prokaryotic introns to be found. The T4 introns are situated in the genes coding for the small subunit of the aerobic ribonucleotide reductase (nrdB) (1,2), the anaerobic ribonucleotide reductase (nrdD (formerly called sunY)) (1,3), and thymidylate synthase (td) (4). The presence of introns in a bacteriophage was unexpected, because phages generally are under strong selective pressure to keep a compact genome to maintain rapid replication and to fit the genome inside the phage capsid. The subsequent finding that the T4 introns are mobile genetic elements, mediated by the presence of a homing endonuclease gene (HEG) 1 within each intron, gives an explanation to how they were inserted into the phage genome (5,6), but the low number of introns among T-even-like phages has hitherto prevented a systematic study of how the introns have propagated within this phage population. Phages of both Gram-negative and Gram-positive bacteria have subsequently been shown to contain group I introns, all belonging to the same subgroup (IA2) of group I introns as the T-even introns indicating a common ancestry of phage introns. Most phage introns also contain homing endonuclease genes (7)(8)(9)(10)(11)(12)(13)(14)(15)(16).
Homing endonucleases are a diverse group of proteins that generates double strand cuts in DNA at, or near, a generally very long recognition sequence spanning up to 40 bp (for reviews see Refs. [17][18][19]. HEGs are often associated with selfsplicing introns or inteins and can confer mobility both to themselves and to their surrounding splicing element via the process of homing. Intron homing occurs when a HEG-containing intron encounters a cognate chromosome copy without an intron (i.e. for phage introns during mixed infection with a cognate or closely related phage without intron). The recognition sites of intron-associated homing endonucleases generally span their respective intron insertion site, and they will therefore only cleave intronless alleles. Intron-containing alleles are immune to cleavage, because the intron interrupts the recognition site. After cleavage the double-strand break is repaired via recombination-dependent replication repair, using an intact, intron-containing chromosome copy as template (20). This results in insertion of the intron and the HEG into the repaired genome. Efficient homing is dependent upon sequence homology between donor and recipient alleles in the sequences flanking the insertion site (5,(21)(22)(23). If the requirement of homology in the homing site and surrounding regions is met the efficiency of homing can be close to 100%.
Of the three HEGs in the T4 introns the td and nrdD HEGs (denoted I-TevI and I-TevII, respectively) belong to the GIY-YIG family of endonucleases. They have been shown to specifically cleave intronless alleles of td and nrdD, respectively (6,24,25). The T4 nrdB intron contains a non-functional HEG of the H-N-H family, named I-TevIII, which has suffered a large deletion and therefore cannot promote homing (5,26). Previous studies have shown that T4 is exceptional among its closer relatives in having several introns. Most of the known T-evenlike phages completely lack introns, whereas a small number * This work was supported by the Swedish Science Research Council. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) AY262125-AY262157.
‡ To whom correspondence should be addressed. only contain the td intron (27,28). The only phage besides T4 that has been shown to have more than one intron is RB3, which has a td intron and an nrdB intron with an intact, functional HEG (26). Given the high efficiency of homing of the two functional T4 introns (6,24,25) and the propensity for horizontal gene transfer that appears to exist between closely related phages (29,30) and perhaps also globally between different phages (31), it is striking how few phages actually contain introns.
Here we show that the previously uncharacterized T-evenlike phage isolate, U5, contains all three known T-even introns. This gives a total of 11 identified introns (6 td, 3 nrdB, and 2 nrdD) within the T-even-like phages and enables a systematic study of how group I introns have propagated within this phage population. We have used a combination of sequence comparisons of the introns and the intron-containing genes from a number of phages belonging to the T-even-like family as well as mixed infections between phages with and without introns/ HEGs to determine the history and potential of intron transfer among these phages. Our results provide evidence for a recent horizontal spread of introns mediated by the cognate HEGs among phages with high sequence similarity at the intron insertion sites. We also show that homing can overcome the effect of genetic exclusion that occurs between some phages and that the presence of optional genes flanking the intron-containing genes can strongly affect the efficiency of homing. of the original T-evens, T2L, T2H,  T4D, and T6, were kindly provided by Elisabeth Haggård, Dept. of Genetics, Stockholm University. Strains RB2, RB3, RB27, RB32, RB49, RB69, LZ1, LZ7, U5, and TuIa were kindly provided by Karin Carlsson, Dept. of Cell and Molecular Biology, Uppsala University. RB14, RB15, RB23, RB26, RB51, and RB61 are from our stocks, originally a gift from Sean Eddy, and phage LZ2 was kindly provided for this study by Sean Eddy, Dept. of Genetics, Washington University of Medicine. The intron-deleted T4 strain T4⌬IVS was kindly provided by David Shub, Dept. of Biological Sciences and Center for Molecular Genetics, University at Albany, State University of New York. Escherichia coli B 0 and CR63 were used as host strains for all phage work and were from our stocks.

Bacteria and Phages-Strains
PCR Amplification and Sequencing of nrdB, nrdD, and td Genes from T-even-like Phages-Individual plaques were picked and suspended in 40 l of water per reaction. 20 l of this suspension was used as template for PCR. Amplifications were performed using 2.5 units of proofreading Pfu DNA polymerase (Stratagene) for 30 cycles according to the manufacturer's recommendations. Primers for PCR amplification of td, nrdB, and nrdD genes were based on T4 sequences. T4nrdA1 5Ј-GCAGCTCCAAATACGTTTGTG, T4nrdA2 5Ј-CCAATATTGGCA-TAACATAGACC, T4nrdA3 5Ј-GAAATAGAAACTCCTAAAGCCG (complement: C-terminal end of T4 nrdA gene for nrdB amplification), LS1043B 5Ј-CCATTGTCGTCTAATTCTAGC (complement: T4 nrdB-denA intergenic region for nrdB amplification); T4frd1 5Ј-AGATGAAG-TAACAACCCTTACG (complement: C-terminal end of T4 frd gene for td amplification), T4nrdA4 5Ј-TACTGATGTTCCTTCAGCTGC (complement: N-terminal end of T4 nrdA for td amplification); PAT22 5Ј-CATATGACAATTGAAAAAG (complement: start of T4 nrdD gene for nrdD amplification), PAT23 5Ј-CTGTCATAATTCATTGGTGCT (complement: end of T4 nrdD gene for nrdD amplification), T4nrdG3 5Ј-CCAAGTACTACGATTATAACAACC (complement: middle of T4 nrdG gene for nrdD amplification). Primers specific for TuIa were: TuIatd1 5Ј-TGATAGAACATATGACAAGGAGG, TuIatd2 5Ј-TGCCTTGACG-CATATTGTAGCG, TuIanrdB1 5Ј-TGTGTATACGAACGAGAGTG, and TuIanrdB2 5Ј-CAACGCCTTAGAAGCTATTCG. PCR products were analyzed on agarose gels and purified using a Qiagen QIAquick PCR purification kit to remove buffers, nucleotides, and primers prior to sequencing. PCR fragments were sequenced on both strands with primers based on known T4 sequences using a DYEnamic ET terminator cycle sequencing kit or DYEnamic ET terminator cycle sequencing for MEGA Bace from Amersham Biosciences. Samples were analyzed on an ABIPrism377 sequencing machine at the KISeq facility at the Center for Genomics and Bioinformatics at Karolinska Institutet. The sequences of the td, nrdB, and nrdD genes obtained in this study have been reported to GenBank TM and given accession numbers AY262125-AY262157. During the progress of this project the complete genome sequences of phages RB49 and RB69 and the Aeromonas phage Aeh1 were determined as part of The T4-like Phages Genome Project (Drs. J. D. Karam, H. M. Krisch, and J. M. Nolan). Sequences from these phages used in the following analyses were obtained from the Gen-Bank TM genome data (accession numbers NC_005066, NC_004928, and AY266303, respectively).
In Vitro GTP Labeling of Introns-Group I splicing was studied in vitro by observing the autocatalytic addition of a guanosine nucleotide ([␣-32 P]GTP) to the 5Ј-end of the intron (1). Total RNA was prepared by the RNeasy procedure on mini-spin columns (Qiagen), from infected cells. Approximately 15 g of total RNA was used in the GTP-labeling reaction under self-splicing conditions as described previously (1). Samples were analyzed on a 5% polyacrylamide gel containing 7 M urea. Labeled bands were visualized by phosphorimaging analysis (FujiFilm FLA-3000). For the additional screen for potential uncharacterized group I introns in U5, genomic DNA from U5 was cleaved with NdeI and ligated to a pBluescript (KSϩ) derivate. PCR was performed using one vector-specific primer together with either a 5Ј-3Ј-directed or a 3Ј-5Ј-directed degenerate primer based on the conserved catalytic center of group I introns. Degenerate primer sequences were as follows: intron primer 5Ј-3Ј, 5Ј-GATATAGTCTGNNCTGCATRG; intron primer 3Ј-5Ј, 5Ј-CYATGCAGNNCAGACTATATC. Generated PCR products were purified and sequenced as described.
Phylogenetic Analysis-Sequence alignments were done with Clust-alX v. 1.8.1, and phylogenetic analysis was carried out using PAUP v. 4.0b10. 2 Randomization of intron presence on the gene trees and calculation of frequency distributions for the number of intron losses were performed with MacClade v. 4.06 (Sinauer Associates, Sunderland, MA). The p values given in the text were calculated for observed minimum number of intron losses for td (5) and nrdB (4) from the frequency distributions.
Multiplex-PCR Screen for Intron Homing in Mixed Infections-Equal amounts of two phage strains (2.5 ϫ 10 9 plaque-forming units) were mixed with 5 ϫ 10 8 cells from a mid log-phase culture giving a multiplicity of infection of 5 of each phage. Incubation was continued for 90 min at 37°C with vigorous shaking. Cells were lysed with chloroform, and progeny phage were plated on new bacteria. Primers for intron screening of progeny plaques were designed to have similar annealing temperatures and to yield gene-specific fragments for use with all phage combinations unless otherwise stated. For each cross 48 plaques were picked and each resuspended in 20 l of water. A 5-l aliquot of the suspension was used as template in a 25-l PCR reaction with 1 unit of Taq polymerase (Invitrogen) according to the manufacturer's recommendations. Primer sequences for intron presence were as follows: td IVS A, CAATCCCGTGCTAAATTGTAGGAC; td IVS B, CGT-TGCATCAGCAATATTGTATCC; td IVS B (TuIa), AGACGCATCAG-CAATATTATATCC; nrdB IVS A, CTCCTTTAATTGCTGGAAAGTCC; nrdB IVS B, GGATGGGTCCCGTTCTTAACC; nrdD IVS A, GT-CAAGCTCATGTAAAATCTGCC; nrdD IVS B (T4), CTTCATCTTTA-AGGGCCGATTC; and nrdD IVS B (u5), CTTCATGTTTAAGGGC-CGATTC. When T2 and RB49 were used as recipient phages we also screened for a unique marker gene in each recipient phage (located in a different part of the chromosome compared with the introns) to have a neutral marker to relate intron frequencies to. For T2 gene 42.3 was used, and for RB49 the wac gene was used. Primers used for screening for these genes were: T2 42.3A, 5Ј-GTTGTTAACGCAGAAAAAGCG; T2 42.3B, 5Ј-CAAATAACTTTAAAGCATCTTCAG; RB49 wacA, 5Ј-CAT-GTTGACGGTGTACCAAGC; RB49 wacB, 5Ј-GTTAACCGCCGTTA-CAACCTC. For flanking screens of RB69 crosses primers RB69 nrdA 1, 5Ј-CCAGTCGATAGGTAATTCGCCTC, and RB69 frd.1 1, 5Ј-CCAGAG-CATCAAAGATTCAGAAG, were used.

PCR Amplification and Sequencing of nrdB, nrdD, and td
Genes from T-even-like Phages-A total of 21 T-even-like phage strains, including the original T-even phages T2, T4, and T6, were used for amplification of PCR fragments of the nrdB, nrdD, and td genes (Table I). Strains were chosen to include all previously known intron-containing T-even-like phages together with a number of T-even-like phages lacking introns (27) and one previously uncharacterized strain (U5 from Uppsala, Sweden). Primers for amplification of td, nrdB, and nrdD were designed to bind in flanking genes to generate fragments of full-length genes for sequencing (see Fig. 1). Direct sequencing of the PCR products from 13 of the phage strains (boldface in Table I) confirmed the presence of all introns seen in the PCR screen. All sequences were readily alignable to the T4 sequences, although larger rearrangements were found in intergenic regions of some phages (data not shown).
The U5 Genome Contains Three Introns-The immediate striking result from the PCR screen was the presence of an intron in each of the td, nrdB, and nrdD genes in the previously uncharacterized phage U5. This is the first phage besides T4 found to contain all three introns. U5 td and nrdD intron sequences are highly similar to the corresponding T4 introns, and the U5 nrdB intron sequence is highly similar to the corresponding RB3 intron, i.e. without the T4 deletion in the endonuclease gene. However, the U5 HEGs I-TevII (in nrdD) and I-TevIII (in nrdB) each contain a 2-bp deletion early in their coding regions, prematurely terminating the I-TevII reading frame after 39 amino acids and the I-TevIII reading frame after 65 amino acids. Furthermore, the sequence shows that the U5 I-TevIII also has a 6-bp deletion exactly spanning the late promoter known to direct the main expression of the I-TevIII homing endonuclease gene (32). We therefore propose that the U5 I-TevII and I-TevIII homing endonuclease genes are non-functional and in the process of degeneration.
To find out if U5 contains any novel group I introns at other sites in the genome we analyzed total RNA isolated from U5infected E. coli cells, with the GTP-labeling assay (1). As can be seen in Fig. 2, we found bands of expected sizes for the U5 introns and two weaker bands of lower molecular mass in the U5 lane without counterparts in the T4 lane. These may be products from further processing of the known introns after complete splicing or may represent labeling of uncharacterized novel introns. In a PCR screen of the U5 genome with degenerate primers based on the highly conserved group I intron catalytic core sequences we picked up the td, nrdB, and nrdD introns but failed to detect any novel introns (data not shown). Although this does not completely rule out the possibility of other group I introns in U5, it makes it less likely.
Phylogenetic Analyses Support Horizontal Spread of Introns among the T-even-like Phages-To obtain further understanding of the history of intron gain/loss among the T-even-like phages we performed independent phylogenetic analyses of the three intron-containing genes and compared them to the distribution of introns. Such analyses have been used earlier to differentiate between horizontal and vertical spread of mobile introns (33)(34)(35)(36). Phylogenies were constructed for each of the td, nrdB, and nrdD genes using alignments of exon sequences only. The td and nrdB sequences generate well supported trees (Fig. 3), whereas the nrdD sequences are too similar to yield a reliable tree. All intron-containing genes (marked with an "I" in Fig. 3), except the RB3 and LZ2 td genes, are more closely related to at least one intronless gene than to another introncontaining gene and separated by branches with 100% bootstraps. The scattered distribution of genes with introns indicates either gain of introns via horizontal transfer, or multiple independent intron losses. If no horizontal intron transfers have occurred the hypothesis predicts that the ancestral phage had introns that were then lost along the lineages leading to taxa with intronless genes. Once an intron is lost by a phage all descendents of that phage would also lack the intron, and, therefore, genes without introns would be clustered on the phylogenies and the number of intron losses would be minimized. If, on the other hand, horizontal transfers have occurred, the intron occurrences would appear scattered on the phylogenetic trees. The minimum number of losses inferred from the trees in Fig. 3 is 5 for td and 4 for nrdB and not statistically less than if intron presence/absence is randomized on the trees (p ϭ 0.72 and p ϭ 0.18, respectively, n ϭ 100 randomizations). Thus, the scattered distribution of introns together with the very efficient homing properties of the HEGcontaining td and nrdD introns during T4 infection (Refs. 5, 6, 20, 21, and this study, see below) suggest that the T-even-like introns have spread among the phages via horizontal transfer.
Lack of Sequence Variation in Intron HEGs Suggests Recent Transfer of td Introns among American Phages-If the introns and the genes they reside in have had different phylogenetic histories, we would expect trees based on the intron sequences to differ from the gene trees in Fig. 3. It is, however, not possible to achieve useful phylogenetic intron trees in the current study, because there are only three identified nrdB introns and the six td introns are too similar (the td introns of the American isolates T4, T6, RB3, and LZ2 are nearly identical, see below) to generate a phylogenetic tree with more than three branches. The lack of sequence variation among the American td introns is clearly seen when the differences in the intron-containing td genes are plotted as the number of varying sites in a 20-nucleotide sliding window (Fig. 4). The differences between T4, T6, RB3, and LZ2, are almost exclusively located to the td exons. There is only one nucleotide difference throughout the whole td intron, including the HEG, among these phages. On the contrary, U5 and especially TuIa show substantially more variation both in non-coding intron sequences and in the HEG, compared with T4. The same low degree of variation as for the American td introns is seen in the nrdB and nrdD intron sequences (see Table II).
Because the catalytic parts of group I introns are under high selective pressure to keep their splicing capacity, they are expected to have a low mutation rate, explaining the low number of differences between cognate introns of different phages. The coding regions of the HEGs however can be presumed to accumulate at least synonymous changes over time. Such variations are clearly present in the td exons of all the phages and to a lesser extent in nrdB and nrdD exons. The most plausible explanation for the lack of variation in the HEGs of the American td introns is that there has been a recent invasion and lateral spread of the td intron among the American phages.
Taken together with the phylogenetic data, this further strengthens the view that the td introns have been spread horizontally among T-even-like phages.
Variations at the Intron Insertion Sites-A prerequisite for intron homing is that the recipient phage contains a homing site and enough sequence similarity in flanking regions to promote efficient homologous recombination. One possible explanation for the limited spread of introns among the T-evenlike phages would be if the phages without introns lacked homing sites. Fig. 5 shows the sequence variations found around the intron insertion sites (IS) for the td, nrdB, and nrdD genes from intron-containing and intronless phages. Apart from the intron-associated variations in the nrdD gene, there is very little variation in the region around the IS in both the nrdB and the nrdD genes among the T-even-like phages. In contrast, more variation is found around the IS in the td genes FIG. 1. Relative organization of td, nrdB, and nrdD genes on  even between closely related phages and even among the intron-containing phages. In agreement with the co-conversion of short flanking sequences that accompanies intron homing (21,24,37) the American intron-containing phages are very similar around the IS. However, TuIa differs markedly from the other five intron-containing phages in the td region between the CS and the IS and is instead more similar to the intronless phage RB69 in concordance with the phylogenetic data. The more distantly related phages RB49 and Aeh1 (classified as pseudo-T-even and schizo-T-even phages, respectively (30,38,39)) show extensive sequence divergence around the IS in all three genes, and homing of the T-even introns would most likely not be possible to these phages.
Although the I-TevI endonuclease can tolerate quite a lot of sequence variation within its recognition site (6, 40 -42) the differences in this region in some of the intronless phages may limit the efficiency of homing of the td intron. In contrast, the low variability among T-even-like phages in the I-TevIII and especially the I-TevII recognition sites together with the fact that homing of the T4 nrdD intron to T2 occurs quite efficiently (5,6) indicates that it is not a lack of homing sites that has limited the spread of these introns among the T-even-like phages.
Mixed Infections between Intron-containing and Intron-less Phages-The efficiency of intron homing within the T-even-like group of phages and between more distantly related T4-type phages was studied in pairwise mixed infections at high multiplicities between intron-containing (donor) phages and intron-less (recipient) phages. We used four different recipient phages: the T-even phages T4⌬IVS (which lacks all three introns) and T2 and the more distantly related T4-type phages RB49 and RB69. The expected frequency of a non-homing gene in the progeny of a mixed infection is 50%, whereas genes capable of homing will increase their occurrence approaching 100% for efficient homing. However, the frequency of genetic markers in the progeny can be strongly affected by the still largely uncharacterized processes of general and localized exclusion (43). To discriminate between exclusion and homing we mapped recombination sites around intron insertion sites, and in mixed infections with T2 or RB49 as recipient phages we also screened for intron-independent markers (Figs. 6B and 7B, inverted gray bars).
With T4⌬IVS as recipient phage, the T4 td and nrdD introns, the U5 td and nrdB introns, and the T6 and TuIa td introns were found at frequencies indicative of homing (Fig. 6A). Despite the influence of exclusion when T2 was the recipient phage, the T2 results agree with the T4 results with respect to which introns are found together with the T2 marker indicating homing (Fig. 6B). These results agree well with our sequencing data as to which introns have functional/non-functional HEGs with three exceptions. The first one is the U5 nrdB intron, which was found at high frequency in the mixed infections despite the fact that the U5 I-TevIII endonuclease is expected to be non-functional according to the DNA sequence. All progeny that had acquired the U5 nrdB intron also contained the U5 td intron and displayed U5 genetic markers throughout the region between these genes, indicating that both introns have been transferred in a single recombination event. It is unlikely that the homing capacity of the td I-TevI can mobilize the nrdB intron 4 kb downstream of td. The region between nrdA and nrdB in U5 contains a previously unknown open reading frame of 544 amino acids without homology to any sequence currently in GenBank TM that replaces the putative freestanding homing endonuclease gene mobE in T4 (see "Discussion"). If this potential U5 gene can promote the transfer of the U5 nrdB intron remains to be investigated (work in progress). The second exception is RB3 in which td and nrdB introns were only found at low frequency or not at all in crosses  with T4⌬IVS or T2. The RB3 td intron is identical to the T4 and T6 introns and should therefore be homing-competent. The RB3 nrdB I-TevIII endonuclease was shown by Eddy and Gold (26) to be able to cleave an intron-less target sequence, but they could not detect any homing. They also confirmed an earlier report that the td-nrdB region of RB3 is strongly excluded by T4 in mixed infections (26,43). When we mapped the region between frd and nrdA in intron-containing progeny from the RB3/T4⌬IVS cross we found that most (8/10) phages had a recombination point within 550 bp downstream of the td intron insertion point and 4/10 had a recombination point within the upstream td exon, clearly indicating that homing has taken place (Fig. 6C). A comparing screen with progeny from the T6/T4⌬IVS cross (high frequency of td intron presence) showed few isolates (2/10) with recombination sites close to the intron, whereas the rest (8/10) had T6 sequences in the mapped region. Our sequence results show that both RB3 and T6 contain a previously unknown open reading frame in-between the frd and td genes (according to T4 nomenclature we call it td.1). To analyze what effect the presence of td.1 has on td homing, we recrossed some of the chimeric progeny isolates from the first RB3 and T6 crosses with T4⌬IVS. Progeny from all recrosses showed increased frequencies of td intron presence compared with the original crosses indicating less T4 exclusion of RB3 alleles and a more efficient homing. However, isolates containing td.1 showed lower homing efficiencies of the td intron than isolates without td.1 (Fig. 6, right part), indicating that the presence of larger differences between the flanking sequences of donor and recipient phages may be involved in the exclusion process or interfere substantially with homing. The third exception is the TuIa td intron, that differs markedly in sequence from the other td introns, and was found at low frequency in the T2 cross but at high frequency when crossed with T4⌬IVS (Fig. 6, A and B). No recipient sequences were found close to the intron (between the frd and nrdA genes) in the td-containing progeny. We can therefore not know whether the td intron has been transferred through homing or as part of a larger rearrangement. The sequence of the TuIa td gene also differs substantially from the T4 and T2 sequences, and even if the I-TevI endonuclease could cleave the intronless alleles recombination may be inhibited by the low degree of sequence similarity. This would result in the survival only of progeny with larger recombination regions, including the td intron and of original TuIa phages.
To determine the ability of the T-even introns to home to more distantly related T4-type phages, we also performed crosses with RB69 and RB49 as recipient phages. Based on sequence homology, RB69 and TuIa form a slightly more distant group of the T-even-like phages (see Fig. 3), whereas RB49 is even more distant in relation, belonging to the pseudo-Tevens (38). In the RB49 crosses there was no co-localization of any of the introns with the RB49 marker (Fig. 7A). RB49 appears to exclude TuIa, and conversely T4 and U5 appear to efficiently exclude RB49 (gray bars in Fig. 7A). In the RB69 crosses, on the other hand, the TuIa td intron was found at high frequency and screening of sequences flanking the td insertion site confirms the presence of recombination points close to the intron indicative of homing (Figs. 7, B and C). In contrast, when T4 or RB3 was used as donor, we found very few introncontaining progeny of which most had all introns and also displayed donor plaque phenotypes and therefore most likely were original donor phages. However, in one of our four T4/ RB69 crosses we found a small number of progeny that only had inherited the nrdD intron. Screening of flanking restriction markers showed recombination points within the nrdD exons in all progeny examined (4/4) clearly indicating homing (Fig. 7D). The nrdD sequences flanking the intron insertion site are very similar between T4 and RB69, whereas td IS sequences differ more. These differences in the I-TevI recognition site together with the lower degree of similarity in flanking regions may decrease the otherwise highly efficient homing of the td IVS in RB69. RB69 has been reported to strongly exclude the original T-even phages and most of the RB-strains (43) explaining the generally low occurrence of T4 and RB3 sequences in the crosses. Our data show that the homing properties of the nrdD I-TevII can overcome part of this exclusion.
DISCUSSION Bacteriophage T4 appears to be an optimal target for homing endonucleases with its highly efficient recombination machinery and multiple copies of the genome present during a large part of the lifecycle. Its response to superinfection (a second phage infecting an already infected cell) is lysis inhibition, a delay in lysis during which replication continues, and an increasing number of new phage are produced (for review see Ref. 44), a perfect environment for transfer of homing endonucleases from one phage to another. In agreement with this the T4 genome contains, apart from the three intron-encoded HEGs, at least 12 genes with homing endonuclease motifs several of which have been shown to possess endonuclease activity (45)(46)(47)(48)(49)(50). Many of these free-standing, putative HEGs are not present in other T-even-like phages (49,51), perhaps for the same reasons as the introns are absent.
Our results indicate that the introns in the T-even-like phages each share a recent common ancestor that has spread horizontally among the phages most likely via mixed infections. The nearly total lack of sequence variation between the td introns of the American phage isolates clearly suggests recent spread of this intron among these phages, further strengthening the view that horizontal transmission of introns can be very efficient among phages in Nature. The high degree of similarity between the nrdB intron of U5, T4, and RB3 and between the nrdD intron of U5 and T4 indicates recent exchange of these introns between the three phages despite the geographic separation of their isolation sites. The differing td intron allele of U5 and the lack of an nrdD intron in RB3 suggest that the introns have been acquired in separate transfers and not as multiple homing events during one mixed infection.
How recent are the introns in the T-even-like phages? Even if the introns in most of the phages are highly similar, the presence of distinctly different versions of the td intron in geographically separated phage populations show that these introns can persist and evolve among phages over time. Furthermore, the highly T-even specific nature of the introns with respect to HEG promoters and codon usage (32,52,53) indicates that the introns and phages have a history of coexistence. We therefore favor the idea that the introns have resided among the T-even-like phages for a long time but that their distribution goes through major fluctuations. A similar conclusion was implicit from studies of the intron within saccharomycete yeast populations (34). Goddard and Burt (34) proposed that recurring horizontal transfers of the intron and cognate HEG into new host populations might be the only way for the HEG to stay viable over larger tracts of evolutionary time. Once at fixation in a population, selection for endonuclease function is relaxed, because there are no additional homing sites to transfer to. The HEG will then degenerate and eventually be lost, unless it can spread horizontally to a HEGless population or evolve to home to a different recognition sequence in a second gene and start a new cycle of spreading into HEG-less individuals of the same population. Our results on the transmission of group I introns within the T-even-like bacteriophages are consistent with the steps in the cycle proposed by Goddard and Burt (outlined in Fig. 8) as follows: (i) The initial stage with an intronless gene containing a recognition site for cleavage by the respective intron homing endonuclease is the most abundant among T-even-like bacteriophages. There is little sequence variation around the intron insertion sites for the nrdB and nrdD introns of the T-even-like phages sequenced here and in many of the phages also for the td intron, indicating that functional homing endonuclease recognition sites most likely are present in many intronless phages. Our mixed infections show that intron homing is possible to sites with enough sequence similarity but with varying efficiency mainly due to the process of local marker exclusion. The phylogenetic data also support that a rapid spread of the td intron has occurred between closely related American T-even-like phages. (ii) The second step in the cycle, degeneration of the HEG, is seen in three of the T-even-like introns, U5 nrdB and nrdD and T4 nrdB. All three HEGs have suffered frameshift deletions that render the proteins non-functional. The low number of additional nucleotide differences between functional and non-functional HEGs suggests that these deletions have occurred recently and shortly after intron acquisition by the phage. This frequently observed loss of HEG function indicates that the HEGs present no selective advantage to the phage and that the dependence upon frequent transfer of the introns to intronless hosts is vital for their survival in phages. Whether the loss of homing function is particularly rapid for certain HEGs in some phages (i.e. I-TevIII in T4 and U5 and I-TevII in U5) due to negative effects on phage viability remains to be investigated. No T-even-like phages have been found with an intron that totally lacks an endonuclease gene, but such introns have been reported in other phages (9, 13, 15). (iii) No signs of a precise loss of an intron, the step completing the cycle, are found for any of the intronless phages in this study. For example, they do not show the signs of intron-proximal exon sequences that are vital for splicing (54) (see Fig. 5). Precise loss of introns is likely to be such a rare event that the number of phage sequences collected so far is too small to detect it. Precise intron loss may also occur by mechanisms that change the sequences surrounding the intron insertion site as well and would then not be detected here.
If homing sites are available and the introns can spread rapidly and efficiently among the phages, why are there so few T-even-like phages with introns? First, if the frequency of mixed infections between populations in Nature is low, the horizontal spread of introns would be hampered. Such frequencies are not easily estimated and some phages of Gram-negative bacteria, including T2 and T4, and plausibly most of the T-even phages, carry genetic systems for limiting secondary phage infections. Second, even if a recognition site for endonucleolytic cleavage is present, the efficiency of homing is also dependent upon longer regions (at least 50 bp) of sequence identity in the exon sequences (5,22,23), a requirement that is not met by other than closely related phages. Our data also show that the presence of larger gene rearrangements in regions flanking the intron-containing genes can have a negative effect on homing. Third, the phenomenon of partial exclusion (43) of regions from one phage by the other in mixed infections strongly reduces the efficiency of intron homing. The mechanisms behind general or partial exclusion are still not fully understood, but several recent studies have shown that the activity of homing endonucleases can (at least partly) explain the dominance of a certain genetic region from one phage over another in mixed infections (48,50). Putative freestanding homing endonuclease genes with sequence homology to the H-N-H family are present both nearby the nrdD (mobC) and the td-nrdB (mobE) regions in T4. What effects competition between two nearby homing endonucleases may have on each other remain to be investigated. Liu et al. (50) recently reported the same high inheritance of the T4 nrdB intron in mixed infections with T2 as we find here and speculated on the possibility of the flanking mobE gene being a functional homing endonuclease with a cleavage site within the first exon of nrdB, thereby promoting homing of the T4 nrdB gene, including the intron. In support of the functionality of mobE as a homing endonuclease, our sequence data of the nrdA-nrdB region differ from the T4 genome sequence (obtained from a cloned fragment) by two frame shifts and show a longer, continuous mobE open reading frame spanning the whole distance between the nrdA and the nrdB genes. Because homing endonucleases are notoriously difficult to clone as functional proteins, this suggests that the originally cloned sequence is of a non-functional version of mobE. 3 These pos-sible limitations to horizontal transfer between phages in Nature, together with our observation that the intron HEGs appear to be rapidly degenerating once inserted into the phage, may explain why introns are not a more general feature of phages