Identification and Characterization of an Essential Telomeric Repeat Binding Factor in Fission Yeast*

Whereas mammalian cells harbor two double strand telomeric repeat binding factors, TRF1 and TRF2, the fission yeast Schizosaccharomyces pombe has been thought to harbor solely the TRF1/TRF2 ortholog Taz1p to perform comparable functions. Here we report the identification of telomeric repeat binding factor 1 (Tbf1), a second TRF1/TRF2 ortholog in S. pombe. Like the Taz1p, the identified Tbf1p shares amino acid sequence similarity, as well as structural and functional characteristics, with the mammalian TRF1 and TRF2 proteins. This family of proteins shares a common architecture with two separate structural domains. An N-terminal domain is necessary and sufficient for the formation of homodimers, and a C-terminal MYB/homeodomain mediates sequence specific recognition of double-stranded telomeric DNA. The identified Tbf1p binds S. pombe telomeric DNA with high sequence specificity in vitro. Targeted deletion of the tbf1 gene reveals that it is essential for survival, and overexpression of the tbf1 gene leads to telomere elongation in vivo, which is dependent upon the MYB domain. These data suggest that fission yeast, like mammals, have two factors that bind double-stranded telomeric DNA and perform distinct roles in telomere length regulation.

Telomeres are the dynamic DNA-protein complexes that make up the natural ends of eukaryotic chromosomes. Early cytological and genetic studies revealed that telomeres are necessary for chromosome stability and genome maintenance (1,2). It was demonstrated that broken chromosomes fuse end-toend and become dicentric, ring-formed, or adopt other unstable forms that cause genomic instability. The instability caused by broken chromosome ends contrasted with the stability of natural chromosome ends, and suggested that telomeres are essential structures that make the natural ends of eukaryotic chromosomes unique. Without telomeres eukaryotic chromosomes are unstable and suffer chromosomal rearrangements.
Telomeric DNA consists of extended arrays of tandem repeats with common endings. One strand is rich in guanines and thymines and forms a single-stranded 3Ј extension (3). Consequently, the complementary strand is rich in cytosines and adenosines, and has a recessed 5Ј-end. Moreover, telomeric repeat sequences appear well conserved through evolution. All vertebrates have an identical GGTTAG hexanucleotide telomeric repeat sequence (4) whereas the telomeric repeat sequence in Schizosaccharomyces pombe is somewhat degenerate. The general S. pombe consensus sequence is GGTTAC(A)(C)G 0 -6 , with GGTTACA being the most frequently occurring repeat (5).
Conventional replication of linear chromosomes fails to make a complete copy of the lagging strand because of discontinuous DNA synthesis and the requirement for RNA priming (6), or because of loss of overhang at the end of leading strand synthesis (7), so in order to complete replication, unicellular eukaryotes like S. pombe depend on the telomerase enzyme (8). Telomerase is a dimeric reverse transcriptase that uses part of its intrinsic RNA moiety as a template to extend the guaninerich strand of the telomeric DNA (9 -12). In higher eukaryotes telomere extension occurs in all cells during early development, but is later restricted to a select subset of cells including the germline and stem cells. As a consequence, the length of the telomeric DNA gradually shortens in differentiated cells (3). It has been suggested that this provides a molecular clock that tells cellular age, and that telomere shortening serves as a tumor suppressor mechanism that prevents accumulation of mutations (13). A differentiated cell enters an irreversible state of arrested growth known as replicative senescence (14) when its shortest telomeres eventually becomes critically short (15).
Telomere extension largely depends on the state of the telomere, whose conformation is modulated by proteins that bind telomeric DNA sequence specifically and fine tune the synthesis of telomeric DNA (16 -22). Three DNA-binding telomeric proteins have been found in man and other organisms with the telomeric repeat sequence GGTTAG. Two structurally related proteins, telomeric repeat binding factors 1 (TRF1) 4 (23) and 2 (TRF2) (24,25), bind double-stranded telomeric DNA, whereas protection of telomeres 1 (POT1) bind telomeric single-stranded 3Ј extensions (26 -29).
A POT1 ortholog has been identified and characterized in S. pombe (26), and Taz1p was the first TRF1/TRF2 ortholog to be discovered in S. pombe (30). However, despite suggestions that also S. pombe contains two telomeric recognition factors (31,32), a second TRF1/TRF2 ortholog remained to be identified. Here we report the identification and characterization of telomeric repeat binding factor 1 (Tbf1), a second TRF1/TRF2 ortholog in S. pombe.

EXPERIMENTAL PROCEDURES
Sequence Analysis, Cloning, and Protein Expression-A candidate S. pombe TRF1/TRF2 ortholog was identified from sequence homology. ClustalW sequence alignments were made using the BLOSUM62 substitution scoring matrix with open gap and extend gap penalties of 10 and 0.5, respectively. The identified tbf1 gene was amplified from S. pombe chromosome II cosmid c19G7 (GenBank TM accession number AL021839) by standard PCR using Vent polymerase (New England Biolabs), as were the nucleotide sequences encoding two spTbf1p deletion mutants (aa 1-411 and aa 405-456). The taz1 gene was amplified from S. pombe chromosome I cosmid c16A10 (Gen-Bank TM accession number Z97185). The PCR products were cloned into a modified Pet30a vector (Novagen) containing an N-terminal His 6 tag/S-tag followed by a Tobacco Etch Virus (TEV) protease cleavage site, and the inserts were verified by sequencing (Geneservice). The constructed plasmids were transformed into Escherichia coli (E. coli) Rosetta cells (Novagen), and protein expression was induced by adding isopropylbeta-D-thiogalactopyranoside to a final concentration of 0.5 mM when the cell cultures reached an optical density ( ϭ 600 nm) of about one. Cells were harvested by centrifugation (5000 ϫ g for 15 min), and pellets were resuspended in lysis buffer (50 mM Tris/HCl, pH 8.0, 500 mM KCl, 1% Triton X-100, 2 mM DTT) containing protease inhibitors (Complete EDTAfree protease inhibitor mixture tablet, Roche Applied Science). Cells were disrupted by sonication and following centrifugation (25,000 ϫ g for 30 min), the supernatant was incubated with lysis buffer equilibrated Ni-NTA resin (Qiagen) for 60 min. After centrifugation (200 ϫ g for 3 min), the Ni-NTA resin was washed with wash buffer (50 mM Tris/HCl, pH 8.0, 500 mM KCl, 10 mM imidazole, 2 mM DTT, 10% glycerol), and the protein was stepwise eluted with wash buffer containing 200 mM imidazole. Protein containing fractions were pooled and dialysed into digestion buffer (50 mM Tris/HCl, pH 8.0, 500 mM KCl, 2 mM DTT, 10% glycerol). TEV proteolytic digestion was performed overnight, with TEV protease at approximately a ratio of 1:100 to eluted protein, to remove the N-terminal His 6 tag/S-tag. The TEV protease was itself His 6 -tagged, which allowed the cleaved protein to be purified from TEV protease and cleaved N-terminal His 6 tag/S-tag by binding to Ni-NTA resin equilibrated with digestion buffer. The proteins were further purified by gel filtration on a Superdex 200 column also equilibrated with digestion buffer, and dialysed into binding buffer (50 mM Tris/HCl, 125 mM KCl, 5 mM DTT, 10% glycerol).
Proteolysis, Western Blotting, and N-terminal Sequencing-Purified full-length protein was digested at room temperature using a protease mixture containing papain, trypsin, and chymotrypsin, each at final concentration of 25 g/ml, and aliquots were removed into 2ϫ SDS loading buffer at 15-min intervals. Proteolytic fragments and prestained markers were separated using standard SDS-PAGE, and the protein content of the gel was transferred onto Immobilon-P polyvinylidene difluoride membrane (Millipore) for N-terminal sequence analysis by automated Edman degradation in a Procise 494 Protein Sequencer.
Electrophoretic Mobility Shift Assays-Synthetic oligonucleotides (Sigma) containing common S. pombe telomeric repeat sequences were purified by polyacrylamide gel electrophoresis and hybridized in equimolar amounts to make up double-stranded telomeric DNA. The pNSU70 plasmid containing cloned S. pombe telomeric DNA (originally constructed by Neal Sugawara, PhD thesis, Harvard University, 1989) was digested by ApaI and SacI (New England Biolabs) to produce a 295-bp fragment consisting of 263-bp telomeric DNA, equivalent to 30 naturally occurring S. pombe telomeric DNA repeats, and 32-bp telomere-associated sequence. The restriction fragment was purified by standard agarose gel electrophoresis. DNA was radiolabeled using 5Ј-[␥-32 P]triphosphate (Amersham Biosciences) and T4 polynucleotide kinase (New England Biolabs). Unincorporated probe was removed on Bio-Spin P30 columns (Bio-Rad). Radiolabeled DNA (1 nM) was incubated with protein in binding buffer (50 mM Tris/HCl, pH 8.0, 125 mM KCl, 5 mM DTT, 10% glycerol) containing 100 g/ml bovine serum albumin (New England Biolabs), 10 g/ml sheared E. coli DNA for 30 min at ϩ4°C, and reaction mixtures were analyzed with native agarose (BioGene HiPure Low EEO agarose) gel electrophoresis (0.5-3% agarose, 0.25ϫ TB) at 7.5 V/cm for up to 80 min at ϩ4°C. Gels were dried onto DE81 anion exchange chromatography paper (Whatman) and scanned using a Typhoon 8600 imaging system (Amersham Biosciences).
S. pombe Strain Construction and Culture Conditions-All strains used in this study (Table 1) were cultured in YE or EMM medium as described (33). The diploid strain JCF1101 was made by mating JCF22 and JCF24. A linear DNA fragment carrying the kanMX module (34) was generated by PCR using primers corresponding to sequences flanking the open reading frame of the identified gene (fwd: AGCAATCGAT TAATCA-GACT GCTTCTCACC TATAGTTTGT ATTTTCTTTG AT-CAATTGAA AAACTACGAT TTCCAAGAAA CGGATC-CCCG GGTTAATTAA, rev: AATGCCTCAC GCTTAA-TCGT TTTAGTATTT AAAAAAAAAT CGCAAACTTA ATCAAGAATG AAATAAACTC CTGATACTAC GAATT-CGAGC TCGTTTAAAC). This was used to transform the diploid to G418 resistance, and deletion of one allele was confirmed by Southern hybridization (Table 1).
Protein Overexpression and Telomere Length Measurement-The following plasmids, carrying the identified gene under control of the NMT81 promoter, were constructed using the pNMT TOPO Expression kit (Invitrogen): pNMT81-tbf1, pNMT81-tbf1-V5, pNMT81-tbf1-⌬MYB-V5 (V5 ϭ V5 epitope tag). These were used to transform JCF109 to leucine prototrophy in the presence of thiamine. Single colonies were re-isolated while maintaining plasmid selection and thiamine repression, and then grown to late log phase in EMM-leucine, with or without thiamine (5 g/ml). Samples were isolated for Western blot and Southern blot analyses. Mouse ANTI V5-TAG (Serotec MCA1360) was used as primary antibody in the detection of the V5 epitope, with anti-mouse IgG-HRP (GE Healthcare NA931V) as secondary antibody. Loading control was Cdc2 p34 PSTAIRE, detected with rabbit polyclonal primary antibody (Santa Cruz Biotechnology SC-53), with anti-rabbit IgG-HRP (Amersham Biosciences NA934) as secondary antibody.
Telomere length was assessed as described (30), except that Southern blots were probed with a random prime-labeled 450-bp synthetic telomeric fragment (35).

RESULTS AND DISCUSSION
Identification of a Candidate TRF1/TRF2 Ortholog in S. pombe-A candidate TRF1/TRF2 ortholog from S. pombe was identified from sequence homology. Sequence comparisons between human telomeric repeat binding proteins, and their S. pombe orthologs reveal that the amino acid sequences of human POT1 and S. pombe Pot1p share around 19% identity and 32% homology. The Taz1p amino acid sequence shares ϳ16% identity and 30% homology with TRF1 as well as TRF2, while the identified S. pombe Tbf1p shares in the region of 17% identity and 30% homology with both TRF1 and TRF2 (Fig. 1).
Despite moderate overall sequence homology, secondary structure prediction using a combination of the Profile-based neural network prediction of protein structure (PredictProtein) (36) and Predictor of natural disordered regions (PONDR) (37) algorithms reveal striking similarities between the human TRF1 and TRF2 proteins and the spTbf1p (Fig. 1). The human TRF1 and TRF2 proteins both contain unstructured N termini. Similarly, the N terminus (aa 1-64) of the spTbf1p is predicted to lack fixed secondary structure. The human TRF1 and TRF2 proteins both have large homodimerization domains that consist of 9 ␣-helices, which despite low sequence homology are structurally conserved (38). The spTbf1p has a corresponding structural domain (aa 71-287), which consists of 9 predicted ␣-helices with very high probability scores. The homodimerization domains of the human TRF1 and TRF2 proteins are connected to C-terminal DNA-binding MYB/homeodomains via flexible linker peptides, which contains putative nuclear localization domains. The spTbf1p has a predicted C-terminal DNA-binding MYB/homeodomain (aa 405-456), with a link- A time course proteolysis study shows that the spTbf1p contains two distinct structural domains that resist degradation. The ϳ25-kDa peptide fragment corresponds to a structural domain containing 9 predicted ␣ helices (aa 76 -287), which likely mediates dimerization. The ϳ10-kDa peptide fragment corresponds to a C-terminal DNA-binding MYB/homeodomain (aa 408 -485), which mediates sequence specific recognition of S. pombe telomeric DNA. ing peptide, which is predicted to lack fixed secondary structure and which contains a putative nuclear localization signal (aa 329 -335). Nota bene, the prowess of the PROF and PONDR algorithms was validated through their ability to predict all secondary structure elements and unstructured regions for the human TRF1 and TRF2 proteins with very high precision. Sequence analyses and secondary structure predictions indicate that the spTbf1p, just like the spTaz1p, is a TRF1/TRF2 ortholog.
The spTbf1p Has the Same Architecture as Mammalian TRF1 and TRF2-The results from the sequence analyses and secondary structure predictions encouraged us to explore the structural architecture of the spTbf1p in more detail. To this end we cloned, expressed, and purified the spTbf1p to homogeneity. Purification by size exclusion chromatography immediately suggested that the oligomeric state of the spTbf1p is a dimer, and a preferred homodimeric arrangement was confirmed by sedimentation analysis by ultracentrifugation (data not shown). Thus, just like the human TRF1 and TRF2 proteins (38), and the spTaz1p (32), the spTbf1p forms homodimers.
To establish the domain organization of the spTbf1p we performed a time course proteolysis study of the purified protein. This left two proteolytic peptide fragments, with approximate molecular masses of ϳ25 kDa and ϳ10 kDa, which resisted degradation (Fig. 2). Western blotting and N-terminal sequence analysis revealed that the first eight amino acids of the ϳ25-kDa peptide fragment had the sequence MNQGMDYA, which is identical to residues 76 -83 of the amino acid sequence of the spTbf1p. The first eight amino acids of the ϳ10 kDa peptide fragment had the sequence SWTKEEEE, which exclusively matches residues 407-414 of the amino acid sequence of the spTbf1p. These two protease-resistant peptide fragments likely represent highly structured domains, which render internal cleavage sites inaccessible to proteases.
We next compared the information obtained from the proteolysis study with the predicted structural organization of the identified TRF1/ TRF2 ortholog. The spTbf1p is predicted to contain twelve ␣-helical secondary structure elements, organized in two domains which are connected by an unstructured linker (Fig. 1). This domain organization is conserved in the human TRF1 and TRF2 proteins (38,39). The amino acid sequence MNQG-MDYAA, representing the N terminus of the ϳ25-kDa proteolysis fragment, falls within the first predicted ␣-helix of the domain that contains the nine predicted ␣-helices (aa 76 -287). The calculated molecular mass of the predicted domain is ϳ24 kDa, which corresponds very well with the experimentally determined molecular mass of the ϳ25-kDa proteolysis fragment. This probably corresponds to a highly structured domain similar to those found in the human TRF1 and TRF2 proteins, in which nine ␣-helices pack tightly to form a compact homodimerization domain (38). The amino acid sequence SWTKEEE, representing the N terminus of the ϳ10 kDa proteolysis fragment, is found within the first predicted ␣-helix of the domain that contains the remaining three predicted ␣ helices (aa 407-485). The calculated molecular mass of this domain is 9 kDa, which corresponds well with the experimentally determined molecular mass of the ϳ10-kDa proteolysis fragment. From sequence homology, and available structural information (39), this domain would be expected to fold into a three ␣-helix bundle that makes up the C-terminal DNA-binding MYB/homeodomain.
We conclude form the proteolysis study that the spTbf1p, just like the human TRF1 and TRF2 proteins (38,39) and the spTaz1p (32), contains two distinct structural domains. The larger N-terminal domain is likely responsible for the formation of homodimers, whereas the smaller C-terminal domain is a conserved MYB/homeodomain that could mediate sequencespecific recognition of telomeric DNA. . The TRF1/TRF2 family of proteins binds double-stranded telomeric DNA via conserved MYB/ homeodomains. A, residues that are identical across species can be separated into two groups. B, hydrophobic residues (gray) make up a hydrophobic core that maintains the fold of the three helix bundle, while charged and aromatic residues (red, blue, and gray) belonging to the third ␣ helix (bottom) of the MYB/homeodomain mediate specific recognition of telomeric DNA repeat sequences. GenBank TM accession numbers in parentheses.
The spTbf1p Binds S. pombe Telomeric DNA in Vitro-Because the spTbf1p appeared to have the same overall architecture as the human TRF1 and TRF2 proteins, we proceeded to investigate whether it bound S. pombe double-stranded telomeric DNA sequence specifically. Binding was investigated using cloned S. pombe telomeres and synthetic oligonucleotides, each containing two repeats of the most frequently occuring S. pombe telomeric repeat sequences (ACACAGGTTACAGGTTACG, ACACAGGGTTACAGGGTTACG, and ACACAGGGGTTA-CAGGGGTTACG) (5) flanked by AC dinucleotides and a 3Ј-G to prevent oligonucleotides from forming multimers.
Electrophoretic mobility shift assays show that full-length spTbf1p (Fig. 3A), just like Taz1p (Fig. 3B), binds doublestranded S. pombe telomeric DNA with high affinity and specificity in vitro. The spTbf1p polymerizes along a DNA fragment containing 30 naturally occurring S. pombe telomeric DNA repeats without significant cooperativity (Fig. 3C). The spTbf1p thus coats S. pombe telomeric DNA in the same way that TRF1 coats human telomeric DNA in vitro (40,41). The spTbf1p does not bind single-stranded telomeric DNA, nor does it bind nontelomeric G-rich DNA (Fig. 3D). Moreover, while the MYB/ homedomain of spTbf1p binds S. pombe telomeric DNA on its own, the Tbf1p dimerization domain exhibits no detectable affinity for telomeric DNA (Fig. 3E). Electrophoretic mobility shift assays using oligonucleotides containing two telomeric repeats and full-length spTbf1p produce a single slow migrating band (Fig. 3A). In contrast, the same experiment using only the MYB/homedomain of spTbf1p yields two bands (Fig. 3E), further supporting the notion that full-length spTbf1p preferentially forms homodimers.
The high degree of sequence homology between the DNAbinding MYB/homeodomains within the telomeric recognition factor family of proteins is particularly noteworthy (Fig. 4). All family members, including the S. cerevisiae TBF1 protein, use a conserved MYB/homeodomain motif for sequence specific recognition of double-stranded DNA (42).
Taken together, the DNA binding studies demonstrate that the spTbf1p binds S. pombe double-stranded telomeric DNA sequence specifically, and thus may function at telomeres in vivo.
The tbf1 Gene Is Essential for Survival-To explore in vivo functions of the spTbf1p, the entire open reading frame of one copy of the tbf1 ϩ gene was replaced with the Kan R marker in a diploid S. pombe strain (JCF1101). While asci dissected from the parental JCF1101 diploid each yielded four colonies (all of which were Kan S ), each ascus dissected from tbf1 ϩ/Ϫ diploids yielded only two colonies and these were both Kan S (Fig. 5), indicating that the gene is essential. Microscopic examination of tbf1⌬ spores that failed to produce colonies revealed that they produced only one or two cells before dying. Albeit, unlike fission yeast whose telomeres erode due to loss of telomerase (trt1) or pot1 (26,43), survivors harboring circular chromosomes do not arise in tbf1⌬ strains. Similar observations have been reported for the mammalian trf1 gene. Targeted deletion of exon 1 of the trf1 gene causes early embryonic lethality in mice (44). The trf1 gene family thus appears to share a common essential function, and we proceeded to investigate a role of the spTbf1p in telomere length regulation.
The spTbf1p Affects Telomere Length in Vivo-To address the role of spTbf1p in telomere length regulation, we overexpressed versions of the tbf1 ϩ gene on plasmids under control of the NMT81 promoter. Expression from this promoter can be repressed by growing in medium containing thiamine. For clarity, here we say that the culture is induced if it lacks thiamine and repressed if it contains thiamine.
Transformants were isolated under repressing conditions, then induced or repressed in liquid medium for 6 days. A transformant carrying epitope-tagged tbf1 ϩ (pNMT81-tbf1-V5) displayed telomere elongation by ϳ100 -150 bp when induced (Fig. 6A, lane 2), relative to the same strain carrying an empty vector (Fig. 6, lane 11). Similar elongation was seen when untagged tbf1 ϩ (pNMT81-tbf1) was induced (Fig. 6, lane 8). The elongation seen for full-length tbf1 ϩ was not observed in a C-terminal deletion epitope-tagged tbf1 (pNMT81-tbf1-⌬MYB-V5), which lacks the last 82 MYB-containing amino acids (Fig. 6, lane 5). We saw intermediate telomere elongation at day 0 (repressed) for transformants carrying full-length tbf1 ϩ constructs (Fig. 6, lanes 1 and 7). This is probably due to incomplete repression of the NMT81 promoter by thiamine. Indeed, after a further 6 days growth in repressing medium, we see a further slight increase in telomere length (Fig. 6, lanes 3 and 9), consistent with incomplete repression. We also found that the stronger promoter, pNMT41, led to equivalent telomere elon-gation (150 bp) under repressing and inducing conditions (data not shown). However, the telomeres never elongated beyond 150 bp, suggesting a new mean telomere length is maintained when tbf1 ϩ is overexpressed. In agreement with our observation, the S. cerevisiae TBF1 protein was recently reported to provide a telomere length-sensing mechanism, which allows telomerase to preferentially elongate short telomeres (45).
To determine that Tbf1p was overexpressed, we visualized samples from the above experiment by Western blot (Fig. 6B, same loading order as 6A). We were able to detect high levels of Tbf1-V5 (predicted calculated molecular mass 57.6 kDa) and Tbf1-⌬MYB-V5 (predicted calculated molecular mass 48.2 kDa) when induced, and low levels of the full-length protein when repressed (consistent with the mild telomere elongation discussed above).
In conclusion, the in vivo experiments reveal that the tbf1 gene is essential for survival, that the encoded spTbf1p can affect telomere length, and that this elongation is dependent upon the presence of the MYB domain.
The spTbf1p May Be a Fission Yeast Ortholog of Mammalian TRF1-Mammalian telomeric dsDNA is bound by telomeric repeat binding factors TRF1 (23) and TRF2 (24,25). Both proteins form homodimers (38) that bind telomeric dsDNA directly through Myb/homeodomains (46), both proteins appear to be regulators of telomere length (22,47), and both proteins recruit additional proteins to telomeres. Specifically, the TRF2 protects telomeres from end-to-end fusions (48) and recruits RAP1 (49) to mammalian telomeres. In a similar way spTaz1p, a negative regulator of telomere length and a confirmed TRF1/TRF2 ortholog, protects S. pombe telomeres from end-to-end fusions (50) and recruits spRAP1 to S. pombe telomeres (51).
TRF1 recruits TIN2 (52), TANK (53) and PINX1 (54) to mammalian telomeres. Preliminary sequence analyses and secondary structure predictions suggest that all of the mammalian TRF1 interacting partners have potential orthologs in S. pombe, but their roles in telomere length regulation remains to be clarified. In the meantime we speculate that the identified spTbf1p may be a fission yeast ortholog of mammalian TRF1. The identification of a second telomeric repeat binding factor in S. pombe suggests that telomere maintenance machineries may be more evolutionary conserved than previously thought, and support the view that telomeric proteins contain multiple distinct functional domains which can move relative to one another to modulate the dynamic structure of telomeres (46). It may also help reveal new functions of telomeres in general and telomeric repeat binding factors in particular.