A Plant Gene Encoding a Myb-like Protein That Binds Telomeric GGTTTAG Repeats in Vitro *

A gene ( AtTRP1 ) encoding a telomeric repeat-binding protein has been isolated from Arabidopsis thaliana. AtTRP1 is a single copy gene located on chromosome 5 of A. thaliana . The protein AtTRP1 encoded by this gene is not only homologous to the Myb DNA-binding motifs of other telomere-binding proteins but also is similar to several initiator-binding proteins in plants. Gel retardation assay revealed that the 115 residues on the C terminus of this protein, including the Myb motif, are suf-ficient for binding to the double-stranded plant telomeric sequence. The isolated DNA-binding domain of AtTRP1 recognizes each telomeric repeat centered on the sequence GGTTTAG. The almost full-length protein of AtTRP1 does not form any complex at all with the DNA fragments carrying four or fewer GGTTTAG repeats. However, it forms a complex with the sequence (GGTTTAG) 8 more efficiently than with the sequence (GGTTTAG) 5 . These data suggest that the minimum length of a telomeric DNA for AtTRP1 binding consists of five GGTTTAG repeats and that the optimal AtTRP1 binding may require eight or more GGTTTAG repeats. It also implies that this protein AtTRP1 may bind in vivo primarily to the ends of plant chromosomes, which consist of long stretches of telomeric repeats. Telomeres, the specialized nucleoprotein structures at the ends of eukaryotic chromosomes, are essential for the maintenance of chromosome integrity (1, 2). washed to remove the unbound rabbit antiserum, incubated with alkaline phosphatase-conjugated goat anti-rabbit IgG antibody, and detected by chemiluminescent reaction using CSPD as substrate (CLONTECH, CA). Other Techniques— Southern hybridization, amplification, and screening of DNA libraries, bacterial transformation, plasmid isolation, gel electrophoresis, and PCR were performed as described (45). Yeast transformation and recovery of plasmids from yeast were done as de- scribed by the manufacturer (CLONTECH, CA).

Telomeres, the specialized nucleoprotein structures at the ends of eukaryotic chromosomes, are essential for the maintenance of chromosome integrity (1,2). The telomeric DNA in most eukaryotic chromosomes consists of tandemly repeated sequences (3); the sequence TTTAGGG has been identified in the telomeres of most plant species (4,5). The length of plant telomeres varies from a few kilobase pairs (kbp) 1 in Arabidopsis (6) to a few hundred kbp in Nicotiana (7). A study in maize suggested that variation of telomere length among strains is controlled by multiple genes (8). Studies in yeast and mammalian cells revealed that the telomere length is regulated by multiple factors including telomerase and telomeric repeatbinding proteins (TRPs) (9). Telomerase is a reverse transcriptase that uses an internal RNA moiety as a template for the extension of G-rich strand at the ends of DNA molecules (10) under the control of telomerase-associated proteins (11)(12)(13)(14). In human cells, telomerase activity and telomere length are tissue-specific and under developmental control (15). In plants, the telomerase activity is also correlated with the cellular proliferation capacity (16 -18). However, telomere length remains unchanged during development in some species but shortens dramatically in others (19 -21). This implies that telomerase may not be the sole factor in controlling telomere length in plants. Disruption of the gene for the telomerase catalytic subunit in Arabidopsis only results in a slow loss of telomeric DNA (22), indicating that the telomerase-deficient plant cells may have other, unknown, mechanism(s) to prevent telomeres from shortening rapidly.
TRPs bind directly to either single-stranded or doublestranded telomeric DNA. The former interact with singlestranded 3Ј extension of the extreme termini and are important for the maintenance of telomere length by restricting access of telomerase to chromosome termini (23,24). Double-stranded telomeric repeat-binding proteins, such as Rap1p in budding yeast (25), Taz1p in fission yeast (26), and hTRF1 (27) and hTRF2 (28,29) in human cells, have various functions. The Rap1p, an abundant telomeric protein in budding yeast, inhibits telomere elongation (30) and controls the expression of numerous genes involved in cell growth (31). The Taz1p is not only a factor in the negative regulation of telomere length (26) but is also required for meiotic telomere clustering and genetic recombination in fission yeast (32)(33)(34). In human cells, telomere length is negatively regulated by hTRF1 (35) and also by hTRF2 (36) which is also required for maintaining telomere integrity (37). Lack of hTRF2 induces apoptosis in some cell lines (38). These results suggest that TRPs play a crucial role in telomere and cellular metabolism.
To explore the structure and function of TRPs in plants, efforts have been made to characterize the factors that can bind to plant telomeric sequence. These studies included the detection of double-stranded telomeric DNA-binding protein in maize and Arabidopsis crude extracts (39), the identification of factors that bind to single-stranded G-rich telomeric repeats in rice and mung bean nuclear extracts (40,41), and the characterization of an Arabidopsis protein (ATBP1) that binds to the G-rich as well as to the double-stranded telomeric sequence (20,42). A recent paper described the cloning of a rice cDNA that encodes a protein (RTBP1) that contains a C-terminal telomeric DNA-binding domain (43). The isolated DNA-binding domain of RTBP1 bound specifically to the duplex oligonucleotide sequence (TTTAGGG) 4 ; however, no evidence was shown that the full-length protein of RTBP1 can distinguish DNA fragments carrying long arrays of telomeric repeats from those * This work was supported by Grant NSC 88-2311-B-001-080 from the National Science Council and Academia Sinica in the Republic of China. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) ATH17722.
§ To whom correspondence should be addressed, Tel.: 886-2-27899590 (Ext. 216); Fax: 886-2-27827954; E-mail: bocmchen@ccvax.sinica. edu.tw. 1 The abbreviations used are: kbp, kilobase pairs; GAD, GAL4 activation domain; RACE, rapid amplification of cDNA ends; TRPs, telomeric repeat-binding proteins; PCR, polymerase chain reaction; bp, base pair; PMSF, phenylmethylsulfonyl fluoride; IPTG, isopropyl-1thio-␤-D-galactopyranoside; PAGE, polyacrylamide gel electrophoresis; NLS, nuclear localization signals. with short ones. Therefore, it remains unclear whether the RTBP1 is bound to the long stretches of telomeric repeats at chromosome ends or to the short telomeric repeats in interstitial regions of plant chromosomes. Moreover, the exact core sequence in the plant telomeric repeat recognized by the isolated DNA-binding domain of RTBP1 has not yet been defined. The mode of interaction between TRP and telomeric DNA in plants therefore remains unclear.
We report here the cloning of an Arabidopsis gene (AtTRP1) and the corresponding cDNA, which was expressed in bacteria to produce a protein that specifically binds to the doublestranded plant telomeric sequence. The DNA-binding domain of AtTRP1 was defined, and the core sequence of each telomeric repeat recognized by the isolated DNA-binding domain of At-TRP1 has been determined. The minimum length of a telomeric DNA fragment bound by an AtTRP1 protein has been defined. Our in vitro evidence indicates that the AtTRP1 protein may be located primarily at the ends of plant chromosomes.

EXPERIMENTAL PROCEDURES
Cloning of AtTRP1-The oligonucleotide (TTTAGGG) 4 was placed upstream of the minimal promoter in pHISi and pLacZi vectors (CLONTECH, CA). These vectors were linearized and sequentially integrated into the chromosomes of yeast strain YM4271 (MATa,112, trp1-903, tyr1-501, gal4-⌬512, gal80-⌬538, ade5::hisG) to generate a reporter strain YR14. Background LacZ activity in late logarithmic phase cells of YR14 was low enough to be distinguished from a positive interaction. Low plating density (10 8 cells per 150-mm plate) and 3-aminotriazole (45 mM) were used to suppress growth on histidine-deficient (ϪHis) plates resulting from the low level of (TTTAGGG) 4 -HIS3 reporter gene expression. An Arabidopsis cDNA expression library cloned into the GAL4 activation domain (GAD) vector pGAD10 (CLONTECH, CA) was amplified in Escherichia coli. Purified library DNA (30 g) was used to transform the YR14 strain. Positive clones were selected on plates containing SD/ϪLeu/ϪHis medium plus 45 mM 3-aminotriazole followed by assay for ␤-galactosidase activity (CLONTECH, CA). Plasmids in positive clones were rescued by transformation into E. coli. The insert in cDNA clone 1-1 was used as a probe to isolate overlapping cDNA clones from an Arabidopsis cDNA library in gt11 and genomic clones from an Arabidopsis genomic library in EMBL3SP6/T7 (CLONTECH, CA). To obtain the 5Ј end of full-length cDNA, rapid amplification of cDNA ends (RACE) was performed by polymerase chain reaction (PCR) to amplify the Arabidopsis cDNA library in pGAD10 with flanking primer GADP1 (5ЈCTATTCGATGATGAAGATACCCCACCAAACCC3Ј) and AtTRP1 primer TBP12 (5ЈGGGACTCAAAGATCCCCG3Ј). The 5Ј-RACE PCR product was digested with EcoRI and cloned into pUC18. All the cDNA and genomic clones were sequenced in both strands using an ABI377 automatic sequencer. Homology searches were made against sequences in the data base of both GenBank TM and that generated by the Arabidopsis Genome Initiative using BLAST.
Production of AtTRP1 Protein in Bacteria-A full-length cDNA was obtained by digestion of partial cDNA clones ( Fig. 1A) with appropriate enzymes, followed by sequential ligation. To generate cDNA subfragments oligonucleotides containing modified sequences of AtTRP1 were used as primers for PCR to amplify the full-length cDNA: Bamp12 (GAATTTGGATCCGCAAGCTACC, derived from 1068 -1089) and Bamp16 (GTACAACGGATCCTCTAAAAGCC, derived from 3685-3663) for ⌬12, Bamp3 (TATCTGGATCCGTTACTGATGA, derived from 2007-2028) and Bamp16 for ⌬1, Bamp4 (GGTGATGGATCCGTGAG-CACTTTAC, derived from 2441-2466) and Bamp16 for ⌬2, Bamp5 (CTTGTGAGGATCCCTACCTGTG, derived from 2731-2753) and Bamp16 for ⌬3, Bamp6 (GCAGCAGGATCCCAACGCAGAA, derived from 2942-2964) and Bamp16 for ⌬4. These PCR products were digested with BamHI and cloned into pET3a (44). To obtain DNA fragment ⌬11, the 640-bp XbaI fragment was excised form the DNA fragment ⌬12, and the flanking fragments were ligated and cloned into pET3a. To generate DNA fragment ⌬31, Bamp3 was coupled with Sacp3 (GTTGTGGAGCTCCTTAGCCGTG, derived from 3328 -3306) and Bamp16 was coupled with Sacp1 (GGTCTGTTAGAGCTCTAAAT-GAATC, derived from 3469 -3493) to generate, respectively, DNA fragments ⌬16 and ⌬18 from the full-length cDNA by PCR amplification. Fragments ⌬16 and ⌬18 were digested with SacI and BamHI and then ligated with BamHI-cut pET3a. All the truncated cDNAs were sequenced to confirm that the sequences are correct. These truncated cDNAs in pET3a were placed under the control of a bacteriophage T7 RNA polymerase promoter. Each construct was transformed in E. coli BL21(DE3) cells that were then induced by 0.4 mM isopropyl-1-thio-␤-D-galactopyranoside (IPTG) to synthesize the truncated AtTRP1 (44). All the AtTRP1 derivatives produced in bacteria contain an N-terminal addition of 14 amino acids encoded by the DNA sequence of gene 10 of T7 phage fused to the BamHI site in pET3a. After induction of the cultures with IPTG for 2 h at 37°C, the cells were spun down, washed twice with buffer containing 50 mM Tris, pH 7.5, 0.2 M NaCl, and 5% glycerol, resuspended in 1ϫ DNA-protein binding buffer (20 mM Hepes, pH 7.6, 1 mM EDTA, 5% glycerol, 10 mM (NH 4 ) 2 SO 4 , 1 mM dithiothreitol, 0.2% Tween 20, and 30 mM KCl) plus 0.1 mM phenylmethylsulfonyl fluoride (PMSF) and 1ϫ proteinase inhibitor mixture (Roche Molecular Biochemicals), broken by sonication, and centrifuged at 18,000 rpm for 1 h at 4°C. The supernatant was aliquoted and stored at Ϫ80°C. The amount of truncated AtTRP1 in each extract was estimated on 8% SDS-polyacrylamide gel; comparable amounts of truncated proteins were tested for DNA binding activity in gel retardation experiments.
Purification of AtTRP1 Derivatives-The crude extract containing the protein ⌬N-463 ( Fig. 3A) was precipitated with 45% (NH 4 ) 2 SO 4 , and the pellet was resuspended in 1ϫ DNA-protein binding buffer plus 0.1 mM PMSF and 1ϫ proteinase inhibitor mixture prior to the fractionation on the plant telomeric sequence-specific DNA affinity column. To purify the almost full-length protein of AtTRP1, the extract containing the protein ⌬N-12 ( Fig. 3A) was fractionated through the DEAE column in the equilibration buffer (20 mM ethanolamine, pH 9.5, 1 mM dithiothreitol, 1 mM EDTA, 5% glycerol, 0.1 mM PMSF, and 1ϫ proteinase inhibitor mixture) containing stepwise increases in NaCl concentration. Most of the degraded form of ⌬N-12 appeared in both the flow-through and the 0.1 M NaCl eluent. The intact form of ⌬N-12 was eluted from the DEAE column with 0.4 M NaCl and further purified on the specific DNA affinity column in 1ϫ DNA-protein binding buffer containing stepwise increases in KCl concentration. Proteins ⌬N-463 and ⌬N-12 were eluted, respectively, at 0.6 and 1 M KCl from the DNA affinity column. The eluents containing telomeric sequence binding activity were desalted, concentrated, and purified once more on the same DNA affinity column, which was set up by coupling the biotinylated duplex (TT-TAGGG) 8 DNA to the streptavidin-agarose beads in 1ϫ DNA-protein binding buffer. The purified AtTRP1 derivatives were stored at Ϫ80°C.
Gel Retardation Assay-The single-stranded or duplex oligonucleotides were end-labeled with digoxigenin-11-ddUTP (Roche Molecular Biochemicals) and used as the probe for gel retardation assay. Gel retardation assays were done in a volume of 20 l containing 1ϫ DNA-protein binding buffer, 1 g of poly[d(I-C)], 0.1 g of poly-L-lysine, 1 pmol of digoxigenin-labeled probe, and 5-10 g of crude extracts or various amounts of the purified proteins. For competition experiments, the binding reaction was augmented with various amounts of competitor DNA. After 30 min at 25°C, the free and bound DNA were separated by electrophoresis on 4 or 6% polyacrylamide gels in 0.5ϫ TBE (44.5 mM Tris borate, 1 mM EDTA, pH 8.0) at 4°C, transferred to nylon membranes, and detected by chemiluminescent reaction using CSPD (C 18 H 20 CIO 7 PNa 2 ) as substrate (Roche Molecular Biochemicals).
Western Hybridization-The purified protein ⌬N-12 was either analyzed directly on the 8% denatured SDS-polyacrylamide gel or subjected to gel retardation assay on the 4% native polyacrylamide gel, electrophoretically transferred onto polyvinylidene difluoride membranes, and hybridized with a polyclonal rabbit antiserum against a synthetic peptide containing residues 34 -45 of AtTRP1. The membranes were washed to remove the unbound rabbit antiserum, incubated with alkaline phosphatase-conjugated goat anti-rabbit IgG antibody, and detected by chemiluminescent reaction using CSPD as substrate (CLONTECH, CA).
Other Techniques-Southern hybridization, amplification, and screening of DNA libraries, bacterial transformation, plasmid isolation, gel electrophoresis, and PCR were performed as described (45). Yeast transformation and recovery of plasmids from yeast were done as described by the manufacturer (CLONTECH, CA).

RESULTS
Cloning of AtTRP1-The cDNA clones encoding TRPs in A. thaliana were isolated by a yeast one-hybrid system. Plasmids from all three positive transformants of yeast contained an identical cDNA insert, 1-1, of 1.1 kbp in size (Fig. 1A). Overlapping cDNAs 1-9 and 1-20 were subsequently obtained using 1-1 as a probe. The resulting sequence was extended further in the 5Ј direction by RACE on the cDNA library from which the clone 1-1 was isolated. Alignment of the cDNA clones and the RACE product, 1-21, yielded a contiguous sequence spanning 2391 base pairs (bp). A genomic clone, G1, containing AtTRP1 (Fig. 1A) was obtained from a phage library using cDNA 1-1 as a probe. Southern hybridization revealed that AtTRP1 is a single copy gene (data not shown). Comparison of the sequences of the genomic and cDNA clones of AtTRP1 indicated that this gene contains nine introns and the first exon is untranslated (Fig. 1A). Each intron contains the conserved dinucleotides GT and AG at 5Ј and 3Ј boundaries, respectively, suggesting that all the introns are of the U2-type (46). A TATAAA sequence was located 63 bp upstream from the first exon and is assumed to be the TATA box. The entire sequence of AtTRP1 matches exactly that of a genomic DNA fragment cloned from the chromosome 5 of Arabidopsis thaliana (GenBank TM AB025604).
Conceptual translation of the 2.4 kbp cDNA sequence reveals an open reading frame of 578 amino acids that is predicted to encode a 65-kDa protein with a pI of 8.4 (Fig. 1B). This protein, AtTRP1, contains four clusters of basic residues (Fig. 1, B and C). The amino acid sequences in the first two clusters (residues 22-36 and 213-219) are similar to those in the nuclear localization signals (NLS) of the maize R gene product (47). The sequence in the third cluster (residues 250 -256) is homologous to that in the NLS of SV40 T antigen, which facilitates the targeting of proteins into nuclei of plants (48) and yeast (49). The presence of multiple NLS suggests that AtTRP1 may be a nuclear protein. The last cluster (residues 450 -454), located next to the Myb motif, may help the DNA-protein interaction since it is rich in positive charges and close to the DNA-binding domain of AtTRP1 (see below). The sequence of amino acids 466 -520 is not only homologous to the Myb-related DNA-binding motifs present in yeast and mammalian TRPs (Fig. 1D) but is also highly similar to the DNA-binding domain of the rice protein RTBP1 (Fig. 2), suggesting that this region may be important for binding plant telomeric sequence. In addition to the Myb motif and NLS, AtTRP1 contains short clusters of acidic residues at its N-terminal end and glutamine-rich regions on both sides of the Myb motif (Fig. 1B). Most interesting, AtTRP1 also contains contiguous residues at several regions identical or similar to those of RTBP1 (43) and of three initiator-binding proteins in plants, including HPPBF1 (GenBank TM AF072536) from A. thaliana, BPF1 from parsley (50), and IBP1 from maize (51) (Fig. 2). HPPBF1 and BPF1 bind, respectively, to the promoters of the genes encoding the H protein of glycine decarboxylase and phenylalanine ammonia lyase. IBP1 binds to the promoter of the shrunken (Sh) gene involved in carbohydrate metabolism.
The DNA-binding Domain of AtTRP1 Is Located at Its C Terminus-To map the DNA-binding domain of AtTRP1, cDNAs encoding various truncated proteins (Fig. 3A) were cloned into the vector pET3a and expressed in bacteria. Treatment of these cultures with IPTG resulted in the accumulation of a unique protein in each bacterial lysate (Fig. 3B). Since the DNA sequence analysis has revealed that each cDNA construction only encodes a single polypeptide, the unique protein in each lysate probably represents the intact form of the corresponding truncated protein. The gel retardation assay of truncated proteins showed that the protein ⌬N-463 retained the DNA binding activity (Fig. 3, A and C). This truncated protein consists of the Myb motif and a C-terminal polypeptide rich in glutamine (Fig. 1B). Removal of this glutamine-rich polypeptide abolished the DNA binding activity of the truncated protein ⌬N-262, suggesting that the protein ⌬N-463 contains a domain essential for recognizing the telomeric sequence.
Since the molecular weight of the protein ⌬N-12 is greater than that of the protein ⌬N-325 (Fig. 3, A and B), it is expected that the complex containing ⌬N-12 will migrate more slowly on the native gel than the complex containing ⌬N-325 in the reaction with an identical probe. However, incubation of the extract containing ⌬N-12 with the probe (TTTAGGG) 4 produced two weak complexes that migrated faster than the single strong complex produced by incubating the extract containing ⌬N-325 with the identical probe (Fig. 3C). The unexpected mobility of ⌬N-12-containing complexes may be due to the proteolysis of ⌬N-12 in the complexes but also could be due to some difference in the shapes between the complexes containing ⌬N-12 and that containing ⌬N-325. Failure to detect the strong slowly migrating complex(es) in either case indicates that the probe (TTTAGGG) 4 may be too short to be bound efficiently by the protein ⌬N-12, since the SDS-PAGE analysis revealed that the amount of intact ⌬N-12 is similar to that of intact ⌬N-325 in the extract (Fig. 3B). A similar phenomenon was also observed in the gel retardation assay of the extracts containing the proteins 13-223 ϩ 452-578 and ⌬N-262 (Fig.  3C). Based on the ability of binding the probe (TTTAGGG) 4 ,  (Fig. 3A). The DNA binding activities among the truncated proteins within each group cannot be compared with one another until each one of them is purified.
The DNA-binding Domain of AtTRP1 Recognizes the Sequence GGTTTAG-To identify the core sequence in the telomeric DNA recognized by the DNA-binding domain of AtTRP1, the protein ⌬N-463 containing the DNA-binding domain of AtTRP1 was purified to near-homogeneity (Fig. 4A). Seven duplex oligonucleotides, representing all possible combinations of four contiguous and identical plant telomeric repeats with one permutated nucleotide at a time, were used as probes for gel retardation assay of the purified protein ⌬N-463 (Fig. 4B). By using as little as 1 pmol of protein, one or two types of complexes (I and II) were observed, and as the protein concentration is increased, complexes with slower mobilities (III and IV) were observed. In the reactions using the probe (GGTT-TAG) 4 , four complexes were observed, of which the complex IV became predominant at high molar ratios of protein to DNA probe (Fig. 4B, lanes 31-36), suggesting that this protein recognizes each of the four sites in the probe (GGTTTAG) 4 with equal efficiency (Fig. 4C). The probe (GGGTTTA) 4 also formed four complexes with the same protein, but the intensity of complex III was always higher than that of complex IV, even when a high molar ratio of protein to the DNA probe was used (Fig. 4B, lanes 25-30), suggesting that one of the sites in the probe (GGGTTTA) 4 is bound less efficiently by this protein than the other three (Fig. 4C). Similarly, one of the four sites in (TTTAGGG) 4 is bound less efficiently than the other three by this protein (Fig. 4B, lanes 1-6 and 4C). It should be noticed that to form the complex IV, (TTTAGGG) 4 requires greater amounts of the protein ⌬N-463 than does (GGGTTTA) 4 (Fig.  4B, lanes 1-6 and 25-30). In the reactions containing probes (TAGGGTT) 4 (Fig. 4B, lanes 13-18) and (AGGGTTT) 4 (Fig. 4B,  lanes 19 -24), complexes I-III were observed clearly but complex IV only weakly. These data suggest that, in each probe, one site is recognized poorly, and the other three sites are bound efficiently by the protein ⌬N-463. Only three complexes were formed in the reactions containing probes (TTAGGGT) 4 (Fig. 4B, lanes 7-12) and (GTTTAGG) 4 (Fig. 4B, lanes 37-42), suggesting that each probe consists of three recognition sites for the protein ⌬N-463. By combining these data, we conclude that the isolated DNA-binding domain of AtTRP1 may recognize each GGTTTAG repeat. Partial telomeric sequences such as TTTAG, GGTTTA, GGTTT, and GGTT may be also bound by this protein but less efficiently (Fig. 4C).
The Minimum Length of a Telomeric DNA Bound by AtTRP1 Contains Five GGTTTAG Repeats-To explore the minimum length of the telomeric DNA bound by AtTRP1, the protein ⌬N-12 was purified. SDS-PAGE analysis of the purified ⌬N-12 revealed that the resulting protein ⌬N-12 appeared to be 90 -95% pure and has an apparent molecular mass of 83 kDa (Fig.  5A, left panel). The discrepancy between the apparent and the calculated molecular mass of AtTRP1 suggests that this protein migrates anomalously on SDS gel. Since this protein was eluted from the DNA affinity column and can be detected with the antibody against a peptide near the N terminus of AtTRP1 (Fig. 5A, right panel), it has been suggested that this protein is probably the intact form of ⌬N-12. This protein does not bind the duplex probe (TTTAGGG) 4 (data not shown) and forms complexes only with the probes carrying five or more GGTT-TAG repeats (Fig. 5B), suggesting that the minimum length of a telomeric DNA bound by the protein ⌬N-12 or AtTRP1 spans five GGTTTAG repeats. ⌬N-12 binds (GGTTTAG) 8 more efficiently than (GGTTTAG) 6 , whereas the probe (GGTTTAG) 5 binds very little of ⌬N-12 (Fig. 5B, lanes 6 -20), indicating that AtTRP1 may require eight or more GGTTTAG repeats for optimal complex formation. Our data do not exclude the possibility that probes with more than eight GGTTTAG repeats show an additional enhancement of AtTRP1 binding.
One or two complexes formed with the probes carrying five, six, and eight GGTTTAG repeats. The weak complex I appeared occasionally in the reactions containing the probes (GGTTTAG) 5 and (GGTTTAG) 6 (Fig. 5B, lanes 7, 14, and 15), and always migrated the same distance. Complex II was always the major one in all of the reactions and also always migrated the same distance. Since the migration rate of DNA-protein complexes is strongly influenced by the protein moiety in the complex (52) and since these probes do not differ significantly from each other in size, it is likely that these complexes with identical migration rates contain an identical number of protein molecules associated with each probe. To make sure that the protein moiety in both complexes is ⌬N-12, the complexes formed with the probe (GGTT-TAG) 5 were transferred from the gel to the membrane after gel retardation assay and were detected with an antibody against a short peptide at the N terminus of AtTRP1 (Fig.  5C). The Western hybridization showed that both complexes I and II reacted with the antibody, confirming that the protein moiety in both complexes is ⌬N-12.
AtTRP1 Binds Specifically to DNA Fragments Carrying Long Arrays of Plant Telomeric Repeats-To determine the sequence specificity of AtTRP1, the competition for the protein ⌬N-12 binding between the probe (GGTTTAG) 8 and unlabeled duplex oligonucleotides carrying plant telomere-related sequences and human telomeric repeats was investigated (Fig. 6). Although both oligonucleotides (GGTTTAG) 4 and (GGTTTAG) 8 consist of contiguous plant telomeric repeats, only the latter competes well for complex formation (Fig. 6, lanes 2-6), confirming that the proteins ⌬N-12 or AtTRP1 do not form any complex with the oligonucleotide (GGTTTAG) 4 (Fig. 5B). The sequences TTT-TGGG and TTAAGGG have been found at subtelomeric or telomeric regions in some plant genomes (6,53,54), but the competition experiment showed that the oligonucleotides (TTT-TGGG) 4 and (TTAAGGG) 4 only compete weakly in complex formation (Fig. 6, lanes 7-12). The oligonucleotide carrying the human telomeric repeat (TTAGGG) 4 is not an effective competitor (Fig. 6, lanes 13-15). The single-stranded oligonucleotides (GGTTTAG) 8 and (CTAAACC) 8 fail to compete in complex formation (data not shown).
The sequence in the DNA-binding domain of AtTRP1 is highly homologous to those in the proteins HPPBF1, BPF1, and IBP1 (Fig. 2). Since the sequence of the binding sites of IBP1 and BPF1is available (50, 51), we examined whether AtTRP1 would recognize the binding sites of both initiator-binding proteins. To address this question, we used the oligonucleotides containing the binding sites of IBP1 (Fig. 6, lanes 16 -18) and BPF1 (Fig. 6, lanes 19 -21) as the competitors of (GGTTTAG) 8 in the assay of complex formation. Although both binding sites seemed to compete weakly with the probe (GGTTTAG) 8 in the complex formation, none of these binding sites can form complex with the protein ⌬N-12 when they were used as the probes (data not shown).
Inspection of the nucleotide sequence of AtTRP1 has revealed that a copy of plant telomeric repeat, GGGTTTA, is located next to the putative TATA box of this gene. We have found that the isolated DNA-binding domain of AtTRP1 could bind the duplex oligonucleotide AtTRPro which carries the sequence ATAGGCTTATAAAGGGTTTAAGCA (bases 67-90 of AtTRP1) around the putative TATA box of AtTRP1 (data not shown). However, the competition experiment shown in Fig. 6  (lanes 22-24) indicated that the protein ⌬N-12 binds poorly to AtTRPro. Gel retardation assay also revealed that no complex formed between the protein ⌬N-12 and the probe AtTRPro (data not shown). The combined results of Figs. 5 and 6 suggest that the protein AtTRP1 binds specifically to telomeric DNA fragments containing at least five GGTTTAG repeats. DISCUSSION Gel retardation assay of cellular extracts followed by SDS-PAGE analysis of the DNA-protein complex has identified a 67-kDa Arabidopsis protein (ATBP1) that forms a complex with the duplex probe (TTTAGGG) 4 (20,42). Here we have shown that the molecular mass of the almost full-length At-TRP1 protein is ϳ83 kDa, based on SDS-PAGE analysis (Fig.  5A), and this protein does not form any complex with the probe (TTTAGGG) 4 (data not shown). The differences in both molecular mass and in the DNA binding behavior of these two proteins suggest that there may be two different telomere-binding The sequences for the binding sites of IBP1 and BPF1 are respectively G 3 AG 3 T 3 CTCTG 3 ACG 3 AGAG 3 AC (51) and A 2 GA 2 G 2 AGT 2 G 2 T 2 GAGA 2 T 2 A 2 (50). proteins in A. thaliana. Alternatively, ATBP1 could be one of the proteolytic products of AtTRP1 that have been found to form specific complexes with the probe (TTTAGGG) 4 during the purification of the protein ⌬N-12. It is important to know whether the proteins HPPBF1 and AtTRP1 are different classes of telomere-binding factors in A. thaliana, since they are highly homologous to each other in the Myb-like DNAbinding domain (Fig. 2).
The isolated DNA-binding domain of the rice protein RTBP1 formed only three complexes with the probe (TTTAGGG) 4 (43), whereas our data showed that the corresponding polypeptide in AtTRP1 formed a fourth complex with the identical probe at an extremely high molar ratio of protein to DNA probe (Fig. 4B). Since both polypeptides have highly similar sequences (Fig. 2), it is very likely that the isolated DNA-binding domain of RTBP1 will form four complexes with the probe (TTTAGGG) 4 in our experimental conditions. In other words, the isolated DNA-binding domain of RTBP1 will behave like that of AtTRP1 in the reaction with plant telomeric sequences. On the other hand, the duplex oligonucleotide (TTTAGGG) 4 has been used by some as a probe for the isolation of plant TRPs from nuclear extracts (20,42). Our results have indicated that this probe is bound efficiently by some of the truncated proteins of AtTRP1 ( Fig. 3 and 4) but not bound by the almost full-length molecule of the same protein (data not shown), suggesting that the duplex oligonucleotide (TTTAGGG) 4 may not be the appropriate probe for the assay of intact TRPs in plant cells.
The isolated DNA-binding domain of AtTRP1 can form complexes with the probes carrying four or fewer telomeric repeats (Fig. 4), 2 whereas the almost full-length molecule of the same protein binds probes with at least five GGTTTAG repeats (Fig.  5). Comparison of the DNA-binding properties of the proteins ⌬N-463, ⌬N-393, ⌬N-325, ⌬N-262, and ⌬N-12 (Figs. [3][4][5] indicates that the presence of the peptide containing residues 13-325 in the N-terminal half of AtTRP1 may cause AtTRP1 to recognize the longer telomeric DNA fragments. By using the longer telomeric DNA fragments as the recognition sites not only makes the binding specificity of AtTRP1 more stringent but also implies that this protein may locate at the ends of plant chromosomes, which consist of long stretches of telomeric repeats. Among the complexes formed between the protein ⌬N-12 and the telomeric DNA, the complex II migrates more slowly than the complex I on the native gel (Fig. 5B), suggesting that the apparent molecular weight of complex II is greater than that of complex I. This led us to propose that the formation of complexes I and II may be attributed to the binding of one and two molecules of protein, respectively, to a single oligonucleotide. The complexes I appeared as fuzzy and inconstant bands, suggesting that these complexes formed by binding one protein to a single DNA molecule may be highly unstable. Complex II is the major one formed in all of the reactions, suggesting that AtTRP1 may bind plant telomeric DNA predominantly as a dimer. In addition, complex II is much more stable than the complex I, raising the possibility that two molecules of AtTRP1 may interact with each other to form a dimeric protein that, perhaps, binds the DNA much more tightly than does a single AtTRP1 molecule. Moreover, protein ⌬N-12 forms complexes II with probes carrying various number of telomeric repeats, implying that the protein AtTRP1 may have a flexible region which allows the two DNA-binding domains of the putative dimeric protein to recognize the telomeric DNAs with various lengths.
The competition assay has shown that the protein ⌬N-12 does not bind the IBP1-binding site, which contains a single telomeric repeat (Fig. 6), suggesting that AtTRP1 does not recognize interstitial regions in plant chromosomes that contain a single telomeric repeat. A survey has shown that the 5Ј regions of some Arabidopsis genes contain two or more noncontiguous telomeric repeats (39). Since the dimeric hTRF1 can bind DNA fragments containing non-contiguous telomeric repeats separated by non-telomeric sequences (55,56), it would be of interest to see whether the AtTRP1 protein would bind the non-contiguous telomeric repeats at the 5Ј regions of these genes and function as a transcriptional regulator. On the other hand, the degraded AtTRP1 might play a role in transcriptional regulation of some plant genes, since we have observed that the isolated DNA-binding domain of AtTRP1 can recognize both the IBP1-binding site and the AtTRPro DNA fragment (data not shown). It should be noticed that only the C-terminal truncated proteins containing the Myb-like motif of IBP1 or BPF1 were shown to interact with their binding sites. It is not clear whether the corresponding full-length proteins would recognize the same binding site (50,51). Both AtTRP1 and hTRF1 proteins belong to the class of Myb proteins that harbor only a single Myb motif (Fig. 1D). Protein hTRF1 has been shown to bind predominantly as a homodimer to human telomeric DNA (56). It would be of interest to see whether protein AtTRP1 would function as a dimeric protein and use a pair of Myb-like DNA-binding domain to recognize DNA.
Our data has revealed a possible mode by which a telomeric repeat-binding protein recognizes telomeric DNA in plants. The information about the features of the AtTRP1 protein allows us to hypothesize a structure for protein-DNA complexes at plant telomeres and propose possible roles for this protein in plant cells.