Structure and Function of the Smallest Vertebrate Telomerase RNA from Teleost Fish*

Telomerase extends chromosome ends by copying a short template sequence within its intrinsic RNA component. Telomerase RNA (TR) from different groups of species varies dramatically in sequence and size. We report here the bioinformatic identification, secondary structure comparison, and functional analysis of the smallest known vertebrate TRs from five teleost fishes. The teleost TRs (312-348 nucleotides) are significantly smaller than the cartilaginous fish TRs (478-559 nucleotides) and tetrapod TRs. This remarkable length reduction of teleost fish TRs correlates positively with the genome size, reflecting an unusual structural plasticity of TR during evolution. The teleost TR consists of a compact three-domain structure, lacking most of the sequences in regions that are variable in other vertebrate TR structures. The medaka and fugu TRs, when assembled with their telomerase reverse transcriptase (TERT) protein counterparts, reconstituted active and processive telomerase enzymes. Titration analysis of individual RNA domains suggests that the efficient assembly of the telomerase complex is influenced more by the telomerase reverse transcriptase (TERT) binding of the CR4-CR5 domain than the pseudoknot domain of TR. The remarkably small teleost fish TR further expands our understanding about the evolutionary divergence of vertebrate TR.

Telomeres are specialized DNA-protein complexes that cap chromosome ends and are important for genome stability and cellular proliferation (1). Telomeres consist of repetitive DNA sequences and a variety of telomere-associated proteins. The length of telomeric DNA in most eukaryotes is maintained by telomerase, a specialized reverse transcriptase that synthesizes telomeric DNA repeats at chromosome ends to counterbalance the natural shortening that occurs during DNA replication. Telomerase, a ribonucleoprotein (RNP) 2 enzyme, consists of at least two essential core components, the catalytic protein component telomerase reverse transcriptase (TERT), and the telomerase RNA (TR) that provides a template for telomeric DNA synthesis.
TR is remarkably variable in size, sequence, and even secondary structure between different groups of eukaryotes. To date, TR sequences have been identified in 28 ciliates, 14 yeasts, and 38 vertebrates. Due to the lack of sequence similarity between groups of species, the TR secondary structures were determined independently for each of these three groups (2). The vertebrate TR secondary structure is composed of three highly conserved structural domains: the pseudoknot/template domain, the CR4 -CR5 domain, and the scaRNA domain (3)(4)(5). The pseudoknot/template domain contains a template region for telomeric DNA synthesis, and a conserved pseudoknot structure essential for telomerase activity. The CR4 -CR5 domain together with the pseudoknot/template domain are both required for reconstituting active telomerase in vitro (6). However, their mechanistic roles are unclear. The scaRNA domain is crucial for the 3Ј-end processing of TR and telomerase RNP biogenesis in vivo (3,7). Whereas TRs from 34 tetrapods and 4 cartilaginous fishes share this three-domain structure (4), they have not yet been identified from teleost fish that comprises near half of the extant vertebrate species.
Teleost fish is the most diverse group among vertebrates (8), and is distinct from the cartilaginous fish. The teleost and tetrapods (including amphibian, reptile, birds, and mammals) diverged from each other around 450 million years ago. Since then, teleost fish have undergone genome duplication and rediploidization, resulting in an amazing level of genomic diversity. The relatively faster evolution rate and the consequent diversity in teleost fish offer an attractive model for evolutionary studies. Identification of TR from teleost fish using degenerate PCR or BLAST search has, however, not been successful due to a high degree of sequence variation in TR.
Here we report the identification of TRs from five teleost fish, Danio rerio, Oryzias latipes, Gasterosteus aculeatus, Takifugu rubripes and Tetraodon nigroviridis, using a novel bioinformatics method. To structurally and functionally characterize the teleost TR, we have cloned TR as well as TERT protein genes from medaka, fugufish, and zebrafish, and reconstituted telom-erase activity for medaka and fugufish. The structural and functional analyses of the teleost fish telomerase enzyme provide important new insights into the evolution of the vertebrate telomerase RNP.

EXPERIMENTAL PROCEDURES
Bioinformatics Search of Teleost Fish TR Sequences-A sequence search was performed using fragrep2. The input pattern, shown in supplemental Fig. S1, consists of eight positionspecific weight matrices (PWMs). The quality of match between a PWM and a DNA sequence is measured as a fraction of similarity above an unavoidable background (9). The computational approach and implementation details of fragrep2 are described in detail in Mosig et al. (10). Our search pattern was generated by annotating the eight conserved regions in the TR alignment published in Chen et al. (4), and converted to a fragrep2 search pattern using the aln2pattern tool and both fragrep2 and aln2pattern (available www.bioinf.uni-leipzig.de/ Software). The initial search pattern (Fig. S1) resulted in a single plausible hit in the medaka genome (assembly MEDAKA1). A BLAST search using medaka sequence as query against other teleost fish genomes revealed homologs in the stickleback (assembly BROAD S1), fugu (assembly FUGU 4.0), and tetraodon (assembly TETRAODON 7). Based on the four teleost TR sequences, a modified and less stringent search pattern was generated, with which we found 79 candidate sequences in the zebrafish genome (assembly Zv6). These were screened using INFERNAL (11) and the secondary structure annotated TR alignment from the Rfam data base (12), resulting in a single sequence that fit well with other teleost candidates and the previously known vertebrate TR sequences. The alignment of all 43 known vertebrate TR sequences can be obtained from the Telomerase Data base (telomerase.asu.edu).
Genomic DNA and Total RNA Isolation-For isolation of genomic DNA and total RNA, medaka fish (O. latipes) were purchased from Aquatic Eco-Systems (Apopka, FL), and Zebrafish (D. rerio) were obtained from Dr. Yung Chang (Arizona State University, AZ) or purchased from Aquatical Tropicals, Inc. (Plant City, FL). Green spotted pufferfish (T. nigroviridis) were purchased from AquariumFish.net. Liver tissue of fugu (T. rubripes) fish was obtained from Dr. Shugo Watabe (University of Tokyo, Japan).
Genomic DNA was isolated from 50 to 100 mg of fish tissue using the DNAzol reagent (Invitrogen) following the manufacturer's instruction. Stickleback (G. aculeatus) genomic DNA was a generous gift from Dr. David Kingsley (Stanford University). Total RNA was isolated from 100 to 200 mg of gill or liver tissues using 1 ml of TRIzol reagent (Invitrogen) following the manufacturer's instructions. Concentrations of DNA and RNA samples were determined by A 260 measurement using the Nanodrop ND-1000 spectrophotometer (Nano-Drop Technologies).
Sequencing and Cloning of TR Genes-To verify the sequences, teleost fish TR genes were PCR amplified from genomic DNA and the PCR products were sequenced directly. The verified sequences of five teleost fish TR genes were deposited into GenBank TM with the following accession numbers: rubripes), EF680233 (T. nigroviridis), and EF680234 (G. aculeatus).
For medaka, zebrafish, and fugu, the PCR products of TR genes were cloned into the EcoRV site of the pZero vector (Invitrogen) to generate pMedaka-TR, pZebrafish-TR, and pFugu-TR. Plasmids were sequenced to confirm sequence accuracy of the cloned TR genes.
Identification and Cloning of Teleost Fish TERT Genes-To reconstitute telomerase activity, we cloned TERT genes from medaka, zebrafish, and fugu. The fugu TERT (AY861384) and medaka TERT (DQ248968) gene sequences have been previously identified and were available from GenBank (13). The zebrafish TERT gene was identified in this study via a BLAST search of the zebrafish genome data base using the fugu TERT protein sequence as query. The exact 5Ј-and 3Ј-ends of the full-length zebrafish TERT cDNA sequence were determined by the 5Ј-and 3Ј-rapid amplification of cDNA ends (RACE) using a SMART-RACE cDNA Amplification Kit (Clontech). The cDNA sequence was determined by direct sequencing of the reverse transcriptase-PCR products. The sequence of zebrafish TERT gene has been deposited into GenBank with accession number EF202140.
To clone the TERT genes, the coding sequences of medaka and zebrafish TERT genes were PCR amplified from the cDNA samples prepared from total RNA samples using Thermoscript reverse transcriptase (Invitrogen) and an oligo(dT) 18 reverse primer. The fugu TERT cDNA was PCR amplified from a cDNA library obtained from Dr. Byrappa Venkatesh (Institute of Molecular and Cell Biology, Singapore). The PCR products of the medaka, zebrafish, and fugu TERT cDNAs were cloned into the pCITE vector for in vitro synthesis of the recombinant TERT proteins.
In Vitro Transcription of TR-RNA was prepared by T7 in vitro transcription using PCR DNA products as template as described previously (14,15).
Northern Blotting Analysis-Twenty micrograms of total RNA was resolved on a 4% polyacrylamide, 8 M urea denaturing gel and electrotransferred to Hybond-XL membrane (Amersham Biosciences) at 0.5 A for 1 h. The membrane was UV cross-linked and prehybridized at 65°C for 30 min in 20 ml of UltraHyb hybridization buffer (Ambion). Riboprobes with sequences complementary to the target RNA were generated by in vitro transcription from a PCR DNA template that contained the T7 promoter and labeled internally with [␣-32 P]UTP using a MaxiScript kit (Ambion). After incubation at 37°C for 1 h, 1 l of RNase-free DNase I (2 units/l) was added to the reaction, followed by a 20-min incubation at 37°C to remove the DNA template. Riboprobes were then purified using microspin G-25 columns (GE Healthcare). The membrane was hybridized at 65°C overnight in 20 ml of UltraHyb buffer with the riboprobe added to 1 ϫ 10 6 cpm/ml. The hybridized membrane was washed twice in 20 ml of 1ϫ SSC (3.0 M NaCl and 0.3 M sodium citrate, pH 7.0), 0.2% SDS for 10 min at 65°C, and twice in 20 ml of 0.2ϫ SSC, 0.1% SDS for 30 min at 65°C. The blot was analyzed using a phosphorimager, Bio-Rad FX Pro.
In Vitro Reconstitution of Telomerase-Human, medaka, fugu, and zebrafish telomerases were reconstituted using the TNT (transcription and translation) Quick Coupled rabbit reticulocyte lysate system (Promega). Briefly, recombinant TERT protein was synthesized in 10 l of rabbit reticulocyte lysate at 30°C for 60 min following the manufacturer's instructions. To assemble the telomerase complex, in vitro synthesized TR was added to the TNT reaction of TERT synthesis, and incubated at 30°C for 30 min. For the titration experiments of individual RNA domains, the pseudoknot/template or CR4 -CR5 RNA fragment was added to a saturated 3 M, whereas the other RNA fragment was added to various concentrations as indicated.
Conventional Telomerase Activity Assay-Enzymatic activity of in vitro reconstituted telomerase was analyzed using a direct primer extension assay. A 10-l reaction was carried out with 3 l of in vitro reconstituted telomerase sample in the presence of 1ϫ PE buffer (50 mM Tris-HCl, pH 8.3, 50 mM KCl, 2 mM dithiothreitol, 3 mM MgCl 2 , and 1 mM spermidine), 1 mM dATP, 1 mM dGTP, 1 mM dTTP, and 2 pmol of 5Ј-32 P-end labeled (TTAGGG) 3 telomere primer at 30°C for 2 h. The products were subjected to phenol/chloroform extraction and ethanol precipitation, followed by 10% denaturing PAGE. Gels were dried, and products were detected and analyzed using a Bio-Rad FX Pro Imager. For each reaction, activity was determined by measuring the total intensity of extended telomere substrate, correcting for background, and normalizing against unextended primer (loading control). Relative activities were obtained by dividing the activity of each reaction by that of the reaction with saturated concentration of RNA fragments. For the titration assay, the relative activities were plotted against concentrations of RNA fragment and the nonlinear regression curve fitting was carried out using the one-site binding (hyperbola) equation, Y ϭ B max ϫ X/(K d ϩ X) (Prism 5, Graphpad Software, Inc.).

RESULTS
A Novel Bioinformatics Approach to Identify TR Sequences-Despite significant efforts to clone TRs from a diverse array of vertebrate species, TR sequences have not been identified from teleost fish (4). Computational searches for TR candidates using the Basic Local Alignment Search Tool (BLAST) of the sequenced teleost fish genomes have been unsuccessful (data not shown). The inability to identify TR sequences in teleost fish using either degenerate PCR or BLAST presumably stems from the fact that vertebrate TRs are conserved only in eight relatively short regions (called Conserved Region 1-8, or CR1-CR8) that are interrupted by highly variable sequences with a large number of indels (4).
To identify TR sequences, we employed an improved homology search tool, fragrep2, to search teleost fish genomes. The original version of the fragrep program implements a specialized algorithm for homology search that considers gap-free sequence patterns separated by variable-length regions of nonaligned sequence (16). This approach has been demonstrated to work well for genomewide searches of non-coding RNAs (16,17). However, it had not been successful in finding teleost fish TRs. This is because even the relatively well conserved blocks, i.e. CR1-CR8, contained too many variations to be well represented by a single consensus sequence. To circumvent this, in fragrep2, we have replaced consensus sequences by PWMs to search for matched DNA sequences (10). As shown in supplemental Fig. S1, the initial search pattern contains a collection of PWMs as well as minimal and maximal distances between these PWM blocks.
Using this new approach, we successfully found a TR candidate in the medaka genome. Homologs of this medaka sequence could then be readily found by means of BLAST in stickleback, fugu, and Tetraodon genomes. All four sequences are flanked upstream by an ADP-ribosylation factor and downstream by homologs of human LASP1 and/or PLXDC2 (Table  S1). Based on the alignment of the four teleost fish sequences, we modified the search pattern and were able to retrieve a single convincing candidate from the zebrafish genome using fragrep2. Surprisingly, the genomic location of the zebrafish TR candidate is neither syntenic with that of the other teleost sequences nor with the human locus (Table S1). All five teleost TR genes were PCR amplified from genomic DNA samples and the PCR DNA products were sequenced directly to verify the sequences identified from the genome databases (see "Experimental Procedures").
Unique Transcription Elements of Fish TR Genes-Analysis of genomic sequences upstream of the fish TR-coding sequences revealed transcriptional elements typical of an RNA polymerase II promoter: a conserved TATA box-like and a CCAAT box element (Fig. S2). This suggests that, like other vertebrate TRs, teleost TRs are products of RNA polymerase II. Interestingly, a putative CRE-BP1/c-Jun binding element, located between the TATA and CCAAT boxes, is conserved in both teleost and cartilaginous fishes, and some amphibians (bullfrog and horned frog) (Fig. S2). This data suggest an evolutionary change in transcriptional regulation of the TR gene along the tetrapod lineage.
The Compact Size of Teleost Fish TR-To confirm the presence of the identified teleost TR transcripts in cells, we performed Northern blotting analysis to detect the endogenous TRs. The medaka and zebrafish TRs were each detected as a single band on the Northern blot (Fig. 1A, lane 1). Based on the Northern result, the size of the endogenous medaka and zebrafish TRs are estimated to be slightly smaller than the in vitro transcribed RNA markers that are 317 and 322 nt, respectively (Fig. 1A, compare lanes 1 and 2).
To determine the actual size of the endogenous TR, we mapped the 5Ј-ends of medaka and zebrafish TRs by 5Ј-RACE. The results showed that the 5Ј-ends of both medaka and zebrafish TRs lie 14 nucleotides upstream of the template sequence. Assuming that the 3Ј-end of the fish TR is located, like other vertebrate TRs, 3 residues downstream of the box ACA motif, the medaka and zebrafish TRs are predicted to be 312 and 317 nt long, respectively, consistent with the sizes observed from the Northern analysis. Based on sequence alignment, the other three teleost TR homologs are predicted to be 348 (stickleback), 325 (fugu), and 328 nt (Tetraodon). This makes teleost TRs the smallest among all known vertebrates, as the size of previously known vertebrate TRs ranges from 382 to 559 nt (4).
Teleost fishes have notably small genomes, whereas the cartilaginous fishes have relatively large genomes (18). Intriguingly, teleost fishes with smaller genomes have the smallest TRs, whereas cartilaginous fishes with larger genomes have the largest TRs (from 478 to 559 nt) among vertebrates. By plotting the TR size over the genome size, we found a positive correlation between the size variation of TR and the genome size with an R 2 value of 0.5007 and a p value Ͻ0.0001 (Fig. 1B). This strong correlation suggests that the size variation of fish TR resulted from evolution of the fish genome.
Secondary Structure of Teleost Fish TR-To determine whether these small teleost TRs share a similar secondary structure with other vertebrate TRs, we constructed secondary structure models for teleost fish TRs using phylogenetic comparative analysis. The primary sequences of the five teleost TRs identified were aligned manually as described previously (4). The eight conserved regions CR1-CR8 found previously in 35 vertebrate TRs are largely conserved in the teleost TRs (Fig. 2). Because of their small size and the presence of the CR sequences, teleost fish TR sequences can be readily aligned without much ambiguity. The aligned sequences were analyzed for covariations to derive a conserved secondary structural model for the teleost TR (Fig. 3A) S3).
Being the smallest, the teleost TR resembles the essential core of vertebrate TR (Fig. 3B). It contains shorter linker sequences between the three conserved domains. The commonalities and differences of the vertebrate TR structures are discussed in detail below.  The residues shaded in yellow are located in the single-stranded regions and are universally conserved among the five teleost fishes. Dashes (Ϫ) denote alignment gaps. Every tenth nucleotide of the zebrafish sequence is marked with dots above the alignment. The size of each RNA is indicated at the end of the respective sequence. Asterisks indicate organisms for which the 5Ј-end of the RNA was determined by 5Ј-RACE.
Pseudoknot/Template Domain-The pseudoknot/template domain consists of a highly conserved pseudoknot structure, the template sequence, and the P1 helix that defines the boundary of the RNA template. The pseudoknot structure consists of the P2a-P2b and P3 helices that are universally present in vertebrate TRs (Fig. 3B). The mammalian pseudoknot, however, contains an additional helix P2a.1 that extends the P2a helix (Fig. 3B, human TR). This mammal-specific P2a.1 helix is essential for human telomerase activity and is possibly involved in binding to the TERT protein (19). In teleost TR, the P2a and P2b helices are separated by a conserved asymmetric (0/6) internal loop (Fig. 3A), whereas, in other groups of vertebrates, this internal loop contains a varying number of residues. P2b  The P3 helix, in tetrapods, is conserved as a 9-base pair helix with a single nucleotide bulge (Fig. 4A, tetrapods). The shark and ray P3 helix has the same length but with a 2-nucleotide bulge at a different position (Fig. 4A, sharks and rays). Medaka TR interestingly lacks any bulge in its P3 helix, whereas other teleost TRs have a 1-nucleotide bulge at the position identical to the sharks. Notably, the lack of a bulge in the medaka P3 helix seems to be compensated by extensions of the P3 helix and GG  GG  GG  GG  GG  GG  GG  GG  GG  GG  GG  GG  GG  AG  GG  GG  GG  GG  GG  GG  GG  GG  GG  GG  GG  GG  GG  GG  GG  GG  GG  GG   GA  GA  GA  GU  GA  GA  GA  GA  GA  GA  GC  GA  GA  GA  GA  GA  GA  GA  GA  GA  GA  GG  GA  GG  GA  AA  GG  GG  GA  GA  AA  GA  GA  GA  GA  GAU  GGU  GAU  GUC J2b/3 loop (Fig. 4A, medaka). The variation of the size and position of the bulge in the P3 helix suggests that it might not be a critical element for the function or structure of the pseudoknot structure. Deletion of the bulge in the human P3 helix results in a minor reduction of telomerase activity (20,21). The real role of the P3 bulge has yet to be revealed. Based on an NMR solution structure, the pseudoknot of human TR forms a triple helix that involves 5 base triples and a base pair at the junction of P2b and P3 helices (21). The sequences that form the triple helix are absolutely conserved even in teleost TR, confirming its critical role in telomerase function (Fig. 4A). In contrast, the distal portion of the P3 helix and the J2b/3 are less conserved, and are slightly variable in length and sequence (Fig. 4A, teleost panel).
In all vertebrates, except for some rodents, TRs possess a long-range interacting P1 helix upstream of the template region (Fig. 3). In human TR, the P1 helix consists of two individual helices, P1a and P1b, separated by an internal loop. The teleost P1 helix is substantially shorter, containing only the P1b equivalent portion while lacking the P1a portion. The integrity of P1b helix and its distance from the template defines the boundary of the RNA template (22). In human telomerase, disruption of the P1b helix alters the template boundary, resulting in template usage outside of the normal template. Likewise, disruption of the P1 helix in medaka TR also altered the template boundary (data not shown). This supports the notion that the P1 helix is also the element for template boundary definition in teleost telomerase.
CR4 -CR5 Domain-The CR4 -CR5 domain, in addition to the pseudoknot/template domain, is a structural element essential for in vitro telomerase activity. The P6 and P6.1 helices in this domain are universally present in all known vertebrate TRs (Fig. 3B). Remarkably, the sequence (5Ј-AAGAGNUNGN-CUCUG-3Ј) of the P6.1 stem-loop is highly conserved even in the teleost fish. It was previously thought that the invariant sequence of the P6.1 helix loop was due to a biased sequence collection that resulted from the PCR amplification strategy used for cloning most of the vertebrate TRs (4). This PCR strategy presumably amplified only the TR sequences with conserved sequence in the P6.1 stem-loop, part of the annealing site of the PCR reverse primers. However, all five teleost fish TRs were identified through bioinformatic searches, instead of PCR. The structure, not the sequence, of the P6.1 helix is known to be important for telomerase activity in vitro as compensatory mutations that maintain the helical structures of P6.1 do not reduce activity of reconstituted telomerase (15). Surprisingly, similar compensatory mutations of P6.1 helix resulted in reduced telomerase activity reconstituted in vivo (23). The absolute sequence conservation in the P6.1 helix suggests that, in addition to its based-paired structure, the sequence of this helix might be also important for the in vivo function of telomerase.
The teleost TR, lacking the distal stem-loop P6b, consists of a shorter P6 (i.e. homologous to the P6a in human TR), P.6.1, and P5 helices in the CR4 -CR5 domain (Fig. 4B). Whereas the P6b helix is dispensable in teleost fish and some tetrapods such as turtle and frog, the proximal part of the P6b stem-loop is required for human telomerase activity (24). The singlestranded regions, J5/6 and J6.1/5, at the three-way junction between P5, P6, and P6.1 helices are relatively more variable in teleost than in other vertebrates. Although its essential role in telomerase function is evident, the mechanistic role of the CR4 -CR5 domain remains to be uncovered.
snoRNA/scaRNA Domain-The 3Ј-portion of vertebrate TR contains a unique secondary structure (hairpin-hinge-hairpintail) and sequence motifs (box H and ACA) that are critical for TR biogenesis and shared by the box H/ACA snoRNAs (7). Most vertebrate TRs contain an additional motif called the CAB box that is shared by the small Cajal body RNAs (scaRNAs) (3). Whereas the box H and ACA are important for RNA localization to nucleoli, the CAB box is important for localization of the RNA to the Cajal body where RNP complex assembly is thought to take place (25). Interestingly, teleost TR lacks an obvious CAB box (UGAG) in the CR7 region (Fig. 4C). The lack of CAB box implies that teleost TR might not localize to the Cajal body. Because the Cajal body has been suggested to play a role in telomerase regulation and telomere recruitment (26), it would be interesting to understand TR localization in teleost and its correlation with the regulation of telomerase function.
Medaka and Fugu Telomerases Reconstituted in Vitro Are Active and Processive-Telomerase activity reconstituted in vitro requires both the TR component and the catalytic TERT protein. To functionally characterize the structural elements of teleost TR, we reconstituted telomerase from in vitro synthesized TERT protein and TR (see "Experimental Procedures"). Active telomerases were successfully reconstituted for medaka and fugu, confirming the authenticity of the teleost telomerase components cloned (Fig. 5). As predicted from the presence of the 4-nucleotide alignment sequence in their RNA templates, the reconstituted medaka telomerases are processive, generating a typical 6-nucleotide ladder pattern of the elongated products (Fig. 5A).
Vertebrate TERT protein possesses two RNA-binding sites that bind independently to the CR4 -CR5 and pseudoknot domains of the TR. As shown previously, human TERT is functionally compatible with the mouse CR4 -CR5 domain but not the mouse pseudoknot domain (14). In this study, we also showed that the medaka and fugu TERT proteins reconstituted telomerase activity with CR4 -CR5 RNA fragments, but not the pseudoknot domain, from other teleost fish species (Fig. 5B,  lanes 1-6 and 10 -15) or even distantly related vertebrates such as human, quoll, chicken, turtle, frog, and shark (data not shown). This difference in cross-species compatibility indicates that the CR4 -CR5 domain is functionally more conserved across a wide variety of species than the pseudoknot domain. Unlike the fugu TERT, the medaka TERT assembled with the fugu pseudoknot RNA to reconstitute telomerase activity with a low processivity (Fig. 5B, lanes 4 -6), suggesting a more relaxed RNA binding specificity of the medaka TERT protein. However, the pseudoknot fragment of zebrafish TR failed to generate telomerase activity when assembled with medaka or fugu TERT proteins (Fig. 5B, lanes 7-9 and 16 -18), suggesting a cross-species incompatibility of the zebrafish pseudoknot with the TERT protein.
To analyze activity of zebrafish telomerase, we thus identified and cloned zebrafish TERT cDNA (see "Experimental Procedures"). Unexpectedly, the in vitro synthesized zebrafish TERT protein failed to reconstitute a detectable activity when assembled with zebrafish, medaka, or fugu TRs (data not shown). Based on the alignment of TERT amino acid sequences, the cloned zebrafish TERT protein was unlikely to be an alternative splicing variant, as it contained all essential motifs. The possibility of mutations in the cloned zebrafish TERT gene was ruled out as identical sequences were found from two individual zebrafish obtained from different sources.
Whereas gene duplication is relatively common in teleost, more rigorous BLAST searches of the zebrafish genome did not reveal any other candidate sequences for the TERT gene. We speculate that the in vitro synthesized zebrafish TERT protein, unlike the medaka and fugu TERT proteins, might not fold correctly as the recombinant zebrafish TERT protein migrated faster than expected on SDS-PAGE (data not shown).
The CR4 -CR5 Domain Is the Main Determinant in TR for Functional Binding to Medaka TERT-During reconstituting teleost fish telomerase, we observed a significantly lower activity of the reconstituted enzyme using two RNA fragments than that of the enzyme reconstituted using the full-length RNA (data not shown). To determine which RNA fragment was responsible for the lower reconstituted activity, we carried out the in vitro reconstitution with titrations of each of the two RNA fragments as well as the full-length RNA. We define the median effective concentration (or EC 50 ) as the RNA concentration required to generate 50% of the saturated activity of reconstituted telomerase. It is noteworthy that this EC 50 value measured in this assay is related only to the functional binding (or assembly) of the RNA fragment to the TERT protein, excluding nonspecific or non-functional bindings. A lower EC 50 value of the RNA indicates that the RNA assembles more efficiently with the TERT protein to generate active telomerase. Remarkably, the CR4 -CR5 fragments and the full-length TR gave rise to comparable EC 50 values. The medaka CR4 -CR5 and fulllength RNAs had similar EC 50 values of 87.4 and 85.9 nM, respectively, whereas the human CR4 -CR5 and full-length RNAs had EC 50 values of 203.9 and 241.6 nM, respectively (Fig. 6). In comparison, the medaka and human pseudoknot RNA fragments had high EC 50 values of 506.2 and 523.5 nM, respectively (Fig. 6). The reduction of reconstituted activity at high concentrations of the full-length TR might be due to the multimerization or aggregation of TR as previously reported (27). Our result indicates that the CR4 -CR5 domain is the main determinant for efficient binding and assembly of TR to the TERT protein.

DISCUSSION
Unlike the TERT, TR is prominently divergent in size, sequence, and even structure. In this study, by using a novel bioinformatics approach, we have successfully identified TR sequences from five teleost genomes. The structural and func- tional analyses of teleost fish telomerase provide important insights into the structural evolution of vertebrate TR as well as the co-evolution of the TR and TERT protein.
Fast Evolution of TR Structure and Size-Because of the various numbers of species-specific structural elements, the size of TR is remarkably variable, up to 1 order of magnitude, from 150 nt in ciliates to 1500 nt in yeasts. From the evolutionary point of view, the emergence or disappearance of structural elements in TR over a short evolutionary time scale is rather intriguing. The unusual plasticity of TR structure was likely facilitated by the non-lethal and progressive nature of the consequences of TR mutations. In organisms with long telomeres, the impact of telomerase mutations is delayed for a number of generations (28). Such delay could allow an accumulation of secondary mutations, some of which might compensate for the initial deleterious mutation, eventually leading to emergence of novel structural elements in TRs.
A possible scenario for the emergence of new structural elements is the insertion of a transposable element into the TR gene during evolution. For example, the scaRNA or snoRNA domains in the vertebrate TR is absent in both the ciliate and yeast TRs, and has been acquired during evolution along the vertebrate lineage. As some snoRNA and scaRNA contain characteristics of retrotransposons (29), it is possible that a transposition event may have occurred and fused a mobile scaRNA gene with an ancestral TR gene. Because most vertebrates, including the early branched cartilaginous fish, contain the scaRNAspecific motif (CAB box), we propose that it was a scaRNA, rather than a snoRNA, that was inserted into the vertebrate TR gene. Teleost fish and some bird TRs that lack an obvious CAB box, might have subsequently evolved to function without a CAB box motif. Notably, other scaRNAs, e.g. U100, from teleost fish contain a conserved CAB box sequence (30). Identification of TRs from early branching chordates such as the sea squirt will provide crucial clues about the origin of the vertebrate-specific structural domains. Based on the phylogenetic tree derived from the aligned TR sequences, tetrapods, teleost fishes, and cartilaginous fishes are grouped into three monophyletic clades (Fig. 7), representing three separated evolutionary lineages that lead to three distinct size groups of TR molecules. Cartilaginous and teleost fish TRs evolved in opposite directions toward size expansion and reduction, respectively, corresponding to their genome size FIGURE 6. Effective concentrations of the pseudoknot and CR4 -CR5 domains to assemble active telomerase in vitro. Titration experiments were performed with pseudoknot and CR4 -CR5 RNA fragments or fulllength TR alone for reconstituting medaka (upper panel) and human (lower panel) telomerase enzymes. Various concentrations of pseudoknot or CR4 -CR5 RNA fragments were assembled with the other RNA fragment at a saturated 3 M and the in vitro synthesized TERT protein, followed by the conventional telomerase assay. The pseudoknot (medaka, nt 1-150 and human, nt 32-195) and CR4 -CR5 (medaka, nt 170 -220 and human, nt 241-328) RNA fragments were titrated as indicated. The relative activity represents the ratio of total activity of each reaction over the total activity of the reaction with saturated concentrations of both RNA fragments. The median effective concentration (EC 50 ) values of each RNA fragment are indicated. FIGURE 7. The neighbor-joining tree inferred from the vertebrate TR sequences. The tree was derived using the neighbor-joining method from the aligned TR sequences of 14 vertebrates including 5 tetrapods (human, mouse, macaw, turtle, and frog), 5 teleost fishes (fugu, tetraodon, stickleback, medaka, and zebrafish), and 4 cartilaginous fishes (stingray, cownose ray, sharpnose shark, and dogfish shark). The phylogenetic tree was constructed using the program MEGA3.1 (37). The number next to each node indicates a value as a percentage of 1000 bootstrap replicates. Branch lengths are proportional to the number of residue changes. Scale bar indicates an evolutionary distance of 0.05 nucleotide substitution per position in the sequence.
evolution. The small sizes of teleost genomes are mainly due to the low abundance of transposable elements and the significant reduction in intron size (31). Our data suggest that genome compression affected not only the intergenic or intronic DNA sequences but also the RNA genes. Similarly, teleost RNase P RNA is about 50 nt shorter than the 350-nt long human RNase P RNA (data not shown).
Interestingly, teleost fish TR appears to be more divergent than cartilaginous fish TR from tetrapod TR (Fig. 7). This is consistent with a recent comparative genomic study that showed a higher degree of sequence conservation between the human and elephant shark genomes than that of human and teleost fish genomes (32,33). It is generally believed that the teleost fish has experienced a genome duplication after diverging from tetrapod lineage and before the fish radiation (34). However, no extra TR gene or pseudoknot gene was found in the 5 teleost fish species, suggesting either the teleost TR gene was not duplicated or the duplicated TR copy has been lost from the common ancestor of teleost fish.
Co-evolution of the TR and TERT Protein-During structural diversification, the function of the telomerase RNP has to be conserved through co-evolution between the RNA and protein components, which can be reflected by the interspecies compatibility of the components. For example, the CR4 -CR5 RNA fragments from distantly related species such as human were able to reconstitute telomerase activity with medaka TERT (Fig. 5 and data not shown). In contrast, the pseudoknot/template RNA domain appears to be incompatible even between closely related species (e.g. between medaka and fugu, or between human and mouse), suggesting a faster rate of co-evolution between the pseudoknot RNA domain and the TERT protein.
The triple helix within the pseudoknot domain contains invariant sequences and is one of the most conserved structural elements in vertebrate TRs (Fig. 4A). As the triple helix seems to be an ancient feature conserved in many species (2,21,35), it is, thus, unlikely to be responsible for the interspecies incompatibility of the pseudoknot domains. The distal helix of P3 stem and J2b/3 loop, on the other hand, demonstrate some extent of variation among vertebrate species (Fig. 4A). Swapping the whole pseudoknot structure (P3, P2b, and J2b/3) between medaka and fugu TRs did not improve their inter-species compatibility (data not shown).
The teleost CR4 -CR5 domain is considerably smaller than other vertebrates as it lacks the distal P6b helix. Nonetheless, the smaller medaka CR4 -CR5 RNA fragment (50 nt) exceeds its human counterpart (89 nt) in effectiveness of reconstituting telomerase activity in vitro (Fig. 6). The higher assembly efficiency is likely due to a higher binding affinity between the medaka TERT protein and the CR4 -CR5 RNA fragment, which would require substantial co-evolution between the medaka TERT protein and the TR. Because the P6b helix in the CR4 -CR5 domain of human TR is essential for binding to the human TERT protein (24), the human TERT might have evolved with an additional binding pocket for the P6b helix.
Whereas we were able to reconstitute activity from medaka and fugu telomerases, it is unclear why the zebrafish TERT failed to reconstitute detectable telomerase activity. Among the five teleost species studied, zebrafish branches out early and is more divergent than the other four teleost fishes (36).
In summary, the identification of teleost TR and characterization of its structure and function reveal an unusual divergence of vertebrate TR. The novel bioinformatic tool fragrep2 is an effective approach to find notoriously divergent TR sequences in eukaryotic genomes. The small teleost fish TR and the large cartilaginous fish TR reflect the unusual plasticity of TR structure during evolution. Teleost fish telomerase is very processive and contains a functional P1 helix that defines the template boundary. The conservation of the structure and function of teleost fish telomerase supports the use of teleost fish as a model organism for the study of telomerase biology.