tRNA Splicing*

Introns interrupt the continuity of many eukaryal genes, and therefore their removal by splicing is a crucial step in gene expression. Interestingly, even within Eukarya there are at least four splicing mechanisms. mRNA splicing in the nucleus takes place in two phosphotransfer reactions on a complex and dynamic machine, the spliceosome. This reaction is related in mechanism to the two self-splicing mechanisms for Group 1 and Group 2 introns. In fact the Group 2 introns are spliced by an identical mechanism to mRNA splicing, although there is no general requirement for either proteins or co-factors. Thus it seems likely that the Group 2 and nuclear mRNA splicing reactions have diverged from a common ancestor. tRNA genes are also interrupted by introns, but here the splicing mechanism is quite different because it is catalyzed by three enzymes, all proteins and with an intrinsic requirement for ATP hydrolysis. tRNA splicing occurs in all three major lines of descent, the Bacteria, the Archaea, and the Eukarya. In bacteria the introns are self-splicing (1–3). Until recently it was thought that the mechanisms of tRNA splicing in Eukarya and Archaea were unrelated as well. In the past year, however, it has been found that the first enzyme in the tRNA splicing pathway, the tRNA endonuclease, has been conserved in evolution since the divergence of the Eukarya and the Archaea. Surprising insights have been obtained by comparison of the structures and mechanisms of tRNA endonuclease from these two divergent lines.

Introns interrupt the continuity of many eukaryal genes, and therefore their removal by splicing is a crucial step in gene expression. Interestingly, even within Eukarya there are at least four splicing mechanisms. mRNA splicing in the nucleus takes place in two phosphotransfer reactions on a complex and dynamic machine, the spliceosome. This reaction is related in mechanism to the two self-splicing mechanisms for Group 1 and Group 2 introns. In fact the Group 2 introns are spliced by an identical mechanism to mRNA splicing, although there is no general requirement for either proteins or co-factors. Thus it seems likely that the Group 2 and nuclear mRNA splicing reactions have diverged from a common ancestor. tRNA genes are also interrupted by introns, but here the splicing mechanism is quite different because it is catalyzed by three enzymes, all proteins and with an intrinsic requirement for ATP hydrolysis.
tRNA splicing occurs in all three major lines of descent, the Bacteria, the Archaea, and the Eukarya. In bacteria the introns are self-splicing (1)(2)(3). Until recently it was thought that the mechanisms of tRNA splicing in Eukarya and Archaea were unrelated as well. In the past year, however, it has been found that the first enzyme in the tRNA splicing pathway, the tRNA endonuclease, has been conserved in evolution since the divergence of the Eukarya and the Archaea. Surprising insights have been obtained by comparison of the structures and mechanisms of tRNA endonuclease from these two divergent lines.

tRNA Precursors in Eukarya and Archaea
The earliest studies of tRNA splicing were in the yeast Saccharomyces cerevisiae where tRNA introns were first discovered (4,5). With the completion of the S. cerevisiae genome sequence it is now known that yeast contains 272 tRNA genes of which 59, encoding 10 different tRNAs, are interrupted by introns (6). The introns are 14 -60 nucleotides in length and interrupt the anticodon loop immediately 3Ј to the anticodon (7). Among the 10 different yeast pre-tRNAs there is no conservation of sequence at the splice junctions although the 3Ј-splice junction is invariably in a bulged loop (8). Early studies on the structure of yeast tRNA precursors showed that the conformation of the mature domain is retained suggesting the model of the tertiary structure of eukaryal pre-tRNA shown in Fig. 1A (9,10).
In the Archaea the introns are also small and often interrupt the anticodon loop, but they are found elsewhere as well, for example interrupting the dihydro U stem (11). In several of the Archaea, tRNA genes have been found that contain two introns. The splice sites are found in an absolutely conserved structural motif consisting of two loops of three bases separated by a four-base pair helix, the bulge-helix-bulge (BHB) 1 motif (12). This structure, modeled in Fig. 1B from the related TAR RNA structure (13), allows the archaeal splicing mechanism to be extended to introns in rRNA that also retain this motif. Thus, early on it was suggested that the eukaryal and archaeal splicing systems operate by a different mechanism.

The Pathway of tRNA Splicing in Eukarya
The early discovery by Hopper and co-workers (14) that pre-tRNAs accumulate in the yeast mutant rna1-1 provided a source of pre-tRNA substrates, which allowed the development of the first in vitro RNA splicing system (15,16). Using this system the pathway of tRNA splicing was deduced (17)(18)(19)(20).
The tRNA splicing reaction in yeast occurs in three steps; each step is catalyzed by a distinct enzyme, which can function interchangeably on all of the substrates (Fig. 2). In the first step the pre-tRNA is cleaved at its two splice sites by an endonuclease. The products of the endonuclease reaction are the two tRNA halfmolecules and the linear intron with 5Ј-OH and 3Ј-cyclic PO 4 ends. The endonuclease has been purified to homogeneity (6,21). The enzyme behaves as an integral membrane protein, and since splicing takes place in the nucleus, it may be an inner nuclear envelope protein. The two tRNA half-molecules, in essence a nicked tRNA, are the substrate for the ensuing ligase reaction. This baroque reaction, catalyzed by the 90-kDa tRNA ligase (22,23), takes place in three steps. In the first step the cyclic PO 4 is opened to give a 2Ј-PO 4 and 3Ј-OH. In the second step the 5Ј-OH is phosphorylated with the ␥-PO 4 of GTP (24,25). tRNA ligase is adenylated at an active site lysine (26), and then the AMP is transferred to the 5Ј-PO 4 of the substrate. Formation of the 5Ј-3Ј-phosphodiester bond proceeds and AMP is released. The phosphate at the spliced junction is derived from the ␥-phosphate of GTP, and the phosphate originally at the 5Ј-splice site remains at the spliced junction as a 2Ј-phosphate and must be removed to complete the splicing reaction. Phizicky and co-workers (27) have characterized the enzymology of the removal of the 2Ј-PO 4 from the spliced tRNA. A nicotinamide adenine dinucleotide (NAD)-dependent phosphotransferase catalyzes the transfer of the 2Ј-PO 4 to NAD (28,29). Surprisingly the structure of the transfer product is ADP-ribose 1"-2"-cyclic phosphate (30). The nicotinamide moiety is displaced, apparently supplying the energy for cyclization. It is tempting to speculate that this unique and hitherto unknown compound goes on to play some crucial regulatory role in the cell.

Specificity in tRNA Splicing: Recognition of pre-tRNA by the Endonuclease
The eukaryal endonuclease is solely responsible for the recognition of the splice sites contained in the pre-tRNA. Since only the mature domain in the pre-tRNA is conserved, it was postulated that endonuclease recognizes the splice sites by measuring the distance from the mature domain to the splice sites (31). This hypothesis was confirmed by experiments in which insertion mutations in the pre-tRNA that changed the distance between the mature domain and the anticodon resulted in a predictable shift of the splice sites (31).
The intron is not completely passive in the recognition process. The Xenopus oocyte tRNA endonuclease has been shown to recognize a crucial element involving the intron (8). Yeast tRNA introns contain a conserved purine residue three nucleotides upstream of the 3Ј-splice site. This base must be able to pair with a pyrimidine at position 32 in the anticodon loop in order for the intron to be recognized by either yeast or Xenopus endonuclease (8,32). These experiments suggested complexities in the structure of the pre-tRNA effecting the recognition by the eukaryal endonuclease, which had not been previously appreciated. It has also been demonstrated that there are different requirements for the recognition of the 5Ј-and 3Ј-splice sites (32).
Recognition of archaeal tRNA splice sites by the archaeal tRNA endonuclease relies solely on the BHB motif (Fig. 1B). Yeast pre-tRNAs are not substrates for the Haloferax volcanii endonuclease (33), and unlike the eukaryal endonuclease, the mature domain of the pre-tRNA is not required for intron excision by the archaeal enzyme (12). Despite the differences in both substrate and the mechanism for substrate recognition between the archaeal and eukaryal systems, as we shall discuss below, the endonuclease that catalyzes the first step in splicing has been conserved between Eukarya and Archaea. Different mechanisms for substrate recognition have evolved since the divergence from their common ancestor.

The Yeast and Archaeal tRNA Splicing Endonucleases: Related Enzymes
The characterization of the yeast tRNA endonuclease was extremely difficult for two reasons. The enzyme was present at very low levels, approximately 150 molecules per cell, and it appeared to be an integral membrane protein. However, after many years of work the enzyme was successfully purified, and the genes for all four of its subunits were cloned (6, 21). Of particular advantage in purifying the enzyme was the construction of a modified gene for the 44-kDa subunit, SEN2, containing His 8 and Flag epitope affinity tags. The SEN2 gene had been found earlier in genetic screens (34,35), and the sen2-3 allele was shown to specifically block 5Ј-splice site cleavage (35).
The enzyme turned out to be an ␣␤␥␦ heterotetramer whose subunits have molecular masses of 54 (SEN54), 44 (SEN2), 34 (SEN34), and 15 kDa (SEN15). Each of these genes proved to be essential for cell viability. The amino acid sequence of each of the four subunits contains a canonical nuclear localization sequence. Sen2 contains the only plausible transmembrane sequence, suggesting that it anchors the endonuclease complex to the nuclear membrane. Two subunits of endonuclease, Sen2 and Sen34, contain a homologous domain approximately 130 amino acids in length, suggesting that they perform a similar function. The excitement came when we learned that apparent homologs of this domain are encoded by the gene for the archaeal tRNA splicing endonuclease of H. volcanii cloned by Daniels and co-workers (36) and in endonuclease homologs found in the sequenced genomes of Methanococcus jannaschii, Methanobacterium thermoautotrophicum, and Archaeoglobus fulgidis.
The homology between Sen2, Sen34, and the archaeal endonucleases immediately suggested a model in which the yeast endonuclease contains two distinct active sites, one for each splice site (6,37). The fact that sen2-3 is defective in cleavage of the 5Ј-splice site suggested that Sen2 contains the active site for 5Ј-splice site cleavage and led to the prediction that Sen34 cleaves the 3Ј-splice site. This hypothesis was strongly supported by the observation that a Sen34 mutant enzyme is defective in 3Ј-splice site cleavage (6).
Our biochemical experience with the endonuclease had suggested strong interactions between the subunits. To probe the nature of these interactions a two-hybrid experiment was performed in which all possible pairwise combinations of the four subunits were probed (6). It turned out that strong interactions were only seen between Sen2 and Sen54 and between Sen34 and Sen15. These results together with those described above lead us to a model for the yeast endonuclease in which Sen2 contains the active site for 5Ј-splice site cleavage and Sen34 the active site for 3Ј-splice site cleavage (see below; Fig. 4C). There is as yet no evidence as to how the ruler mechanism works, although we propose that it could be via the interaction of Sen54, a very basic protein, and Sen2.
The tRNA splicing endonuclease of the archaeon H. volcanii was shown to behave as a homodimer in solution (36). Since its substrate, the consensus BHB motif, has pseudo 2-fold symmetry, it seems very likely that the 2-fold symmetric dimer recognizes its substrate such that each splice site is cleaved by a separate active site (see below). Thus we are led to a unified model of tRNA splicing in which the two splice sites are cleaved by separate protein subunits, each containing an active site.

The Three-dimensional Structure of an Archaeal tRNA
Splicing Endonuclease M. jannaschii contains a gene that encodes an endonuclease homologous to the H. volcanii enzyme but is about half the size (179 amino acids in the case of M. jannaschii). We believed that a high resolution structure of the simpler archaeal enzyme would shed light on the mechanism of the more complicated but related eukaryal endonuclease and thus embarked on a structural study of the M. jannaschii endonuclease, obtaining an x-ray structure refined to a resolution of 2.3 Å (13).
The M. jannaschii endonuclease is an ␣ 4 tetramer different from the dimeric H. volcanii enzyme (13,36,38). Fig. 3A shows that the M. jannaschii endonuclease monomer consists of two distinct domains: the N-terminal domain (residues 9 -84) and the C-terminal domain (residues 85-179). The N-terminal domain consists of three ␣-helices and a mixed antiparallel/parallel ␤-pleated sheet of four strands. The C-terminal domain contains two ␣-helices flanking a five-stranded mixed ␤-sheet. The last strand ␤9 is partially hydrogen bonded with ␤8, but its main interactions mediate the isologous pairing seen in the tetramer (see below). Understanding the interactions between monomers has turned out to be crucial to under- standing the structure and evolution of the members of the tRNA splicing endonuclease family.
Two pairs of subunits (A1 and A2, B1 and B2) associate to form isologous dimers via extensive interactions between their ␤9 strands (Fig. 3B). The carboxyl half of ␤9 from one subunit forms main chain hydrogen bonds with the symmetry-related residues of ␤9 from the other subunit (␤9Ј), leading to a two-stranded ␤-sheet spanning the subunit boundary. The two L8 loops, also hydrogen bonded and related by the same symmetry, form another layer on top of the two-stranded ␤9 sheet. The ␤9-␤9Ј sheet together with L8-L8Ј encloses a hydrophobic core at the intersubunit surface. These hydrophobic residues are important for stabilizing the dimer interface and lead to an extremely stable dimeric unit, which we believe has been conserved in evolution (see below).
The tetramer is formed via heterologous interaction between the two dimers. The main interaction between the two dimers is via the insertion of loop L10 from subunits A2 and B2 into a cleft in subunits B1 and A1 between the N-and C-terminal domains of each monomer. The interaction is primarily polar between acidic residues in loop L10 and basic residues in the cleft. This arrangement causes the two isologous dimers to be translated relative to each other by about 20 Å. This brings subunits A1 and B1 much closer together than A2 and B2, which do not interact at all, and results, as we shall see below, in an arrangement of subunits in which only one symmetrically disposed pair of active sites can recognize the substrate. These interactions, though probably less stable, are also likely to have been conserved in evolution because they lead to a distinctive interaction between the dimers, which in turn leads to the required positioning of the two active sites.
The tRNA splicing endonucleases cleave pre-tRNA leaving 5Ј-OH and 2Ј-3Ј-cyclic PO 4 termini. This is the same specificity seen in the ribonucleases, RNase A, T1, etc. The RNase A mechanism has been studied extensively, and a first guess would be that chemically the endonuclease mechanism should be similar (39,40). The reaction pathway for RNase A is a two-step acid-base-catalyzed reaction. A general base abstracts a proton from the 2Ј-OH of ribose leading to an in-line attack on the adjacent phosphodiester bond and the formation of a pentacovalent intermediate. The general acid protonates the 5Ј-leaving group leading to the 2Ј-3Ј cyclic PO 4 product. In a second step a proton is abstracted from H 2 O, OH Ϫ attacks, and the 2Ј-3-cyclic PO 4 is hydrolyzed to give the 3Ј-PO 4 . In RNase A, His 12 is the general base in the first step, His 119 is the general acid, and the pentacovalent transition state is stabilized by Lys 41 .
In the endonuclease family there is a conserved histidine residue at position 125 in the M. jannaschii enzyme. There is strong evidence that this histidine is part of the active site. A change to alanine in the equivalent histidine at position 242 in Sen34 was shown to impair 3Ј-splice site cleavage (6). Daniels 2 has shown that the equivalent histidine mutant in the H. volcanii enzyme impairs cleavage as has Garrett (38) for a His 125 to Ala mutant in the M. jannaschii endonuclease.
His 125 (in L7) is found in a cluster with conserved residues Tyr 115 (in L7) and Lys 156 (in ␣5) on the surface of the monomer and forms a pocket into which the scissile phosphate to be cleaved is proposed to fit (Fig. 3A). Significantly these three residues can be spatially superimposed with the catalytic triad of RNase A. In the superposition, His 125 is equivalent to His 12 of RNase A and should be the 2 C. J. Daniels, personal communication. general base; Tyr 115 should be the general acid, and Lys 156 stabilizes the transition state. This would appear to be a case of convergent evolution, because it is clear that the tRNA endonucleases and the RNases do not share a common ancestor.
We expect two of the M. jannaschii endonuclease active sites to recognize and cleave the symmetric tRNA substrate. We favor the choice of the symmetrically disposed subunits A1 and B1 to function as active subunits. The active sites on subunits A1 and B1 are at one side of the tetramer that is shown to be basic from an electrostatic potential calculation. This side of the surface could therefore bind the phosphodiester backbone of the tRNA substrate.
A model of the substrate derived from the TAR RNA NMR structure docks well with the proposed active subunits A1 and B1 (Fig. 3C). The two phosphodiester bonds fit exactly into the A1 and B1 active site pocket and superimpose with the putative SO 4 2Ϫ density found in each site. The distance between other pairs of active sites in the tetramer is too long to fit with this model substrate. This is particularly so of the other symmetrically related pair, in A2 and B2, which are so far apart that it is unlikely that any change in substrate geometry could allow for a fit. Obviously it is of high priority to solve a structure of the enzyme-substrate complex.

The Structure of the M. jannaschii Endonuclease Provides Insight into the Evolution of tRNA Splicing
The H. volcanii dimer and the M. jannaschii tetramer recognize the same consensus substrate so their active sites must ultimately be arrayed similarly in space. This was difficult to understand until Garrett pointed out that the H. volcanii monomer is actually a tandem repeat of the consensus sequence of the endonuclease gene family (38). The N-terminal repeat does not contain the N-terminal domain, and it lacks 2 of the 3 putative active site residues. It does, however, contain the structural features of the C-terminal domain, in particular the Loop L10 sequence. Fig. 4 shows a proposed model of the H. volcanii enzyme, which is best described as a pseudotetramer of two pseudo-dimers. The structure of the pseudo-dimer is predicted to contain a two stranded ␤9-␤9Ј pleated sheet, an important structural feature of the M. jannaschii dimer. The H. volcanii enzyme only contains two active sites (found in the Cterminal repeats), and these are proposed to occupy an identical spatial configuration to those in the A1 and B1 subunits of the M. jannaschii enzyme. The pseudo-dimers are proposed to interact via the conserved loop L10 sequences in the N-terminal repeats, equivalent to those in the A2 and B2 subunits in the M. jannaschii enzyme. The H. volcanii enzyme tells us that only two of the active sites are necessary, but to array these in space correctly one must retain important features of both the isologous dimer interactions (␤9-␤9Ј) and the dimer-dimer interactions mediated by Loop L10.
The yeast endonuclease contains two active site subunits, Sen2 and Sen34. The other two subunits do not appear to belong to the endonuclease gene family; however, Garrett pointed out that both Sen15 and Sen54 contain a stretch of sequence similarity to the endonuclease family near their COOH termini (38). Upon inspection of the crystal structure of the M. jannaschii endonuclease, this sequence conservation appears to contain the Loop L10 and hydrophobic core interactions crucial to the association of the monomers to form the tetrameric M. jannaschii endonuclease. We have proposed that the strong Sen2-Sen54 and Sen34-Sen15 interactions seen in two-hybrid experiments (13) are mediated by the C-terminal ␤9-␤9Ј-like interactions. These two heterodimers are proposed to interact to form the heterotetramer via the conserved Loop 10 sequences of Sen15 and Sen54.
Thus it is likely that what has been conserved since the divergence of the Eukarya and the Archaea is the endonuclease active site and the means to array two of them in a precise and conserved spatial orientation. Further support for this evolutionary pathway is supported by the results of Tocchini-Valentini and co-workers (41), where it is demonstrated that both the eukaryal and archaeal endonucleases can accurately cleave a universal substrate contain-ing the BHB motif. The eukaryal enzyme seems to dispense with the ruler mechanism for tRNA substrate recognition when cleaving the universal substrate. This leads to the conclusion that the precise positioning of two active sites in endonuclease has been conserved. Thus, subunits A1 and B1 comprise the active site core of all tRNA splicing endonucleases, and subunits A2 and B2 position the two active sites precisely in space. The eukaryal enzyme has evolved a distinct measuring mechanism for splice site recognition via the specialization of the A2 and B2 subunits while retaining the ability to recognize and cleave the primitive consensus substrate.