Crystal structures of the RNA triphosphatase from Trypanosoma cruzi provide insights into how it recognizes the 5 9 -end of the RNA substrate

RNA triphosphatase catalyzes the first step in mRNA cap formation, hydrolysis of the terminal phosphate from the nascent mRNA transcript. The RNA triphosphatase from the protozoan parasite Trypanosoma cruzi , TcCet1, belongs to the family of triphosphate tunnel metalloenzymes (TTMs). TcCet1 is a promising antiprotozoal drug target because the mechanism and structure of the protozoan RNA triphosphatases are completely different from those of the RNA triphosphatases found in mammalian and arthropod hosts. Here, we report several crystal structures of the catalytically active form of TcCet1 complexed with a divalent cation and an inorganic tripolyphosphate in the active-site tunnel at 2.20 – 2.51 Å resolutions. The structures revealed that the overall structure, the architecture of the tunnel, and the arrangement of the metal-binding site in TcCet1 are similar to those in other TTM proteins. On the basis of the position of three sulfate ions that cocrystallized on the positively charged surface of the protein and results obtained from mutational analysis, we identified an RNA-binding site in TcCet1. We

RNA triphosphatase catalyzes the first step in mRNA cap formation, hydrolysis of the terminal phosphate from the nascent mRNA transcript. The RNA triphosphatase from the protozoan parasite Trypanosoma cruzi, TcCet1, belongs to the family of triphosphate tunnel metalloenzymes (TTMs). TcCet1 is a promising antiprotozoal drug target because the mechanism and structure of the protozoan RNA triphosphatases are completely different from those of the RNA triphosphatases found in mammalian and arthropod hosts. Here, we report several crystal structures of the catalytically active form of TcCet1 complexed with a divalent cation and an inorganic tripolyphosphate in the active-site tunnel at 2.20-2.51 Å resolutions. The structures revealed that the overall structure, the architecture of the tunnel, and the arrangement of the metal-binding site in TcCet1 are similar to those in other TTM proteins. On the basis of the position of three sulfate ions that cocrystallized on the positively charged surface of the protein and results obtained from mutational analysis, we identified an RNA-binding site in TcCet1. We conclude that the 59-end of the triphosphate RNA substrate enters the active-site tunnel directionally. The structural information reported here provides valuable insight into designing inhibitors that could specifically block the entry of the triphosphate RNA substrate into the TTM-type RNA triphosphatases of T. cruzi and related pathogens.
Trypanosomatid parasites belong to the order kinetoplastida that causes several neglected diseases of humans and animals affecting close to 100 million people worldwide, including African sleeping sickness caused by Trypanosoma brucei, Chagas disease caused by Trypanosoma cruzi, and a spectrum of diseases caused by various Leishmania species (1). New therapeutics against trypanosomatids are needed because current medications are generally ineffective at late stages of infection or have severe side effects (2).
The 59-cap (m 7 Gp) is an essential feature in eukaryotic cellular and viral mRNAs that functions to protect mRNA from degradation and to promote translation initiation. RNA triphosphatase, which catalyzes the first step in mRNA capping, is a promising target for antiprotozoal drug development because the mechanism of protozoan RNA triphosphatase is completely different from mammalian or arthropod host (3,4). Metazoan and plant RNA triphosphatases belong to the cysteine-phosphatase enzyme superfamily that catalyzes a two-step phosphoryl transfer reaction in which the active site cysteine in the phosphate-binding loop attacks the g-phosphate of triphosphate-terminated RNA (pppRNA) to form a covalent proteincysteinyl-S-phosphate intermediate and release the diphosphate RNA product in the absence of metal cofactor (5,6). The RNA triphosphatases of fungi, protozoa, and several DNA viruses belong to a TTM family that hydrolyzes g-phosphorus of pppRNA in the presence of magnesium and NTP in the presence of either manganese or cobalt (7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17). The crystal structure of Saccharomyces cerevisiae RNA triphosphatase (Cet1) revealed that the catalytic core is located in a topologically closed hydrophilic tunnel composed of eight antiparallel bstrands (18). The Cet1 active site constitutes of 15 essential amino acid side chains that either stabilize the topology of the tunnel or make contacts to the divalent cation or sulfate ion (7,(19)(20)(21). The position of the sulfate ion was proposed to reflect the position of g-phosphate of pppRNA and NTP or the leaving group phosphate (18). Two glutamate-containing motifs (designated as motifs A and C) comprise the metal-binding site and are conserved among all TTM family members ( Fig. 1) (22)(23)(24).
RNA triphosphatases from kinetoplastids have been characterized in T. brucei (TbCet1) and T. cruzi (TcCet1) (10,11). Based on the primary structures of TbCet1 and TcCet1, the kinetoplastid enzymes contain all the putative counterparts of b-strands that comprise the yeast Cet1 triphosphate tunnel but lack extra domains appended to the N-terminal region that are essential for homodimerization and interaction with Ceg1 guanylyltransferase (25-27). TbCet1 is essential for procyclic cell growth (28) and can also complement the function of yeast Cet1 in S. cerevisiae as a cap-forming enzyme (11). Several classes of small molecular weight compounds have been reported to inhibit the triphosphatase activity of TbCet1, and mutagenesis studies have illuminated functional groups that are important for catalytic activity (28)(29)(30). What remains obscure is the structural basis of inhibition and the selectivity for the RNA substrate. Structure-based alignment of T. cruzi RNA triphosphatase. The secondary structure of T. cruzi RNA triphosphatase (TcCet1) is shown above the amino acid sequence. TcCet1 amino acids shaded in blue correspond to the segments that are removed in TcCet1(18-243 D55-75) protein. The TcCet1 sequence shaded in green (amino acid residues 166-177) corresponds to the segment that interacts with C 13 H 13 NO 2 and C 10 H 14 N 4 O 2 but was disordered in TcCet1(18-243 D55-75) manganese and Mn·PPPi-bound structures. The amino acid sequences of TcCet1 is aligned with Leishmania major (LmCet1), T. brucei (TbCet1), Schizosaccharomyces pombe (SpPct1), and S. cerevisiae (ScCet1) RNA triphosphatases. The secondary structure of ScCet1 is indicated below the aligned sequences. Identical side chains found in all polypeptides are highlighted in red. Amino acids with similar side chains are highlighted in gray. The positions of conserved motifs A and C, located within the catalytic domain of the metal-dependent RNA triphosphatases, are indicated. The alignment was prepared by ESPript. The secondary structure assignment was based on the DSSP program.
In the present study, we produced a catalytically active form of TcCet1 and crystallized the enzyme in complex with divalent cation and tripolyphosphate (PPPi) bound at the active site. We propose that the 59-triphosphate terminus of RNA substrate directionally enters the active-site tunnel through a positively charged surface based on the three sulfate ions that cocrystallized with TcCet1. The structural information could provide valuable insight into designing inhibitors that could specifically block the entry of triphosphate RNA substrate in TTM-type RNA triphosphatases.

Results
Defining the minimal catalytic unit of Trypanosoma RNA triphosphatase Deletion and limited proteolysis analyses of TbCet1 suggest that the hydrophilic N-terminal ;20-amino acid segment is connected by protease-sensitive site and is dispensable for triphosphatase activity (30). To define the minimal catalytic core for structural analysis, we deleted several amino acids from N and C termini in TcCet1 (TcCet1(18-243)) and further made an internal deletion to remove a putative protease-sensitive loop (TcCet1(18-243 D55-75); Fig. 1 and Fig. S1). Similar deletions were made for TbCet1 (TbCet1(29-253) and TbCet1(29-253 D62-90)). The truncated TcCet1 and TbCet1 proteins were expressed in bacteria and purified from a soluble extract by nickel-agarose chromatography, and proteins were assayed for a release of 32 Pi from [g-32 P]ATP in the presence of 2 mM manganese. Removal of N-terminal region apparently increased the ATPase activity of TcCet1 and TbCet1 1.3-and 2.6-fold, respectively, compared with their full-length enzymes ( PPPi is a potent competitive inhibitor of TbCet1, suggesting that PPPi binds tightly to the active site of the enzyme (30). Although TbCet1 is capable of hydrolyzing PPPi, tripolyphosphatase activity is only 0.5% of its NTPase activity. We therefore cocrystallized TcCet1 (18-243 D55-75) in the presence of PPPi and manganese (see "Experimental procedures"). The crystal was further soaked with potassium iodine to improve the mosaicity to obtain experimental phase information. The structure was then refined at 2.20 Å resolution with R/R free of 22.6%/26.1% ( Fig. S2 and Table S1). The electron density corresponding to PPPi was evident, and a density for manganese was modeled adjacent to PPPi. We subsequently solved the structure of TcCet1(18-243 D55-75) in complex with manganese alone by means of molecular replacement phasing method using the TcCet1(18-243 D55-75) Mn·PPPi structure as a search model. The root-mean-square deviations between pairs of structures was 0.9 Å for the main chain atoms of residues 37-242, suggesting that binding of PPPi may not induce conformational change. Efforts to crystallize full-length TcCet1 and TbCet1 and other derivatives of the proteins were unsuccessful.
Two unique interfaces are formed in the structure of TcCet1 (18-243 D55-75) (Fig. S2). The dimer-forming interface in the manganese-bound and Mn·PPPi-bound forms differ in the position of the N-terminal a-helix (from residues Asp 18 to Leu 36 ). Analysis with the PDBePISA server (32) reveals that the manganese-bound dimer interface buries 663.9 Å 2 with a Complex Formation Significance Score of 0.000, suggesting that this interaction occurs only in the crystal. In the Mn·PPPi-bound form, the N-terminal a-helix is extended and protrudes from one protomer to the other, with an interface area of 654.1 Å 2 and the calculated Complex Formation Significance Score of 0.440, suggesting that the interface plays an auxiliary role in complex formation. However, identical contact is formed by the monomeric enzyme, in which the N-terminal a-helix folds back by a connecting loop (Fig. S2C). Indeed, TcCet1 behaves as a monomer in solution at micromolar concentration, as judged by gel-filtration analysis and as previously noted (10). Thus, the two independent molecules of TcCet1 found in the asymmetric unit in Mn·PPPi-bound form can be described as a dimer formed by a crystal packing.

Structure of TcCet1 with tripolyphosphate in the active site
The overall structure of TcCet1(18-243 D55-75) is similar to other TTM proteins, composed of eight-stranded anti-parallel b-barrel with three a-helices surrounding the active-site tunnel ( Fig. 2A). Structural comparison with yeast RNA triphosphatase Cet1 reveals that the diameter of the tunnel and the metal-binding site is nearly identical between the two enzymes ( Fig. 2, C-E). Positions of 15 side chains in Cet1 active-site tunnel are conserved in TcCet1 (Fig. 2, B and E), of which 14 side chains were shown to be essential for Cet1 (7,19,21) and TbCet1 (30) triphosphatase activity. Three glutamates (Glu 42 , Glu 44 , and Glu 210 ) from motifs A and C from the tunnel floor coordinate the Mn 21 and allows Mn 21 to interact with the PPPi at P1 and P2 positions (Fig. 2F). The basic side chains (Arg 118 and Lys 138 ) extended from the other side of the tunnel likely stabilize the interaction with PPPi. Lys 182 and Arg 184 interact with central phosphate (P2 position), and Lys 138 interacts with phosphate at the P3 position of PPPi (Fig. 2F). A similar mode of PPPi binding and the metal arrangement was observed in the structures of the Arabidopsis protein AtTTM3·Mn·PPPi complex (24,33). In TcCet1, Arg 46 also forms water-mediated contact to Glu 44 and Glu 210 .
Based on the configuration between Mn 21 and PPPi, Mn 21 could facilitate the hydrolysis of phosphate at the P1 position. Hence, we predict that g-phosphate of pppRNA and NTP will be situated at the position of P1 phosphate. In support of this view, superimposition of TcCet1 and yeast Cet1 structures reveal that P1 phosphate of PPPi is in close proximity to the sulfate ion present in the Cet1 structure (Fig. 2E), which has been proposed to mimic the position of g-phosphate of pppRNA and NTP (18).  We performed a fluorescent-based protein thermal stability shift assay to identify a potential ligand that can interact to stabilize the TcCet1 protein. In this procedure, TcCet1(18-243 D55-75) was incubated with CYPRO Orange dye, which preferentially binds to the hydrophobic core of the protein and emits fluorescence when the protein is unfolded by temperature shift. We determined that the T m of TcCet1(18-243 D55-75) was 44.6°C, and the addition of 2 mM ATP increased the T m by 1.4°C to 46.0°C (Fig. 3A).
We screened for compounds that can bind and increase the T m of TcCet1(18-243 D55-75) using an in-house chemical compound library consists of 965 commercially available lowmolecular-weight compounds (Fig. 3A). We identified four compounds that increase the T m of TcCet1

Structure of TcCet1 in complex with ligands
We attempted to cocrystallize TcCet1(18-243, D55-75) with the four compounds identified from the above screen that stabilize the enzyme. Only the complex formed with compound 951 (C 10 H 14 N 4 O 2 ) crystallized with satisfactory quality for X-ray crystal diffraction analysis. By soaking 2 mM of compound 466 (C 13 H 13 NO 2 ) in to C 10 H 14 N 4 O 2 -bound crystal, we also obtained the structure of TcCet1(18-243, D55-75)·C 13 H 13 NO 2 . The structures of TcCet1(18-243, D55-75) with the ligands were solved at 2.39 Å (complex with C 10 H 14 N 4 O 2 ) or 2.51 Å (complexed with C 13 H 13 NO 2 ) resolution using the TcCet1(18-243 D55-75) Mn·PPPi-bound structure as a search model (Fig. 4, A and B). The bound ligands were well-defined in the electron-density maps (Fig. S4) and bound on the same surface of the protein that forms a helical loop between amino acid residues 166 and 177, a region that was disordered in both manganese and Mn·PPPi-bound structures. Neither manganese nor PPPi is evident in both ligand-bound structures, and the tunnel cavities are slightly narrower compared with the manganese-or Mn·PPPibound structures (Fig. 4).
Two aromatic residues, Phe 169 from the loop and Trp 117 from b4-strand, interact with C 13 H 13 NO 2 and C 10 H 14 N 4 O 2 through p-p stacking. The C 13 H 13 NO 2 oxo-groups are hydrogenbonded by His 119 and Asp 165 side chains (Fig. 4C). Similarly, one of the oxo-groups in C 10 H 14 N 4 O 2 forms a hydrogen bond with the His 119 side chain, and the other oxo-group is stabilized by Arg 115 (Fig. 4D). Neither compound had a substantial effect on the triphosphatase activity of TcCet1, up to 1 mM concentration (Fig. 4E). Point mutations in Phe 169 and His 119 did not have significant impact on the triphosphatase activity, although Trp 117 mutation showed a moderate effect with 20-50% of WT activity (Fig. 4, F and G). These results indicate that the ligand-binding site is not essential for TcCet1 activity and that ligand-bound structures constitute an active form of the enzyme.
Sulfate ions bound on the surface of TcCet1 reveal the entry site for triphosphate RNA The striking feature of C 13 H 13 NO 2 -and C 10 H 14 N 4 O 2 -bound structures is that three sulfate ions, designated as SO 4 -a, SO 4 -b, and SO 4 -c, were bound on the same positively charged surface of the protein (Fig. 5, A and B). The distance between the three consecutive sulfur atoms is ;6 Å each (S-S distance), which is similar to the distance between the two phosphorus atoms on a single-stranded RNA (Fig. 5C). We predict that three consecutive sulfate ions reflect the position of backbone phosphate on the RNA chain. Superposition of the ligand-bound and the PPPi-bound structures reveals that SO 4 -a is situated 12.6 Å (S-P distance) from the nearest phosphate on the PPPi in the active-site tunnel, suggesting that SO 4 -a likely occupies the position of phosphate between the second and third nucleotides of the RNA substrate. SO 4 -b and SO 4 -c may occupy the positions between the third and fourth and between the fourth and fifth nucleotides on the RNA, respectively.
We hypothesized that three sulfate ions found in the structure of TcCet1 reflect the position of phosphodiester bonds of the RNA substrate; mutation in the residues surrounding the bound sulfate ions may affect the RNA triphosphatase activity but retains its ability to hydrolyze NTP. Alanine substitution was introduced at two positively charged residues, Arg 156 and Lys 144 , that coordinate SO 4 -a and SO 4 -c, respectively (Fig. 5C). Phe 50 and Phe 79 were also selected, because these aromatic residues may participate in stacking interaction with the RNA bases. In addition, Arg 58 , situated within the internal deletion of TcCet1 (18-243 D55-75), was substituted to alanine as a positive control.
The full-length WT and Ala-substituted TcCet1 proteins were produced in bacteria (Fig. 6A). The RNA triphosphatase and ATPase activities were assayed for the release of 32 Pi from 1 mM [g-32 P]pppRNA and [g-32 P]ATP, respectively (Fig. 6, B and C). The specific activities were calculated from the averages of the slopes of the titration curves in the linear range of enzyme dependence. Under this condition, TcCet1 can preferentially hydrolyze pppRNA, 5.4-fold higher than that for ATP (Fig. 6D). The R156A mutation showed significantly reduced RNA triphosphatase activity compared with the WT enzyme. The ratio of pppRNA hydrolysis/ATP hydrolysis was 0.4, suggesting that R156A is selectively impaired for hydrolyzing triphosphate RNA (13.5-fold lower compared with the WT enzyme). F50A and F79A display reduced pppRNA and ATP hydrolysis compared with the WT enzyme. The ratio of pppRNA/ATP hydrolysis by F50A and F79A was 2.4 and 0.6, respectively, implying that F79A was less active in hydrolyzing pppRNA than the ATP. K117A and R58A displayed near WT activity in hydrolyzing pppRNA and ATP. Consistent with our findings, F50A, F79A, and R156A mutations show significant reduction in binding to the nucleic acid, whereas R58A and K144A mutants maintained WT binding (Fig. S5). Taken together, these results support the notion that three sulfate ions found in the structure of TcCet1 reflect the position of phosphodiester bonds on the RNA, and the 59-end of the RNA likely enters the active-site tunnel directionally through the positively charged nucleic acid-binding surface.

Discussion
The present study provides new insights into the structure and mechanism of TTM-RNA triphosphatase by capturing structures of catalytically active TcCet1 in complex with manganese and PPPi. The catalytic domain of TcCet1 adopts a characteristic TTM-enzyme fold with eight-stranded antiparallel b-barrel, and the arrangement of the metal-binding site is similar to other TTM-enzymes (18,23). The RNA must bind near one of the tunnel openings to allow the 59-triphosphate terminus to enter the active site. We further cocrystallized TcCet1 with the two phenolic compounds that were identified to stabilize the TcCet1. The crystals were grown in the presence of ammonium sulfate, and the TcCet1-ligand complexes contained three sulfate ions on an electrostatically positive surface near the tunnel entrance. Each sulfate ions are separated by ;6 Å, similar to the distance between consecutive phosphates in single-stranded RNA. The distance between the PPPi found in the active-site tunnel and the closest sulfur atom was 12 Å. We predict that sulfate ions reflect positions of the second (pppN 1 pN 2 pN 3 pN 4 p), third (pppN 1 pN 2 pN 3 pN 4 p), and fourth (pppN 1 pN 2 pN 3 pN 4 p) backbone phosphates on the pppRNA substrate.
Two metal-binding sites were identified in the structures of other TTM members, including AtTTM3 polyphosphatase (24), ygiF polyphosphatase (24), and adenylate cyclase CyaB (34). The second metal ion participates in binding of the substrate, coordinating the triphosphate, or stabilizing the diphosphate leaving group for optimal catalysis. TcCet1 may employ two metal catalytic mechanisms, because previous studies sug-gest that TbCet1 and other RNA triphosphatases exhibit synergistic activation by magnesium and manganese (30, 35). However, only a single metal was found in the structure of TcCet1 (18-243 D55-75). We speculate that the second metal is absent in the structure because PPPi lacks the nucleoside moiety found in NTP and pppRNA. Alternatively, an extra negative charge on the PPPi, which is not present in NTP or pppRNA, may cause interference with the binding of the second metal.
Mutational analysis of selected amino acids on the sulfatebinding surface identified Arg 156 to be important for the RNA triphosphatase activity. Alanine substitution of Arg 156 severely reduced RNA triphosphatase activity, but the mutant protein resulted in a 2-fold increase in NTPase activity. The Arg 156 residue, located at the entrance of the tunnel, is conserved as arginine or lysine in cellular TTM-type RNA triphosphatases characterized to date, but is not present in other TTM-enzymes (13,15,23). The TcCet1 Arg 156 counterpart for Cet1 is Lys 427 . Substitution of Cet1 Lys 427 with Glu exhibits a cold-sensitive growth arrest phenotype in yeast, suggesting that a positivecharge residue at this position could be important for the Cet1 function in vivo (36). Two aromatic residues, Phe 50 and Phe 79 , in the vicinity of sulfate ions were also crucial for TcCet1 RNA triphosphatase activity, suggesting that these aromatic residues may participate in stacking interaction with the nucleoside bases on the RNA. The effect of F79A substitution appeared to be much more severe than F50A, because Phe 79 is positioned closer to the tunnel entrance than Phe 50 and may have greater impact on stabilizing the 59-end of the RNA.
The TTM-enzyme fold and the active site architecture are evolutionarily conserved but have different substrate specificity to hydrolyze phosphate. We propose that 59-end of nascent RNA Structure of T. cruzi RNA triphosphatase enters the triphosphate tunnel directionally through the positively charged RNA-binding surface to allow the g-phosphate to be aligned in the active-site tunnel. This view is supported by cryo-EM and CX-MS analysis of yeast capping enzyme in complex with RNA polymerase II (37). In their model, the 59-end of the newly synthesized RNA enters the Cet1 tunnel cavity in the same direction as we proposed for TcCet1. The directionality of substrate binding has been proposed to dictate the cleavage specificity of TTM (22,24). Thiamine triphosphatase hydrolyzes thiamine triphosphate into thiamine diphosphate and Pi but is unable to hydrolyze pppRNA. The a-helix at the C-terminal end protrudes into the tunnel to interact with thiamine to orient the terminal phosphate for hydrolysis reaction, in the same direction as we proposed for TcCet1 (24). Similar C-terminal plug-in helix is also present in AtTTM3 tripolyphosphatase (24,33) but is absent in TcCet1. The C-terminal plug-in helix may also function to block the unwanted substrate from entering the tunnel cavity, which may explain why thiamine triphosphatase and AtTTM3 cannot hydrolyze pppRNA. In the crystal structure of class IV adenylate cyclase, ATP is bound in a reverse orientation in the active-site tunnel, compared with the thiamine triphosphate of the thiamine triphosphatase (24,34). This conformation of ATP allows the a-phosphate to position in a suitable orientation for an in-line nucleophilic attack by ribose O39 at the cleavage site.

Structure of T. cruzi RNA triphosphatase
Because TcCet1 can also hydrolyze g-phosphate from NTP, we predict that triphosphate on the NTP will enter the tunnel in the same direction as pppRNA. In contrast, symmetric molecule such as PPPi may enter from either side of the tunnel. This may partly explain why PPPi is a potent inhibitor for the triphosphatase activity (30, 31).
In summary, we solved the structures of TcCet1 and illuminate the mechanism on how RNA substrate is being recognized and how the triphosphate end enters the active-site tunnel. The structural information of TcCet1 could be exploited in the development of effective inhibitor that could specifically block the entry of RNA substrate in the TTM-type RNA triphosphatase.

Experimental procedures
Expression plasmids for TcCet1 and TbCet1 The bacterial expression plasmid pET44-TcCet1 encodes T. cruzi RNA triphosphatase fused in-frame to an N-terminal His tag with a custom HRV3C protease recognition site. This was accomplished by amplifying the TcCet1 ORF from T. cruzi (Tulahuen strain) genome by PCR and inserting it into modified pET44, in which the plasmid segments encoding for Nus·Tag, S·Tag, and thrombin cleavage segments were replaced with a His tag and the HRV3C protease recognition site between NdeI and BamHI sites. The truncated allele TcCet1  was generated by PCR amplification using a mutagenic sense primer that replaces Gly 17 with methionine and an antisense-strand primer that introduced stop codon in lieu of the codons for Ser 244 , and the fragment was inserted between NdeI and Sal1 restriction sites from the pET-TcCet1 template. TcCet1(18-243 D55-75) allele spanning from residues 18 to 243 with an internal deletion from 55 to 75 amino acids and point mutations in the TcCet1 gene were generated by synthetic oligonucleotides using the two-stage PCR-based overlap extension strategy. T. brucei RNA triphosphatase was amplified by PCR from pET-His/Smt3-TbCet1 (11) and inserted into pET44a plasmid between the NdeI and SalI restriction sites. The plasmids for expression of TbCet1  and TbCet1(29-253 D62-90) were generated by similar strategies described for TcCet1  and TcCet1 (18-243 D55-75). The presence of the desired mutation was confirmed in every case by sequencing the entire insert. For deletion and mutational analysis, the plasmids were transformed into E. coli BL21 (DE3), and the proteins were purified from the soluble lysates by a Ni-NTA column as described (11).
Large-scale purification of TcCet1(18-243 D55-75) E. coli BL21 (DE3) transformed with pET-TcCet1(18-243 D55-75) was grown in LB medium (6 liters) with 0.1 mg/ml ampicillin at 37°C until the absorbance at 600 nm reached 0.6. The expression of TcCet1 (18-243 D55-75) was induced by addition of isopropyl b-K-1-thiogalactopyranoside to a final concentration of 0.5 mM. The temperature was reduced to 16°C following induction, and the cells were incubated for another 12-16 h. The bacteria were harvested by centrifugation and resuspended in lysis buffer (50 mM Tris-HCl, pH 8.0, 0.3 M NaCl, 10 mM imidazole, 2 mM MnCl 2 and 1 mM DTT), sonicated on ice, and centrifuged. The supernatant was mixed with Ni-NTA-Sepharose resin that had been equilibrated with lysis buffer, and the suspension was mixed by continuous rotation at 4°C for 1 h. The nickel-Sepharose resin was poured into a column. The packed column was washed with lysis buffer containing 20 mM imidazole, and the N-terminal His tag was cleaved by GST-fused HRV3C protease on the column at 4°C for 16 h. Tag-less TcCet1(18-243 D55-75) was recovered from the Ni-NTA column by elution buffer (50 mM Tris-HCl, pH 8.0, 0.1 M NaCl, 20 mM imidazole, 2 mM MnCl 2 , and 1 mM DTT). The eluate was passed through a Q-Sepharose (3 ml, GE Healthcare) column and then to a GS4B (3 ml, GE Healthcare) column, and the flow-through fractions were collected for each step. The GS4B flow-through fraction was concentrated by Amicon-Ultra (molecular mass cutoff, 10 kDa) to 20-30 mg/ml, and ;15 mg of protein was applied to Superdex 75 gel filtration column (GE Healthcare) equilibrated with 10 mM HEPES (pH 7.5), 2 mM MnCl 2 , and 100 mM NaCl. Peak fractions containing the recombinant protein were collected, concentrated to ;15 mg/ ml, and stored at 280°C. Purified TcCet1(18-243 D55-75) was used for crystallization and thermal shift assay.
The diffraction data were collected at BL15A in National Synchrotron Radiation Research Center (Hsinchu, Taiwan) or BL-1A in the Photon Factory (Tsukuba, Japan). The data were integrated by XDS (38) and scaled by aimless in the CCP4 software package (39). For I-SAD data collection, the crystal was soaked with 0.25 M potassium iodine for 30 min at 293 K. Phenix Auto-Sol was used for the phase determination (40). In the Mn·PPPi complex, seven iodine sites in the asymmetric unit were located and used for phase determination and improvement, which yielded a traceable electron density map. The initial figure of merit was 0.327. Initial model building was performed by Phenix AutoBuild. Iterative rounds of manual adjustment in Coot and refinement using Phenix.refine were performed for structure refinement. The B-factors of phosphate atoms (a, b, and g) were 89.21, 88.30, and 99.52, respectively.
Other crystal structures were solved by molecular replacement using Phaser in Phenix. Iodine and Mn·PPPi-bound form structures were used as a search model for the molecular replacement. The structures were refined to the indicated statistics using iterated rounds of manual adjustments in Coot (41,42), followed by refinement using Phenix (40). Final coordinates and structure factors were submitted to the Protein Data Bank under accession codes 6L7W (manganese complex), 6L7V (Mn·PPPi complex), 6L7Y (C 13 H 13 NO 2 complex), and 6L7X (C 10 H 14 N 4 O 2 complex).

Thermal shift assay
A Library of 965 chemical compounds was assembled inhouse from commercially available low-molecular-weight compounds based on the "rule of three" (molecular weight , 300, number of hydrogen bond donors ≤ 3, number of hydrogen bond acceptors ≤ 3, and ClogP ≤ 3), as described (43). Compounds were stored desiccated and were resuspended with DMSO to 100 mM prior to the assay. Thermal shift assay was carried out with 1 mg of TcCet1-(18-243 D55-75) and CYPRO Orange fluorescent dye (Thermo Fisher Scientific) (1:1000 dilution) in 96-well plates. The reaction mixture contained 10 mM HEPES (pH 7.5), 2 mM MnCl 2 , and 1 mM DTT, and 2 mM of each compound. The temperature of the plate was raised from 25 to 70°C by an increment of 0.5°C/5 s using the StepOne Plus real-time PCR system (Thermo Fisher Scientific). The resulting melting curves were analyzed by Protein Thermal Shift Software, version 1.1 (Thermo Fisher Scientific) to determine DT m .

Preparation of triphosphate terminated RNA
Substrate for RNA triphosphatase assay was prepared by in vitro transcription with T7 RNA polymerase from a partially duplexed oligonucleotide DNA (44). Oligonucleotides corresponding to the T7 promoter sequence (59-TAATACGACTCACTATA-39) and the complementary sequence with poly(dT) stretch (59-TTTTTTTTTTTTTTTTTTTTTATAGTGAGTCGTATTA-39) were annealed to produce a template for T7 RNA polymerase. The reaction mixture contained 1 mM of [g-32 P]ATP, 20 mM of duplex oligonucleotide DNA, 1 unit/ml of RNase inhibitor (Toyobo Co., Ltd.), and 2.5 units/ml of T7 RNA polymerase (New England Biolabs Inc.) in the provided reaction buffer. The reaction was carried out for 3 h at 37°C. DNase I (Nippon Gene Co., Ltd.) was then added to digest the template oligonucleotide DNAs, according to the manufacturer's instructions. The product was extracted with phenol/chloroform/ isoamyl alcohol, precipitated with ethanol, and separated on 20% denaturing polyacrylamide gel. The radioactive product that migrates between 7 and 9 nucleotides was excised from the gel and eluted by 10 mM Tris-HCl (pH8), 1 mM EDTA. The eluate was passed through a spin column to remove gel debris. The RNA was precipitated with ethanol and resuspended in water.

Triphosphatase assay
Reaction mixtures containing 50 mM Tris-HCl (pH 7.5), 2 mM DTT, 2 mM of divalent cation (MgCl 2 or MnCl 2 ) with either [g-32 P]ATP or [g-32 P]pppRNA, and the indicated amount of protein were incubated for 15 min at 30°C. The reaction was quenched by the addition of EDTA or formic acid. The reaction products were separated by PEI-cellulose TLC with 0.45 M ammonium sulfate. The TLC plate was exposed to a PhosphorImager plate, scanned by BAS-2000 (FujiFilm, Japan), and quantitated by Image Gauge software.

Data availability
The atomic coordinates and structure factors have been deposited into the Protein Data Bank as entries 6L7V, 6L7W, 6L7X, and 6L7Y. All other information and data are available from the authors upon request.