Specificity and Evolutionary Conservation of the Escherichia coli RNA Pyrophosphohydrolase RppH*

Background: In bacteria, RppH triggers mRNA degradation by converting the 5′-terminal triphosphate to a monophosphate. Results: The substrate specificity of E. coli RppH has been determined, along with residues important for this property and species in which they are conserved. Conclusion: The specificity of RppH is species-dependent. Significance: These findings help to explain the selective influence of RppH on rates of mRNA decay. Bacterial RNA degradation often begins with conversion of the 5′-terminal triphosphate to a monophosphate by the RNA pyrophosphohydrolase RppH, an event that triggers rapid ribonucleolytic attack. Besides its role as the master regulator of 5′-end-dependent mRNA decay, RppH is important for the ability of pathogenic bacteria to invade host cells, yet little is known about how it chooses its targets. Here, we show that Escherichia coli RppH (EcRppH) requires at least two unpaired nucleotides at the RNA 5′ end and prefers three or more such nucleotides. It can tolerate any nucleotide at the first three positions but has a modest preference for A at the 5′ terminus and either a G or A at the second position. Mutational analysis has identified EcRppH residues crucial for substrate recognition or catalysis. The promiscuity of EcRppH differentiates it from its Bacillus subtilis counterpart, which has a strict RNA sequence requirement. EcRppH orthologs likely to share its relaxed sequence specificity are widespread in all classes of Proteobacteria, except Deltaproteobacteria, and in flowering plants. By contrast, the phylogenetic range of recognizable B. subtilis RppH orthologs appears to be restricted to the order Bacillales. These findings help to explain the selective influence of RppH on bacterial mRNA decay and show that RppH-dependent degradation has diversified significantly during the course of evolution.

Messenger RNA degradation is an important mechanism for controlling gene expression in all organisms. By limiting the number of times that a transcript can be translated, mRNA lifetimes have a direct effect on protein synthesis. Half-lives can vary greatly for different mRNAs within the same cell, ranging from seconds to about an hour in bacteria and from minutes to days in higher eukaryotes. mRNA lifetimes can also be modulated in response to growth conditions or other signals.
In bacteria, the ends of primary transcripts are protected by a 5Ј-terminal triphosphate and a 3Ј-terminal stem-loop, both of which are nuclease-resistant. As a consequence, bacterial mRNA degradation generally begins by either of two mechanisms in which those barriers are bypassed or removed. The bypass mechanism involves direct access by an endonuclease to an internal cleavage site (1,2). In Escherichia coli, RNase E is the endonuclease that most often initiates mRNA degradation in this manner, whereas in species that lack RNase E, such as Bacillus subtilis, the endonuclease RNase Y plays a similarly important role (3). Less frequently, internal cleavage by another endonuclease such as RNase III triggers mRNA decay. The other mechanism for initiating mRNA degradation involves conversion of the 5Ј-terminal triphosphate to a monophosphate by the RNA pyrophosphohydrolase RppH, a member of the Nudix hydrolase family (4 -6). In E. coli, this modification creates better substrates for RNase E, whose endonucleolytic activity is greatly enhanced when the RNA 5Ј end is monophosphorylated (7), whereas in B. subtilis it exposes the 5Ј end to attack by the 5Ј-exonuclease RNase J, an enzyme that is absent in E. coli (6). mRNA degradation by the 5Ј-end-dependent pathway is important for bacterial pathogenesis, as evidenced by the impaired invasiveness of mutant strains that lack RppH (8 -10).
The specificity of the two enzymes involved in RppH-dependent degradation in B. subtilis has been investigated. Purified B. subtilis RppH (BsRppH) 4 requires two unpaired nucleotides at the RNA 5Ј end and prefers three or more such nucleotides (11). Furthermore, BsRppH is sequence-dependent, requiring G as the second nucleotide of its substrates, preferring a purine at the third position, and slightly favoring A over G at position 1 (11,12). In contrast, RNase J appears to require at least 9 -10 unpaired nucleotides at the 5Ј end of the monophosphorylated intermediate for maximum 5Ј-exonuclease activity and 4 -5 unpaired nucleotides for exonucleolytic processivity, but it exhibits little if any sequence specificity (13). Consistent with these observations, RppH-dependent mRNA degradation in B. subtilis requires an unpaired 5Ј end and a G at position 2 (6,11).
Interestingly, B. subtilis appears to contain a second, as yet unidentified, RNA pyrophosphohydrolase whose action is sequence-independent (6,11).
In E. coli, hundreds of messages are targeted for degradation by the RppH-dependent pathway, but even more are not (5,14). However, little is understood about the basis for the specificity of this pathway in E. coli other than its requirement for an unpaired 5Ј end (1,5,15,16). Neither the number of unpaired 5Ј-terminal nucleotides required for pyrophosphate removal by E. coli RppH (EcRppH) or for 5Ј-monophosphate-assisted cleavage by RNase E nor the influence of the sequence of those nucleotides has previously been determined.
We have now examined the specificity of EcRppH. Our findings show that, although its 5Ј-terminal length requirements resemble those of BsRppH, its sequence preferences are much more permissive. Phylogenetic comparison indicates which other bacterial species contain an EcRppH-like ortholog and which contain a BsRppH-like ortholog. Finally, mutational analysis of EcRppH has identified amino acid residues important for its catalytic activity and relaxed substrate specificity.

EXPERIMENTAL PROCEDURES
Purification of RppH-Wild-type and mutant forms of E. coli RppH bearing an amino-terminal hexahistidine tag were produced in E. coli cells that contained plasmid pPlac-RppH6 (5) or a derivative thereof and lacked the chromosomal genes encoding RppH and RNase I. Protein synthesis was induced by adding isopropyl 1-thio-␤-D-galactopyranoside (1 mM) to a log phase culture (A 600 ϭ 0.5) growing in LB medium (1000 ml) at 37°C. After 2 h, the cells were pelleted, resuspended in 10 ml of buffer E (10 mM HEPES, pH 7.5, 0.3 M sodium chloride) containing 0.25% Genapol, 0.1 mM PMSF, and Complete EDTA-free protease inhibitor (Roche Applied Science), and disrupted in an EmulsiFlex-C3 high pressure homogenizer (Avestin). The cell lysates were cleared twice by centrifugation at 14,500 ϫ g for 15 min at 4°C, and the supernatants were incubated with 2 ml of BD TALON metal affinity resin (Clontech) for 1 h at 4°C with gentle agitation. The resin was washed five times with buffer E containing 0 -20 mM imidazole, and RppH was then eluted with buffer E containing 400 mM imidazole. The imidazole was then removed by gel filtration through Sephadex G-25 (1 ml). Peak fractions were stored at Ϫ80°C in buffer E containing 50% (v/v) glycerol. Protein concentrations were measured colorimetrically by using the Bio-Rad Protein Assay and verified by SDS-PAGE and staining with Coomassie Blue.
Monitoring Pyrophosphate Removal by RppH-RppH substrates bearing a 5Ј-terminal ␥-32 P label and a single internal fluorescein label were prepared by in vitro transcription and purified by PAGE, as described previously (11).
To examine the specificity of wild-type EcRppH, equal amounts of the 5Ј ␥-32 P-labeled RNA to be tested and 5Ј ␥-32 Plabeled A8XL (0.4 pmol each) were prewarmed to 37°C for 3 min in EcRppH reaction buffer (20 mM HEPES, pH 7.5, 10 mM magnesium chloride, 1 mM dithiothreitol, 1% glycerol; total volume of 99 l). A 0-min sample (9 l) was removed, mixed with 8 l of loading buffer (90 mM EDTA, 82% formamide), and cooled on ice. RppH (10 l of an 80 nM solution in EcRppH reaction buffer) was added to the remainder (final RppH con-centration of 8 nM), and 10-l samples were withdrawn at time intervals (1, 2, 4, and 8 min) and quenched on ice with 8 l of a loading buffer containing excess EDTA. (Under these conditions, the reaction rate is limited by the concentration of RppH.) All of the samples were then heated to 95°C for 3 min and subjected to electrophoresis on a 13.5% polyacrylamide, 8 M urea gel. After visualizing the fluorescence of the RNA bands with a Typhoon Trio scanner (GE Healthcare), the gel was dried, and the radioactivity of the bands was visualized with a Storm 820 PhosphorImager (GE Healthcare). Band intensities were quantified by using ImageQuant software, and radioactivity was normalized to fluorescence.
To examine the effect of amino acid substitutions on the activity and specificity of EcRppH, equal amounts of 5Ј ␥-32 Plabeled A8XL and either A4 AGAA or A4 ACAA (final concentration of 5 nM each) were prewarmed to 37°C for 5 min in EcRppH reaction buffer containing unlabeled total E. coli RNA (1 g/ml) as a competitor. EcRppH or a mutant thereof was added to a final concentration of either 15 nM (for A4 AGAA ) or 60 nM (for A4 ACAA ), and samples (10 l) were withdrawn at time intervals (0, 1, 2, 4, and 8 min) and quenched on ice with a loading buffer containing excess EDTA. Sample analysis was then performed as described above.
Examining mRNA Degradation in E. coli-To determine the effect of sequence on the 5Ј phosphorylation state of yeiP mRNA in E. coli, either strain BW25113 containing plasmid pYeiP1 (17) or strain BW25113⌬yeiP (17) containing plasmid pYeiP1-U2G was grown to exponential phase at 37°C in MOPS/glucose medium. Total cellular RNA was then extracted and analyzed by PABLO as described previously (17), except that 10 -23 deoxyribozyme YeiP-DZM7 (GTAATTCAGTAGGCTAGCTACAACGACATAC-CTTTT) was used instead of YeiP-DZM3, oligo X22 (GAA-CAATATGAATGATAACTTG) was used instead of oligo X32, and oligo Y yeiP-U2G (AGTCGAAAATGTCAAAAATCTCA-AGTTATCATTCATATTGTTC) was used instead of oligo Y yeiP (AGTCGAAAATGTCAAAAATATCAAGTTATCATTCA-TATTGTTC) to determine the phosphorylation state of the yeiP-U2G transcript, and purified EcRppH was used instead of tobacco acid pyrophosphatase to generate fully monophosphorylated RNA in vitro.
Rates of yeiP mRNA degradation in these strains or in congenic strains lacking the rppH gene were measured by Northern blot analysis of total RNA extracted at time intervals after inhibiting transcription, as described previously (17).
Identification of EcRppH and BsRppH Orthologs-Bacterial proteins with a high degree of sequence homology to EcRppH and BsRppH were identified by using Blastp. True orthologs of these two enzymes were distinguished from other Nudix hydrolases on the basis of their p value (Յ3 ϫ 10 Ϫ12 for proteobacterial EcRppH orthologs or Յ4 ϫ 10 Ϫ15 for BsRppH orthologs) and confirmed by the presence of conserved amino acid residues characteristic of each protein clade. Nudix hydrolases with higher p values had clear deficiencies in sequence homology outside the vicinity of the Nudix motif. Subclades of EcRppH orthologs and their evolutionary relatedness were determined by using ClustalW2 to analyze protein sequences. The degree to which individual amino acid residues are conserved in orthologs of EcRppH and BsRppH was examined by using the Protein versus All Alignment tool on the website of the Comprehensive Microbial Resource.

Requirement for Unpaired Nucleotides at the 5Ј Terminus-
To determine the minimum number of unpaired 5Ј-terminal nucleotides required by EcRppH, we used a set of structurally unambiguous substrates previously employed to examine the specificity of BsRppH. A8, the prototype of these RNA substrates, comprised an 8-nucleotide single-stranded segment followed by two stem-loop structures, the first of which contained the only U nucleotide in the entire molecule (Fig. 1). It was synthesized by in vitro transcription in the presence of [␥-32 P]ATP and fluorescein-12-UTP so as to incorporate a ␥ radiolabel within the 5Ј-terminal triphosphate and a single fluorescein label at the top of the first stem-loop. For use as an internal standard, we also prepared doubly labeled A8XL RNA, which was identical to A8 except for an additional stem-loop at the 3Ј end.
To monitor pyrophosphate removal from these RNAs, we combined equal amounts of each with EcRppH in a Mg 2ϩ -con-taining buffer, quenched reaction samples periodically with EDTA, and subjected the reaction products to gel electrophoresis. The extent of reaction at the various time points was then quantified for each substrate by normalizing the radioactivity of the corresponding gel band to its fluorescence intensity. As expected, the rate of pyrophosphate removal was very similar for A8 and A8XL ( Fig. 2A and supplemental Table 1).
We then shortened the single-stranded segment at the 5Ј end of A8 from 8 to 4, 3, 2, or 1 nucleotides by removing nucleotides from its 3Ј boundary so as to generate A4, A3, A2, and A1 (Fig. . EcRppH substrates. The sequence and expected secondary structure of A8, A4, A3, A2, A1, A1ϩ3, G8, and A8XL RNA are shown. Each bore a 5Ј-terminal triphosphate (ppp), a ␥-32 P radiolabel (*) at the 5Ј end, and a fluorescein label at the top of the first stem-loop. In each RNA name, the letter indicates the identity of the 5Ј-terminal nucleotide, and the numeral indicates the number of unpaired nucleotides at the 5Ј end. Truncated derivatives of A8 (A4, A3, A2, A1) lacked 4 -7 nucleotides from the 3Ј boundary of the 5Ј-terminal single-stranded segment. G8, G4, G3, G2, G1, and G0 were identical to their A-series counterparts except for the presence of G instead of A at the 5Ј end. A1ϩ3 was the same as A1 except for three additional nucleotides at the 3Ј end. In vitro transcribed A8 and A1 bearing a ␥-32 P radiolabel and an internal fluorescein label were mixed with labeled A8XL and treated with purified EcRppH (8 nM), and the radioactivity (P-32) and fluorescence (Fluor) of each RNA were monitored as a function of time by gel electrophoresis. B and C, graphs. RppHcatalyzed phosphate removal from A8, A4, A3, A2, A1, and A1ϩ3 or from G8, G4, G3, G2, G1, and G0 was monitored as in A and quantified by normalizing the radioactivity remaining in each RNA to the corresponding fluorescence intensity. Each time point is the average of two or more independent measurements. Error bars have been omitted to improve the legibility of the graph; instead, the standard error of each measurement is reported in supplemental Table 1. 1). The reactivity of each toward EcRppH was compared by monitoring pyrophosphate removal in the presence of A8XL. A4 and A3 were nearly as reactive as A8, whereas A2 was significantly less reactive, and A1 was completely unreactive (Fig.  2, A and B). Adding three unpaired nucleotides to the 3Ј end of A1 (A1ϩ3) did not increase its reactivity (Fig. 2B), demonstrating that its resistance to pyrophosphate removal by EcRppH was due to an insufficient number of unpaired nucleotides at the 5Ј terminus and not merely to its shorter overall length. A similar length dependence was observed for a related set of RNA substrates in which the 5Ј-terminal nucleotide was changed to G (Fig. 2C). These results show that substrates of EcRppH, like those of BsRppH, must have at least two unpaired nucleotides at the 5Ј end and preferably three or more.
Effect of 5Ј-terminal RNA Sequence-To ascertain whether EcRppH has sequence preferences at the 5Ј end of its substrates, we introduced single nucleotide substitutions into A4 (hereafter referred to as A4 AGAA to indicate both the identity of the 5Ј-terminal nucleotide and the sequence of unpaired nucleotides at the 5Ј end) and tested their effect on the rate of pyrophosphate removal. The reaction rate of G4 GGAA , in which the first nucleotide was changed from A to G (the other one com-monly present at the 5Ј end of primary transcripts), was somewhat slower than that of A4 AGAA (Fig. 3A). Changing the identity of the third nucleotide had only a small effect, as A4 AGGA and A4 AGCA were as reactive as A4 AGAA , and A4 AGUA was only slightly less reactive (Fig. 3D). By contrast, substitutions at the second position had a clear effect on reactivity. Although replacing the G at position 2 of G4 GGAA with A (to create G4 GAAA ) impeded the reaction only slightly, substituting a pyrimidine at that position in the context of either G4 GGAA or A4 AGAA (to create G4 GCAA , G4 GUAA , A4 ACAA , or A4 AUAA ) allowed the reaction to proceed but slowed it significantly (Fig.  3, B and C). Thus, the 5Ј-terminal sequence specificity of EcRppH differs from that of BsRppH in one crucial respect: whereas BsRppH has an absolute requirement for G at position 2, EcRppH prefers either G or A at that position but can also tolerate either C or U.
Influence of 5Ј-Terminal RNA Sequence in E. coli-To determine whether the sequence preference of EcRppH is also evident in vivo, we examined the influence of the second nucleotide on the steady-state ratio of monophosphorylated to triphosphorylated yeiP mRNA, a transcript known to be degraded by an RppH-dependent mechanism in E. coli (5,17).  Table 1.
This ratio, which reflects the relative values of the rate constants for RppH-dependent pyrophosphate removal and subsequent ribonucleolytic degradation (14), was measured by PABLO, a splinted ligation assay that takes advantage of the ability of T4 DNA ligase to covalently join a DNA oligonucleotide to the 5Ј end of monophosphorylated RNA, but not triphosphorylated RNA, when the two are juxtaposed by simultaneous base pairing to a bridging oligonucleotide (4,18). The ligation product and any unligated RNA that remains are then separated and quantified by gel electrophoresis and Northern blotting, and the percentage of 5Ј ends that are monophosphorylated is calculated from the ratio of the ligation yields before and after enzymatic conversion of all 5Ј ends to ligatable monophosphates.
In accordance with a previous report (17), the ratio of monophosphorylated to triphosphorylated yeiP 5Ј ends was found to be 0.78 Ϯ 0.02 in E. coli (Fig. 4), a sign that the rate constants for pyrophosphate removal and subsequent degradation are in close balance for this mRNA. Replacing the U ordinarily present at the second position of yeiP with G increased this ratio 3-fold to 2.4 Ϯ 0.2, a finding consistent with more rapid pyrophosphate removal as a consequence of substituting the nucleotide that EcRppH prefers there. Nevertheless, although RppH is crucial for the rapid decay of yeiP mRNA (17), the U-to-G substitution had little effect on the half-life of this transcript in E. coli (data not shown), suggesting that formation of the monophosphorylated intermediate is not the rate-limiting event in the multistep process of yeiP degradation. The latter conclusion is consistent with the high relative abundance of this decay intermediate at steady state.
Phylogenetic Distribution of EcRppH and BsRppH Homologs-Alignment of the sequences of EcRppH and BsRppH shows them to be quite different, with only 23% identity (Fig. 5A). Much of what little similarity exists is localized in and around the sequence motif (GX 5 EX 7 REX 2 EEXG) characteristic of all Nudix hydrolases, irrespective of their function. As a result, many of the BsRppH residues implicated by x-ray crystallography in substrate binding (12) appear to have no counterpart in the sequence of EcRppH. However, superposition of the three-dimensional structures of BsRppH and EcRppH (12,19) reveals a pair of nonhomologous residues, BsRppH Lys-97 and EcRppH Lys-140, that, despite their disparate sequence contexts, appear to be similarly situated in the active site of each enzyme.
Screening for orthologs by phylogenetic comparison reveals that recognizable homologs of EcRppH are widespread among 1,367 sequenced species representing all classes of Proteobacteria except Deltaproteobacteria, where they are very rare (Fig.  5, B and C). By contrast, the phylogenetic range of BsRppH homologs is much narrower, as they are present in 186 sequenced species of the order Bacillales but almost entirely absent from the closely related order Lactobacillales and from other classes of Firmicutes (Fig. 5C). In bacterial species that lack an identifiable EcRppH or BsRppH ortholog, sequence divergence makes it difficult to predict which of their Nudix hydrolases functions as an RNA pyrophosphohydrolase.
Sequence alignment of proteobacterial EcRppH homologs shows a high degree of conservation of the amino acid residues that line the active site, as defined by the structure of the unliganded enzyme determined by NMR and corroborated by x-ray crystallography (19,20). Not only most of the residues characteristic of the Nudix motif but also 14 other residues are strictly conserved in virtually all of those orthologs (Fig. 5, A and B). According to the structure of ligand-associated EcRppH reported in the accompanying article by Vasilyev and Serganov (20), all but a couple of these residues are situated in close proximity to the bound substrate. Conserved residues Glu-53, Glu-56, Glu-57, and Glu-120 coordinate magnesium ions near the 5Ј triphosphate of the trinucleotide ligand; Arg-8 and Gly-39 directly contact the triphosphate; Gln-37, Tyr-77, and Gln-95 interact with the phosphate between nucleotides 1 and 2; and Arg-27, Phe-139, and Lys-140 contact the base of the second nucleotide.
Alignment of the BsRppH orthologs in various Bacillales species reveals a greater number of strictly conserved residues (Fig.  5A), as expected from the closer evolutionary relationship of those species. According to the structure of ligand-associated BsRppH (12), most of these conserved residues line the substrate-binding pocket. In particular, Gly-52, Glu-68, Glu-72, and Glu-115 coordinate magnesium ions in the vicinity of the 5Ј triphosphate; His-43 and Arg-46 make direct contact with the 5Ј triphosphate; Lys-54 interacts with the phosphate between nucleotides 1 and 2; and Asp-6, Tyr-86, Lys-97, and Asp-141 surround the base of the second nucleotide.
Effect of Active Site Mutations in EcRppH-To assess the catalytic importance of the conserved EcRppH glutamate residues that coordinate Mg 2ϩ , we examined the effect of alanine substitutions at these positions on the ability of the purified enzyme to remove pyrophosphate from A4 AGAA and A8XL (Fig. 6A). These experiments showed that mutating Glu-53, Glu-56, or Glu-57 abolished enzyme activity. By contrast, replacing Glu-120 with alanine had little effect.
We next investigated the importance of the five EcRppH residues that surround the base of the second nucleotide of substrate RNAs, as initially suggested by modeling and later substantiated by x-ray crystallography. These include the conserved residues Arg-27, Phe-139, and Lys-140 as well as Ser-32 and Val-137 (20).  Total RNA was extracted from log-phase E. coli cells containing either wild-type yeiP mRNA (5Ј AU) or a variant (yeiP-U2G) in which the second nucleotide was changed from U to G (5Ј AG), and the phosphorylation state of each transcript was determined by PABLO. RNA samples that had first been treated in vitro with an excess of purified EcRppH (PPase) were analyzed in parallel so that the ligation yields of the fully monophosphorylated transcripts could be used as correction factors for calculating the percentage of yeiP and yeiP-U2G that was monophosphorylated. The steady-state ratios of monophosphorylated to triphosphorylated mRNA reported in the text (mean values and standard deviations) were each calculated from measurements made on three independent RNA preparations. The dendrogram includes only bacterial phyla in which recognizable sequence homologs of EcRppH (p value Յ3 ϫ 10 Ϫ12 ) or BsRppH (p value Յ4 ϫ 10 Ϫ15 ) are common. The number of species within a class or order whose genomes were represented in the Blastp database and found to encode an ortholog of EcRppH (red) or BsRppH (blue) is indicated. Phylogenetic branching reflects RppH clades, as calculated by ClustalW2. Line lengths are arbitrary.
Although the S32A substitution mutant retained full activity toward A4 AGAA and A8XL, the R27A, V137A, and K140M mutants were somewhat less active, and the F139A mutant lacked detectable activity (Fig. 6B). To determine whether these five amino acid residues influence the preference of EcRppH for a purine at position 2, the rate at which the same substitution mutants remove pyrophosphate from A4 AGAA and A4 ACAA was compared (Fig. 7). Interestingly, the ability of the enzyme to discriminate between G and C at that position was considerably diminished in the K140M mutant. By contrast, the S32A mutation increased the preference for G over C at position 2, as did an S32D or S32E substitution. Mutating Arg-27 or Val-137 had a more modest effect on the sequence bias of EcRppH, whereas the inactivity of the F139A mutant precluded a meaningful assessment of its specificity. Unexpectedly, the preference of the enzyme for G at position 2 was also enhanced by replacing the Mg 2ϩ -coordinating residue Glu-120 with alanine.

DISCUSSION
Both E. coli and B. subtilis can degrade some but not all mRNAs by a 5Ј-end-dependent mechanism in which the triphosphorylated 5Ј end of primary transcripts is converted by RppH to a monophosphate amenable to ribonuclease binding. The specificity of this pathway for a subset of messages has been a subject of considerable interest. However, apart from the requirement for an unpaired 5Ј terminus, nothing was known about the features of E. coli transcripts that govern their susceptibility to degradation by this mechanism. Here, we have shown that pyrophosphate removal by purified EcRppH requires at least two unpaired nucleotides at the 5Ј end and prefers three or more, exactly as has been reported for BsRppH (11). However, unlike BsRppH, which has a strict requirement for G at the second position of its substrates (11,12), EcRppH can act on unpaired 5Ј ends with any nucleotide sequence. Nevertheless, it has a modest preference for A over G at position 1 and a greater preference for a purine at position 2.
The greater promiscuity of EcRppH in comparison with BsRppH is consistent with the smaller number and more flexible hydrogen bonding potential of the EcRppH side chains that interact with the edges of the base at position 2 (20). When bound to either enzyme, the planar surfaces of this base are sandwiched between hydrophobic or cationic residues that are conserved or conservatively substituted in their respective FIGURE 6. Effect of mutations on the activity of EcRppH. A, mutation of Mg 2ϩ -coordinating glutamate residues. The catalytic activity of EcRppH bearing an alanine substitution for Glu-53, Glu-56, Glu-57, or Glu-120 was compared with that of wild-type EcRppH (WT) by monitoring phosphate removal from A4 AGAA , as in Fig. 2. B, mutation of residues that contact the base of the second RNA nucleotide. The catalytic activity of EcRppH bearing a mutation of Arg-27, Ser-32, Val-137, Phe-139, or Lys-140 was compared. The standard error of each measurement is reported in supplemental Table 1. . Effect of mutations on the specificity of EcRppH. A, representative gel images. The substrate specificity of various EcRppH mutants was compared with that of wild-type EcRppH (WT) by monitoring phosphate removal from A4 AGAA and A4 ACAA , as in Fig. 2. To achieve greater balance in the reaction rates, four times more EcRppH was added to A4 ACAA than to A4 AGAA . Although both radioactivity and fluorescence were measured, only the former is shown in the gel images. B, graph. After normalizing radioactivity to fluorescence and then averaging data from multiple experiments, the ratio of triphosphate remaining on A4 AGAA versus A4 ACAA was calculated at each time point and plotted. The standard error of each measurement is reported in supplemental Table 1. A more negative slope indicates a greater preference for A4 AGAA over A4 ACAA . Because more EcRppH was added to A4 ACAA than to A4 AGAA , even an initial slope of 0 (horizontal line) signifies a residual preference for the latter substrate. Red zone, enhanced specificity mutants; green zone, diminished specificity mutants.
orthologs (Arg-27, Val-137, and Phe-139 in EcRppH; Val-88, Ile-95, Phe-137, and Ile-138 in BsRppH), and a lysine residue (Lys-140 in EcRppH; Lys-97 in BsRppH) forms hydrogen bonds with O6 and N7 of guanine (Fig. 8) (12,20). Our data show that all four of these EcRppH residues contribute to enzyme activity, especially Phe-139. Furthermore, Lys-140 is important for the ability of EcRppH to discriminate between G and C at position 2, a finding consistent with the more favorable interaction of the positively charged side chain of this lysine residue with the carbonyl oxygen (O6) and imino nitrogen (N7) of guanine than with the amino group (N4) of cytosine. Besides Lys-140, the only other EcRppH residue that interacts with the edge of this base is Ser-32, which does so by accepting a hydrogen bond from the N2 amino group of guanine (Fig. 8). Because the hydroxyl-bearing side chain of this serine residue can either accept or donate hydrogen bonds, it has the potential to accommodate any of the four possible RNA bases. Changing Ser-32 either to alanine (whose side chain cannot hydrogen bond) or to aspartate or glutamate (both obligate hydrogen bond acceptors) increases the preference of EcRppH for G over C at position 2, possibly by preventing the protein from donating a hydrogen bond to either the carbonyl oxygen (O2) or imino nitrogen (N3) of cytosine and thereby eliminating all productive interactions with the edges of that base. In contrast, mutating EcRppH residues that engage the planar surfaces of the second base in cationor hydrophobic interactions (Arg-27 or Val-137, respectively) has only a small effect on substrate specificity. Unlike EcRppH, the nucleotide-binding pocket of BsRppH contacts the edge of guanine via hydrogen bond interactions involving not only Lys-97 but also three other residues (Asp-6, Tyr-86, and Asp-141) that are strictly conserved, two of which (Asp-6 and Asp-141) have negatively charged side chains that can accept but not donate hydrogen bonds (12). The number and specificity of these interactions likely explain the inability of BsRppH to tolerate other nucleotides at this position.
The importance of the four glutamate residues that coordinate the three Mg 2ϩ ions in the catalytic center of EcRppH has also been investigated. Although the three Mg 2ϩ -coordinating glutamates that are part of the sequence motif characteristic of every Nudix hydrolase (Glu-53, Glu-56, and Glu-57) are all essential for activity, a fourth Mg 2ϩ -coordinating glutamate residue (Glu-120) is not, even though a fourth Mg 2ϩ -coordinating glutamate (Glu-115) is similarly situated in the catalytic center of BsRppH, and a homologous glutamate residue is reportedly important for the hydrolysis of Ap4A by an EcRppH homolog present in Lupinus angustifolius (12,21). In view of crystallographic evidence that the two Mg 2ϩ ions coordinated by Glu-120 are critical for enzyme function (20), it seems likely that their productive binding does not require this glutamate residue at the Mg 2ϩ concentration used in these experiments, a postulate consistent with the absence of an additional Mg 2ϩcoordinating glutamate residue in some Nudix hydrolases (22,23). Interestingly, an E120A substitution enhances the preference of EcRppH for G over C at position 2, presumably by an indirect mechanism because of the large distance between them.
Nearly all bacterial species contain multiple Nudix hydrolases with diverse functions. Although a Nudix hydrolase with RNA pyrophosphohydrolase activity has so far been authenticated only in the Gammaproteobacterium E. coli, the Deltaproteobacterium Bdellovibrio bacteriovorus, and the Firmicute B. subtilis (5,6,23), there is reason to believe that they are widespread in bacteria, as 5Ј-end-dependent RNA degradation has been observed in other species (24 -26). However, the divergent evolution of these enzymes often makes them difficult to recognize solely on the basis of sequence homology, a complication that has hindered their discovery and characterization. Nevertheless, sequence homologs of EcRppH are readily identifiable in all classes of Proteobacteria except Deltaproteobacteria, including many important pathogens such as Yersinia pestis, Vibrio cholerae, Neisseria meningitidis, Bordetella pertussis, and Helicobacter pylori. Interestingly, they are also present in flowering plants, where they presumably originated by horizontal gene transfer from colonizing or infecting bacteria (most likely Alphaproteobacteria, judging by homology) and where their biological role is unknown.
Because four of the five EcRppH residues that surround the base of nucleotide 2 are generally conserved (Arg-27, Phe-139, and Lys-140) or conservatively replaced (Val-137) in its orthologs, these enzymes might be expected to share the relaxed sequence specificity of EcRppH. However, the identity of the fifth such residue, corresponding to Ser-32, is highly variable in proteobacterial homologs of EcRppH; although it is usually serine, it is often replaced by alanine and sometimes by a residue with a large hydrophobic side chain or a side chain that is negatively or positively charged. Therefore, our evidence that S32A, S32D, and S32E substitutions reduce the promiscuity of EcRppH raises the possibility that the sequence specificity of many of its orthologs may be more pronounced. Selectivity for the nucleotide at position 2 may also be affected by the protein loop in which Ser-32 is embedded. Anchored by the conserved residues Arg-27 and Trp-33, this loop is highly variable in length and sequence, which may influence the interaction of EcRppH homologs with the second nucleotide of potential substrates.
Orthologs of BsRppH are readily identifiable throughout the order Bacillales (e.g. in Bacillus anthracis, Staphylococcus aureus, and Listeria monocytogenes) but not in Lactobacillales or other Firmicutes. Among the Bacillales homologs, the strict conservation of the four BsRppH residues that contact the guanine at position 2 of substrate RNAs (Asp-6, Tyr-86, Lys-97, and Asp-141) suggests that the stringent sequence specificity of BsRppH is also conserved. Although an ortholog of BsRppH is not recognizable in Lactobacillales, there nevertheless is evidence for 5Ј-end-dependent mRNA degradation in one such species, Streptococcus pyogenes, whose ska transcript can be stabilized by 5Ј-terminal base pairing (26). Therefore, Lactobacillales, which contain the 5Ј-monophosphate-dependent exonuclease RNase J, presumably also contain an RNA pyrophosphohydrolase, most likely a Nudix hydrolase with a divergent sequence.
In addition to BsRppH, B. subtilis appears to contain a second as yet unidentified RNA pyrophosphohydrolase that is indifferent to the identity of the second RNA nucleotide (6,11). Its presence is thought to broaden the subset of transcripts that are capable of being degraded by a 5Ј-end-dependent mechanism, which would otherwise be limited to those in which the second nucleotide is G. By contrast, there is as yet no evidence for an alternative RNA pyrophosphohydrolase in E. coli, where the greater promiscuity of EcRppH may be adequate to allow targeting of a sufficient variety of cellular transcripts. Whether the presence of an additional RNA pyrophosphohydrolase is generally correlated with the sequence specificity of RppH in other bacterial species remains to be determined.