The rluC Gene of Escherichia coli Codes for a Pseudouridine Synthase That Is Solely Responsible for Synthesis of Pseudouridine at Positions 955, 2504, and 2580 in 23 S Ribosomal RNA*

Escherichia coli ribosomal RNA contains 10 pseudouridines, one in the 16 S RNA and nine in the 23 S RNA. Previously, the gene for the synthase responsible for the 16 S RNA pseudouridine was identified and cloned, as was a gene for a synthase that makes a single pseudouridine in 23 S RNA. The yceCopen reading frame of E. coli is one of a set of genes homologous to these previously identified ribosomal RNA pseudouridine synthases. In this work, the gene was cloned, overexpressed, and shown to code for a pseudouridine synthase able to react with in vitro transcripts of 23 S ribosomal RNA. Deletion of the gene and analysis of the 23 S RNA from the deletion strain for the presence of pseudouridine at its nine known sites revealed that this synthase is solely responsible in vivo for the synthesis of three of the nine pseudouridine residues, at positions 955, 2504, and 2580. Therefore, this gene has been renamed rluC. Despite the absence of one-third of the normal complement of pseudouridines, there was no change in the exponential growth rate in either LB or M-9 medium at temperatures ranging from 24 to 42 °C. From this work and our previous studies, we have now identified three synthases that account for 50% of the pseudouridines in the E. coli ribosome.

Pseudouridine (⌿), 1 the 5-ribosyl isomer of uridine, occurs in rRNA (1), tRNA (2), and small nuclear and nucleolar RNA (3,4) but not in mRNA or viral genomic RNAs. All the RNAs in which ⌿ is found share a common characteristic, namely a tertiary structure that must be maintained for proper function. ⌿ is made after the polynucleotide chain has been formed by an enzyme-catalyzed but energy-independent isomerization of uridine (reviewed in Ref. 5). A considerable amount of ⌿ is found in ribosomal RNA approaching 8% of the uridines in mammals (6). The number and distribution of ⌿ is different between the two large rRNAs. In small subunit (SSU) RNA, the number varies from 0 or 1 (yeast mitochondria) to 1 (Escherichia coli) to ϳ40 (mammals), and ⌿ are deployed throughout the molecule, whereas in the large subunit (LSU) RNA, although there is also a wide variation in the number of ⌿ from 1 (yeast mitochondria) to 4 -9 (prokaryotes) to 55-57 (mammals), the distribution is conserved in all organisms to three defined secondary structural regions at or near the peptidyl transferase center (reviewed in Ref. 5). In E. coli, the organism studied in this work, there are 10 ⌿ residues, one in the SSU RNA at position 516 (7) and nine in the LSU RNA at positions 746, 955, 1911, 1915, 1917, 2457, 2504, 2580, and 2605 (8, 9). ⌿1915 is further modified by methylation at N 3 (10).
Despite the specificity implicit in the conservation of geographic localization in the LSU to the functionally important peptidyl transferase center, there is no known role for ⌿ in the ribosome. To address this issue, we have embarked on a program to identify all of the synthases responsible for formation of the 10 ⌿ in E. coli rRNA with the aim of deleting specific ⌿ residues by inactivating the genes for the corresponding synthases. So far, rsuA, the gene for the synthase responsible for forming ⌿516 in 16 S RNA (11), and rluA, the gene for the synthase that makes ⌿746 in LSU RNA (12), have been identified. In this work we show that the gene yceC, renamed rluC, makes the synthase responsible for formation of ⌿ at positions 955, 2504, and 2580. Deletion of this gene and thus the absence of these three ⌿ residues has no detectable effect on the exponential growth rate in either rich or minimal glucose medium at temperatures ranging from 24 to 42°C.

EXPERIMENTAL PROCEDURES
Gene Deletion-The yceC gene was deleted by the method of Hamilton et al. (13). The insert cloned into the KpnI and XbaI sites of pMAK705 was prepared by PCR as described by Nelson and coworkers (see Fig. 2 in Ref. 14). It contained 841 bases 5Ј to the AUG start and 870 bases 3Ј to the UAA termination codon. 39 bases of the N-terminal portion of the gene and 49 bases of the C terminus were retained with the remainder being replaced by the kanamycin resistance gene, obtained by PCR amplification from pUC4K (Amersham Pharmacia Biotech, catalog number 27-4958-01). The host strain for pMAK705 was MC1061 as described by Hamilton et al. (13) (17).
Rescue Plasmid-This plasmid (pLG338/yceC) was constructed by insertion into the SmaI site of pLG338 (18) of a PCR-amplified fragment of DNA starting 116 bases 5Ј to the AUG initiator of yceC and ending 124 nucleotides 3Ј to the termination codon. This insertion site inactivates the Kan R gene of the plasmid. The construct includes the 71-base promoter, 45-base spacer, 960-base gene, 77-base spacer, 44base termination sequence, and 3 bases beyond. pLG338 also carries a tetracycline resistance gene. Putative promoter and terminator sequences were identified by examination of the upstream and downstream regions. For the promoter, a web site was used. 2 The terminator sequence was located visually and verified by folding using M-fold version 3.0 (19). 3 Other Methods and Materials-Transformants of yceC-deleted SJ134 and of wild type and yceC-deleted MG1655 with pLG338 and pLG338/ yceC were selected by tetracycline resistance. All growth media contained 10 g/ml tetracycline to retain the plasmid in the tetracyclinesensitive host cells. ⌿ sequencing was performed as described previously (8,20). For the growth experiments, overnight cultures at 37°C in the medium to be tested were diluted 50-fold (minimal medium) or 100-fold (rich medium) and placed at the testing temperature. Cell density was monitored at 600 nm.

Identification of Putative ⌿ Synthase Genes in E. coli-From
the amino acid sequences of RsuA and RluA, as well as those for two tRNA ⌿ synthases, TruB (21) and TruA (22), Koonin (23) and Gustafsson et al. (24) were able to identify putative ⌿ synthase ORFs by searching for sequence motifs. Five ORFs were identified in E. coli (Table I). A sixth ORF, ymfC, was found subsequently when the entire E. coli genome sequence became available. The ORFs could be divided into four subfamilies based on their sequence motifs (23, 24) as indicated in Table I. By chance, the sequences of the four initially identified genes happened to define each of the four subclasses. Note, however, that the six putative synthase genes subsequently identified are all in class A or B. The predicted protein properties and the site(s) of ⌿ formation recognized by the synthases are also indicated in Table I when known. yceC-Because homologs to yceC are common in other species (23, 24), we selected this ORF for our initial study. The gene was cloned, the protein was overexpressed, and ⌿ formation activity was detected using 5-[ 3 H]uridine-labeled 23 S RNA as an in vitro substrate, confirming that this gene product was indeed a ⌿ synthase. To determine the specificity of the synthase under in vivo conditions, the gene was deleted by insertion of the kanamycin resistance gene (13). Verification of gene deletion was done by PCR amplification from the N and C termini of the yceC gene in the chromosomal DNA of the deletion mutant. The wild type control gave the expected 1.0-kb band, whereas the mutant supposed to contain the Kan R insert was, as expected, 1.4 kb in size. Further evidence was obtained by amplification from the N and C termini of the Kan R gene.
The mutant gave the expected 1.3-kb band, but nothing was obtained from the wild type. Preliminary ⌿ sequence analysis showed the absence of ⌿ 955, 2504, and 2580 as a result of this single gene deletion.
For further experiments, the deleted gene was transferred by P1 transduction (17) into strain SJ134. Mutant cells were then transformed with a low copy number plasmid, pLG338 (18), or a rescue plasmid, pLG338 containing the yceC gene inserted at its SmaI site, creating strains yceC Ϫ (pLG338) and yceC Ϫ (pLG338/yceC). The yceC gene insert included the putative yceC gene promoter and terminator elements, and there were no other apparent ORFs in this operon. Because pLG338 carries a tetracycline resistance gene, the plasmid-containing cultures were grown in the presence of tetracycline to retain the plasmid in its tetracycline-sensitive host strain. 3 www.ibc.wustl.edu/ϳzuker/rna. The yceC-deleted strain SJ134 and the plasmids pLG338 (ϩ) and pLG338 carrying the yceC gene (ϩ/yceC) were prepared as described under "Experimental Procedures." Transformation of the yceC Ϫ strain with the plasmids, RNA preparation, and ⌿ sequencing were also done as described there. The five naturally occurring ⌿ sites monitored in this figure are indicated by arrows. RNA for ACUG sequencing lanes was from wild type SJ134 (left side) and a transcript (25) Koonin (23). In this report, SfhB was YfiI, and YqcB was ECU29581_5. g Gustafsson et al. (24). In this report, SfhB was YfiI, and YqcB was f260. h Values in parentheses were obtained using the next downstream initiator AUG. There is uncertainty regarding the true start site (K. Rudd, personal communication).
Ribosomal RNA from the wild type and yceC-disrupted strain SJ134 and from the two transformed strains, yceC Ϫ (pLG338) and yceC Ϫ (pLG338/yceC), were isolated and sequenced for the presence of ⌿ at the nine known sites in 23 S RNA (Fig. 1). This figure shows clearly that of the five ⌿ shown, the deletion mutant is lacking ⌿ at 955, 2504, and 2580 but not at 2457 or 2605. Because the bands for 2457 and 2605 are found in the same lanes as 2504 and 2580, respectively, they serve as effective internal controls for the absence of bands at 2504 and 2580. The other four ⌿, at positions 746, 1911, 1915, and 1917, were present in the mutant as well as the wild type (data not shown). When the deletion strain was supplemented with the rescue plasmid pLG338/yceC, but not with pLG338 alone, all three ⌿ reappear. We conclude that the yceC gene makes a ⌿ synthase that can recognize these three sites in 23 S RNA and that no other synthase in the cell can take over this function. In view of this result, we have renamed the gene rluC, ribosomal large subunit pseudouridine synthase C.
The location of the nine ⌿ sites in 23 S RNA and the five sites monitored in Fig. 1 are shown in Fig. 2A on the secondary structure of 23 S RNA, and in Fig. 2B, the immediate vicinity of the three sites recognized by RluC are shown in expanded form. Clearly, the three sites are separated from each other and do not share any obvious common structural elements that would be suitable for recognition purposes.
Growth Rate of the Mutant Strain-Deletion of these three ⌿ residues from the ribosome was not lethal. To assess any more subtle metabolic defects, growth rates were measured at different temperatures in both rich and minimal glucose media (Table II). For this purpose, the rluC deletion was moved by P1 transduction into strain MG1655, the same strain whose genome was sequenced by Blattner et al. (16), to provide a well defined background. The MG1655 deletion strain was selected by its kanamycin resistance, and PCR amplification from the yceC termini confirmed the deletion (the expected 1.4-kb band was found instead of a 1.0-kb band for the wild type). Wild type and mutant MG1655 were transformed with both the rescue plasmid and its control. Exponential growth rates for all four strains are shown in Table II. Even though both rich and minimal media were tested over a range of temperature from 24 to 42°C, no significant change in growth rate was observed. Moreover, no instance was found where the maximum cell density at stationary phase was limited by the deletion mutation. We conclude that at least under the described conditions, there is no effect of the loss of three of the nine ⌿ present in the large subunit of the ribosome.

DISCUSSION
Specificity-In this work, we have shown that yceC is indeed a gene for a ⌿ synthase and that its gene product is the only synthase which in vivo is capable of forming ⌿955, ⌿2504, and ⌿2580 in E. coli 23 S RNA. In view of its specificity, the gene has been renamed rluC. Because this work was done in vivo in the presence of all of the other ⌿ synthases of the cell, it is clear that none of the other synthases share the specificity for positions 955, 2504, and 2580 with RluC. However, the reverse is not necessarily the case. RluC might share the ability with another synthase for recognition of one or more of the remaining ⌿ sites. This can only be determined with certainty when all of the other rRNA ⌿ synthases are identified and deleted. RluC does not share the ability to form ⌿516 in 16 S RNA with RsuA (Table I), because deletion of the rsuA gene blocks for-mation of ⌿516 in vivo. 4 A possible dual specificity of recognition with one of the sites in tRNA, like that found for RluA (12), has not yet been tested.
RNA Recognition Site-Although all three sites for ⌿ formation are well separated in the primary structure of 23 S RNA, 2504 and 2580, but not 955, approach each other when the RNA is folded into its secondary structure ( Fig. 2A). However, in the tertiary structure existing in the ribosome, ⌿955 is probably not far away either, because cross-linking results place A 960 next to C 2475 (28). These results suggest a possible mode of recognition of these three sites, which otherwise appear dissimilar in both sequence and secondary structure. Indeed, the only common structural element is that all three Us destined to be isomerized to ⌿ are followed by a G residue. We postulate that any UG sequence within a given short distance of the catalytic center will be recognized. Thus if there is a tertiary structure of the 23 S RNA at some stage of ribosome biogenesis in which U 955 is sufficiently close to the recognition site of the synthase, which must also be near to U 2504 and U 2580 , it might be possible for the synthase to catalyze ⌿ formation at all three sites. In this regard it should be noted that the closest other UG sequence to any of the three sites is 11 residues away at U 2493 .
An alternative way to account for this unusual specificity would be for the synthase to possess separate polypeptide binding domains specific for each site that would each have the  property that upon binding, the desired U residue would be brought into the catalytic center. Function of ⌿-No effect on exponential phase growth rate was found when ⌿955, 2504, and 2580 were absent, even when both medium and temperature were varied. Other growth parameters such as survival in stationary phase and the length of the lag phase were not examined in detail, but preliminary observations suggest that there is little or no effect (data not shown). In the absence of any such clues as to ⌿ function, the ribosomes from ⌿-deficient cells will need to be examined for their ability to support the partial reactions of protein synthesis in vitro. This has been done previously to study the effects of rRNA mutations on ribosome function (29 -31).
Failure to find a physiological effect upon ⌿ deletion is disappointing but not surprising. Neither the rsuA deletion strain mentioned above nor a strain with the rluA gene deleted showed any major metabolic defects, 4 and deletion of various ⌿ residues from yeast SSU and LSU RNAs by deletion of their guide RNAs also had no effect (reviewed in Ref. 5).
Nevertheless, the geographic juxtaposition of at least two of these ⌿ residues, namely 2504 and 2580, to sites known to be functionally important in peptide bond formation (reviewed in Ref. 6) is a strong indication that these ⌿ residues play an important, if still undiscovered, role. This conclusion is reinforced by the recent finding that of the four 23 S RNA residues whose chemical modification interferes with binding of tRNA to the ribosomal P site, U 2506 and U 2585 are within 2 and 5 bases, respectively, of ⌿2504 and ⌿2580 (35).
Comparison with Other Known ⌿ Synthases- Table III summarizes the properties of all known cloned ⌿ synthases and their genes. Table III includes synthases for E. coli and Bacillus subtilis rRNA and for E. coli and yeast tRNA. It is clear that the specificity of each synthase varies from being limited to one kind of RNA at one site to multiple sites in one class of RNA or even sites in different kinds of RNA. To help in categorizing these synthases and those still to be discovered, four specificity classes have been defined. Class I specificity is most stringent, in that only a single specific site in one kind of RNA is recognized. RsuA, RluB, TruB, and its homolog in yeast, Pus4p, are examples of this class. Class II defines those synthases that modify any U residue within a span of 5-6 nucleotides, but only in one kind of RNA. TruA and its yeast homolog, Pus3p, are examples. In class III, the specificity is relaxed even more, and distant sites become recognizable, although again only in the same class of RNA. Pus1p in yeast is an example of this type because it can recognize up to eight different sites in tRNA, and RluC is another example. Although both synthases recognize sites spread some distance apart, the maximum distance in the case of Pus1p is 41 residues, whereas for RluC it is 1625 nucleotides. Class IV specificity is reserved for those synthases that recognize specific single sites in more than one class of RNA, a property we have termed "dual specificity" (12). So far, RluA is the only member of this class, but this may be at least in part because of the fact that testing for dual specificity is not straightforward because it requires a predetermination of which site in which RNA to test. The dual specificity of RluA was only discovered serendipitously.
Further Studies-The intriguing RNA recognition properties of the RluC synthase coupled with the ease of overexpression of the recombinant enzyme with demonstrable ⌿ formation activity of appropriate specificity make this protein a logical candi-date for x-ray crystallographic analysis. Major questions are how RNA recognition at three such disparate sites is accomplished and what is the mechanism of U isomerization to ⌿. Such studies, in collaboration with R. Fenna of this department, are underway.