Identification of Four Genes Necessary for Biosynthesis of the Modified Nucleoside Queuosine* □ S

Queuosine (Q) is a hypermodified 7-deazaguanosine nucleoside located in the anticodon wobble position of four amino acid-specific tRNAs. In bacteria, Q is produced de novo from GTP via the 7-deazaguanosine precursor preQ 1 (7-aminoethyl 7-deazaguanine) by an uncharacterized pathway. PreQ 1 is subsequently transferred to its specific tRNA by a tRNA-guanine transgly-cosylase (TGT) and then further modified in situ to produce Q. Here we use comparative genomics to implicate four gene families (best exemplified by the B. subtilis operon ykvJKLM ) as candidates in the preQ 1 biosyn- thetic pathway. Deletions were constructed in genes for each of the four orthologs in Acinetobacter . High pressure liquid chromatography analysis showed the Q nucleoside was absent from the tRNAs of each of four deletion strains. Electrospray ionization mass spectrometry confirmed the absence of Q in each mutant strain. Finally, introduction of the Bacillus subtilis ykvJKLM operon in trans complemented the Q deficiency of the two deletion mutants that were tested. Thus, the products of these four genes (named queC , - D , - E , and - F ) are essential for the Q biosynthetic pathway. Nucleoside modification typically occurs in (cid:1) 10% of the nucleosides of a particular tRNA but

Nucleoside modification typically occurs in ϳ10% of the nucleosides of a particular tRNA but can involve as many as 25% of the nucleosides for a specific tRNA (1). Over 80 modified nucleosides have been characterized (1), many of which are phylogenetically conserved. The nature of nucleoside modification varies from simple methylation of the base or ribose ring to extensive "hypermodification" of the canonical bases. The latter involves multiple enzymatic steps and can result in substantive structural changes (2). Queuosine (Q) 1 is an example of a highly modified nucleoside located in the anticodon wobble position 34 of tRNAs specific for Tyr, His, Asp, and Asn (3). With few exceptions (such as yeast and mycoplasma), it is widely distributed in most prokaryotic and eukaryotic phyla (4). Q is based on a very unusual 7-deazaguanosine core, which is further modified by addition of a cyclopentendiol ring (5,6). The Q modification has long been known and has been implicated in a number of disparate physiological phenomena, such as eukaryotic cell proliferation and differentiation (7)(8)(9)(10), tyrosine biosynthesis (11), and bacterial virulence (12). However, despite the presence of Q in the tRNA anticodon loop, no definitive role for Q in translation has been established (13).
Unlike eukaryotes that import the queuine base, bacteria synthesize Q modification de novo from a GTP starting block (14). The first stage of synthesis gives 7-cyano-7-deazaguanosine (preQ 0 ) that is then further modified into 7-aminoethyl 7-deazaguanine (preQ 1 ) (Fig. 1). The preQ 1 base is subsequently transferred to the appropriate tRNA in a guanine-exchange reaction catalyzed by a tRNA-guanine transglycosylase (TGT) (15). It is then further modified in situ to give Q (16). A second 7-deazaguanosine tRNA modification, archaeosine, is found in all archaeal tRNAs at position 15 of the D-loop (17). Archaeosine is also synthesized via preQ 0 , the substrate for the archaea, TGT enzyme (18). Although the TGT-dependent guanine exchange reactions are well characterized for both Q and archaeosine biosynthesis, little is known about the enzymes and intermediates involved in carbon atom replacement of N-7 of the purine ring to produce 7-deazaguanosine bases preQ 0 and preQ 1 . Elucidation of genes involved in 7-deazaguanosine biosynthesis could potentially lead to the discovery of novel metabolites and enzyme activities.
We report here the identification of four new genes involved in the biosynthesis of the 7-deazaguanosine bases in bacteria. No definitive phenotype or genetic selection exists for the Q modification. Therefore, comparative genomics was used to identify candidate genes. Three criteria were used: phylogenetic distribution of tRNA modification activities, clustering of tRNA modification genes, and prediction of catalytic mechanism. This approach has been successful in finding other tRNA modification genes such as those encoding the tRNA dihydrouridine synthase that produces 5,6-dihydrouridine (19). Once promising gene candidates had been identified, we took advantage of an efficient gene deletion strategy that makes use of the competence properties of Acinetobacter calcoaceticus. 2 This organism is naturally competent and is efficient at homologous recombination (20,21), thus allowing rapid construction of mutant alleles. The candidate 7-deazaguanine biosynthesis genes were independently deleted from the Acinetobacter genome, and the bulk tRNA extracted from these deletion strains was examined for the loss of Q. Two of the deletion mutants were then tested for complementation of Q deficiency by the introduction of the orthologous B. subtilis ykvJKLM operon. * This work was supported by National Institute of Health Grant GM23562, by National Science Foundation Grant MCB-0128901, and by a fellowship from the National Foundation for Cancer Research. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.   1 The abbreviations used are: Q, queuosine; preQ 0 , 7-cyano-7-deazaguanosine; preQ 1 , 7-aminoethyl 7-deazaguanine; TGT, tRNA-guanine transglycosylase; HPLC, high pressure liquid chromatography; ESI-MS, electrospray ionization mass spectrometry; COG, clusters of orthologous groups.

EXPERIMENTAL PROCEDURES
Strains and Growth Conditions-All genetic manipulations of Acinetobacter strains were performed at 30°C in LB liquid media or on LB-agar plates supplemented, when appropriate, with 25 g/ml kanamycin. Growth of constructed bacterial strains for bulk tRNA purification was performed at 37°C in LB liquid media with 50 g/ml kanamycin.
Strain Construction- Table I lists all mutant strains made in this study. To generate chromosomal gene deletions in the wild-type Acinetobacter strain ADP1 (22), a positive/negative selection cassette was constructed as described elsewhere in detail. 2 (Methods and tools are available upon request from vcrecy@scripps.edu). DNA fragments encoding the ykvJKLM homologs in Acinetobacter were identified by BLAST (23) and used to design primers (see Supplemental Material, Table II) that allowed the precise replacement of the four open reading frames by either a Km/tdk or a Km/SacB cassette. For each mutant, three independent PCR fragments were amplified and spliced together: one covered a 1000-base region upstream of the target gene, one covered the selective marker cassette, and one covered a 1000-base region downstream of the target gene. The PCR oligonucleotides were designed so that the three fragments could be spliced together (24) in a PCR. The DNA product of this PCR was used directly to transform Acinetobacter by recombination (21). All potential recombinant deletion mutants were screened by PCR (using the outside flanking primers) to generate the correctly sized recombination inserts.
The strain PS6434 carrying P lipA Spec R SacB 2 was used to insert the ykvJKLM operon ectopically into the lipAB operon. By using the appropriate oligonucleotides (see Supplemental Material, Table II), a three-way PCR product was produced and used to transform strain PS6434. The transformed cells were then plated on LB plates (without NaCl) supplemented with 6% sucrose at 30°C. The sucrose resistant transformants were screened for the loss of spectomycin resistance. The sucrose-resistant spectinomycin sensitive colonies were analyzed by PCR to check for correct integration of the ykvJKLM operon to give strain PS6437. The ykvJ::Km and ykvL::Km alleles were then introduced into strain PS6437 by transformation with the appropriate amplified deletion-cassette PCR products (used previously to produce strain PS6381 (ykvJ::Km) and strain PS6378 (ykvL::Km)) to give strains PS6447 and PS6449, respectively.
Purification of Bulk tRNA and Enzymatic Digestions-Each deletion strain (4 liters) was grown overnight at 37°C in 6-liter flasks of LB media supplemented with kanamycin (50 g/ml), harvested, and then stored at Ϫ20°C. The cell pellets were defrosted and resuspended in 20 ml of resuspension buffer (0.3 M NaOAc, 10 mM EDTA, pH 4.3) and extracted with buffer-saturated phenol, pH 4.3 (Sigma). The aqueous layer was collected, and the RNA was precipitated with ethanol. The bulk tRNA was then purified on Nucleobond AX-500 columns (Clontech, Palo Alto, CA) (according to the manufacturer's protocol) and precipitated with isopropanol.
HPLC and Electrospray Mass Spectrometry Analysis of Digested Bulk tRNA-To compare the nucleoside constituents of the bulk tRNA purified from wild-type strain ADP1, single-gene-deletion strains, and complemented deletion strains, samples of each purified bulk tRNA (1 mg) were enzymatically hydrolyzed (25) and the digestion products were used directly for HPLC analysis (26). HPLC was performed (typically with 200 g of tRNA nucleoside per run) with a flow rate of 1.5 ml/min with 5 mM NH 4 Oac, pH 6.0, for 60 min and a gradient of 0 -20% (40:60, v/v) ACN/H 2 O on a Supelco C18 reverse-phase analytical column (Sigma). HPLC chromatograms of the bulk tRNA's nucleoside constituents from single-gene-deletion strains and ykvJKLM-complemented strains were then compared with constituents from the wild-type strain. Fractions were collected in a region of the HPLC chromatogram with retention times of 22-28 min and then lyophilized. These fractions were resuspended in H 2 O, pooled, and again lyophilized to remove any further salt. The samples were then resuspended in methanol or methanol/water mixes and analyzed by ESI-MS.

RESULTS
Identifying preQ 1 Biosynthesis Gene Candidates by Comparative Genomics-Phylogenetic occurrence was the first crite-  rion used to identify potential preQ 1 biosynthesis gene candidates. To generate an occurrence profile in a given organism, the combined presence of both tgt and queA (the only other known gene in the Q biosynthesis pathway) in the genome was used as a marker of the Q biosynthesis pathway. The two organisms (S. cerevisiae and Mycoplasma) whose tRNAs lack the Q modification did not contain tgt ((clusters of orthologous groups (COG)-0343) and queA (COG0809) in their genomes. Interestingly, tgt and queA homologs were also absent from Mycobacterial species and Treponema pallidum, suggesting that Q is absent from these organisms.
The second stage was to use the predicted catalytic mechanism of the preQ 1 biosynthetic gene products to guide a COG data base search. Early labeling studies showed that GTP is the precursor of preQ 1 (14). An uncharacterized GTP cyclohydrolase was proposed to catalyze the first step in the Q biosynthesis pathway (13). A search of the COG data base for GTP cyclohydrolases identified the two known GTP cyclohydrolase families involved in riboflavin (ribA) and tetrahydrofolate synthesis (folE). It also extracted COG0780, annotated as an "enzyme related to GTP cyclohydrolase I." Importantly, no orthologs were found in Saccharomyces cerevisiae, Mycoplasma, Mycobacterium, and T. pallidum. Therefore, the COG0780 family met our phylogenetic criteria.
Several critical residues for the GTP cyclohydrolase I (CYH) family members have been identified (27). Cys 110, His 113, and Cys 181 are zinc-binding residues that are strictly conserved. His 179 has been implicated in enzymatic events occurring after ring opening (28). However, only Cys 110 is conserved in the COG0780 family, suggesting that, although both families share a common ancestor, the reaction catalyzed by this enzyme family might differ from that of the CYH family.
Genes encoding enzymes of the same pathway have a probability higher than random chance of being physically linked (29). Analysis of the neighboring regions of the COG0780 members in many organisms revealed that the COG0780 member ykvM was the last gene of the ykvJKLM operon in B. subtilis (Fig. 2a). The remaining three genes (ykvJ, K, and L) were absent in S. cerevisiae, Mycoplasma and Mycobacterium, thus fulfilling the occurrence criteria. In addition, in 80% of the organisms whose genomes are completely sequenced, two or three of these four genes are clustered together in different combinations in operons (Fig. 2b). This observation suggested that all four families were involved in the same pathway.
The ykvL COG0602 has been annotated as nrdG. However, this COG is divided into two subfamilies: one containing nrdG homologs that encode the anaerobic class III ribonucleotide reductase, and the other containing the ykvL subclass that is clearly distinct from nrdG. All genes in the two subfamilies have been identified as part of the SAM superfamily of genes that encode proteins that generate radical species of S-adenosylmethionine through unusual Fe-S centers. They also catalyze diverse reactions such as methylation, isomerizations, sulfur insertions, ring formation, anaerobic oxidations, and protein radical formation (30). Interestingly, an iron-dependent enzyme is known to be present on the Q biosynthesis pathway (31).
The ykvK COG0720 has been annotated as 6-pyruvoyl-tetrahydropterin synthase, an enzyme involved in tetrahydropterin (BH4) biosynthesis in higher animals (27). Because BH4 is not found in most bacteria, the function of members of this family in Escherichia coli or B. subtilis is not clear. Recently, the ykvK homolog in E. coli (ygcM) was cloned and expressed and showed to have some 6-pyruvoyl-tetrahydropterin synthase activity (8.7% of the mammal counterpart; Ref. 32). The gene product of FIG. 4. Comparison of HPLC chromatograms of enzymatically digested bulk tRNA from wild-type Acinetobacter and from strains harboring single deletions of potential Q biosynthetic genes. HPLC profiles of bulk tRNA nucleoside digests from wild-type, ⌬tgt, and single-gene-deletion strains (⌬ykvJ, ⌬ykvK, ⌬ykvL, and ⌬ykvM) were performed as detailed under "Experimental Procedures." Chromatograms show peaks with retention times in the 22-to 30-min region, with the peak for Q labeled in the wild-type sample. For clarity, small run-to-run variations in peak retention times were normalized to the average retention time of the adenosine reference peak. the E. coli ygcM gene can also convert sepiaterin to 7,8dihydropterin without any cofactor, thus showing a new activity of sepiapterin C6 side chain cleavage. (Because sepiaterin is also not present in E. coli, the physiological substrate and true in vivo activity of this enzyme are unknown.) The ykvJ COG0603 has been annotated as an ATPase aluminum-resistance gene. To date, no biochemical studies have validated this assignment.
Analysis of Single-gene-deletion Strains Show Lack of Q Modification in tRNA-To test whether each of the four different gene families identified by comparative genomics were Q biosynthesis genes, single-deletion strains for each of the Acinetobacter homologous genes were generated (PS6334, PS6338, PS6378, and PS6381; see "Experimental Procedures"). All four strains had growth rates similar to that of the wild-type and ⌬tgt::Km strains. In parallel, two additional single-gene deletions were also generated as controls. First, tgt was deleted from the Acinetobacter genome. This deletion prevented transfer of preQ 1 to tRNA and hence the formation of Q (15). Second, a gene not implicated in the preQ 1 biosynthesis was deleted as a control for any effects of the presence of the Km cassette (results not shown). Enzymatically digested bulk tRNAs from all six deletion strains were then compared (by HPLC) with the wild-type ADP1 strain for the presence or absence of Q.
For enzymatically digested bulk tRNA, Q is known to elute on the Superlco reverse phase column between guanosine and adenosine ( Fig. 3; Ref. 26). The HPLC profile was studied in detail between those two peaks, using tRNAs from each of the six single-gene-deletion strains and from the wild-type ADP1 strain (Fig. 4). The HPLC chromatogram for digests of tRNA isolated from the wild-type strain showed a small but pronounced peak with an average elution time of 26.8 min. This peak was absent from tRNAs of the ⌬tgt::Km strain and was, therefore, assigned to Q. Examination of bulk tRNA extracted from strains deleted in any of the ykvJ, K, L, and M genes showed HPLC patterns in the 22-to 28-min region identical to the chromatogram of the tRNA extracted from the ⌬tgt::Km strain (absence of the queuosine nucleotide peak). The ⌬yadB::Km control for the influence of the Km cassette showed an HPLC profile of isolated tRNA digests that was indistinguishable from that of the wild-type strain (data not shown).
To confirm that the single ykvJ, K, L, and M deletion strains lacked the Q modification in their bulk tRNAs, elution fractions were collected in the 22-to 28-min region of each HPLC profile, lyophilized, and then examined by ESI-MS (Fig. 5). The results showed an m/z ϭ 410 Da (equivalent to that of protonated Q) in digests of tRNAs from the wild-type and ⌬yadB::Km strain. No such mass was present in digests of tRNA from the ⌬tgt::Km strain or in any of the four ykvJ, K, L, and M single-genedeletion strains. Thus, like the tgt strain, the deletion strains did not contain Q in their tRNA. These data strongly support the hypothesis that the ykvJKLM cluster encodes proteins involved in the biosynthesis of Q. The B. subtilis ykvJKLM Operon Complements Single-genedeletion Strains Defective in Q Modification-To eliminate the possibility of polar effects, the whole B. subtilis ykvJKLM operon was introduced ectopically under the P lipA promoter and combined with the ⌬ykvJ or ⌬ykvL deletions to give strains PS6447 and PS6449, respectively. Nucleoside constituents of bulk tRNA samples from the complemented strains were checked by HPLC (Fig. 6). The single ⌬ykvJ::Km and ⌬ykvL:: Km deletion strains complemented by the B. subtilis operon showed a nucleoside peak with a retention time similar to that of the ADP1 strain. Nucleoside constituents from tRNAs isolated from each of the two complemented strains showed ESI-MS peaks of 410 Da, equivalent to protonated Q (data not shown). Thus, successful complementation demonstrated that loss of Q in the single ⌬ykvJ::Km and ⌬ykvL::Km deletion strains is not a result of an artifact of the deletion procedure but is rather due to the removal of genes involved in Q biosynthesis. The other single-gene-deletion strains were not tested.
After consultation with K. E. Rudd of the Ecogene data base (University of Miami, Miami, FL) and M. Berlyn of the E. coli Genetic Stock Center data base (Yale University, New Haven, CT), the ykvJ, ykvK, ykvL, and ykvM (B. subtilis) genes identified here have been renamed queC, queD, queE, and queF, respectively.

DISCUSSION
Although the Q hypermodification was discovered early and proved to be one of the most complex tRNA modifications, only two of the enzymes involved in its biosynthesis had been elucidated. This situation was primarily due to the lack of a selectable phenotype linked to this nucleoside. Comparative genomics enabled us to overcome the inherent difficulty of identifying gene families involved in Q biosynthesis. The identification of four gene families underlines the relatively complex chemistry required to produce hypermodified 7-deazaguanine nucleoside from the simple starting block of a purine ring. The proposed biosynthetic pathway in Fig. 1 flags a number of unknown steps. If the pathway is correct, then the four gene products of queC, -D, -E, and -F could potentially be used to find key intermediates and activities on the Q biosynthetic pathway by employing standard molecular cloning and biochemical approaches.
Because preQ 0 is known to be an intermediate in the formation of the ubiquitous archaeal tRNA-modified nucleoside archaeosine (18), Q and archaeosine may share a common 7-deazaguanosine biosynthesis pathway. Examination of the genomes of archae indicates that these organisms contain the ykvJ, K, and L homologs but are missing the ykvM homolog (except in Aeropyrum pernix) that is believed to be a GTP cyclohydrolase homolog (see above). In this scenario, the ykvM gene product is most likely involved in one of the steps converting GTP to preQ 0 . Two explanations for the phylogenetic distribution of ykvM are possible. First, the ykvM gene product could be involved in the preQ 0 biosynthesis in bacteria and has been displaced by another enzyme in archaea. (For example, archaea have a third class of GTP cyclohydrolases that could be involved in archaeosine biosynthesis; Ref. 33.) Second, the ykvM gene product, although sharing significant homology with GTP cyclohydrolases, might instead encode an enzyme in bacteria involved in converting preQ 0 to preQ 1 , a reaction that does not occur in archae. However, to satisfy the latter hypothesis, an existing bacterial GTP cyclohydrolase would be needed to catalyze the first predicted step in Q biosynthesis. Further studies are underway to discriminate between these two hypotheses.
The genome sequencing efforts of the last decade have revealed how little is known about the relationship between DNA sequences and biological functions. In the best genetically characterized organisms, a third of the genes have no assigned function. Analysis of the literature reveals many known enzymatic activities or pathways for which the cognate genes remain unknown. The tentative identification of these "missing" genes by comparative genomics may turn out to be a general approach to link genes to function (34), once confirmation is obtained by the kind of genetic and biochemical analysis described here.