Yeast-like mRNA Capping Apparatus in Giardia lamblia*

A scheme of eukaryotic phylogeny has been suggested based on the structure and physical linkage of the RNA triphosphatase and RNA guanylyltransferase enzymes that catalyze mRNA cap formation. Here we show that the unicellular pathogen Giardia lamblia encodes an mRNA capping apparatus consisting of separate triphosphatase and guanylyltransferase components, which we characterize biochemically. We also show that native Giardia mRNAs have blocked 5′-ends and that 7-methylguanosine caps promote translation of transfected mRNAs in Giardia in vivo. The Giardia triphosphatase belongs to the tunnel family of metal-dependent phosphohydrolases that includes the RNA triphosphatases of fungi, microsporidia, and protozoa such as Plasmodium and Trypanosoma. The tunnel enzymes adopt a unique active-site fold and are structurally and mechanistically unrelated to the cysteine-phosphatase-type RNA triphosphatases found in metazoans and plants, which comprise part of a bifunctional triphosphataseguanylyltransferase fusion protein. All available evidence now points to the separate tunnel-type triphosphatase and guanylyltransferase as the aboriginal state of the capping apparatus. We identify a putative tunnel-type triphosphatase and a separate guanylyltransferase encoded by the red alga Cyanidioschyzon merolae. These findings place fungi, protozoa, and red algae in a common lineage distinct from that of metazoa and plants.

The m 7 GpppN cap is a defining feature of eukaryotic mRNA that is formed via three enzymatic reactions: (i) the 5Ј-triphosphate end of the pre-mRNA is hydrolyzed to a diphosphate by RNA triphosphatase; (ii) the diphosphate RNA end is capped with GMP by RNA guanylyltransferase; and (iii) the GpppN cap is methylated by RNA (guanine-N7) methyltransferase (1). Capping enzymes are a good focal point for considering eukaryotic evolution, because the cap structure is thought to be ubiquitous in eukaryotic organisms but absent from the bacterial and archaeal domains of life. Thus, any differences in the capping apparatus between eukaryal taxa would reflect events that post-date the emergence of ancestral nucleated cells. Previously, we proposed a scheme of eukaryotic phylogeny based on two features of the mRNA capping apparatus: the structure and mechanism of the triphosphatase component (metal-dependent fungal-type versus metal-independent cysteine-phosphatase type) and whether the triphosphatase is covalently fused to the guanylyltransferase component (2,3).
Metazoans and plants have a two-component capping system consisting of a bifunctional triphosphatase-guanylyltransferase polypeptide and a separate methyltransferase polypeptide, whereas fungi, microsporidia, and protozoa (e.g. Plasmodium falciparum and Trypanosoma brucei) have a three-component system consisting of separate triphosphatase, guanylyltransferase, and methyltransferase polypeptides (2,4,5). The primary structures and biochemical mechanisms of the fungal and mammalian guanylyltransferases and cap methyltransferases are conserved. However, the atomic structures and catalytic mechanisms of the fungal and mammalian RNA triphosphatases are completely different (6 -8).
Metazoan and plant RNA triphosphatases belong to the cysteine-phosphatase enzyme superfamily, which is defined by the conserved phosphate-binding loop motif HCxxxxxR(S/T). Mammalian RNA triphosphatase catalyzes a two-step phosphoryl transfer reaction in which the conserved cysteine of the signature motif attacks the ␥-phosphorus of triphosphate-terminated RNA to form a covalent protein-cysteinyl-S-phosphate intermediate and expel the diphosphate RNA product (7,8). The covalent phosphoenzyme intermediate is hydrolyzed to liberate inorganic phosphate. The reaction does not require a divalent cation cofactor.
In contrast, the RNA triphosphatases of fungal species such as Saccharomyces cerevisiae, Candida albicans, and Schizosaccharomyces pombe are strictly dependent on a divalent cation. The fungal enzymes belong to a family of metal-dependent phosphohydrolases that embraces the triphosphatase components of the P. falciparum, T. brucei, Encephalitozoon cuniculi, Chlorella virus, poxvirus, and baculovirus mRNA capping systems (2, 4, 5, 9 -13). The signature biochemical property of this enzyme family is the ability to hydrolyze nucleoside triphosphates to nucleoside diphosphates and inorganic phosphate in the presence of manganese or cobalt. The defining structural features of the metal-dependent RNA triphosphatases are two glutamate-containing motifs that are required for catalysis by every family member and which comprise the metal-binding site. The crystal structure of the S. cerevisiae RNA triphosphatase Cet1 revealed a novel tertiary structure in which the active site is situated within a topologically closed hydrophilic tunnel composed of eight antiparallel ␤ strands (6).
The radically different flavors of RNA triphosphatase exemplify an unusual situation in nature, whereby two different structural solutions to an enzyme chemistry problem (gamma phosphate hydrolysis, a fairly pedestrian reaction) have evolved within a single biological pathway. When and how did the RNA triphosphatase component of the capping apparatus diverge during eukaryotic evolution and will the differences in the organization of the capping apparatus continue to provide a useful guide to phylogenetic relationships among taxa as more genomic information becomes available? We are pursuing these questions by identifying the cap-forming enzymes of uni-* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. cellular eukaryotes and evaluating their structural and mechanistic relatedness to exemplary fungal and mammalian enzymes.
Here we focus on the amitochondriate protozoan parasite Giardia lamblia. In addition to being a significant enteric pathogen of humans, Giardia is posited to occupy a deeply branching position in eukaryal phylogeny. Analysis of the sequence of the 12-Mb Giardia genome is providing important insights to the early origins of RNA processing mechanisms, such as spliceosomal pre-mRNA splicing, which are regarded as uniquely eukaryal (14). Although it has been a matter of some debate whether Giardia mRNAs even have a 5Ј cap structure (15,16), we show here that Giardia does possess the enzymatic machinery for cap synthesis, modifies the 5Ј-ends of its mRNAs, and exploits the m 7 G cap for enhanced translation of a reporter mRNA in vivo. Biochemical characterization of the separately encoded triphosphatase and guanylyltransferase of Giardia shows that they resemble the capping enzymes of fungi and protozoa.
Mining of other newly available genome databases permitted the identification of candidate cap-forming enzymes in the unicellular red alga C. merolae (17) and the water mold Phytopthera ramorum. The red alga separately encodes a tunnel family RNA triphosphatase and a guanylyltransferase, whereas P. ramorum appears to encode a bifunctional fusion enzyme consisting of an N-terminal tunnel-type-triphosphatase domain and a C-terminal guanylyltransferase domain. We integrate these findings into an expanded cap-centric scheme of eukaryotic phylogeny.

EXPERIMENTAL PROCEDURES
Giardia Cultivation-Trophozoites of the G. lamblia isolate WB, clone WB/1267 (ATCC number 50582), were cultured essentially as described previously (33) with a slight modification. Cells were maintained anaerobically in glass culture tubes at 37°C in Keister's modified TYI-S-33 medium supplemented with antibiotics in which the sodium phosphate buffer was replaced by a 25 mM sodium bicarbonate buffer. This change had no detectable effect on parasite growth.
Recombinant G. lamblia RNA Triphosphatase-The open reading frame encoding GlCet1 was PCR-amplified from Giardia genomic DNA with primers that introduced an NdeI site at the start codon and a BamHI site 3Ј of the stop codon. The PCR product was digested with NdeI and BamHI and then inserted into a customized T7 RNA polymerase-based pET-His 10 -Smt3 expression vector, wherein the GlCet1 was fused to an N-terminal His 10 -Smt3 domain (18,19). An E58A missense mutation was introduced into the GlCET1 gene by the PCR-based two-stage overlap extension method. The mutated PCR product was inserted into the pET-His 10 -Smt3 vector. The inserts of the resulting pET-His 10 -Smt3-GlCET1 and pET-His 10 -Smt3-GlCET1-E58A plasmids were sequenced completely to confirm the presence of the wild-type open reading frame or the desired mutation and exclude the acquisition of unwanted changes during amplification and cloning.
The wild-type and E58A pET-His 10 -Smt3-GlCET1 plasmids were transformed into Escherichia coli BL21-CodonPlus(DE3). Cultures (500 ml) amplified from single transformants were grown at 37°C in LB medium containing 50 g/ml kanamycin and 50 g/ml chloramphenicol until the A 600 reached 0.6. The cultures were adjusted to 2% ethanol and 0.2 mM isopropyl ␤-D-thiogalactopyranoside and then incubated at 17°C for 20 h with constant shaking. Cells were harvested by centrifugation, and the pellets were stored at Ϫ80°C. All subsequent procedures were performed at 4°C. Thawed bacteria were resuspended in 25 ml of buffer A (50 mM Tris-HCl, pH 8.0, 200 mM NaCl, 10% glycerol). Cell lysis was achieved by the addition of lysozyme and Triton X-100 to final concentrations of 100 g/ml and 0.1%, respectively. The lysates were sonicated to reduce viscosity, and insoluble material was removed by centrifugation. The soluble extracts were applied to 1-ml columns of nickel-nitrilotriacetic acid-agarose resin (Qiagen) that had been equilibrated with buffer A containing 0.1% Triton X-100. The columns were washed with 10 ml of the same buffer and then eluted stepwise with 2.5-ml aliquots of buffer A containing 50, 100, 250, and 500 mM imidazole. The polypeptide compositions of the column fractions were monitored by SDS-PAGE. The 54-kDa recombinant His 10 -Smt3-GlCet1 polypeptide was recovered predominantly in the 250 and 500 mM imidazole fractions. The 250 mM imidazole eluates were dialyzed against buffer A containing 2 mM DTT 1 and 0.01% Triton X-100 and then stored at Ϫ80°C. The protein concentrations were determined by SDS-PAGE analysis of serial dilutions of the GlCet1 preparation and a BSA standard. The gels were stained with Coomassie Blue, and the staining intensities of the His 10 -Smt3-GlCet1 and BSA polypeptides were quantified with a Fuji film FLA-5000 digital imaging and analysis system. GlCet1 concentrations were calculated by interpolation to the BSA standard curve.
Triphosphatase Assay-Reaction mixtures (20 l) containing 50 mM Tris-HCl, pH 7.5, 5 mM DTT, 2 mM MnCl 2 , 1 mM [␥-32 P]ATP, and GlCet1 as specified were incubated for 20 min at 37°C. An aliquot (2 l) of the mixture was applied to a polyethyleneimine-cellulose TLC plate, which was developed using 0.5 M LiCl, 1 M formic acid. The radiolabeled material was visualized by autoradiography, and 32 P i formation was quantified by scanning the TLC plate with a Fujix BAS2500 imager. Alternatively, reaction mixtures (20 l) containing 50 mM Tris-HCl, pH 7.5, 5 mM DTT, 2 mM MnCl 2 , 1 mM unlabeled NTP, and GlCet1 as specified were incubated for 20 min at 37°C. The reactions were quenched by adding 1 ml of malachite green reagent (Biomol Research Laboratories, Plymouth Meeting, PA). Phosphate release was determined by measuring A 620 and interpolating the value to a phosphate standard curve.
Recombinant G. lamblia Guanylyltransferase-The open reading frame encoding GlCeg1 was PCR-amplified from genomic DNA and cloned into a T7 RNA polymerase-based bacterial expression vector pET16b-Nco (which contains an NcoI site instead of a NdeI site) so as to fuse the GlCeg1 protein to an NH 2 -terminal 14-amino acid leader peptide (MGHHHHHHHHHHSA). pET-His-GlCEG1 was transformed into E. coli BL21-CodonPlus(DE3). A 500-ml culture amplified from a single transformant was grown at 37°C in LB medium containing 50 g/ml kanamycin and 50 g/ml chloramphenicol until the A 600 reached 0.6. The culture was adjusted to 2% ethanol and 0.2 mM isopropyl ␤-Dthiogalactopyranoside and then incubated at 17°C for 20 h. Recombinant His 10 -GlCeg1 was isolated from the soluble bacterial extract by nickel-agarose chromatography as described above for His 10 -Smt3-Gl-Cet1. The recombinant His 10 -GlCeg1 polypeptide was recovered predominantly in the 250 mM and 500 mM imidazole fractions. The 250 mM imidazole eluate was dialyzed against buffer A containing 2 mM DTT and 0.01% Triton X-100 and stored at Ϫ80°C. Protein concentrations were determined using the Bio-Rad dye reagent with bovine serum albumin as the standard. Guanylyltransferase Assay-Guanylyltransferase activity was assayed by the formation of the covalent enzyme-GMP intermediate. Reaction mixtures (20 l) containing 50 mM Tris-HCl, pH 8.0, 5 mM DTT, 5 mM MgCl 2 , and [␣-32 P]GTP and enzyme as specified were incubated at 37°C. The reaction was halted by addition of SDS to 1% final concentration. The samples were analyzed by SDS-PAGE. Transfer of labeled GMP to the GlCeg1 protein was visualized by autoradiography of the dried gel and quantified by scanning the gel with a Fujix BAS2500 imager.
Glycerol Gradient Sedimentation-An aliquot (40 g) of the nickelagarose preparation of GlCeg1 was mixed with catalase (40 g), BSA (40 g), and cytochrome c (40 g). The mixture was applied to a 4.8-ml 15-30% glycerol gradient containing 50 mM Tris-HCl, pH 8.0, 0.2 M NaCl, 1 mM EDTA, 2 mM DTT, 0.05% Triton X-100. The gradient was centrifuged for 14 h at 4°C in a Beckman SW50 rotor at 50,000 rpm. Fractions (ϳ0.19 ml) were collected from the bottom of the tube. The polypeptide compositions of the gradient fractions were analyzed by SDS-PAGE.
Cap-dependent 5Ј-RACE Assay-The state of the 5Ј-end of specific Giardia transcripts was determined by using the GeneRacer TM kit from Invitrogen, Inc. (Carlsbad, CA). Total RNA was isolated from Giardia using RNA-STAT 60 according to the manufacturer's protocol (Tel-Test, Inc., Friendswood, TX) and then treated with DNase I (Promega, Inc., Madison, WI) to remove all remaining DNA. Four separate reactions were established in parallel for each transcript to be assayed: 1) untreated; 2) treated with calf intestinal phosphatase (CIP); 3) treated with tobacco acid pyrophosphatase (TAP); and 4) treated with CIP and TAP. To remove all unmodified phosphates from the 5Ј-end of transcripts, RNA in the CIP and CIP ϩ TAP tubes was initially dephosphorylated (10 units of CIP per 1 g of total RNA) at 50°C for 60 min.
Untreated and TAP-only tubes were processed in parallel without the addition of CIP. To remove modified phosphate structures from the 5Ј-end of RNA transcripts and leave a single 5Ј-phosphate behind, CIP was then removed by phenol:chloroform extraction. RNA in the TAP and CIP ϩ TAP samples was subsequently treated with TAP (0.5 unit/1 g of total RNA) at 37°C for 60 min. Untreated and CIP-treated samples were processed in parallel without the addition of enzyme.
To provide an anchor for subsequent PCR reactions, 0.25 g of the GeneRacer TM oligoribonucleotide (5Ј-CGACUGGAGCACGAGGACA-CUGACAUGGACUGAAGGAGUAGAAA-3Ј) was ligated to the 5Ј-end of RNA molecules that retained a 5Ј-phosphate. The RNA was then reverse transcribed using an oligo(dT) primer according to the manufacturer's protocol (SuperScript TM II RT, Invitrogen). The RNA template was degraded with RNase H and then PCR amplification of the cDNA was performed using primers that matched the GeneRacer TM RNA oligonucleotide at the 5Ј-end (5Ј-CGACTGGAGCACGAGGACACTGA-3Ј) and a gene-specific primer at the 3Ј-end (Table I). For rare transcripts, a second round of nested PCR was performed using the GeneRacer TM 5Ј Nested Primer (5Ј-GGACACTGACATGGACTGAAG-GAGTA-3Ј) and another gene-specific primer. The PCR products were analyzed by agarose gel electrophoresis and visualized by staining the gel with ethidium bromide.
Luciferase Expression Assays-Plasmids pT7Luc and pT7LucA50 (34) were kindly provided by Dr. D. Gallie. Transcription from either plasmid yields a luciferase transcript with an 8-nucleotide 5Ј leader. Transcription from pT7LucA50 results in a 50-nucleotide 3Ј poly(A) tail, whereas transcripts produced from pT7Luc have no poly(A) tail. The plasmids were linearized by digestion with BamHI and DraI, respectively, and mRNA was synthesized by using AmpliScribe and AmpliCap T7 in vitro transcription kits (Ambion Inc., Austin, TX). RNA was recovered after DNase I digestion, phenol extraction, and ammonium acetate precipitation. RNA was quantified by UV absorbance and RNA integrity was judged by denaturing agarose gel electrophoresis. For transfection experiments, 10 million trophozoites were resuspended in 800 l of cytomix with 50 units of RNAsin (Promega) and 10 g of yeast tRNA and then kept on ice (25). 20 g of luciferase transcripts were added and cells were electroporated at 800 V, 1260 microfarads, and 500 V in a 0.4-cm cuvette using a BTX ECM600 electroporator (Harvard Apparatus, Holliston, MA). Cells were kept on ice an additional 10 min and then placed in 15 ml of fresh medium for the indicated times. Cells were harvested, and luciferase activity was determined as described (35).

Identification of a Candidate G. lamblia Triphosphatase
GlCet1-We identified a candidate Giardia RNA triphosphatase gene (Genbank TM accession number AACB01000007) on the third iteration of a psi-blast search of the NCBI data base with the T. brucei RNA triphosphatase TbCet1. The putative Giardia triphosphatase gene encodes a 344-aa polypeptide (Genbank TM accession number EAA42388) with primary structure similarity to the catalytic domain of fungal, viral, and protozoal RNA triphosphatases. We named this gene product GlCet1 (capping enzyme triphosphatase). GlCet1 includes the two glutamate-containing metal-binding motifs characteristic of the tunnel triphosphatases ( Fig. 1), plus putative homologs of the other ␤ strands that comprise the active site of yeast Cet1 (Fig. 2) (6). Ten hydrophilic amino acids that are essential for catalysis are conserved in the Giardia polypeptide (denoted by dots in Fig. 2) (10, 11). It is notable that a psi-blast search using the GlCet1 sequence itself failed to identify any other member of the metal-dependent RNA triphosphatase family, thus accounting for previously unannotated status of this Giardia gene product. Metal-dependent Triphosphatase Activity of GlCet1-The putative triphosphatase was produced in bacteria as a His 10 -Smt3 fusion protein and partially purified from a soluble bacterial extract by adsorption to nickel-agarose and elution with 250 mM imidazole. The nickel-agarose preparation was enriched with respect to the recombinant Giardia polypeptide, as judged by SDS-PAGE (Fig. 3A). We purified in parallel a mutated version of GlCet1 in which Glu 58 was replaced by alanine (Fig. 3A). Glu 58 corresponds to one of the signature metal-binding residues of the tunnel enzyme family. Recombinant wild-type GlCet1 catalyzed the release of 32 P i from [␥-32 P]ATP in the presence of manganese; the extent of ATP hydrolysis was proportional to enzyme concentration (Fig. 3B). The ATPase activity of the E58A mutant was Ͻ1% of the activity of wild-type enzyme. We conclude that the observed phosphohydrolase activity is intrinsic to GlCet1.
GlCet1 catalyzed no detectable ATP hydrolysis in the absence of a divalent cation (Fig. 4A). ATP activity was optimal at 2.5 mM MnCl 2 and declined slightly at 5 mM MnCl 2 . Cobalt also satisfied the metal requirement; activity was optimal at 2.5-5 mM CoCl 2 . No activity was observed when the manganese and cobalt concentrations were below the level of input ATP. Magnesium supported only feeble ATP hydrolysis at up to 5 mM concentration (Fig. 4A). Specificity for NTP hydrol-ysis in the presence of manganese and cobalt is characteristic (indeed, virtually diagnostic) of the fungal-type RNA triphosphatase family.
The rate of release of 32 P i from [␥-32 P]ATP was identical to the rate of conversion of [␣-32 P]ATP to [␣-32 P]ADP in a parallel reaction mixture containing the same concentration of GlCet1 (Fig. 4C), indicating that GlCet1 catalyzes the hydrolysis of ATP to ADP and P i . From the initial rate, we estimate a turnover number of 3.4 s Ϫ1 . We detected only low levels of [␣-32 P]AMP formation during the reaction. NTP specificity was examined by colorimetric assay of the release of P i from unlabeled ATP, GTP, CTP, or UTP. GlCet1 displayed nearly equivalent specific activity with each of the four ribonucleoside triphosphates (Fig. 4B).
Identification and Characterization of Giardia Guanylyltransferase GlCeg1-We identified a candidate mRNA guanylyltransferase gene (GenBank TM accession number AACB01000002) encoding a 511-aa polypeptide (GenBank TM accession number EAA42847) with extensive similarity to fungal, protozoal, metazoan, and virus-encoded RNA guanylyltransferases (Fig. 1). We named this gene product GlCeg1 (capping enzyme guanylyltransferase). GlCeg1 contains the six conserved peptide motifs (I, III, IIIa, IV, V, and VI) that comprise the active site of GTP binding and nucleotidyl transfer ( Fig. 1) (1). Recombinant His 10 -GlCeg1 was produced in bacteria and isolated from a soluble bacterial extract by nickelagarose chromatography. The 59-kDa Giardia polypeptide adsorbed to the affinity resin and was eluted with 50 -250 mM imidazole (Fig. 5A).
All known cellular guanylyltransferases employ a pingpong reaction mechanism of nucleotidyl transfer through a covalent enzyme-(lysyl-N)-GMP intermediate (1). Hence, guanylyltransferase activity can be detected with high sensitivity and specificity by label transfer from [␣-32 P]GTP to the enzyme. To assay guanylyltransferase activity of the expressed GlCeg1 protein, we incubated the initial soluble bacterial extract and the nickel-agarose column fractions with [␣-32 P]GTP and a divalent cation. This resulted in the formation of an SDS-stable ϳ60-kDa nucleotidyl-protein adduct (Fig. 5A). The guanylyltransferase activity eluted in parallel with the GlCeg1 protein.
Formation of the covalent intermediate was optimal at pH 8.0 to 9.0 in Tris-HCl buffer and was sharply reduced at pH Յ6.5 in Tris acetate buffer (Fig. 6A). Enzyme-GMP formation required a divalent cation cofactor, either manganese or magnesium. Both cations were equally effective at their respective optima, which were 0.3-5 mM MnCl 2 and 5 mM MgCl 2 (Fig. 6B). Calcium was unable to support guanylyltransferase activity at up to 5 mM concentration. The extent of enzyme guanylylation increased with GTP concentration up to 0.3 M and saturated at Ͼ1 M GTP (Fig. 6C). Half-maximal guanylylation was achieved at 0.09 M GTP. The nucleotidyltransferase reaction was specific for GTP, insofar as we could detect no formation of a protein-NMP complex during reaction of GlCeg1 with [␣-32 P]ATP (data not shown).
The forward guanylylation reaction of capping enzymes is in equilibrium with the reverse pyrophosphorolysis reaction, in which PP i attacks the enzyme-GMP complex to regenerate GTP. We observed that inclusion of low concentrations of PP i in the guanylyltransferase reaction mixture exerted a stimulatory effect on the extent of label transfer from [␣-32 P]GTP to the GlCeg1 protein. A 4-fold stimulation was evident at the optimal concentration of 10 M PP i (Fig. 6D). We ascribe this stimulation to pyrophosphorolysis of the population of enzyme-GMP present in the recombinant protein preparation, which thereby regenerates additional apoenzyme to react with the radiolabeled GTP substrate. From the data in Fig. 6D, we estimated that 48% of the enzyme molecules were labeled with [ 32 P]GTP in the presence of 10 M PP i .
The native size of GlCeg1 was gauged by sedimentation through a 15-30% glycerol gradient. Marker proteins catalase (248 kDa), BSA (66 kDa), and cytochrome c (13 kDa) were included as internal standards. Guanylyltransferase activity sedimented as a single discrete peak between BSA and cytochrome c (Fig. 5B); the activity profile coincided with the abundance of the GlCeg1 polypeptide (data not shown). An S value of 4.1 was determined for GlCeg1 by interpolation to the internal standard curve. We surmise that GlCeg1 is a monomer in solution.
5Ј Modification of Endogenous Giardia Transcripts-The existence of capping machinery in Giardia prompted us to query by 5Ј-RACE whether native Giardia transcripts have blocked 5Ј-ends. We examined sense transcripts that code for tubulin, glutamate dehydrogenase, and NADH oxidoreductase (NOR), and an antisense transcript for NOR that is one of many such sterile transcripts present in Giardia (36). 5Ј-RACE experiments were designed to distinguish transcripts that contain a blocked 5Ј-end (presumed to be a cap) from transcripts that have unblocked 5Ј-mono-, -di-, or -triphosphate termini (Fig. 7). Only RNAs having 5Ј-monophophate termini can be ligated to the synthetic oligonucleotide that provides an anchor for reverse transcription-PCR amplification. The experimental logic was as follows: 1) if the transcript has a 5Ј-monophosphate end, then a PCR product should appear in the RACE reactions programmed by both the untreated and TAP-treated RNA samples but not in the reactions programmed by CIP or CIP ϩ TAP-treated RNA samples; 2) if the transcript has an unblocked 5Ј-triphosphate or diphosphate end, then a PCR product should appear in the RACE reactions programmed by TAPtreated RNA but not in reactions containing the CIP-treated, CIP ϩ TAP-treated, or untreated RNAs; 3) if the transcript has a blocking 5Ј cap structure, then a PCR product will appear in RACE reactions programmed by both the TAP-treated and CIP ϩ TAP-treated RNAs but not in reactions containing the untreated or CIP-treated RNAs. As shown in Fig. 7, we consistently observed PCR products for all of the sense transcripts and the antisense-NOR transcript in the TAP and CIP ϩ TAP treated samples, but not in the CIP-treated or untreated samples. These results indicate that endogenous Giardia RNAs have blocked 5Ј-ends, consistent with the presence of a cap.
Cap-dependent Translation of mRNA Transfected into Giardia in Vivo-To evaluate whether m 7 G caps were utilized by Giardia to promote translation, we synthesized luciferase transcripts in vitro with and without 5Ј-m 7 G caps and 3Јpoly(A) tails. We found, in agreement with the recent report from Li and Wang (25), that luciferase activity was detected after transfection of capped mRNA, but not after transfection of uncapped mRNA (Fig. 8). Luciferase production was evident within 1 h post-transfection and activity persisted for at least 20 h. The presence of a 50-nucleotide poly(A) tail was also required for expression of the reporter mRNA. We conclude that caps and poly(A) tails can be utilized by Giardia to promote translation in vivo.
Candidate Capping Enzymes in the Red Alga C. merolae-Yeast-type metal-dependent RNA triphosphatases of the tunnel family have now been characterized from a variety of unicellular eukarya. The RNA guanylyltransferases of these unicellular organisms are conspicuously not fused to a cysteine-phosphatase-type RNA triphosphatase, as are the guanylyltransferases of metazoa and plants (e.g. Arabidopsis, rice, tomato, potato, maize, barley, and wheat). Plants, green algae, and red algae are believed to descend from a common ancestor that acquired its plastid by endosymbiosis of a cyanobacterium (20). The recently reported genome sequence of the unicellular red alga C. merolae (17) provides an opportunity to query whether plants, green algae, and red algae are monophyletic with respect to the organization of their mRNA capping systems.
By searching the red alga genome data base (merolae.biol.s.u-tokyo.ac.jp) with S. pombe RNA triphosphatase Pct1 (21), we identified a C. merolae gene (locus CMR231C) encoding a putative 397-aa metal-dependent RNA triphosphatase enzyme, which we named CmCet1 (Fig. 1). CmCet1 contains the two signature glutamate-containing metal-binding motifs and putative homologs of the ␤ strands that comprise the triphosphate tunnel of yeast Cet1, including the conserved catalytic side chains (Fig. 2). A psi-blast search using the CmCet1 sequence itself readily identify all fungal metal-dependent RNA triphosphatases as well as viral and protozoal members of this family.
We identified a 414-aa monofunctional guanylyltransferase encoded by C. merolae locus CMA035C, which we named Cm-Ceg1. CmCeg1 contains the six nucleotidyltransferase motifs Trophozoite RNA treated with CIP ϩ TAP (lanes 1), TAP alone (lanes 2), CIP alone (lanes 3) or an untreated control (lanes 4) was used for 5Ј-RACE as described under "Experimental Procedures." An anchor primer was ligated to the RNA 5Ј-ends, and gene-specific transcripts were amplified by reverse transcription-PCR. For less abundant transcripts (glutamate dehydrogenase (GDH), NOR sense, and NOR antisense), one round of nested PCR was performed. The expected outcomes of treating a 5Јcapped RNA (GpppRNA), a 5Ј-triphosphate-terminated RNA (pp-pRNA), and a 5Ј-monophosphate-terminated RNA (pRNA) with CIP alone, TAP alone, or CIP then TAP are depicted at right. The ligatable pRNA species that can give rise to a 5Ј-RACE product are highlighted in shaded boxes. that comprise the active site of prototypal capping enzymes (Fig. 1). We also identified a putative 373-aa cap guanine-N7 methyltransferase encoded by C. merolae locus CMP091C (data not shown). We surmise that the red alga has a yeast-like three-component mRNA capping apparatus that includes a tunnel family RNA triphosphatase. The capping apparatus of this unicellular red alga is substantially diverged from that of plants. DISCUSSION Capping in Giardia-An initial report that bulk Giardia poly(A)-containing RNAs do not have a blocking cap structure at their 5Ј-ends (15) raised the prospect that either; (i) the mRNA cap did not exist in the last common ancestor of all present eukarya, but arose subsequent to the branching of Giardia, or (ii) the cap did exist in the ancestral eukaryote but was lost in the lineage that gave rise to Giardia. It was hypothesized that the mechanism of translation initiation in Giardia was fundamentally different from that of other eukarya in being cap-independent (15). However, other analyses of a specific Giardia transcript highlighted two distinct 5Ј starts, one of which had either a 5Ј-triphosphate or blocked 5Ј-end, whereas the other had a 5Ј-monophosphate. (16). Our analysis of four additional transcripts, including one antisense transcript, indicates that all four have blocked 5Ј-ends consistent with cap-like modifications of the primary transcripts. In the absence of direct analysis of the cap structure of specific Giardia mRNAs, we regard the demonstration that Giardia transcripts have blocked 5Ј-ends and that the parasite encodes an mRNA capping apparatus as reasonable evidence that this primitive eukaryote does indeed cap its mRNAs. Our biochemical characterization of the recombinant Giardia triphosphatase and guanylyltransferase proteins highlights their similarities to the well characterized fungal mRNA capping enzymes. Although the present study was limited to the enzymes that are predicted to catalyze the first two steps in formation of the m 7 GpppN cap in Giardia, we, and others (22), also identified a 422-aa putative Giardia cap methyltransferase (GenBank TM AACB01000009) that resembles the biochemically and structurally characterized cap methyltransferases of fungi, metazoa, and microsporidia (data not shown) (23,24). Our initial attempts to produce catalytically active Giardia cap methyltransferase in bacteria were not successful. Nonetheless, we surmise that Giardia can synthesize m 7 GpppN caps.
This conclusion is consistent with a recent study (25) and our own data showing that a m 7 GpppN capped reporter mRNA, but not an uncapped reporter RNA, was translated in Giardia trophozoites in vivo when introduced into cells by electroporation. This result, and the fact that Giardia encodes two homologs of the cap-binding translation initiation factor eIF4E (data not shown), would argue that Giardia is not an exception to the general reliance on the cap structure for optimal gene expression in eukarya. The hypothesis that Giardia caps its RNAs is also consistent with an early study identifying several small RNAs in Giardia that reacted with antibody to the 2,2,7trimethylguanosine (TMG) cap; the TMG-capped species were tentatively identified as Giardia snRNAs with possible roles in RNA processing (26). Available evidence suggests that TMG caps are formed by post-transcriptional N2-methylation of the standard m 7 GpppN cap (27). In this light, it is noteworthy that Giardia encodes a putative homolog of the yeast TMG cap synthase, Tgs1 (data not shown).
As current methods only allow two alleles in the tetraploid Giardia genome to be targeted in vivo, it is not possible to test directly whether the GlCEG1 and GlCET1 genes are essential for Giardia viability or to assess their role in mRNA metabo-lism. We attempted to exploit a surrogate genetic system by querying whether expression of the Giardia proteins could sustain the growth of S. cerevisiae cells that lack the endogenous guanylyltransferase Ceg1 or the triphosphatase Cet1. We found that neither GlCeg1 nor GlCet1 could complement the respective yeast null mutations when they were expressed under the control of a constitutive yeast promoter on a centromeric plasmid (data not shown). GlCeg1 was also unable to complement the yeast ceg1⌬mutation when provided on a multicopy plasmid (data not shown). A possible explanation for the failure of the Giardia enzymes to function in yeast is that they are unable to interact with yeast RNA polymerase II. It is noteworthy that the largest subunit of Giardia RNA polymerase II (28) lacks a carboxyl-terminal domain, composed of tandem repeats of the heptapeptide YSPTSPS, which is characteristic of fungal, plant, and metazoan RNA polymerase II (29). Among its many known or imputed functions, the carboxylterminal domain acts as a landing pad for the targeting of the guanylyltransferase component of the fungal and metazoan capping apparatus to the transcription elongation complex (30,31).
G. lamblia is a significant cause of water-borne enteritis in humans.
Although not yet validated as essential for Giardia growth, the triphosphatase GlCet1 is a potential target for anti-giardial drug discovery because: (i) the structure and catalytic mechanism of the triphosphatase family of which GlCet1 is a member is completely different from that of the human host capping enzyme and (ii) humans encode no plausible homolog of GlCet1 or other tunnel family RNA triphosphatases. The similarity between GlCet1 and other metal-dependent RNA triphosphatases raises the possibility of achieving broadspectrum antiprotozoal and antifungal activity with a single class of mechanism-based inhibitors.
An Expanded Cap-centric Phylogeny of Eukarya-A scheme of eukaryotic phylogeny was suggested (2, 3) based on two features of the mRNA capping apparatus: the structure and mechanism of the triphosphatase component (metal-dependent "tunnel" type versus metal-independent cysteine-phospha- FIG. 9. Cap-centric scheme of eukaryal evolution. A capping system consisting of separately encoded metal-dependent RNA triphosphatase (TPase, white) and guanylyltransferase (GTase, gray) enzymes is found in the indicated protozoal, red algal, and fungal lineages and is proposed to represent the ancestral state of the capping apparatus. The metazoan and plant capping systems, consisting of a cysteine phosphatase-type RNA triphosphatase (TPase, black) fused to a guanylyltransferase (GTase), comprises a derived state. A different derived state, resulting from fusion of the metal-dependent triphosphatase to the guanylyltransferase, is found in Entamoeba, Dictyostelium, and Phytophthora.
tase type) and whether the triphosphatase is physically linked in cis to the guanylyltransferase component (Fig. 9). By these simple criteria, relying on stark differences in the same metabolic pathway, we arrived at different relationships among taxa than those suggested by comparisons of sequence variations among proteins that are themselves highly conserved in all eukaryotes (32). For example, the capping-based phylogeny placed metazoans (specifically arthropods, nematodes, and vertebrates) in a common lineage with plants (exemplified by Arabidopsis thaliana) because these organisms have a cysteine-phosphatase type RNA triphosphatase fused in cis to their guanylyltransferase (Fig. 9). Fungi, microsporidia, and many protozoa (including Plasmodium, Trypanosoma, and, as shown here, Giardia) fall into a different lineage distinguished by a tunnel family RNA triphosphatase that is physically separate from RNA guanylyltransferase. In contrast to the cap-centric view, the protein sequence variation-based scheme (32) places fungi in the same supergroup as metazoa and puts Plasmodium and Trypanosoma nearer to plants.
Recent progress in plant genome sequencing has consolidated the conservation of bifunctional capping enzymes in plants and metazoans. Searching the plant databases at TIGR (tigrblast.tigr.org/tgi) with an Arabidopsis capping enzyme revealed clear evidence of a cysteine-phosphatase/guanylyltransferase fusion in many species of higher plants, including rice, tomato, potato, maize, barley, and wheat (data not shown). It is therefore highly instructive that the plastid-containing red alga, which shares an evolutionary history with plants (20), has a different capping apparatus akin to that of yeast and unicellular protozoa. The implication is that the plant/metazoan-type capping apparatus appeared subsequent to branching of the red alga from the ancestral algal/plant lineage. A key question is whether the differences in capping enzymes between fungi/ red algae/protozoa and metazoa/plants reflect a transition from unicellularity to multicellularity. We suspect this is not the case, insofar as the unicellular green alga Chlamydomonas reinhardtii encodes a cysteine phosphatase-type RNA triphosphate along with a guanylyltransferase. However, because we do not find in the available genome data base a continuous genomic clone or cDNA that includes both the triphosphatase and the guanylyltransferase active sites, we can only tentatively place the green alga in the plant/metazoan group with respect to capping. It is conceivable that a capping enzyme composed of a cysteine phosphatase-type RNA triphosphatase arose first in green algae and was retained in the ensuing plant lineage and transferred horizontally to a separate ancestor of the metazoan lineage. Alternatively, green algae, plants, and metazoans might have descended from a common ancestral lineage.
Given that Giardia is regarded as a basal eukaryote, the available evidence strongly favors the model that a three-component capping system with a metal-dependent tunnel family triphosphatase is the ancestral state from which other eukarya evolved by gene fusion and gene replacement with a novel analog (Fig. 9). It was noted previously (3) that certain protozoa (e.g. Entamoeba histolytica) acquired a bifunctional capping enzyme by fusion of the ancestral triphosphatase and guanylyltransferase genes (Fig. 9). The candidate E. histolytica capping enzyme is composed of an N-terminal tunnel family triphosphatase domain fused to a guanylyltransferase domain. A similar bifunctional tunnel family triphosphatase/guanylyltransferase fusion appears to exist in Dictyostelium discoideum. By mining the genome data base for the water mold P. ramorum. (www.jgi.doe.gov/genomes), we find that this unicellular organism also encodes a bifunctional tunnel-type triphosphatase/guanylyltransferase fusion (data not shown). Phytophthora is classified as a Stramenopile, a kingdom that includes diatoms and brown algae.
Metazoans and plants have experienced a different and more radical gene rearrangement that transferred a cysteinephosphatase domain into the same transcription unit as the guanylyltransferase. A pathway of evolution was suggested, entailing the appearance of a new cysteine phosphatase enzyme (e.g. via duplication and mutation of one of the protein phosphatase genes present in lower eukarya) that gained the capacity to hydrolyze an RNA 5Ј-phosphate instead of, or in addition to, a phosphoprotein (3). The subsequent fusion of the cysteine phosphatase to the guanylyltransferase then allowed for the loss of the tunnel family enzyme from the genome of a putative common metazoan/plant ancestor or else the divergence of the protein to a point that it is no longer discernable as Cet1-like. An alternative explanation is that the cysteine-phosphatase/guanylyltransferase fusion arose first in either the green algal/plant (or metazoan) lineage and was subsequently transferred horizontally to the metazoan (or green algal/plant) ancestor. Another alternative explanation is that green algae/plants and metazoans independently experienced this gene fusion in distant branches of the phylogenetic tree.
The cap-centric scheme of eukaryotic evolution is useful for its simplicity and because answers to questions about missing evolutionary links and the order of events in the progression from fungal and protozoal-type to metazoan and planttype capping systems are emerging as more genomes are sequenced. A key outstanding issue is the nature of the mRNA capping apparatus in the earliest branching metazoan organisms.