Characterization of the adenylation site in the RNA 3'-terminal phosphate cyclase from Escherichia coli.

RNA 3'-terminal phosphate cyclases are a family of evolutionarily conserved enzymes that catalyze ATP-dependent conversion of the 3'-phosphate to the 2',3'-cyclic phosphodiester at the end of RNA. The precise function of cyclases is not known, but they may be responsible for generating or regenerating cyclic phosphate RNA ends required by eukaryotic and prokaryotic RNA ligases. Previous work carried out with human and Escherichia coli enzymes demonstrated that the initial step of the cyclization reaction involves adenylation of the protein. The AMP group is then transferred to the 3'-phosphate in RNA, yielding an RNA-N(3')pp(5')A (N is any nucleoside) intermediate, which finally undergoes cyclization. In this work, by using different protease digestions and mass spectrometry, we assign the site of adenylation in the E. coli cyclase to His-309. This histidine is conserved in all members of the class I subfamily of cyclases identified by phylogenetic analysis. Replacement of His-309 with asparagine or alanine abrogates both enzyme-adenylate formation and cyclization of the 3'-terminal phosphate in a model RNA substrate. The cyclase is the only known protein undergoing adenylation on a histidine residue. Sequences flanking the adenylated histidine in cyclases do not resemble those found in other proteins modified by nucleotidylation.

RNA 3-terminal phosphate cyclases are a family of evolutionarily conserved enzymes that catalyze ATP-dependent conversion of the 3-phosphate to the 2,3-cyclic phosphodiester at the end of RNA. The precise function of cyclases is not known, but they may be responsible for generating or regenerating cyclic phosphate RNA ends required by eukaryotic and prokaryotic RNA ligases. Previous work carried out with human and Escherichia coli enzymes demonstrated that the initial step of the cyclization reaction involves adenylation of the protein. The AMP group is then transferred to the 3-phosphate in RNA, yielding an RNA-N 3 pp 5 A (N is any nucleoside) intermediate, which finally undergoes cyclization. In this work, by using different protease digestions and mass spectrometry, we assign the site of adenylation in the E. coli cyclase to His-309. This histidine is conserved in all members of the class I subfamily of cyclases identified by phylogenetic analysis. Replacement of His-309 with asparagine or alanine abrogates both enzyme-adenylate formation and cyclization of the 3-terminal phosphate in a model RNA substrate. The cyclase is the only known protein undergoing adenylation on a histidine residue. Sequences flanking the adenylated histidine in cyclases do not resemble those found in other proteins modified by nucleotidylation.
The RNA 3Ј-terminal phosphate cyclase, an enzyme originally identified in extracts from human HeLa cells and Xenopus oocyte nuclei, catalyzes the ATP-dependent conversion of the 3Ј-terminal phosphate group into a 2Ј,3Ј-cyclic phosphodiester at the 3Ј-end of RNA (Refs. 1-3; reviewed in Ref. 4). The exact biological role of the cyclase in RNA metabolism remains unknown, but the demonstration that several eukaryotic and prokaryotic RNA ligases require 2Ј,3Ј-cylic phosphate RNA ends (Refs. 1 and 5-16; reviewed in Refs. [17][18][19] suggests that the enzyme may be involved in generation or maintenance of cyclic termini in RNA ligation substrates (4,20). Alternatively, the cyclase could be responsible for producing cyclic phosphate 3Ј-ends identified in the spliceosomal U6 small nuclear RNA (21) and some other small RNAs (Ref. 22; for discussion of additional possible functions, see Ref. 20) The cyclase has been purified from HeLa cell extracts, and its cDNA had been cloned (20,23). The enzyme is expressed in all mammalian tissues and cell lines investigated, and has a nucleoplasmic localization, consistent with its postulated role in RNA processing (20). The cyclase has no apparent motifs in common with any proteins of known function. However, data base searches indicated that genes encoding proteins with a significant similarity to the human cyclase are conserved among eucarya, bacteria, and archaea. When the protein encoded in the Escherichia coli genome was overexpressed, it showed RNA 3Ј-phosphate cyclase activity. The E. coli cyclase gene forms part of a previously uncharacterized operon, expression of which is controlled by an alternative sigma factor, 54 (20,24).
The properties of the human and bacterial cyclases are very similar. Both enzymes catalyze conversion of the 3Ј-terminal phosphate to a 2Ј,3Ј-cyclic phosphodiester in a reaction dependent on ATP, other nucleoside triphosphates being much less active co-factors. With both enzymes, the cyclization of the 3Ј-phosphate at the 3Ј-end of RNA occurs by a three-step mechanism (2-4, 20, 23, 24) as follows. ( where N 1 is any nucleoside, and p is a phosphate group. (iii) RNA-N 3Ј pp 5Ј A 3 RNA-NϾp ϩ AMP, where NϾp is nucleoside 2Ј,3Ј-cyclic phosphate.
Evidence for step (i) comes from identification by either SDS-polyacrylamide gel electrophoresis (SDS-PAGE) or gel filtration of the covalent cyclase-AMP complex.
Step (ii) is supported by the ability of 3Ј-phosphorylated RNA but not 3Ј-OHterminated RNA to release AMP from the preformed adenylated cyclase complexes and by accumulation of the RNA-N 3Ј pp 5Ј A molecules when the ribose at the RNA 3Ј terminus is replaced with the 2Ј-deoxy-or 2Ј-O-methylribose (2,3,20,23,24). Step (iii) probably takes place nonenzymatically as the result of nucleophilic attack by the adjacent 2Ј-OH on the phosphorus in the phosphodiester linkage.
Mechanistically, with respect to formation of the covalent protein-nucleoside monophosphate intermediate and transfer of nucleoside monophosphate to the terminal phosphate (or pyrophosphate) in nucleic acid, the cyclase resembles RNA and DNA ligases and capping enzymes (reviewed in Ref. 25). In all the later cases, nucleotidyl transfer occurs through a covalent lysyl-nucleoside monophosphate phosphoamide intermediate; the active-site lysine is present in a conserved short sequence motif, KXDG. RNA ligases, ATP-dependent DNA ligases, and capping enzymes also contain several additional conserved motifs (25). Neither KXDG nor these additional sequence motifs are identifiable in cyclases.
In this work, we determined the adenylation site of the E. coli cyclase. The adenylated amino acid His-309 is conserved in a large subfamily of cyclases encompassing all bacterial and archaeal proteins and also some metazoan proteins. Mutations * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18

Overexpression and Purification of Wild-type and Mutant Cyclases-
The pET11d vector-based plasmid for overexpression of the wild-type E. coli cyclase containing a 6 ϫ His tag at the C terminus has been previously described (20,24). Plasmids for overexpression of mutant proteins were generated by a polymerase chain reaction approach as described previously (26). The identity of mutants was checked by DNA sequencing. Overexpression was performed in the E. coli strain BL21(DE3)pLysS as described before (20). For purification, the bacterial pellet was resuspended in buffer A (50 mM Tris-HCl, pH 8.0, 0.3 M NaCl, 10 mM imidazole, 1 mM DTT) supplemented with 0.5% Triton X-100 and protease inhibitors (complete protease inhibitors-EDTA mixture, Roche Molecular Biochemicals). The pellet was lysed by sonication, and a lysate, cleared by centrifugation, was applied to a Ni ϩ -silica gel column (Qiagen) pre-equilibrated with buffer A. The column was washed with buffer A containing 40 mM imidazole, and the cyclase protein was eluted with buffer A containing 0.4 M imidazole. Samples were desalted into 50 mM HEPES-NaOH, pH 7.8, 0.1 M NaCl, 0.5 mM DTT, 5% glycerol, concentrated using the UltraFree Biomax system (Millipore), and stored at Ϫ20°C. The protein was more than 95% pure as judged by SDS-PAGE (20). Protein concentration was determined using the Bradford procedure with bovine serum albumin as a standard (27).
Cyclization Assay-Cyclase activity was assayed by the Norit method as described elsewhere (4). Unless otherwise indicated, the 10-l assays contained 40 fmol of the substrate, and incubations were for 20 min at 25°C. Other details are indicated in the figure legends.
Analytical Adenylation Assays with [␣-32 P]ATP-Reactions (15 l) containing 20 ng of wild-type or mutant cyclase and 2.5 M [␣-32 P]ATP (specific activity, 300 Ci/mmol) were incubated at 25°C for 3 h in 50 mM HEPES-NaOH buffer, pH 8.0, containing 0.2 M NaCl, 10 mM MgCl 2 , 1 mM DTT, and 10% glycerol. The reactions were analyzed by SDS-PAGE and autoradiography. Immediately before loading onto the gel, samples were supplemented with unlabeled ATP (final concentration 10 mM) to decrease the background.
Adenylation with [␣-32 P]ATP and Protease Mapping-Twenty-five g (0.7 nmol) of cyclase was adenylated in 60 l of buffer T (50 mM Tris-HCl, pH 8.0, 0.2 M NaCl, 2 mM MgCl 2 , 1 mM CaCl 2 , 1 mM DTT) containing 20 M [␣-32 P]ATP. After incubation for 3 h at 25°C, the sample was divided into three equal aliquots, which were submitted to digestion with either trypsin, endoproteinase Glu-C (both from Promega, Madison, WI), or Lys-C protease (Wako, IG Instrumenten Gesellschaft, Zurich, Switzerland) overnight at 30°C (Lys-C) or 37°C (trypsin and endoproteinase Glu-C). A 10% aliquot of each digestion reaction was analyzed by SDS-PAGE using a Tris/Tricine/urea system described by Schagger and Von Jagow (28) after boiling the samples in the presence of SDS and DTT. Separated peptides were visualized by silver staining and autoradiography. The remaining 90% of each digest was resolved in the same electrophoresis system, and the peptides were electroblotted onto a polyvinylidene difluoride membrane. After autoradiography, labeled peptides retained on the membrane were excised and subjected to sequence analysis by Edman degradation using an Applied Biosystems (Foster City, CA) model 477A sequencer and following the manufacturer's recommendations. Adenylation with Unlabeled ATP, and LC-ESIMS and NanoESI-MSMS Analyses-Two hundred g (6.7 nmol) of cyclase was incubated in buffer T with 60 M ATP. After incubation for 2 h at 25°C, proteins were alkylated with 5 mM iodoacetamide for 45 min directly in the adenylation buffer. The protein was then digested at 37°C for 1 h with 5 g of trypsin followed by a 1-h cleavage with 5 g of endoprotease Glu-C. Liquid chromatography-purified adenylated peptide was digested with 0.2 g of subtilisin (Sigma) for 90 min at 25°C in 50 mM NH 4 HCO 3 . Peptides were separated on a 1 ϫ 250-mm Vydac C 8 column (Hesperia, CA) equilibrated in 98% solvent A (25 mM ammonium acetate, pH 6, 2% CH 3 CN) and solvent B (25 mM ammonium acetate, pH 6, 80% CH 3 CN), and a linear gradient was developed from 5 to 50% solvent B in 60 min at a flow rate of 50 l/min. The LC-ESIMS system was as described previously (29). NanoESI-MSMS was performed according to the published method of Wilm and Mann (30). The mass spectra were acquired on an API 300 mass spectrometer (PE Sciex, Toronto, Ontario, Canada) equipped with a NanoESI source (Protana, Odense, Denmark).
Model Building-A homology-based model of the human cyclase was obtained using the coordinates of the E. coli enzyme. 2 The two enzymes were aligned using the program Bestfit (Genetics Computer Group, Madison, WI). The optimal alignment (34% identity with 5 gaps) was used by the program MODELER (Ref. 31; BIOSYM/Molecular Simulations, San Diego, CA) to build and refine 6 human cyclase models. Human cyclase is 27 residues longer than the E. coli enzyme. No attempt was made to model this part of the polypeptide chain. The best model, selected on the lowest violations of the probability density functions, was evaluated using the program Profiles_3D (Ref. 32; BIOSYM/ Molecular Simulations). Its overall self compatibility score (162.2) was slightly better than that expected for a polypeptide chain of this length (157.6), indicating the reliability of the model.

Known Cyclases and Cyclase-like Proteins Can be Grouped into Two
Subfamilies-A dendrogram of the E. coli and human cyclases and cyclase-like proteins encoded in different organisms indicated that they can be subdivided into two classes (Fig. 1A). Members of class I include all prokaryotic proteins, the Dictyostelium discoideum protein, and one of the two proteins expressed in Drosophila and humans. The previously characterized E. coli and human cyclases belong to this class of enzymes. To the class II belong proteins encoded in genomes of budding and fission yeast, Caenorhabditis elegans, and also second forms of proteins expressed in Drosophila and humans.
Alignment of the members of the cyclase class I subfamily is shown in Fig. 1B. Analyzed proteins and also class II (data not shown) have no apparent motifs in common with other proteins in various data bases. The N-terminal halves and C-proximal parts of class I cyclases are relatively highly conserved at the sequence level, with central regions of proteins being highly variable. The two most outstanding regions of similarity are glycine-rich sequences corresponding to residues 9 -27 and 155-166 of the E. coli protein. Interestingly, the sequence of the human Hs1 protein is more related to the D. discoideum protein than to the Drosophila Dm1 protein (Fig. 1). It is possible that human and Drosophila proteins are not orthologs and that additional cyclase genes are expressed in vertebrates and/or insects.
Previous demonstration that the covalent human cyclase-AMP complex is unstable when heated in 0.1 N HCl or when treated with hydroxylamine at pH 4.7, but insensitive to heating in 0.1 N NaOH, has suggested that AMP is linked to the protein via a phosphoamide bond, possibly involving the lysine ⑀-amino group (3,23). The E. coli cyclase-AMP complex is likewise resistant to treatment with alkali (0.1 N NaOH, 1 min at 95°C) but is unstable in 0.1 N HCl (1 min at 95°C), consistent with the phosphoamide linkage (data not shown). As evident from the alignment shown in Fig. 1B, no conserved lysine residue is present in class I cyclases.
Protease Mapping of the Adenylation Site Region in the E. coli Cyclase-To delineate the protein region containing the adenylation site, protease mapping of the adenylated E. coli cyclase was performed. The protein was adenylated in the presence of [␣-32 P]ATP and treated with either trypsin, endoproteinase Glu-C, or Lys-C protease. In this and other experiments it was essential to add proteases directly to adenylation reaction mixtures, since attempts to purify the adenylated complex by gel filtration under nondenaturing conditions resulted in loss of incorporated label. Hence, as observed previously for the human enzyme (3,23), the E. coli cyclase-AMP complex may undergo hydrolysis in the absence of SDS. The peptides resulting from protease digestion of the cyclase-AMP complex were separated by SDS-PAGE, located by autoradiography, and analyzed by microsequencing. This analysis demonstrated that the adenylated amino acid is located between residues 258 and 324 (data not shown; see Fig. 1B).
To identify the adenylated amino acid residue, the enzyme adenylated in the presence of cold ATP and nonadenylated control enzyme were digested with trypsin followed by endoproteinase Glu-C. The peptides were separated by reversephase LC-ESIMS. Comparison of digests of the adenylated and nonadenylated proteins indicated the presence in the former of one additional peptide with a measured mass of 2436 Da (data not shown), suggesting that it corresponds to FTVAHPSCHLL-TNIAVVER ϩ AMP Ϫ H 2 O. The recovery of this peptide was improved by reduction and carboxyamidomethylation of the cyclase before protease digestion. The peptides were separated by reverse-phase LC-ESIMS, and a peptide with a 57-Dahigher measured mass (2493 Da), eluting at nearly the same position as the 2436-Da peptide, was detected (Fig. 2). Its mass would be consistent with the sequence: FTVAHPSCHLLTNIA-VVER ϩ Cam ϩ AMP Ϫ H 2 O. The identity of this peptide was confirmed by MSMS. This analysis did not identify the adenylated amino acid, since the AMP residue was lost under the MSMS conditions, and only nonadenylated ions were observed (Fig. 3).
The adenylated and carboxyamidomethylated peptide FT-VAHPSCHLLTNIAVVER was further digested with subtilisin, yielding a number of short peptides that were analyzed by NanoESI-MSMS. The mass spectra and their interpretation are shown in Fig. 4A. The generated peptides were further characterized by MSMS, which confirmed the amino acid sequence of the following peptides: CHLL ϩ Cam ϩ AMP Ϫ H 2 O (870.3 Da), FTVAHPS (757.3 Da), and IAVVER (685.3 Da). The peptide with a mass of 870.3 Da showed a facile loss of a 329-Da mass, corresponding to an anhydro-AMP residue. As a result, only nonadenylated ions were observed, so the information about the residue originally carrying the modification was lost. The presence of the AMP residue was directly examined by MSMS analysis of this peptide in the negative mode. Two major product ions were detected (Fig. 4B), whose masses could be The active-site peptide, CHLL, could in principle be adenylated on Cys-308 or His-309, and carboxyamidomethylation could also have taken place on either residue (33,34). That the AMP moiety is present at Cys-308 can, however, be excluded. From the difference in the m/z value for the y11 and y12 ions (160.5 Da) in the tandem mass spectrum of the original 2493-Da peptide (Fig. 3), it can be concluded that Cys-308 is carboxymethylated. Furthermore, the y13 ion (m/z 1512) demonstrates that His-309 is not carboxyamidomethylated. This was confirmed by Edman degradation of this peptide, which yielded phenylthiohydantoin-Cam-Cys in cycle 8 (data not shown). These results leave His-309 as the only possible site of adenylation. This conclusion is strengthened by two observations. (i) The first is the stability of the cyclase-AMP complex in 0.1 N NaOH and its sensitivity to acidic pH. This is consistent with a P-N rather than a P-S linkage (35). The acid lability of the P-N bond also explains why phenylthiohydantoin-His, rather than a modified residue, was observed in cycle 9 during Edman degradation of the 2493-Da peptide (data not shown). (ii) Second, the histidine residue is conserved in all class I cyclases, whereas the cysteine residue is only found in the E. coli enzyme (Fig. 1B).
Activity of the Cyclase Mutants-To directly assess the importance of His-309 for activity of the E. coli cyclase, two single-amino acid mutant enzymes were engineered, overexpressed in E. coli, and purified. Inspection of the three-dimensional structure of the enzyme 2 revealed that replacement of His-309 by either Asn or Ala would most likely not disturb the structure of the protein.
Mutant proteins were tested for activity as acceptors in the adenylation reaction and for their ability to catalyze cyclization of the 3Ј-phosphate in the model oligoribonucleotide substrate, AAAACAAAAGp* (the asterisk indicates a radiolabeled phosphate). Both of the His-309 mutations completely abolished adenylation of the protein (Fig. 5A) and its activity to catalyze cyclization of the 3Ј-phosphate (Fig. 5B). The recombinant C308A mutant protein was also engineered, overexpressed, and purified. Although less efficiently that the wild-type protein, this mutant underwent adenylation and catalyzed cyclization of the 3Ј-phosphate (data not shown). These results provide further evidence for His-309 acting as an adenylate acceptor.

DISCUSSION
In this work we assigned the site of adenylation in the E. coli cyclase to His-309 by using protease digestions and mass spectrometry. Consistent with this histidine residue acting as an AMP acceptor, its replacement with asparagine or alanine abrogated both formation of the enzyme-AMP complex and the cyclization of the 3Ј-terminal phosphate in a model RNA substrate. Based on the crystal structure of the E. coli cyclase, 2 the introduced His-309 mutations should not interfere with the local structure of the protein, since this residue is largely exposed to solvent. Furthermore, changing it into a smaller residue (Ala or Asn) circumvents the problem of steric hindrance (see Fig. 6). Hence, it is unlikely that these mutations exerted their effect on enzyme activity indirectly by modifying folding of the protein rather than its catalytic site.
The human class I cyclase has 34% identity and 43% similarity with the E. coli enzyme. Modeling of the human sequence using the coordinates of the E. coli enzyme showed an overall folding that is very similar (data not shown). Importantly, the human enzyme contains a histidine residue at position 329, which corresponds to His-309 of the E. coli enzyme. Moreover, as expected for a catalytic-site residue, its immediate neighborhood in the three-dimensional model also appears to be highly conserved (Fig. 6). In particular, the presence of Glu-14 and Gln-18 is noteworthy, since these residues have been found to form hydrogen bonds with the histidine in the E. coli enzyme. 2 Alignment of proteins belonging to the class II subfamily did not identify a conserved histidine residue in the C-terminal region or anywhere else in this group of proteins. 3 It remains to be established whether class II proteins have cyclase activity.
To the best of our knowledge, the E. coli cyclase is the only established example of a protein adenylated on a histidine and using ATP as a co-factor. However, three other proteins are known to undergo modification of a histidine residue with nucleotidyl groups other then adenylyl. Galactose-1-phosphate uridylyltransferase, an enzyme involved in the Leloir pathway for galactose metabolism, is transiently uridylated at the N⑀2 position of the imidazole ring of a histidine residue; UDPglucose is a uridylyl group donor in this reaction (36,37). A second example is the gag protein of the S. cerevisiae doublestranded RNA L-A virus. His-154 of this protein makes a co-valent complex with m 7 GMP (m 7 G is 7-methylguanosine), following a nucleophilic attack by the imidazole nitrogen on the ␣ phosphate of the m 7 GpppN cap structure in mRNA. The capture of m 7 GMP by the viral protein results in decapping of cellular mRNAs (38). Recently, Cartwright and McLennan (39) reported that the brine shrimp GTP:GTP guanylyltransferase, the enzyme responsible for synthesis of diguanosine tetraphosphate (Gp 4 G), forms a histydyl-GMP reaction intermediate via N⑀2 of a histidine residue. Although only few proteins are known to be nucleotidylated on a histidine, phosphoryl transfer reactions involving a phosphohistidine residue are more common and found in many proteins that are members of twocomponent signaling systems in both prokaryotes and eukaryotes (40).
Amino acid sequences flanking the adenylated His-309 in the E. coli cyclase and corresponding histidines in other members of the class I family of cyclases do not resemble sequences found in other proteins that undergo nucleotidylation on histidine or other amino acid residues. For example, the bacteriophage T4 RNA ligase, like many other RNA and DNA ligases, forms a covalent protein-adenylyl intermediate and transfers AMP to the 5Ј-terminal phosphate in nucleic acid to form the 5Ј-5Ј phosphoanhydride ligation intermediate (reviewed in Ref. 19). Notably, in the absence of the physiological 5Ј-phosphorylated substrate, T4 RNA ligase can inefficiently transfer AMP to 3Ј-terminal phosphate, resulting in 3Ј-5Ј phosphoanhydride formation and 3Ј-phosphate cyclization, via a mechanism probably very similar to that of the RNA cyclase (41,42). However, in RNA and DNA ligases, the nucleotidyl transfer occurs to the lysine in a conserved sequence motif, KXDG; this motif is not present in cyclase class I or class II families. We have previously tested whether, in the absence of the 3Ј-phosphorylated end, the human cyclase has a potential to activate the 5Јterminal phosphate in RNA. No evidence of A 5Ј pp 5Ј N formation was found, even when a large excess of the enzyme was used. In addition, no evidence of cyclase-catalyzed inter-or intramolecular ligation of either 5Ј-or 3Ј-phosphorylated oligoribonucleotides was obtained (20). Taken together, these data argue that RNA ligases and RNA 3Ј-phosphate cyclases are very distinct enzymes.