UPF0586 Protein C9orf41 Homolog Is Anserine-producing Methyltransferase*

Background: Anserine is an abundant dipeptide in vertebrate skeletal muscles. Results: We identified UPF0586 protein C9orf41 homolog as a carnosine N-methyltransferase, responsible for anserine formation in rat muscle. Conclusion: Besides being a carnosine N-methyltransferase, UPF0586 protein is likely to be a novel peptide or protein methyltransferase in eukaryotes. Significance: This molecular identification will help to elucidate physiological functions of UPF0586 protein in eukaryotes. Anserine (β-alanyl-N(Pi)-methyl-l-histidine), a methylated derivative of carnosine (β-alanyl-l-histidine), is an abundant constituent of vertebrate skeletal muscles. Although it has been suggested to serve as a proton buffer and radical scavenger, its physiological function remains mysterious. The formation of anserine is catalyzed by carnosine N-methyltransferase, recently identified in chicken as histamine N-methyltransferase-like (HNMT-like) protein. Although the HNMT-like gene is absent in mammalian genomes, the activity of carnosine N-methyltransferase was reported in most mammalian species. In the present investigation, we purified carnosine N-methyltransferase from rat muscles about 2600-fold. Three polypeptides of ∼45, 50, and 70 kDa coeluting with the enzyme activity were identified in the preparation. Mass spectrometry analysis of these polypeptides resulted in the identification of UPF0586 protein C9orf41 homolog as the only meaningful candidate. Rat UPF0586 and its yeast, chicken, and human orthologs were expressed in COS-7 cells and purified to homogeneity. Although all recombinant proteins catalyzed the formation of anserine, as confirmed by chromatographic and mass spectrometry analysis, rat UPF0586 was more active on carnosine than other orthologs. Confocal microscopy of HeLa cells expressing recombinant UPF5086 proteins revealed their presence in both cytosol and nucleus. Carnosine and Gly-His were the best substrates for all UPF0586 orthologs studied, although the enzymes also methylated other l-histidine-containing di- and tripeptides. Finally, cotransfection of COS-7 cells with rat or human UPF0586 and carnosine synthase transformed the cells into efficient anserine producers. We conclude that UPF0586 is mammalian carnosine N-methyltransferase and hypothesize that it may also serve as a peptide or protein methyltransferase in eukaryotes.

Carnosine (␤-alanyl-L-histidine) and its methylated derivative, anserine (␤-alanyl-N--methyl-L-histidine), are dipeptides commonly found in excitable animal tissues, whereas they have never been detected in plants, fungi, or other eukaryotes (for a review, see Ref. 1). In contrast to carnosine, which is present at high concentrations (in the range of 0.6 -30 mM) in both skeletal muscles and the central nervous system of almost all vertebrates, including human beings (2,3), the storage of anserine is more selective. Anserine is abundant in skeletal muscles of most vertebrates (1), and only traces of this dipeptide have been occasionally detected in brain (4,5). Anserine was reported as a major L-histidine-containing dipeptide in avian tissues (up to 43 mM in chicken pectoral muscle) (4), whereas much lower concentrations of this compound have been measured in muscles of mammals, such as rats (2-9 mM), cats (8 mM), and rabbits (17 mM) (1,6). Intriguingly, no endogenous anserine has ever been detected in human muscles (1,3).
A great effort has been made to understand the physiological role of both carnosine and anserine. Originally, these two dipeptides were postulated to serve as pH buffers neutralizing lactic acid produced in working muscle due to their abundance and pK a value that is close to the physiological pH (4). This hypothesis, however, provides no explanation for the synthesis of anserine, which shows buffer capacity similar to that of carnosine. Recently, histidine-containing dipeptides have been postulated to exert a more complex effect on cell and tissue metabolism via their antiglycemic (7), antiglycation (8), and antioxidant properties (9). Unfortunately, no definitive explanation for their physiological importance has been provided so far.
Knowledge of the enzymes catalyzing the formation of histidine-containing dipeptides is rather limited, and only recently have carnosine synthase and chicken carnosine N-methyltransferase been molecularly identified as ATPGD1 (ATP-grasp ml of H 2 O and centrifuged at 13,000 ϫ g for 10 min. After neutralization of the supernatant with 3 M K 2 CO 3 , the salts were removed by centrifugation (13,000 ϫ g for 10 min); the clear supernatant was diluted 5 times with 20 mM Hepes, pH 7.5; and 2 ml were applied to Dowex 50W-X4 columns (1 ml, Na ϩ form), equilibrated with 20 mM Hepes, pH 7.5. The columns were washed with 5 ϫ 2 ml of 20 mM Hepes, pH 7.5, to remove minor radioactive contaminants of the radioreagent. Anserine and methylated forms of L-histidine were eluted with 5 ϫ 2 ml of 20 mM Hepes, pH 7.5, containing 0.5 M NaCl. To elute a non-consumed [ 1 H ϩ 3 H]SAM, the columns were washed with 4 ϫ 2 ml of 1 M NH 4 OH. In the experiments where Gly-Gly-His, Gly-His-Gly, Ala-His or His-Ala, and His-Gly were used as substrates, 20 mM Hepes, pH 7.5, in the equilibration and elution buffers for Dowex 50W-X4 columns was replaced by 20 mM Hepes, pH 7.0, or 20 mM MES, pH 5.5, respectively, to obtain retention of the methylated peptides on the cationite. In all cases, the samples to be counted were mixed with 6 volumes of scintillation fluid (Ultima Gold), and the incorporated radioactivity was analyzed with a Beckman LS6000 IC liquid scintillation counter.
Purification of Rat Carnosine N-Methyltransferase-Rat leg muscles (130 g) from four male Wistar rats, aged 4 months, were homogenized with 4 volumes (w/v) of buffer A (50 mM Hepes, pH 7.6, 10 mM KCl, 1 mM DTT, 1 mM EGTA, 1 mM MgCl 2 , 5 g/ml leupeptin, and 5 g/ml antipain) with a Waring blender 7011HS. The homogenate was centrifuged for 40 min at 20,000 ϫ g at 4°C. The resulting supernatant (400 ml) was filtered through three layers of gauze to remove fat particles and was applied to a DEAE-Sepharose column (300-ml bed volume) equilibrated with buffer A. The column was washed with 850 ml of buffer A and developed with a NaCl gradient (0 -0.5 M in 715 ml) in buffer A, and fractions (6.5 ml) were collected. The most active fractions of the DEAE column (39 ml) were diluted to 220 ml with buffer B (50 mM Tris-HCl, pH 8.0, 10 mM KCl, 1 mM DTT, 1 mM EGTA, 1 mM MgCl 2 , 5 g/ml leupeptin, and 5 g/ml antipain) and applied to a Q-Sepharose column (50-ml bed volume) equilibrated with buffer B. The column was washed with 135 ml of buffer B containing 25 mM NaCl, and the retained protein was eluted with a NaCl gradient (25-500 mM in 350 ml in buffer B), and fractions (5 ml) were collected. The most active fractions (15 ml) were pooled, concentrated to 2.6 ml in Vivaspin-20 ultrafiltration devices, and loaded on a Superdex 200 16/60 column (120-ml bed volume) equilibrated with buffer A containing 100 mM NaCl. One-ml fractions were collected. To obtain more purified enzyme preparation for a tandem mass spectrometry, fractions 44 -53 (10 ml) of the Superdex 200 purification step were pooled again, concentrated to 1 ml in Vivaspin-20 ultrafiltration devices, and loaded on a HiScreen blue Sepharose column (4.7-ml bed volume) equilibrated with buffer A. The column was washed with 25 ml of buffer A and developed with a NaCl gradient (0 -1 M in 50 ml) in buffer A, and fractions (1 ml) were collected. All purification steps were performed at 4°C, and the enzymatic preparation was stored at Ϫ70°C between steps. Protein concentration was determined spectrophotometrically according to Bradford (13) using bovine ␥-globulin as a standard. Protein content in the most active blue Sepharose fractions were determined follow-ing their 10-fold concentration in Vivaspin-500 ultrafiltration devices.
Identification of Rat Carnosine N-Methyltransferase by Tandem Mass Spectrometry-Because SDS-PAGE analysis of the peak activity fractions from the HiScreen blue Sepharose revealed a low protein content, resulting in very faint protein bands, fractions were 10-fold concentrated in a Vivaspin-500 ultrafiltration device and reanalyzed by SDS-PAGE. The bands co-eluting with carnosine N-methyltransferase activity in the blue Sepharose purification step were cut from a 10% gel and digested with trypsin. In-gel digestions of the peptides were performed as described previously (14). Peptides were analyzed by nano-UPLC-tandem mass spectrometry employing Acquity nano-UPLC coupled with a Synapt G2 HDMS Q-TOF mass spectrometer (Waters) fitted with a nanospray source and working in MS∧E mode under default parameters. Briefly, products of in-gel protein digestion were loaded onto a Waters Symmetry C18 trapping column (20 mm ϫ 180 m) coupled to the Waters BEH130 C18 UPLC column (250 mm ϫ 75 m). The peptides were eluted from these columns in a 1-85% gradient of acetonitrile in water (both containing 0.1% formic acid) at a flow rate of 0.3 l/min. The peptides were directly eluted into the mass spectrometer. Data were acquired and analyzed using MassLynx version 4.1 software (Waters) and ProteinLynx Global Server version 2.4 software (Waters) with a false discovery rate of Յ4%, respectively. To identify rat carnosine N-methyltransferase, the complete rat (Rattus norvegicus) reference proteome was downloaded from UniProt, randomized, and used as a data bank of the MS/MS software.
Phylogenetic Analysis-Sequences homologous to rat UPF0586 protein C9orf41 homolog (GenBank TM accession number NP_001020145.1) were identified by Protein BLAST searches. A phylogenetic analysis was performed on the Phylogeny.fr platform (15). Amino acid sequences were aligned with MUSCLE (version 3.7) (16). After alignment, ambiguous regions were removed with Gblocks (version 0.91b) (17). Phylogenetic trees were generated using phylogenetic estimation using maximum likelihood (18) with the WAG model for amino acid substitution (19). The final tree was customized with the editing interface TreeDyn (20). A confidence level was assessed using the approximate likelihood ratio test (minimum of Shimodaira-Hasegawa-like procedure and 2 -based parametric) (21).
Overexpression and Purification of Recombinant UPF0586 Proteins-Yeast (Saccharomyces cerevisiae), chicken, and rat total RNA was prepared from 250 mg of cell pellet, 200 mg of pectoral muscle, and 200 mg of leg muscle, respectively, with the use of TriPure reagent according to the manufacturer's instructions. cDNA was synthesized using Moloney murine leukemia virus-reverse transcriptase (Thermo-Fermentas), with oligo(dT) 18 primer and 2.5 g of total RNA according to the manufacturer's instructions.
The open reading frames encoding yeast (GenBank TM accession number NM_001182930.1), chicken (XM_003643032.1), and rat UPF0586 protein (NM_001024974.1) were PCR-amplified using Pfu DNA polymerase in the presence of 1 M betaine, whereas the ORF coding for the human protein (BC034033.1) was amplified from cDNA clone HsCD00296470 (DNASU Plasmid Repository). UPF05086s were amplified using 5Ј primers containing the initiator codon preceded by the Kozak consensus sequence (22) and an EcoRI site and 3Ј primers in which the original stop codon was replaced by an amino acid coding codon flanked by an XbaI site (for primer sequences, see Table  1). The amplified DNA products of the expected size were digested with the appropriate restriction enzymes; cloned into the pEF6/Myc-His A expression vector (Invitrogen), which allows the production of proteins with a C-terminal His 6 tag; and verified by DNA sequencing. For transfections, COS-7 cells (Cell Lines Service, Eppelheim, Germany) or HEK-293T cells (a kind gift of Dr. Maria Veiga-da-Cunha) were plated in 100-mm Petri dishes at a cell density of 1.7 ϫ 10 6 or 2.1 ϫ 10 6 cells/plate, respectively, in Dulbecco's minimal essential medium supplemented with 100 units/ml penicillin, 100 g/ml streptomycin, and 10% (v/v) fetal bovine serum and grown in a humidified incubator under a 95% air and 5% CO 2 atmosphere at 37°C. After 24 h, each plate was transfected with 6 g of either unmodified pEF6/Myc-His A vector or the same vector encoding HNMT-like protein using the TurboFect transfection reagent according to the protocol provided by the manufacturer. After 48 h, the culture medium was removed, and the cells were washed with 5 ml of phosphate-buffered saline and harvested in 1 ml of 50 mM Hepes, pH 7.5, containing 10 mM KCl, 1 mM MgCl 2 , 1 mM EGTA, 5 g/ml leupeptin, and 5 g/ml antipain. The cells were lysed by freezing in liquid nitrogen, and after thawing and vortexing, the extracts were centrifuged at 4°C (20,000 ϫ g for 30 min) to remove insoluble material.
In the experiments where COS-7 cells were co-transfected with two different plasmids, 1.7 ϫ 10 6 seeded cells were transfected with 3 g of each plasmid to reach a total of 6 g of DNA. Plasmid encoding carnosine synthase (CARNS1) was a kind gift of Dr. Maria Veiga-da-Cunha and was prepared as described (10). Twenty-four hours after transfection, the culture medium was changed to a fresh one (10 ml) containing 0.1 mM ␤-alanine, and the cells were left for another 24 h before removing the medium, washing with PBS, and collecting from each plate in 0.5 ml of 50 mM Hepes, pH 7.6, 10 mM KCl, 1 mM DTT, 1 mM EGTA, 1 mM MgCl 2 , 5 g/ml leupeptin, and 5 g/ml antipain. The cells were lysed by freezing three times in liquid nitrogen, and insoluble material was removed by centrifugation (13,000 ϫ g for 15 min). Part (0.2 ml) of soluble extract was deproteinized by the addition of 0.6 ml of acetonitrile, followed by centrifugation (13,000 ϫ g for 15 min) and filtration (Whatman 0.2 m PVDF filter), and used to determine carnosine and anserine content by HPLC-hydrophilic interaction chromatography (HILIC) (see "Product Analysis"). Another part (0.3 ml) was used to measure protein concentration and for expression analysis of the recombinant proteins by SDS-PAGE and Western blotting (11).
For the purification of recombinant UPF0586 proteins, the supernatant of COS-7 lysate (7-10 ml) was diluted 4-fold with buffer A (50 mM Hepes, pH 7.5, 300 mM NaCl, 10 mM KCl, 20 mM imidazole, 1 mM MgCl 2 , 5 g/ml leupeptin and 5 g/ml antipain) and applied on a HisTrap HP column (1 ml) equilibrated with the same buffer. The column was washed with 6 ml of buffer A, and the retained protein was eluted with a stepwise gradient of imidazole (7 ml of 30 mM, 7 ml of 60 mM, and 8 ml of Protein content in purified recombinant enzyme preparations was quantitated by densitometric analysis using Quantity One (Bio-Rad). The yield of recombinant proteins ranged between 0.07 mg (yeast) and 0.2-0.3 mg (vertebrates) of homogeneous protein per 20 mg of soluble COS-7 cell protein. The purified enzymes were supplemented with 1 mg/ml BSA and stored at Ϫ70°C. When appropriate, the C-terminal His 6 -tagged recombinant proteins were detected by Western blot analysis as described previously (11).
Product Analysis-To obtain a sufficient amount of the methylated dipeptide formed in the reaction catalyzed by recombinant UPF0586 proteins for mass spectrometry analysis, the reaction mixture was scaled up. Briefly, 1 g of yeast or 4 g of vertebrate UPF0586 proteins were incubated for 12 h at 30°C in 0.15 ml of a reaction mixture containing 25 mM Hepes, pH 7.5, 80 -100 g of BSA derived from the enzyme preparation, 10 mM KCl, 1 mM EGTA, 1 mM MgCl 2 , 1 mM DTT, 4 mM carnosine in the absence or presence of 0.5 mM SAM. The reaction was stopped by the addition of 0.45 ml of acetonitrile. Precipitated protein was removed by centrifugation (13,000 ϫ g for 10 min), and the clear supernatants were analyzed by HPLC-HILIC according to a slightly modified method of Mora et al. (23). Briefly, carnosine and anserine were separated by the gradient mode on Ascentis Express HILIC (2.1 ϫ 100 mm, 2.7 m) using Acquity UPLC (Waters). Mobile phases consisted of solvent A, containing 1 mM ammonium acetate, pH 5.5, in water/ acetonitrile (25:75), and solvent B, containing 4.9 mM ammonium acetate, pH 5.5, in water/acetonitrile (70:30). The separation was performed in a linear gradient from 0 to 100% of solvent B for 3.5 min at a flow rate of 0.3 ml/min, followed by the column wash in 100% of solvent B for 2.6 min and equilibration Rat UPF0586 protein was purified to homogeneity by affinity chromatography on nickel-Sepharose (HisTrap HP) as described under "Experimental Procedures." For the SDS-PAGE analysis, 7.5 l of sample from each fraction was loaded onto a 10% gel and electrophoresed, and the resulting gel was then stained with silver (11). For the Western blot analysis, 7.5 l of each fraction was loaded onto a 10% gel, electrophoresed, and blotted to nitrocellulose membrane, which was then sequentially probed with a mouse primary antibody against His 6 tag and a horseradish peroxidase-conjugated goat antimouse antibody. Secondary antibody was detected through autoradiography using chemiluminescence. M, prestained protein marker; L, cell-free lysate of COS-7 cells overexpressing the recombinant enzyme; AL, 4-fold diluted lysate applied on the column; FT, flow-through; W, wash. Fractions 30 -300 were eluted with the indicated concentrations of imidazole. FIGURE 2. SDS-PAGE analysis of purified recombinant UPF0586 proteins. Rat (rUPF0586), human (hUPF0586), chicken (chUPF0586), and yeast (yUPF0586) proteins were purified to homogeneity by affinity chromatography on nickel-Sepharose (HisTrap HP), as described under "Experimental Procedures." For the SDS-PAGE analysis, 10 l of sample derived from each fraction was loaded onto a 10% gel and electrophoresed, and the resulting gel was then stained with silver (11). for 10 min under the initial conditions. The column eluate was monitored by a UV detector at ϭ 214 nm, followed by a mass spectrometer. All mass spectral analysis were performed on a Synapt G2 HDMS Q-TOF mass spectrometer fitted with an electrospray source (Waters). The detector worked in positive MS/MS mode. The electrospray ionization-MS source was set at a temperature of 100°C, capillary voltage of 3.5 kV, and cone voltage of 40 V. The flow rate of the nebulizer gas (nitrogen) was 700 liters/h. To confirm the structure of anserine precursor ion, collision-induced dissociation experiments were run by selecting the target ion (m/z 241). The trap collision energy was 20 eV. Quantification was achieved using external standards of carnosine and anserine.
Subcellular Localization of Recombinant UPF0586 Proteins in Transfected HeLa Cells-Expression plasmids for rat and human UPF0586s fused to a C-terminal EGFP tag were constructed by PCR amplification of the appropriate open reading frame, using Pfu DNA polymerase, the primers shown in Table  1, and either pEF6/rat UPF5086 or pEF6/human UPF5086 construct as a template. The amplified fragments were digested using EcoRI and SalI restriction enzymes, cloned into the pEGFP-N1 vector (Clontech), and verified by sequencing.
Subcellular localization of fused proteins were performed as described (24). Briefly, HeLa cells (European Collection of Cell Cultures) were cultured in DMEM supplemented with 10% fetal bovine serum, penicillin (100 units/ml), and streptomycin (100 g/ml) at 37°C in a humidified atmosphere containing 5% CO 2 . For transfection with the FuGENE HD reagent (Roche, Mannheim, Germany), 1.5 ϫ 10 5 cells were seeded on a 16-mm glass coverslip in a 35-mm plate, and the manufacturer's recommended protocol for transfection with plasmid DNA in the presence of serum was followed. Forty-eight hours after transfection, cells were washed with PBS and analyzed by confocal microscopy. For DNA staining, Hoechst 33342 had been added at the concentration of 1 g/ml for 20 min just before cells were washed and analyzed.
Images of cells were acquired sequentially on a Zeiss LSM700 confocal laser scanning microscope equipped with a Plan-Apochromat ϫ63/1.40 oil objective. For detecting green fluorescence of EGFP-tagged UPF0586 proteins, a 488-nm excitation line was used, and for detecting blue fluorescence of Hoechst 33342, a 405-nm excitation line was employed.
Quantitative Real-time PCR Assays-Human kidneys from deceased donors were obtained from Eurotransplant. These kidneys were unsuitable for transplantation due to technical reasons. Human brain tissue was obtained at autopsy, and normal muscle tissue was obtained from the non-affected part of tumor resection samples. All tissues were coded and handled anonymously in accordance with the Dutch National Ethics Guidelines (Code for Proper Secondary Use of Human Tissue, Dutch Federation of Medical Scientific Societies).
Total RNA was isolated from fresh rat and frozen human kidney cortex, leg muscle, and brain with TRIzol according to the manufacturer's instructions. RNA was converted to cDNA by using avian myeloblastosis virus reverse transcriptase (Roche) with random hexamer priming. Gene-specific primers (Table 1) were designed to generate PCR products from UPF0586 and glyceraldehyde-3-phosphate dehydroge-nase (GAPDH). A SYBR Green quantitative polymerase chain reaction (Bio-Rad) was performed to quantify the relative UPF0586 mRNA levels. The expression of enzyme was normalized to the expression of GAPDH using the 2 ⌬Ct method (25).
Calculations-V max , K m , and k cat for the methyltransferase activity of studied enzymes were calculated with Prism version 4.0 (GraphPad Software) using a nonlinear regression.

Results
Purification and Identification of Rat Carnosine N-Methyltransferase-Carnosine N-methyltransferase activity was assayed by measuring the incorporation of [ 3 H]methyl group of [ 3 H]SAM into carnosine. The enzyme was purified ϳ2600-fold from rat leg muscle by a procedure involving chromatography on DEAE-Sepharose, Q-Sepharose, Superdex 200, and HiScreen blue Sepharose ( Table 2). The methyltransferase was eluted as a single peak in each of the purification steps (Fig. 3), indicating the presence of a single enzyme species. The gel filtration step on Superdex 200 disclosed that the molecular mass of native enzyme in the peak fraction (F49) was equal to about 95,000 (not shown). The overall yield of the purification was only about 1.5% (see Table 2) due to the fact that only fractions exhibiting at least 70% activity of the most active one were used for the next step procedure. The only exceptions to this rule were Superdex 200 fractions loaded on the blue Sepharose column containing almost all activity recovered from the gel filtration step (see Fig. 3).
SDS-PAGE analysis of the peak activity fractions (F66 -F80) derived from the HiScreen blue Sepharose revealed a low protein content, resulting in very faint protein bands, which prevented from the identification of protein bands coeluting with the enzyme activity (not shown). Thus, both peak activity fractions F73-F75 and fraction F63, displaying a residual enzyme activity, were concentrated about 10-fold and reanalyzed by SDS-PAGE. Carnosine N-methyltransferase activity was coeluted with three protein bands of about 45, 50, and 70 kDa present in fractions F73-F75 but not in the fraction F63 withdrawn from the last purification step (see Fig. 3). The bands were cut out from the gel and digested with trypsin, and the resulting peptides were analyzed by MS/MS and compared with the Uniprot reference proteome of rat. The analysis indicated that only bands B and C contained methyltransferase Smyd1 and a protein of unknown function designated UPF0586 protein C9orf41 homolog (Table 3), the methyltransferase identity of which could be identified only by its similarity to bacterial methyltransferases (not shown). To exclude the possibility of missing any potential methyltransferase in protein bands that did not coelute with the enzyme activity, we also performed MS/MS analysis of all bands visualized by SDS-PAGE. No methyltransferase was identified in the remaining protein bands (not shown).
Although Smyd1 protein is a histone-lysine N-methyltransferase (26) and it was very unlikely that it would catalyze methylation of L-histidine residue in carnosine, we cloned and overexpressed recombinant rat Smyd1 (GenBank TM accession number NM_001106595.1) in COS-7 cells. As expected, Smyd1 did not catalyze methylation of carnosine (not shown). Thus, UPF0586 protein was the only meaningful candidate for rat

Sequences of primers used in PCR and quantitative RT-PCR experiments
The nucleotides corresponding to the coding sequences are in capital letters, the Kozak consensus sequence is shown in boldface type, and the restriction sites added are underlined.

Identification of Mammalian Carnosine N-Methyltransferase
JULY 10, 2015 • VOLUME 290 • NUMBER 28 carnosine N-methyltransferase (see Table 3). UPF0586 was a single protein identified in band C, and 11 matching peptides (underlined in Fig. 4) were found to cover about 59% of its sequence. Expression profiles of mRNA available in the EMBL Expression Atlas database (27) revealed a predominant expression of UPF0586 in leg muscles compared with other tissues of both rats and mice, as expected for carnosine N-methyltransferase. Taken together, these data strongly suggest that the rat UPF0586 protein homologous to human C9orf41 is carnosine N-methyltransferase. Analysis of UPF0586 Protein Sequences-Protein BLAST searches (28) with rat UPF0586 and phylogenetic analysis of resulting sequences indicated that orthologs of this protein were found in all eukaryotes, including vertebrates (ϳ90% identity with the rat sequence), insects (ϳ50% identity), and plants and fungi (both ϳ30% identity) (Figs. 4 and 5). All taxa followed the expected lines of descent, indicating that the enzyme was already present in a common eukaryotic ancestor (see Fig. 5). Because a low similarity in amino acid sequence between eukaryotic UPF0586 orthologs and some bacterial proteins was also detected (ϳ20% identity, Chitinophaga pinensis, Gen-Bank TM accession number YP_003120969.1), it is likely that the eukaryotic enzyme was acquired from an ancestral prokaryote.
All identified proteins contain the N-2227 domain of unknown function at their C terminus, whereas an N-terminal sequence of 80 -180 amino acids could plausibly form an additional domain (see Fig. 4). Although the amino acid sequence identity between orthologs from distinct species, such as rat and yeast, was rather low (about 30%), the alignment of their sequences showed the presence of motifs that were strictly or at least moderately conserved in both putative N-terminal and N-2227 domains (see Fig. 4). Protein structures of UPF0586 orthologs have not been reported so far. However, we performed their prediction employing the I-TASSER server (29), which indicated that UPF0586 proteins were methyltransferases, and at least seven conserved amino acid residues within the N-2227 domain were involved in the binding of S-adenosyl-L-methionine (see Fig. 4). No function was identified for the N-terminal part of UPF0586 proteins.
Characterization of Recombinant UPF0586 Proteins-To confirm the molecular identity of rat carnosine N-methyltransferase with that of rat UPF0586 protein and to compare the enzymatic activities among orthologs of UPF0586 protein from evolutionarily distinct species, rat, human, chicken, and yeast (S. cerevisiae) UPF0586 proteins were expressed in both COS-7 and HEK-293T cells as fusion proteins with the C-terminal polyhistidine tag (Fig. 6). The recombinant enzymes catalyzed synthesis of anserine, as determined by the radiochemical assay (see Fig. 6). Large amounts of recombinant enzymes for further studies were produced in COS-7 and purified to homogeneity (see Figs. 1 and 2).
The identity of the methylated product formed from carnosine by homogeneous recombinant rat UPF0586 protein was verified by HPLC-HILIC MS/MS. As shown in Fig. 7, chromatographic analysis of the generated product revealed its comigration with a commercial standard of anserine. The addition of anserine to the reaction mixture resulted in a selective increase in the peak area of the product (from 11,627 to 57,122 units) without a noticeable disturbance in its peak symmetry, disclosing the product's identity as anserine (see Fig. 7). No anserine signal was detected in the control reaction with no SAM in the reaction mixture. Analysis of the product by electrospray mass spectrometry indicated the presence of a protonated molecular ion with m/z 241, as expected for anserine (not shown). As shown in Fig. 8, the tandem mass spectrometry analysis of this ion revealed a fragmentation pattern in agreement with the anserine structure, which was indeed identical with that of commercial anserine. Recorded fragmentation spectrum of anserine was also in a perfect agreement with that available for this dipeptide (MassBank Record PR100392). Similar results of HPLC-HILIC MS/MS analysis were obtained for all recombinant UPF0586 orthologs tested (not shown).
Activities of the four homogeneous recombinant enzymes determined at various temperature and pH values with carnosine as substrate are presented in Fig. 9. They were maximal at pH between 7.0 and 7.5 (i.e. a typical value estimated for cytosolic enzymes) (30). Whereas the yeast enzyme showed an optimal activity at 30°C, which was in fact consistent with the optimal temperature for yeast growth, mammalian proteins were most active at 50°C, with ϳ1.3-(rat) and 2-fold (human) higher activities than at the physiological temperature of 37°C. The chicken enzyme exhibited an optimal activity at 40°C (i.e. within the range of physiological temperature of the chicken body), and it was constant up to 50°C (see Fig. 9). The ability of vertebrate enzymes to be catalytically active at relatively high temperatures suggests that structures of these proteins were subjected to significant changes when compared with yeast enzyme. More interestingly, all of these enzymes were still able to produce anserine. In the subsequent experiments, carnosine N-methyltransferase activity was assayed in the presence of Hepes buffer, pH 7.5, at 30°C (yeast enzyme) or 40°C (vertebrate enzymes).
Substrate specificities of the four homogeneous recombinant enzymes are shown in Table 4. Several compounds structurally related to carnosine were tested as possible methyl group

Identification of Mammalian Carnosine N-Methyltransferase
acceptors. A similar profile of specificity was found among all enzymes tested with carnosine (␤-Ala-His) as the best substrate, followed by Gly-Gly-His, Gly-His, and homocarnosine (GABA-His). The highest activity of yeast enzyme on carnosine, when compared with the vertebrate enzymes, was due to its lowest K m value for SAM (Table 5). A substitution of non-standard amino acid by ␣-Ala resulted in a ϳ5-10-fold decrease in the activity of all enzymes, whereas tripeptide Gly-Gly-His was a good methyl group acceptor. All peptides used as substrates by the enzymes contained C-terminal histidine, whereas methylation of free L-His was negligible. These findings suggest that C-terminal histidine of other peptides or proteins might be methylated, provided a correct amino acid sequence is present at their C terminus. The kinetic properties of homogeneous recombinant UPF0586 proteins were studied in the presence of carnosine, which was the best substrate for all enzymes tested (see Table  4). Rat enzyme exhibited the highest affinity for carnosine, and all vertebrate enzymes displayed about 3-6-fold higher affinity for the dipeptide than yeast one (see Table 5), presenting K m values much lower than the physiological concentration of carnosine in vertebrate skeletal muscles (1). Similar differences between vertebrate and yeast enzymes were observed for V max values estimated at saturating concentrations of SAM, indicating that vertebrate enzymes are clearly better at methylating carnosine than the yeast enzyme. This was also true when the reaction rate was determined in the presence of intracellular SAM concentration (ϳ30 M in both yeast and vertebrate cells). Yeast enzyme operated at V 0 Ϸ V max (i.e. 15 nmol ϫ min Ϫ1 ϫ mg Ϫ1 protein) due to its very high affinity for SAM (K m ϭ 2 M), whereas rat, human, and chicken enzymes exhibited 12-25-fold higher K m for SAM than did the yeast enzyme and catalyzed the reaction at V 0 equal to about 38, 26, and 25 nmol ϫ min Ϫ1 ϫ mg Ϫ1 protein, respectively (data not shown).
Analysis of SAM reagent purity revealed the presence of 1% S-adenosyl-L-homocysteine (SAH). SAH contamination did not affect K m values obtained for the investigated recombinant enzymes, as judged by V 0 values determined in the presence of rabbit SAH hydrolase and adenosine deaminase from Streptococcus thermophilus (data not shown).
Tissue Distribution of mRNA for Rat and Human UPF0586 Proteins-EMBL-EBI Expression Atlas searches with rat and human UPF0586 revealed that rat enzyme showed about 7-fold higher expression of its mRNA in skeletal muscle in comparison with all other tissues, whereas mRNA of human enzyme was present at a low level in all tissues with the exception of kidney and liver, where it doubled the basal level. A highly enriched expression of the mRNA for human enzyme was also present in kidney but not in liver, as shown by the Genevestigator database (31). To verify microarray results, we determined the level of UPF0586 mRNAs in both rat and human brain, leg muscle, and kidney cortex by quantitative real-time PCR using GAPDH transcript as a reference. As shown in Fig. 10, the  2 g of protein). The indicated bands were cut out of the gel, submitted to trypsin digestion, and analyzed by tandem mass spectrometry.

Identification of Mammalian Carnosine N-Methyltransferase
JULY 10, 2015 • VOLUME 290 • NUMBER 28 expression of UPF0586 mRNA was different in human tissues from that in rat tissues. As for the human enzyme, the highest level of expression was observed in kidney. Brain and skeletal muscle showed about 25 and 9% of the level found in kidney, respectively. The rat enzyme was mainly expressed in skeletal muscle, followed by brain and kidney (about 38 and 3% of the muscle level, respectively). It is, however, worth noticing that the relative levels of UPF0586 mRNA observed in skeletal mus-

TABLE 3 Proteins identified in the gel bands submitted to trypsin digestion and MS/MS analysis
Identified proteins are listed for each band according to their score calculated by ProteinLynx Global Server software (PLGS). For each protein, its molecular weight (M r ) and the sequence coverage are also indicated. Occasional peptide hits corresponding to keratins have not been included.  . Amino acid sequence alignment of rat UPF0586 protein with its yeast, chicken, and human orthologues. Sequences were obtained with following GenBank TM accession numbers: yeast (S. cerevisiae, NP_014307.1), chicken (G. gallus, XP_003643080.1), rat (R. norvegicus, NP_001020145.1), and human (H. sapiens, NP_689633.1). All shown sequences have been confirmed by PCR amplification of the cDNA and sequencing. The percentage amino acid identities with rat UPF0586 protein are given at the top right. The highly conserved domain of unknown function (N-2227) is labeled above the alignment, whereas amino acids residues interacting with SAM are indicated by asterisks, as predicted by I-TASSER (29). The peptides identified by mass spectrometry in the protein purified from rat leg muscle are underlined in the rat sequence. The level of residue conservation is indicated as 100% (black background), 70% and more (dark gray background), and 50% and more (light gray background). cle may be underestimated due to a high expression of glycolytic GAPDH in muscle.

Gel band
Although our results are in a fairly good agreement with reported data showing that (i) activity of rat carnosine N-methyltransferase was associated with skeletal muscle (32) and (ii) no anserine was detected in human muscle tissue (3), the unexpectedly high expression of UPF0586 mRNA in human kidney suggests that in some species (e.g. human beings), the enzyme may be active in cells or tissues that have not been considered as a source of anserine so far.
Subcellular Localization of Recombinant UPF0586 Proteins-Analysis with WolF PSORT (33) and CELLO (34) predicted cytoplasmic and/or nuclear localization of vertebrate UPF0586 orthologs, whereas yeast enzyme was suggested to interact with the cytoskeleton or the plasma membrane of fungal cells. To verify subcellular localization of mammalian enzymes, confocal microscopy was used to analyze HeLa cells transfected with C-terminal EGFP-tagged rat or human UPF0586 proteins. The direct fluorescence microscopic observations indicated that these two mammalian UPF0586 proteins were present in both the cytoplasm and nucleus of HeLa cells, exhibiting a very similar diffuse pattern that excluded their localization in the nucleoli (Fig. 11). It was previously shown that both cytosolic MEK1 (mitogen-activated protein kinase 1) and nuclear ER (enhancer of rudimentary) protein fused to a C-terminal EGFP correctly localized in HeLa cells, confirming the validity of the employed reporting system (35). Both nuclear and cytoplasmic localization of human UPF0586 was also reported in normal human tissues (36), confirming our results.
Overexpression of Carnosine Synthase and UPF0586 Proteins in COS-7 Cells-To investigate anserine-producing activity of rat and human UPF0586 proteins in more physiological conditions, we transfected COS-7 cells with plasmids driving the expression of human carnosine synthase (12) with or without either rat or human UPF0586 protein. After 24 h of transfection, cell culture media were exchanged to fresh media that were supplemented with 0.1 mM ␤-alanine, and the cells were further incubated for 24 h. Next, acetonitrile extracts of cells were prepared, and the concentrations of both carnosine and anserine were determined in deproteinized samples by HPLC-HILIC.
As shown in Fig. 12, COS-7 cells transfected with human carnosine synthase synthesized carnosine, and the intracellular concentration of the dipeptide was about 11.8 Ϯ 1.8 mM (n ϭ 6) based on a water/protein ratio for COS-7 cells equal to 6.8 l/mg, as determined by three independent measurements. The co-expression of carnosine synthase with rat or human UPF0586 protein triggered anserine production in the COS-7 cells that led to its accumulation up to about 2.2 Ϯ 0.3 and 3.7 Ϯ 0.4 mM (for rat and human UPF0586s, respectively, n ϭ 6) accompanied by a decrease in the intracellular content of carnosine (by ϳ30%), probably due to its conversion into anserine. Intriguingly, anserine concentration in COS-7 cells expressing human UPF0586 protein was 40% lower than in cells transfected with rat enzyme, which FIGURE 5. Phylogenetic tree of UPF0586 proteins. Protein sequences were aligned using Muscle (16), whereas the phylogentic tree was inferred with the use of phylogenetic estimation using maximum likelihood (18). Brunch support values assessed using the approximate likelihood ratio test are indicated (21). The protein sequences that were used for the analysis and corresponding GenBank TM accession numbers are as follows:

Identification of Mammalian Carnosine N-Methyltransferase
JULY 10, 2015 • VOLUME 290 • NUMBER 28 seems not to fully reflect a clearly higher difference in the expression of these two enzymes. This finding suggests that factors other than the level of UPF0586 expression may affect the production of anserine (e.g. a limited availability of SAM).
Most importantly, both the concentrations of the dipeptides and anserine/carnosine ratio determined in the transfected COS-7 are in fairly good agreement with the corresponding data reported for rat leg muscle (6), indicating that UPF0586

Discussion
Molecular Identity and Biochemical Properties of Rat Carnosine N-Methyltransferase-Recently, we have reported the identification of carnosine N-methyltransferase as chicken HNMT-like protein. To our surprise, the HNMT-like gene is absent in available mammalian genomes (11), although the activity of carnosine N-methyltransferase has been shown in most mammalian species (1). These observations led us to hypothesize that mammals possessed an anserine-forming enzyme completely different from the HNMT-like one (11). In the present study, we report the molecular identification of rat carnosine N-methyltransferase as UPF0586 protein C9orf41 homolog, disclosing the identity of the mammalian enzyme. This conclusion results from the following findings: (i) rat leg muscle is a rich source of mammalian carnosine N-methyltransferase (32) and UPF0586 protein is the only logical candidate for the enzyme that was identified in the most highly purified preparation from the rat tissue; (ii) the recombinant rat UPF0586 protein catalyzes the transfer of the methyl group from SAM onto carnosine, yielding anserine; (iii) the identity of the product made by the recombinant enzyme was confirmed by both hydrophilic interaction chromatography and tandem mass spectrometry; and (iv) the overexpression of carnosine synthase together with UPF0586 protein in COS-7 cells transformed them into efficient anserine producers.
The fact that COS-7 cells overexpressing carnosine synthase and rat UPF0586 protein efficaciously produce anserine is an argument supporting the identification of rat carnosine N-methyltransferase. However, the production of anserine by transfected COS-7 cells was observed under conditions of strong overexpression of UPF0586 protein, raising the question on the importance of those data. Taking into account that the half-lives of both carnosine and anserine are ϳ3 weeks in the rat (37), there is no need for a rapid process, and much lower expression of the enzymes may be sufficient to reach and maintain physiological concentration of the dipeptides.
To the best of our knowledge, there are no published studies on the substrate specificity and kinetic properties of rat carnosine N-methyltransferase. In our hands, the recombinant rat UPF0586 acted best on carnosine and homocarnosine, whereas ␣-alanine-containing dipeptide was a poor substrate, confirming identification of the enzyme as carnosine N-methyltransferase. More interestingly, the enzyme also showed a high activity toward carnosine analogues containing glycine or glycine-glycine instead of ␤-alanine, suggesting that there might be other endogenous substrate(s) for it.  Considering that the intracellular concentrations of SAM and carnosine in rat leg muscle are about 30 M and 9 mM, respectively (1,38), the rat UPF0586 showed rather low affinity for SAM (K m ϭ 42 M) and a high affinity for carnosine (K m ϭ 3.3 mM). These observations indicate that the enzyme works at a rate of about 40% of the V max in rat muscle, and the SAM concentration rather than carnosine availability affects its activity in vivo.
It is intriguing that both rat and human enzymes exhibit optimal activities at an unexpectedly high temperature (50°C). This phenomenon may simply be a consequence of their protein structures without a clear physiological relevance, as shown for other enzymes (e.g. plant aspartate aminotransferases (39) or human L-isoaspartyl methyltransferase (40)). On the other hand, the high optimum temperature for the enzymes may be beneficial from a physiological standpoint. As the temperature of working muscle rises during even short term moderate exercise (41), a higher activity of carnosine N-methyltransferase could result in an increase in the rate of anserine formation, which in turn might raise a buffering capacity of working muscle.
Other Roles of UPF0586 Protein-Besides the identification of rat carnosine N-methyltransferase, the major finding reported in this work is demonstration that UPF0586 orthologs from evolutionarily distinct species (i.e. yeast, chicken, and human) are all anserine-producing methyltransferases. To the best of our knowledge, this is the first experimental evidence that N-2227 domaincontaining enzymes do indeed catalyze the SAM-dependent transfer of methyl group, and the N-2227 domain is most likely involved in SAM binding. Furthermore, UPF0586 is a widespread protein in eukaryotes, and its orthologs are also found in organisms that are unable to produce carnosine due to the lack of carnosine synthase (plants, fungi including yeast, and several fishes). This indicates that the enzyme may have other function related to its ability to N-methylate histidine residues in peptides. This notion is further supported by our findings that (i) all tested UPF0586 orthologs are very sluggish enzymes, with a k cat of 1-5 min Ϫ1 , resembling that estimated for protein methyltransferases rather than that measured for the enzymes catalyzing methylation of low molecular weight metabolites (38,42), and (ii) anserine-

H͔Methylation of both L-histidine and various L-histidine-containing peptides catalyzed by UPF0586 proteins of various species
The reaction of methyl group transfer from SAM into the indicated methyl group acceptors was determined with the use of homogeneous recombinant rat, human, chicken, or yeast UPF0586 proteins. Enzyme preparations (0.20 -0.72 g of protein) were incubated for 10 min at 30°C (yeast) or 40°C (vertebrates) in the presence of 1 M ͓ 1 H ϩ  . Relative mRNA levels for rat and human UPF0586 protein as determined in brain, kidney, and muscle tissues by quantitative real time PCR. Cycle threshold (Ct) values were determined for rat and human UPF0586 and GAPDH. The expression of a target UPF0586 gene was normalized to the expression of the related GAPDH using the 2 ⌬Ct method. Values are either the means Ϯ S.E. (error bars) of three to four independent samples (rat tissues and human kidney) or the means of two independent samples Ϯ range (error bars) (human brain and muscle).
producing UPF0586 protein is present in chicken as well as other avian and reptilian species that express, mainly in their muscles, HNMT-like protein, carnosine N-methyltransferase being about ϳ22-fold more efficacious than UPF0586 protein (k cat /K m ϭ 2224 and 100 min Ϫ1 mM Ϫ1 , respectively) (11). Finally, a gene encoding a protein similar to UPF0586 is also present in genomes of some prokaryotic species (e.g. Haliangium ochraceum, GenBank TM accession number YP_003265816.1), where it is located together with a non-ribosomal peptide synthase in a single gene cluster, plausibly an operon. Thus, this prokaryotic methyltransferase may catalyze methylation of bacterial non-ribosomal peptides. We therefore hypothesize that UPF0586 protein may also serve as a peptide or protein histidine-methylating enzyme in eukaryotes. In fact, several eukaryotic proteins have been shown to contain methylhistidine residues, although yeast YIL110W enzyme responsible for the modification of Rpl3 protein has been the only identified protein histidine methyltransferase to date (43). Based on our results, a cytosolic and/or nuclear protein might be a plausible substrate that would be methylated at its C-terminal histidine. The presence of Gly or Gly-Gly upstream of histidine would facilitate its methylation, whereas ␣-Ala could be expected to inhibit the reaction. No such modification of protein C terminus has been reported so far, although there are several proteins in vertebrates that could be potential substrates (e.g. caspase-1 (GenBank TM accession number P43527) or inositol monophosphatase 3 (Gen-Bank TM accession number D4AD37)). A plausible function of such post-translational modification might be to protect proteins from attack by carboxypeptidases. Interestingly, methylation of the N terminus has also been observed for a variety of proteins in both prokaryotes and eukaryotes (44), and protein N-terminal methyltransferases have been recently identified in yeast and humans (45). However, a clear physiological role of such protein modification remains unknown.  Evolution of Rat Carnosine N-Methyltransferase-Our present hypothesis that the function of UPF0596 in eukaryotes may be different from carnosine methylation and related to its ability to methylate peptides and/or proteins raises the question of how the enzyme gained carnosine N-methyltransferase activity in rats and other mammals. Because the enzyme is slow, to be an effective anserine producer, it would have to be expressed at a high level in the major carnosine-containing tissue (i.e. muscle). This condition is indeed fulfilled in rodents, as indicated by tissue expression profiles of the UPF0586 mRNA in rats and mice but not in chicken that in fact exhibits other form of carnosine N-methyltransferase (HNMT-like). Our findings therefore suggest that the acquisition of carnosine N-methyltransferase activity by UPF0586 protein may simply result from both its promiscuous activity (i.e. broad substrate specificity) and a large increase in the level of its expression in muscle tissue. In fact, this conclusion is in perfect agreement with both experimental data and theoretical analyses showing that the acquisition of a new enzyme activity frequently results from the utilization of its promiscuous activity and an up-regulation of its expression, which appears to be a key step in the divergence of a new function (46) (for a review, see Ref. 47).
Human UPF0586 Protein-Human beings are believed to be the only mammalian species that has no methylated carnosine in its tissues (1). This rare exception could be easily explained by a finding that the gene encoding mammalian carnosine N-methyltransferase is absent or is mutated in Homo sapiens. Alternatively, the absence of anserine in humans may be more apparent than real. In fact, our findings suggest that humans possess carnosine N-methyltransferase activity, and the enzyme may be considerably more active in kidney cortex than in skeletal muscle, leading to the accumulation of anserine in renal tissue. Because virtually all searches for anserine in humans were limited to the skeletal muscle (1,3), the presence of anserine in kidney as well as in other tissues, which have not been considered as a source of anserine so far, might be easily overlooked. This hypothesis is supported by findings that storage of both carnosine and anserine are clearly detectable in the kidney cortex of human and mouse renal tissue by immunohistochemistry and by HPLC analysis (48), 3 although the physiological role of these dipeptides in the kidney is unclear.
Recently, more than dozen patients have been reported with novel chromosome microdeletions at 9q21.13, presenting mental retardation, speech delay, and epilepsy (49,50). The smallest region of deletion contained six genes, including that encoding human UPF0586 (also known as C9orf41). Although the authors suggested the RORB gene as a candidate for a neurological phenotype (49), the effect of UPF0586 gene loss remains undetermined. Intriguingly, almost half of the patients have suffered from urinary tract disorders (enuresis, pyelonephritis, and vesico-ureteral reflux) of unknown origin (49,50), suggesting that some of the deleted genes may be important for renal physiology. We therefore hypothesize that these renal disorders may be related to the loss of UPF0586 protein, and it would be of interest to determine more systematically the renal func-tions in patients with deletion of the UPF0586 gene, which could be helpful to identify the physiological role of the enzyme.
Conclusions-In the current investigation, we have identified rat carnosine N-methyltransferase, which catalyzes the last step in the anserine biosynthesis pathway. The UPF0586 protein is a novel form of vertebrate carnosine N-methyltransferase that is unrelated to the reptilian and avian enzyme (HNMT-like) and seems to be typical of mammalian species. Because anserineproducing orthologs of UPF0586 are found in organisms that are unable to produce carnosine, we hypothesize that the enzyme might also catalyze methylation of histidine residues in some eukaryotic peptides or proteins.
Author Contributions-J. D. conceived and designed the study, wrote the paper, and partially performed and analyzed the experiments shown in Tables 1-5