Arylsulfatase from Klebsiella pneumoniae Carries a Formylglycine Generated from a Serine*

Eukaryotic sulfatases share an unusual posttranslational protein modification, which converts a cysteine into a -formylglycine. The a -formylglycine is essential for the catalytic activity. Klebsiella pneumoniae ex-presses an inducible arylsulfatase for which the DNA predicts a serine at the position occupied by the a -formylglycine residue in eukaryotic sulfatases. Struc-tural analysis showed that the majority of the arylsulfatase polypeptides from K. pneumoniae carries the a -formylglycine, whereas the remaining arylsulfatase polypeptides contain the predicted serine residue. This demonstrates the evolutionary conservation between prokaryotes and eukaryotes of this novel protein modi-fication that so far has been found only in sulfatases. a -Formylglycine in Klebsiella is generated from a serine and not from a cysteine as in eukaryotes. Eukaryotic post-translationally 1 linear The FGly residue is critical for the catalytic activity of sulfatases. Crystallographic analysis of two FGly

Eukaryotic sulfatases share an unusual posttranslational protein modification, which converts a cysteine into ␣-formylglycine. The ␣-formylglycine is essential for the catalytic activity. Klebsiella pneumoniae expresses an inducible arylsulfatase for which the DNA predicts a serine at the position occupied by the ␣-formylglycine residue in eukaryotic sulfatases. Structural analysis showed that the majority of the arylsulfatase polypeptides from K. pneumoniae carries the ␣-formylglycine, whereas the remaining arylsulfatase polypeptides contain the predicted serine residue. This demonstrates the evolutionary conservation between prokaryotes and eukaryotes of this novel protein modification that so far has been found only in sulfatases. ␣-Formylglycine in Klebsiella is generated from a serine and not from a cysteine as in eukaryotes.
Eukaryotic sulfatases share a cysteine residue, which is posttranslationally converted into ␣-formylglycine (FGly; 2-amino-3-oxopropanoic acid). 1 This novel posttranslational modification occurs in the endoplasmic reticulum and is directed by a linear sequence surrounding the cysteine to be modified (1)(2)(3). The FGly residue is critical for the catalytic activity of sulfatases. Crystallographic analysis of two sulfatases has shown that the FGly residue is part of the catalytic site (4,5). The aldehyde is likely to be hydrated and to serve as an acceptor for sulfate during catalysis (5). 2 In prokaryotes expression of sulfatases is generally controlled by the sulfur or the carbon content of their environment. Under appropriate conditions such as the absence of SO 4 2Ϫ and the presence of alkyl sulfates, the sulfatases are expressed in the periplasmic space (for review see Ref. 7). The sequences of five prokaryotic sulfatases have been reported (8 -12), and they share sequence homology with eukaryotic sulfatases to a similar extent as the members of the eukaryotic sulfatase family among each other. Surprisingly, in the genes encoding the sulfatases from Klebsiella pneumoniae (8) and from Escherichia coli (9), a serine residue is predicted at a position where all other known sulfatase DNAs predict a cysteine that is known to be converted into FGly in eukaryotes. To examine whether the FGly residue is found also in prokaryotic sulfatases and whether it can be generated also from a serine residue, we purified arylsulfatase from K. pneumoniae and examined the protein for the presence of a FGly residue.
Sulfatase Production and Purification-Bacterial culture and purification of arylsulfatase were performed as described by Okamura et al. (13) with some modifications: K. pneumoniae DSM 681 from blood agar plates was grown overnight in aliquots of 0.4 liters of medium containing methionine as sulfur source. At an A 600 of 1.5-2.5 cells were sedimented, washed, and disrupted for 20 min in aliquots of 30 ml using a Sonicator W 220-F, Heat Systems-Ultrasonics, Inc. For analysis of arylsulfatase activity 200 l of sample were incubated with 200 l of 20 mM p-nitrocatecholsulfate in 10 mM Tris/HCl, 150 mM NaCl, pH 7.4, for 10 min at 37°C. After addition of 1 ml of 1 M NaOH, absorbance at 515 nm was measured. The initial ammonium sulfate precipitation (13) was omitted, and Sephadex G-100 was substituted by Superdex 75 (120-ml volume, Pharmacia Biotech Inc.) equilibrated with 20 mM Tris/HCl, 150 mM NaCl, pH 7.2 on a fast protein liquid chromatograph. Arylsulfatase activity eluted after 66 ml. After concentration in an ultra thimble and dialysis against 20 mM Tris/HCl, pH 7.4 arylsulfatase was loaded on a MonoQ fast protein liquid chromatography column replacing the DEAE-Sephadex A-25 column (13). By this procedure arylsulfatase was purified about 224-fold to a specific activity of 123 units/mg protein yielding about 1 mg of enzyme from 30 liters of culture with a recovery of 7%. For further analysis the sulfatase was desalted by RP-HPLC on a SMART system (Pharmacia) using an Aquapore Butyl 7 micron (220 ϫ 2.1 mm) column (Applied Biosystems) equilibrated with 0.1% trifluoroacetic acid/H 2 O. Arylsulfatase A was eluted by an acetonitrile gradient from 0 to 90% in 36 min. Reduction with [ 3 H]NaBH 4 -30 g of arylsulfatase were lyophilized and solubilized in 4 M guanidinium hydrochloride, 25 mM Tris/HCl, 10 mM EDTA, pH 9. Reduction with [ 3 H]NaBH 4 , desalting, tryptic digestion, and purification of the peptides on RP-HPLC were performed as described, omitting reductive carboxymethylation with dithiothreitol and iodacetic acid (1, 3). Before digestion with trypsin an aliquot of the 3 H-labeled arylsulfatase was analyzed by SDS-PAGE using high Tris gels containing 10% acrylamide, 0.13% bis-acrylamide (14), followed by Coomassie Blue staining and phosphoimaging. Fractions from RP-HPLC were analyzed by liquid scintillation counting, mass spectrometry, amino acid sequencing, and radiosequencing (1,3).
Proteolytic In-gel Digestion-180 g of arylsulfatase were lyophilized and subjected to SDS-PAGE (see above). After staining with Coomassie Blue (0.195% Coomassie R-250, 0.005% Coomassie G-250, 0.5% acetic acid, 20% methanol) for 20 min and destaining with 30% methanol, the gel slice containing arylsulfatase was excised and cut into small pieces. The gel pieces were washed twice with 0.5 ml of 50% acetonitrile, 50% 100 mM ammonium carbonate (buffer A) and dried on air. 3.6 g of trypsin in 100 l of buffer A were added. After 30 min of incubation at room temperature buffer A was added to cover the gel pieces. Following incubation for 16 h at 37°C, peptides were extracted with 0.2 ml of 50% acetonitrile, 50% trifluoroacetic acid, 0.2 ml of 50% acetonitrile/H 2 O, and 0.2 ml of 75% acetonitrile/H 2 O, each at 60°C for 30 min. The supernatants were pooled, concentrated to 20 l, filled up to 100 l with 0.1% trifluoroacetic acid/H 2 O and subjected to RP-HPLC (see above). Fractions containing nonmodified and modified peptide 2, as identified by mass spectrometry (MALDI III, Shimadzu), were pooled, concentrated to 50% volume, and applied to a peak C2/C18 column (Pharmacia) equilibrated with 0.1% trifluoroacetic acid/H 2 O. The peptides * This work was supported by the Deutsche Forschungsgemeinschaft and the Fonds der Chemischen Industrie. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

RESULTS AND DISCUSSION
K. pneumoniae was grown in the presence of 1 mM methionine to induce the expression of arylsulfatase (8). The arylsulfatase was purified from the periplasmic fraction to a specific activity of 123 units/mg. The final preparation contained one major polypeptide (Fig. 1, lane 1), which was identified as arylsulfatase by protein sequencing. Among the contaminating proteins, which became visible upon overloading the gel, the xylose binding protein (15) represented the major species accounting for up to 5% of total protein (data not shown).
In the arylsulfatase from K. pneumoniae, serine 72 is equivalent to the cysteine residue that in eukaryotic sulfatases is converted into FGly. To examine whether serine 72 is converted to FGly, the arylsulfatase was incubated with NaB[ 3 H]H 4 in the presence of 4 M guanidinium chloride. This would reduce the aldehyde group of FGly and generate a [ 3 H]serine residue (1,3). An aliquot of the reduced arylsulfatase was analyzed by SDS-PAGE followed by Coomassie Blue staining and phosphoimaging. Phosphoimaging showed that the arylsulfatase had incorporated 3 H radioactivity (Fig.  1, lane 2). The [ 3 H]arylsulfatase was digested with trypsin, and the tryptic peptides were separated by RP-HPLC ( Fig.  2A). The fractions were analyzed for radioactivity and by mass spectrometry. The peak of 3 H radioactivity eluted with a retention time of 26.7 min and was collected in fraction 36. Mass spectrometry of fraction 36 yielded a mass of 1590 Da (Fig. 2B), which corresponds to the mass calculated for the tryptic peptide 2 comprising residues 63-76. Mass spectrometry of other fractions identified seven tryptic peptides representing 126 of the 444 amino acids of arylsulfatase from K. pneumoniae (8).
Radiosequencing of the HPLC fraction 36 showed that the radioactivity was released in the tenth cycle (Fig. 2C). In peptide 2 this position corresponds to residue 72, for which the DNA sequence predicts a serine. The association of the radioactivity with residue 72 strongly suggests that serine 72 in newly translated arylsulfatase of Klebsiella had been oxidized to FGly, which by treatment with NaB[ 3 H]H 4 had been reduced to [ 3 H]serine.
When sulfatases are expressed at high level in mammalian cells, a fraction of the cysteine can escape conversion to FGly, presumably due to saturation of the oxidizing machinery (1). Because the Klebsiella arylsulfatase had been reduced prior to tryptic digestion, it was not possible to determine whether a fraction of the arylsulfatase had retained serine 72 predicted by the DNA. Therefore the experiment was repeated with the omission of the reduction of the arylsulfatase. In addition the tryptic digestion was performed after separation by SDS-PAGE to remove the remaining contaminants. The tryptic peptides were separated by RP-HPLC. Fractions containing masses predicted for the nonmodified serine 72-containing form of peptide 2 (1590 Da) or for the modified FGly 72-containing form of peptide 2 (1588 Da, designated as peptide 2*) were identified by mass spectrometry and subjected to rechromatography (Fig. 3,  upper panel). A 3 H-labeled standard of peptide 2 (obtained as described above) eluted in fractions 31-34 as indicated in Fig.  3 (upper panel) by a horizontal bar. The material in these fractions had a mass of 1589 -1590 Da, and sequencing of the pooled fractions 31-34 yielded the full-length sequence of peptide 2 (Fig. 3, lower panel).
The peptide(s) in fractions 26 -30 had a mass of 1587-1589 Da. To determine whether this material corresponds to the modified peptide 2*, the mass was additionally determined  Fig. 1) was digested with trypsin, and its tryptic peptides were separated by RP-HPLC. The acetonitrile gradient (0 -90%) used for elution is indicated. The absorbance and the radioactivity (shaded area) are shown. Fraction 36 containing the peak of radioactivity associated with tryptic peptide 2 is indicated. B, mass spectrometry of fraction 36. The material was embedded in indole-2-carboxylic acid. C, fraction 36 was subjected to amino acid sequencing. The radioactivity released in each cycle is given. The sequence indicated on the abscissa corresponds to that of tryptic peptide 2 (residues 63-76) of Klebsiella arylsulfatase.
Formylglycine in Arylsulfatase Generated from Serine 4836 after embedding the material in a p-nitroaniline matrix. If a peptide contains an aldehyde group, a Schiff's base is formed with p-nitroaniline, and the mass of the peptide increases by 120 Da (1,3). The peptide in fractions 26 -30 formed a Schiff's base of 1707-1708 Da (not shown). Fractions 26 -30 were pooled and subjected to amino acid sequencing. The first eight amino acids of peptide 2 were clearly detectable. The recovery of the ninth amino acid, a methionine, was markedly reduced, and no signal was obtained for the following five residues (Fig. 3, lower panel). The presence of an FGly residue in a peptide is known to block Edman degradation at the position of the FGly and to reduce the efficiency in the preceding cycle (1-3). Thus, amino acid sequencing and mass spectrometry using the p-nitroaniline matrix clearly demonstrated the presence of the modified peptide 2* containing the FGly residue in fractions 26 -30. The relative frequency of peptide 2 and peptide 2* estimated from the sequencing data was about 2:3.

CONCLUSIONS
The present data clearly demonstrate that the arylsulfatase from K. pneumoniae contains an FGly residue at position 72, where its DNA predicts a serine residue. About 40% of the arylsulfatase polypeptides, which were synthesized under conditions of strong induction, still contained the predicted serine 72, suggesting that the capacity of the machinery generating the FGly residue is saturable. FGly 72 in the Klebsiella sulfatase occupies a position that is homologous to that of the FGly residue in eukaryotic sulfatases. Thus, the FGly residue, which is critical for sulfate ester cleavage, is conserved between prokaryotes and eukaryotes. The pathways responsible for this unusual protein modification, however, may be different. If in eukaryotic sulfatases the cysteine is replaced by a serine, catalytically inactive sulfatases are formed (6) 2 in which the serine residue is not converted into FGly. 2 Whether the prokaryotic machinery converting serine into FGly can also catalyze the conversion of cysteine to FGly remains to be determined. At least one prokaryotic sulfatase is known in Pseudomonas aeruginosa for which the DNA sequence predicts a cysteine at the position occupied by FGly in sulfatases from eukaryotes and K. pneumoniae (10).
The ability of K. pneumoniae to convert a serine into FGly in arylsulfatase should facilitate genetic approaches to identify components required for this novel protein modification. This in turn may help to explain the genetic defect in multiple sulfatase deficiency. In this human disease catalytically inactive sulfatases are synthesized due to a failure to generate the FGly residues in sulfatases (1).