Posttranslational Formation of Formylglycine in Prokaryotic Sulfatases by Modification of Either Cysteine or Serine*

Eukaryotic sulfatases carry an α-formylglycine residue that is essential for activity and is located within the catalytic site. This formylglycine is generated by posttranslational modification of a conserved cysteine residue. The arylsulfatase gene ofPseudomonas aeruginosa also encodes a cysteine at the critical position. This protein could be expressed in active form in a sulfatase-deficient strain of P. aeruginosa, thereby restoring growth on aromatic sulfates as sole sulfur source, and inEscherichia coli. Analysis of the mature protein expressed in E. coli revealed the presence of formylglycine at the expected position, showing that the cysteine is also converted to formylglycine in a prokaryotic sulfatase. Substituting the relevant cysteine by a serine codon in the P. aeruginosa gene led to expression of inactive sulfatase protein, lacking the formylglycine. The machinery catalyzing the modification of thePseudomonas sulfatase in E. coli therefore resembles the eukaryotic machinery, accepting cysteine but not serine as a modification substrate. By contrast, in the arylsulfatase ofKlebsiella pneumoniae a formylglycine is found generated by modification of a serine residue. The expression of both theKlebsiella and the Pseudomonas sulfatases as active enzymes in E. coli suggests that two modification systems are present, or that a common modification system is modulated by a cofactor.

Sulfatases are members of a highly conserved gene family (1,2) sharing extensive sequence homology and a unique posttranslational modification (3,4). This novel protein modification generates a 2-amino-3-oxopropanoic acid, C ␣ -formylglycine (FGly), 1 residue (5). In eukaryotes, FGly formation occurs in the endoplasmic reticulum by oxidation of a conserved cysteine residue and is directed by a linear sequence surrounding this cysteine (6,7). Deficiency of FGly formation is observed in multiple sulfatase deficiency, a rare human lysosomal storage disorder that is characterized by synthesis of sulfatase polypeptides with greatly reduced enzyme activity (8). The FGly residue is part of the catalytic site, as has been shown by crystallographic analysis of two lysosomal sulfatases (9,10). The aldehyde group of the FGly residue, most likely in its hydrated form (10,11), accepts the sulfate during sulfate ester cleavage leading to the formation of a covalently sulfated enzyme intermediate (11) from which the sulfate can subsequently be eliminated with concomitant regeneration of the aldehyde group.
It has recently been shown that a prokaryotic member of the sulfatase family, the arylsulfatase from Klebsiella pneumoniae, also carries the FGly residue (12). In contrast to eukaryotic sulfatases, the modified residue in the Klebsiella protein is a serine instead of a cysteine, though the amino acid sequence downstream of the FGly, which is thought to direct this protein modification, is highly conserved between eukaryotes and prokaryotes (XPXRXXXLTG). Only 60% of the arylsulfatase polypeptides expressed under strongly inducing conditions carried the FGly residue, and the remaining 40% carried the serine predicted from the DNA sequence (12). Under these conditions it seems likely that the FGly generating machinery may be saturated. The FGly residue, which is critical for sulfate ester cleavage, is therefore conserved between prokaryotes and eukaryotes. However, the pathways responsible for FGly formation may be different. If the cysteine in eukaryotic sulfatases is replaced by a serine, catalytically inactive proteins are synthesized, and the serine residue does not undergo conversion to FGly (6,11).
To investigate whether prokaryotes can convert both serine and cysteine to FGly, we studied the arylsulfatase from Pseudomonas aeruginosa, for which the DNA sequence predicts a cysteine at the position occupied by FGly in the sulfatase polypeptides from eukaryotes and K. pneumoniae (13). The Pseudomonas sulfatase was expressed in the wild-type form, carrying a cysteine in position 51, and as a C51S mutant. The recombinant sulfatase proteins were purified and analyzed for the presence of FGly. The results obtained showed that the Pseudomonas sulfatase was modified with high specificity and that the modification system accepted only the cysteine form of this sulfatase. The modification machinery therefore shows a different specificity to the machinery converting serine to FGly in the Klebsiella sulfatase.
P. aeruginosa ATS2 (⌬atsA recA7::Tn501) was constructed as follows. An 832-base pair deletion was created in the atsA gene of P. aeruginosa by exonuclease III/Mung bean nuclease digestion of plasmid pME4051, which carried the atsA gene on a 3.9-kilobase EcoRI-SalI fragment in pBluescript, and religation. The deleted allele was transferred to the chromosome of strain PAO1 using a ColE1-based allele exchange vector, and correct recombination was confirmed by Southern analysis and polymerase chain reaction. The recA allele was introduced subsequently by transduction from P. aeruginosa PDO3 (15) and confirmed by testing for UV-sensitivity.
Site-directed Mutagenesis of Cysteine 51-Plasmid pME4055 was constructed by cloning the atsA gene of P. aeruginosa as a 2.9-kilobase SalI fragment in pBBR1MCS (16), under the control of the lac promoter. This plasmid served as a template for polymerase chain reactions with noncoding mutagenic primers comprising a NcoI-site (C51S: 5Ј-CATGCCATGGTGCCGATCCCGGCGATGTGGTGGTCGGTGCCG-GTGAGCAGCATCGAGCGGGTCGGCGAGCTGGTCG-3Ј; C51A carried a GGC triplet instead of GCT). The polymerase chain reaction products were subcloned as NcoI fragments replacing the corresponding fragments of the template DNA, which carried an additional NcoI site in the vector. The subcloned fragments were checked by DNA sequencing to preclude any polymerase chain reaction-derived errors.
Sulfatase Production and Purification-For overproduction of arylsulfatase protein, an NdeI site was first introduced at the atsA translation start site by site-directed mutagenesis of plasmid pME4055, using mutagenic primer ATS-N (5Ј-GACCCGCATATGAGCAAACG-3Ј). The polymerase chain reaction product was then cloned into the expression vector pET24-b (Novagen Inc.), and the 3Ј-part of the gene was replaced by an NcoI-SalI DNA fragment from pME4055 to give plasmid pME4322. The polymerase chain reaction-derived region of this plasmid was checked by DNA sequencing. For overproduction of the C51S and C51A mutant proteins, the mutant alleles were subcloned into pME4322 as EcoRV-SalI fragments.
Overproduction of the sulfatase proteins was carried out in E. coli BL21(DE3), as described previously (17). After lysozyme/Dnase/Rnase treatment the cells were lysed using a French press, and membranes were removed by ultracentrifugation (250,000 ϫ g, 30 min). Cell extract was then desalted into 20 mM Tris/HCl, pH 7.5, with a PD-10 gel filtration column (Amersham Pharmacia Biotech). The proteins were purified by two chromatographic steps, using a BioCAD SPRINT apparatus (Perseptive Biosystems Inc.). A first separation was obtained by anion-exchange chromatography on a 1-ml Resource-Q column (Amersham Pharmacia Biotech), at a flow rate of 3 ml/min. Proteins were eluted with a gradient of 0 -100 mM Na 2 SO 4 in 20 mM Tris/HCl, pH 7.5. The pooled fractions containing the enzyme were concentrated by ultracentrifugation with a Vivaspin 4 concentrator (Vivascience), and corrected to 1 M (NH 4 ) 2 SO 4 by addition of concentrated ammonium sulfate solution. They were then further purified by hydrophobic interaction chromatography on a 1-ml Resource-Iso column (Amersham Pharmacia Biotech) at a flow rate of 3 ml/min and eluted with a descending gradient of (NH 4 ) 2 SO 4 in 20 mM Tris/HCl, pH 7.5. Fractions containing the enzyme were desalted and concentrated as before and stored at Ϫ20°C until required.
Arylsulfatase was assayed in whole cells and in cell extracts as described previously using 4-nitrocatechol sulfate (Fluka) as substrate (13). For small scale enzyme assays with E. coli, concentrated cell suspensions were treated with lysozyme (10 min, 4°C), and lysed by brief ultrasonication. Protein concentration was measured by the method of Bradford (18), using bovine serum albumin as standard.

RESULTS
Arylsulfatase expression in P. aeruginosa PAO1 is repressed during growth in LB medium because of the presence of excess inorganic sulfate (Table I; Ref. 13). Derepression is observed during growth in minimal medium with alternative sulfur sources such as sulfate esters, sulfonates, or methionine, and under these conditions arylsulfatase activities of up to 56 nmol/ min per mg of total cell protein are observed (21). P. aeruginosa ATS2 is an arylsulfatase-deficient derivative of strain PAO1 in which an 832-base pair fragment of the sulfatase gene atsA was deleted. This strain therefore shows no arylsulfatase activity even under derepressing conditions, and is unable to utilize aromatic sulfate esters as a sulfur source for growth. Upon transformation of this strain with a plasmid containing the wild-type atsA gene under control of the lac promoter (pME4055), the ability of the strain to grow with aromatic sulfates as sole sulfur source was restored, and sulfatase activity could be measured during growth either in LB (Table I) or minimal medium (not shown). However, when strain ATS2 was transformed with plasmids coding for the mutated C51S or C51A versions of the arylsulfatase protein, expression of sulfatase activity was below the detection limit (Table I), and no growth was possible with aromatic sulfates as sole sulfur source. The changed residue in these constructs, cysteine 51, is equivalent to the residue that is converted to FGly in the sulfatases of eukaryotes and of Klebsiella pneumoniae, suggesting that FGly plays a crucial role in the Pseudomonas enzyme, as well as in the other sulfatases.
In the Klebsiella sulfatase, the FGly is generated from a serine (12). This sulfatase can be expressed as an active enzyme in E. coli (22). To test whether the loss of arylsulfatase activity caused by the C51S and C51A mutations in the Pseudomonas sulfatase was only observed after expression in P. aeruginosa, the same plasmids were used to transform E. coli DH5␣. After induction with isopropyl thiogalactopyranoside, arylsulfatase activity was measured in the cell extracts (Table  I). Although E. coli carries at least one sulfatase-related gene, the aslA gene (23), this species has not yet been found to express active endogenous sulfatases. Expression of the native Pseudomonas arylsulfatase in E. coli led to significant levels of enzyme activity in the cells, whereas the mutant forms again showed no enzyme activity (Table I). This demonstrates that the P. aeruginosa arylsulfatase can be produced in a stable, active form by E. coli.
For further protein chemical studies, the atsA gene and its mutant derivatives were placed under the control of the T7 promoter, and the proteins were overproduced in E. coli BL21(DE3). After overexpression of the Pseudomonas sulfatase or its C51S derivative in this strain, the target proteins constituted 20 -30% of total cell protein (not shown). The wild-type protein and the C51S mutant were purified to homogeneity (4 -10-fold purification); the purified wild-type enzyme showed a catalytic activity of 47 mol/min per mg, whereas the activity of the C51S mutant was extremely low (13 nmol/min per mg). and in E. coli DH5␣ Strains harboring the indicated plasmids were grown in LB medium. Gene expression in E. coli was induced in the mid-exponential growth phase with isopropyl-␤-D-thiogalactopyranoside (0.5 mM) over 3 h, and arylsulfatase activity was then measured. Arylsulfatase activity in P. aeruginosa was measured in the late exponential phase. AtsA 42 pPAS-S1 C51S-AtsA 0 pPAS-A3 C51A-AtsA 0 The C51A mutant could not be purified, because it was degraded by the cells on overexpression (not shown).
To examine whether residue 51 was converted to FGly, the purified proteins were denatured in guanidine hydrochloride and incubated with NaB[ 3 H]H 4 , which reduces the aldehyde group of FGly and generates a [ 3 H]serine residue (4,5,12). The samples were then subjected to reductive carboxymethylation of cysteines. After removal of all low molecular weight compounds by gel filtration, aliquots were analyzed by SDS-polyacrylamide gel electrophoresis, followed by Coomassie staining and fluorography (Fig. 1). The wild-type protein was found to carry a 3 H-label but the C51S mutant was not labeled by this treatment (Fig. 1B). The two carboxymethylated sulfatases were digested with trypsin and their tryptic peptides were separated by RP-HPLC. The fractions were analyzed for radioactivity and by mass spectrometry. Radioactivity was recovered only from the wild-type sample (Fig. 2) and found to be associated with a single tryptic peptide ( Fig. 2A) of 1521 Da (Fig. 3A). This mass is predicted for the [ 3 H]serine 51-containing form of the tryptic peptide 3 comprising residues 42-55 of the Pseudomonas sulfatase, i.e. after reduction of FGly-51. A peptide of the C51S mutant eluting with the same retention time (Fig. 2B) also showed a mass of 1521 Da (Fig. 3B) but carried no 3 H-label (Fig. 2B). Sequencing of the two peptides led to the amino acid sequence predicted for the reduced wildtype and the C51S form of peptide 3 (Fig. 4, A and B), both carrying a serine in position 51. In the tenth sequencing cycle, corresponding to residue 51, the 3 H-radioactivity was released from the wild-type peptide (Fig. 4C). The association of the radioactivity with residue 51 strongly suggests that cysteine 51 in the newly translated wild-type protein had been oxidized to FGly, which was then reduced to [ 3 H]serine by treatment with NaB[ 3 H]H 4 .
In control samples, in which the treatment with NaB[ 3 H]H 4 had been omitted, peptide 3 showed a mass of 1519 Da for the wild-type peptide (Fig. 3C) and of 1521 Da for the C51S peptide (Fig. 3D), as predicted for the FGly-51 and serine 51-containing forms of peptide 3. When p-nitroaniline was used as a matrix for matrix-assisted laser desorption ionization mass spectrometry of the same peptides, masses of 1639 Da and 1521 Da were determined (Fig. 3, E and F). The increase of the wild-type peptide 3 by 120 Da is because of formation of a Schiff base between the peptide and p-nitroaniline (4, 5, 12), thereby confirming the presence of the aldehyde group in the wild-type and its absence in the mutant peptide. Further support for this conclusion was obtained when sequencing the peptides. Whereas the entire sequence of the C51S containing peptide 3 could be determined (not shown, cf. Fig. 4B), almost no signal was obtained for the C-terminal residues 51-55 of the wild-type peptide (Fig. 4D). This became most obvious in the case of proline 53, the only amino acid within this sequence (FGlySPTR) that can be recovered with good efficiency during sequencing (see Fig. 4, A and B). The presence of a FGly residue in a peptide is known to block Edman degradation at the position of the FGly (4,5,12). DISCUSSION The presence of FGly in the arylsulfatase of P. aeruginosa is a prerequisite for sulfatase activity, as has been observed earlier for the eukaryotic sulfatases (5,11). FGly formation in the Pseudomonas arylsulfatase is catalyzed in E. coli by specific oxidation of cysteine residue 51. Generation of FGly appeared to be quantitative, because by mass spectrometry we could not detect the unmodified peptide 3 comprising carboxymethylcysteine 51 (1595 Da) among its tryptic peptides. Modification of the Pseudomonas enzyme was only observed on a cysteine residue, and serine could not substitute for cysteine as a substrate for the reaction. This specificity, observed here in E. coli, is most likely also true in P. aeruginosa, because also in this species sulfatase activity could only be detected after expression of the wild-type enzyme, and not with the C51S sulfatase. Interestingly, the arylsulfatase protein was degraded by the E.  Fig. 1), were digested with trypsin, and their tryptic peptides were separated by RP-HPLC. The UV absorbance and the radioactivity (shaded area) associated with the tryptic peptides of the wild-type protein (A) and the C51S mutant (B) are shown. The position of the peptide 3, as identified by mass spectrometry (Fig. 3, A and B) and amino acid sequencing (Fig. 4, A and B), is indicated.
coli host after replacement of the cysteine 51 by an alanine residue, suggesting that changes in the modification status of the protein may also have an effect on its correct folding in vivo.
In Klebsiella pneumoniae, by contrast, it was found that the FGly in the Klebsiella sulfatase is generated by modification of a serine residue (12). This modification reaction is also catalyzed in E. coli, because the Klebsiella enzyme can be expressed in E. coli as an active sulfatase (22). Generation of FGly, i.e. serine semialdehyde, from serine is most likely to be a one-step oxidation process, whereas FGly generation from cysteine has been proposed to occur in two steps involving an oxidation and a hydrolysis reaction (3,5). It is highly unlikely that the cysteine is converted to serine before being oxidized to FGly, because substitution of the critical cysteine residue of the Pseudomonas sulfatase by serine abolished FGly formation. Thus, the modification of both cysteine and serine is achieved by direct oxidation of the respective residue found in the primary translation product.
The modifying machinery catalyzing this oxidation is highly specific for the respective residue, as was demonstrated for the cysteine-converting system of prokaryotes (this study) and of eukaryotes (6,7,11). The serine-converting system, which modifies the serine of the Klebsiella sulfatase but not that of the Pseudomonas C51S sulfatase, may involve a modification machinery independent of the cysteine-converting system. Alternatively a cofactor may confer specificity to a modification machinery catalyzing FGly formation from either cysteine or serine. Because the Pseudomonas sulfatase is a cytosolic enzyme, the cysteine-specific modification system must exist in the cytosol. On the other hand, the Klebsiella sulfatase contains a leader peptide and is exported into the periplasm. The serine-converting system may therefore not be localized in the cytosol, but in the plasma membrane or in the periplasm.
In both P. aeruginosa and K. pneumoniae, expression of arylsulfatase is coupled to the sulfur status of the cell, and is repressed when preferred sulfur sources such as sulfate or cysteine are present (21,24). When the atsA gene was expressed under lac control, however, arylsulfatase activity was observed even in the presence of excess sulfate. Expression of the bacterial genes encoding the modification system(s) therefore appears to be independent of the sulfur supply to the cells.
To date, the FGly modification has only been found in sulfatase enzymes of both eukaryotic and prokaryotic origin. However, the presence of FGly modification systems in E. coli, which lacks an active sulfatase gene, and the difference in the regulatory pattern between bacterial sulfatases and the enzyme system that modifies them suggest that other as yet unidentified bacterial proteins also undergo a similar FGly modification. The cysteine-modifying system of prokaryotes shows exactly the same specificity as the eukaryotic system. We anticipate, therefore, that a genetic approach to identify the modifying enzyme(s) in bacteria may be of help in elucidating the genetic defect in human multiple sulfatase deficiency, which leads to synthesis of catalytically inactive sulfatases lacking the FGly.