Molecular Characterization of the Human Cα-formylglycine-generating Enzyme*

Cα-formylglycine (FGly) is the catalytic residue in the active site of sulfatases. In eukaryotes, it is generated in the endoplasmic reticulum by post-translational modification of a conserved cysteine residue. The FGly-generating enzyme (FGE), performing this modification, is an endoplasmic reticulum-resident enzyme that upon overexpression is secreted. Recombinant FGE was purified from cells and secretions to homogeneity. Intracellular FGE contains a high mannose type N-glycan, which is processed to the complex type in secreted FGE. Secreted FGE shows partial N-terminal trimming up to residue 73 without loosing catalytic activity. FGE is a calcium-binding protein containing an N-terminal (residues 86–168) and a C-terminal (residues 178–374) protease-resistant domain. The latter is stabilized by three disulfide bridges arranged in a clamp-like manner, which links the third to the eighth, the fourth to the seventh, and the fifth to the sixth cysteine residue. The innermost cysteine pair is partially reduced. The first two cysteine residues are located in the sequence preceding the N-terminal protease-resistant domain. They can form intramolecular or intermolecular disulfide bonds, the latter stabilizing homodimers. The C-terminal domain comprises the substrate binding site, as evidenced by yeast two-hybrid interaction assays and photocross-linking of a substrate peptide to proline 182. Peptides derived from all known human sulfatases served as substrates for purified FGE indicating that FGE is sufficient to modify all sulfatases of the same species.

active site (1)(2)(3)(4)(5)(6)(7)(8)(9). In the native enzyme FGly is present as an aldehyde hydrate carrying two geminal hydroxyls at its ␤-carbon that both participate in catalysis (10 -12). One hydroxyl performs a nucleophilic attack on the sulfur atom of the sulfate ester leading to covalent sulfation of the enzyme. The presence of the second hydroxyl is essential for the consecutive elimination of the sulfate leading to regeneration of the formyl group. The latter step represents an "intramolecular hydrolysis" made possible by the initial aldehyde hydration and is the hallmark of this novel hydrolytic mechanism (see Ref. 10). Sulfatase activity relies on the presence of FGly. This is highlighted by multiple sulfatase deficiency (MSD), a rare inherited disorder of fatal nature that is characterized by the synthesis of catalytically inactive sulfatases lacking FGly (1,2,9,13,14).
In most eu-and prokaryotic sulfatases FGly is generated by the recently discovered FGly-generating enzyme (FGE) and all MSD patients analyzed so far carried mutations in the FGEencoding gene (13)(14)(15). In eukaryotes, FGly formation occurs in the endoplasmic reticulum (ER) and is a late cotranslational or early posttranslational oxidation of a specific cysteine residue (5,6,16). This oxidation is directed by a conserved sequence motif downstream of the cysteine to be modified (CX-PSRXXX(L/M)TG(R/K/L) in human sulfatases, see Ref. 17). In vitro FGly formation can be measured using sulfatase-derived synthetic peptides as substrates that show a high affinity toward FGE. These peptides, in modified form, could also be used to construct an affinity matrix that allowed purification of FGE from bovine testis, leading to its identification as a single 374-amino acid glycoprotein responsible for FGly generation (14). Expression of FGE in MSD patient cells led to reconstitution of sulfatase activities in these cells (13,14).
Human FGE belongs to a new protein family with orthologous members in pro-and eukaryotic sulfatase-expressing organisms. Some distant FGE-related homologs seem to have distinct oxygenase and other functions (18). Deuterostomia, including vertebrates and echinodermata, also express a paralog of FGE (pFGE) (13,14,18) 2 that also localizes to the ER. pFGE shows no FGly-generating activity in vitro but reduces the specific activity of overexpressed sulfatases in vivo suggesting an interaction with the sulfatases and/or with FGE (32). FGE defines a new enzyme family capable of performing an oxidation reaction without any obvious requirement for a cofactor and most likely using molecular oxygen as the terminal electron acceptor. 3 How oxygen-dependent cysteine oxidation is facilitated in the ER remains unclear, even though the crystal structures of two other cysteine-oxygen oxidoreductases have been solved (19,20). These oxidoreductases, Erv2p and Ero1p, contain FAD and are unrelated to FGE on the primary sequence level. For Ero1p, which is the protein-disulfide isomerase (PDI)-oxidizing enzyme and hence the primary source of disulfides in the ER, specific cysteine residues have been implicated in redox catalysis. Also for FGE cysteines have been speculated to play an essential role for catalysis (18). To comprehend these fundamental redox mechanisms, detailed functional and structural information of the involved oxygen-utilizing oxidases is required. Here we describe functional and structural properties of human FGE allowing the depiction of a first model of its molecular architecture and its substrate binding site.

MATERIALS AND METHODS
Construction of Expression Plasmids-A pMPSVEH plasmid, containing an FGE-RGS-His 6 -encoding cDNA insert (see Ref. 14), served as template for an add-on PCR reaction using FGE_SP_EcoRI (gGAAT-TCgggacaacATGGCTGCG) and FGE-His_XbaInc (gcTCTAGAttaGT-GATGGTGATGGTGATGC) as forward and reverse primers, thereby adding a 5Ј-EcoRI and a 3Ј-XbaI site to the cDNA. The PCR product was cut with EcoRI, blunted with Klenow polymerase and then cut with XbaI. After purification, it was cloned into the pSB4.7pA expression vector, provided by Transkaryotic Therapies Inc. (Cambridge, MA), downstream of its CMV promotor using the BamHI (5Ј), blunted with Klenow polymerase, and the XbaI (3Ј) sites.
For construction of a cDNA encoding FGE⌬34 -68-His, a BspEI site was added 5Ј of codon 69 (primer FGE_BspEI, aatgcggcTCCGGAc-CGATACTCGCGGGAGG), and an EcoRI site at the 3Ј-end of the FGE-RGS-His 6 -encoding cDNA (see above). The resulting product was cloned as a BspEI/EcoRI fragment into plasmid pTD17 (5) in-frame with the signal peptide of preprolactin. The resulting plasmid pTD101 was used as a template for a second add-on PCR adding a BamHI site to the 5Ј-end (primer ppl_BamHI, cgGGATCCgccaccATGGACAGCAAAG-GTTCG) and an XbaI site to the 3Ј-end (primer FGE-His_XbaInc, see above). Because of an endogenous BamHI site in the coding region the resulting PCR product had to be cloned in two steps into pSB4.7pA. First the 162-bp 3Ј-BamHI/XbaI fragment was inserted into the corresponding sites of the vector and then the remaining 908-bp 5Ј BamHI fragment. The resulting construct with correct orientation of the BamHI fragment encodes a fusion protein consisting of the preprolactin signal peptide and mature FGE-His with a deletion of residues 34 -68. All PCR reactions were performed using Pfu-Ultra polymerase (Stratagene). The resulting constructs were fully sequenced to preclude any sequence error.
Cell Culture and Protein Expression-HT1080 cells and human skin fibroblasts were grown in Dulbecco's modified Eagle's medium supplemented with 10% fetal calf serum and 1% penicillin/streptomycin (Invitrogen) under 5% CO 2 at 37°C. Transfection of HT1080 cells with pSB4.7pA-plasmids (see above) was performed with Lipofectamine 2000 following the protocol recommended by Invitrogen. After growth for 2 weeks in medium containing genticin (G418 sulfate, Invitrogen) with increasing concentrations from 200 to 800 g/ml, drug-resistant cells were cloned and expanded in growth medium containing genticin (800 g/ml). Clones were screened by Western blot analysis for expression of FGE protein.
Purification of Recombinant FGE-His-HT1080 cells stably overexpressing FGE-His were grown to near confluence in normal growth medium. Medium was collected every 48 h, cleared by spinning at 1,100 ϫ g for 10 min, and the supernatant was subjected to ammonium sulfate precipitation (50% w/v). The precipitate obtained from 4 liters of medium was resuspended in 35 ml of buffer I (50 mM sodium phosphate, 300 mM sodium chloride, pH 8.0) and dialyzed two times for 8 -12 h at 4°C against buffer I. The dialysate was cleared by centrifugation and incubated overnight at 4°C with 25 ml of Ni-NTA agarose (50% suspension, Qiagen). The agarose was washed with 12 column volumes of 25 mM imidazole in buffer I. FGE-His 6 was eluted three times with 15 ml of buffer I containing 100 mM imidazole and another time with 15 ml of 250 mM imidazole in buffer I (each time incubating the Ni-NTA agarose for 20 min at 4°C). The elution fractions were pooled and incubated overnight at 4°C with 35 ml of Affi-Gel 10, substituted with ASA-C69S-(65-80) peptide (see Ref. 14), that had been equilibrated with PBS. After binding, the Affi-Gel was washed six times with 50 ml of 2 M sodium chloride in buffer II (10 mM Tris-HCl, 2.5 mM DTT, 1 mg/ml amino acid mix (0.05 mg/ml of each amino acid), pH 8.0) and again with 50 ml of buffer II. Elution was carried out as follows. The Affi-Gel was incubated for 4 h at 4°C with 35 ml of buffer II containing 300 mM KCl and 200 M ASA-C69S-(65-80) peptide. Afterwards the Affi-Gel was washed with 35 ml of the same buffer without peptide. This procedure was repeated two times. The elution fractions were pooled, concentrated to a volume of about 500 l using an Ultra Thimble 100/25 (MWCO 25 kDa, Schleicher & Schuell, Germany), and dialyzed against 10 mM Tris-HCl, 2.5 mM DTT, pH 8.0.
For purification of intracellular FGE-His, stably overexpressing HT1080 cells (fifteen 150-mm dishes) grown to confluency were harvested with trypsin and extracted by sonication in denaturation buffer B (The QIAexpressionist, Qiagen). The extract was used for non-native purification of FGE-His on 0.8-ml Ni 2ϩ -NTA agarose according to the manufacturer's protocol (Qiagen) using Qiagen buffers B, C, and E for binding, washing, and elution. Eluate fractions were pooled and loaded on a C4 reversed-phase column (Aquapore BU-300 7 microns, 20 ϫ 2.1 mm, PerkinElmer Life Sciences) pre-equilibrated with 0.1% trifluoroacetic acid/water. Proteins were eluted with a gradient from 0 -90% acetonitrile in 0.1% trifluoroacetic acid (1.5% per min). Peak fractions were analyzed by Coomassie Blue staining and Western blotting.
Treatment with N-ethylmaleimide (NEM)-HT1080 cells, stably expressing FGE-His or FGE⌬34 -68-His were grown to near confluency. After washing with PBS, cells were incubated with 10 mM NEM (Fluka)/PBS for 10 min at 37°C. Controls were incubated with PBS without NEM. After trypsinization, cells were harvested by centrifugation, washed once with PBS, and lysed in PBS containing 10 mM NEM. Protein concentration of cell extracts was determined by Bradford's reagent (Bio-Rad). For Western blot analysis, equal amounts of proteins were applied for SDS-PAGE under reducing and non-reducing conditions (with and without 5% mercaptoethanol), transferred to nitrocellulose, and probed with anti His tag monoclonal antibody and horseradish peroxidase-conjugated goat anti-mouse antibody as primary and secondary antibody, respectively.
EndoH/PNGase Treatment-HT1080 cells stably expressing FGE-His were grown to confluency. After 48 h in fresh medium, cells and medium were collected for partial purification of FGE-His by chromatography on Ni-NTA agarose under native conditions according to the manufacturer's protocol (Qiagen). Instead of using imidazole, protein was eluted by boiling the Ni-agarose beads in 0.1% SDS/0.1 M ␤-mercaptoethanol at 95°C for 5 min. The eluate was used for PNGase (with addition of Triton X-100) and EndoH treatment as recommended by the manufacturer (Roche Applied Science).
FGE Activity Assays and Western Blotting-Activity assays of FGE were performed as described earlier using the synthetic ASA-derived peptide P23 as substrate (14). Corresponding 23-mer peptides derived from the remaining 15 other human sulfatases were synthesized (see "Results" and Table I) and purified. The turnover of all peptides was assayed at 200 nM substrate concentration under conditions that led to 50% turnover of P23 in 30 min at 37°C.
For Western blot analysis a monoclonal anti RGS(His 6 ) antibody (Qiagen) or polyclonal antibodies, raised against purified FGE-His in rabbits, were used as primary antibodies. Horseradish peroxidase-conjugated goat anti-mouse or anti-rabbit antibodies were used as secondary antibodies. Western blot signals were quantified using the AIDA 2.1 software package (Raytest).
Dependence of FGE Activity on Metal Ions-FGE-His 6 (ϳ300 g of protein purified from secretions) was incubated for 1 h on ice in the presence of each 5 mM EDTA and 5 mM EGTA. Small compounds were then removed by gel filtration on a Fast Desalting PC 3.2/10 column (Amersham Biosciences) equilibrated with 10 mM Tris-HCl, pH 8.0 in the absence of DTT. Eluted FGE-His 6 was dialyzed overnight at 4°C against 150 ml of the same buffer. FGE-His 6 free of divalent metal ions was shock-frozen in liquid nitrogen and stored at Ϫ80°C. FGE activity assays were carried out as described earlier (14) except that the routinely present 15 M CaCl 2 were replaced by 0 -150 M of either CaCl 2 or CuCl 2 (final concentration). FGE-His 6 was used at 2.3 nM final concentration.
Limited Proteolysis-FGE-His purified from the secretion of HT1080 cells was treated with thermolysin or elastase in phosphate-buffered saline at a protein to protease ratio of 100:1 (w/w) for up to 24 h at room temperature. The reaction was stopped by adding SDS-PAGE loading buffer and heating at 95°C. The digests were separated by SDS-PAGE, followed by Coomassie Blue staining. For N-terminal sequencing, the proteolytic fragments were resolved by SDS-PAGE, blotted onto a polyvinylidene difluoride membrane (Schleicher & Schuell, Germany), stained with Coomassie Blue, excised from the membrane and subjected to Edman degradation on a Procise cLC sequenator (Applied Biosystems).

Reductive Carbamidomethylation and Tryptic Digestion-Purified
FGE-His (20 -200 g) was dried, solubilized in 45 l of 8 M urea, 0.4 M ammonium bicarbonate, pH 8.6 (buffer U), and reduced with 5 l of 100 mM DTT in buffer U at 56°C for 1 h. Cysteines were carbamidomethylated by incubation with 5 l of 300 mM iodoacetamide in buffer U in the dark at 25°C for 30 min. Reduction and carbamidomethylation were repeated (adding 7.5 l of 100 mM DTT and 5 l of 300 mM iodoacetamide, respectively) and the excess of iodoacetamide quenched with 2.5 l of 100 mM DTT. The pH was adjusted to about 2 by adding 130 l 2% trifluoroacetic acid/water, and the enzyme was purified on a C4 reversed-phase column (Aquapore BU-300 7-micron 20 ϫ 2.1 mm) applying a 0 -90% acetonitrile gradient in 0.1% trifluoroacetic acid (1.5% per min). The FGE-His containing fraction was lyophilized, solubilized in 25 l buffer U, diluted to 100 l with water, and digested with trypsin (2% w/w) at 37°C overnight. For the identification of disulfide bridges, reduction and carbamidomethylation were omitted, and native FGE was directly digested with trypsin (at 10% w/w). After digestion the peptides were purified in aliquots, corresponding to 20 g of FGE, on a microbore C18 reverse phase column (C18 PepMap100 1 ϫ 150 mm, Dionex) applying a 0 -63% acetonitrile gradient in 0.1% trifluoroacetic acid (1.0% per min). All reverse phase purifications were performed on a SMART system (Amersham Biosciences).
MALDI-TOF Mass Spectrometry and Edman Sequencing-For the analysis of cysteine modifications, disulfide bridges and the oligosaccharide side chain structure, tryptic peptides of FGE-His were subjected to matrix-assisted laser desorption ionization time of flight mass spectrometry (MALDI-TOF MS) using a Reflex III instrument (Bruker Daltonik, Germany) and dihydroxybenzoic acid as matrix. The identity of cysteine or oligosaccharide-containing peptides was verified by Edman sequencing (Procise cLC, Applied Biosystems), allowing also quantification of peptides. For MALDI LIFT-TOF/TOF mass spectrometry see "Photocross-linking." Metal Content of Purified FGE-His Protein-Duplicates (0.75 mg each) of FGE-His protein, purified from HT1080 secretions, was used at 407 M concentration in 10 mM Tris-HCl, 2.5 mM DTT (pH 8.0) for plasma mass spectrometry (ICP-MS). A buffer sample served as blank. Photocross-linking-For photoaffinity labeling, six p-benzoyl phenylalanine (Bpa)-containing peptides were synthesized corresponding to residues 60 -80 of arylsulfatase A (ASA) with an additional ⑀-biotinylated lysine residue at the N terminus and an additional serine amide at the C terminus. The Bpa residue replaced either phenylalanine 62, valine 66, or leucine 77 in the ASA sequence. Each of the peptides was synthesized with a cysteine (wild-type) or an alanine (C69A) residue in position 69. Cross-linking was carried out in PBS in a final volume of 50 l. 1 mM stock solutions of the Bpa-containing P23 peptides were prepared in 30% acetonitrile, 0.1% trifluoroacetic acid. FGE-His 6 (final concentration 1 M) was incubated for 5 min on ice with a 50-fold molar excess of the respective Bpa peptide. Then the mixture was irradiated with UV-light (400 watt metal vapor lamp, Ultratech, Osram, Germany) on ice for 30 min. To avoid photodamage of the protein UV light with wavelengths below 300 nm was filtered out by a special glass filter plate (B270, Schott, Germany) (21,22). The photoadducts were analyzed by Western blotting using anti-RGS-His 6 or anti-biotin monoclonal antibodies as primary antibodies and a horseradish peroxidaseconjugated goat anti-mouse antibody as secondary antibody.
For determination of the cross-link site, FGE was reacted with peptide P23-Bpa77 on a preparative scale. 64 g of FGE were cross-linked with P23-Bpa77 in four aliquots (each in 400 l) under the conditions described for the analytical scale. Thereafter, aliquots were pooled and lyophilized. Reductive carbamidomethylation, separation on a C4 column, tryptic digestion, peptide purification and analysis by MALDI-TOF-MS and Edman sequencing were performed as described above. MALDI LIFT-TOF/TOF analysis of the metastable fragments of the cross-linked peptide (CLP) was performed on an Ultraflex MALDI-TOF/ TOF mass spectrometer (Bruker Daltoniks) using Biotools (Bruker Daltoniks) for evaluation of the fragment spectra. To increase fragmentation in the C-terminal part of CLP, a truncated form was generated by treatment with proteinase K. 3 pmol of CLP were incubated with 1 ng of proteinase K in 2 l of 25 mM NH 2 HCO 3 at 37°C for 10 min. The reaction was stopped by adding 2 l of dihydroxybenzioc acid as matrix for MALDI-TOF and MALDI-TOF/TOF analysis.
Yeast Two-hybrid Analysis-Based on the domain mapping predictions of FGE (18), fusion constructs of the FGE bait cDNA were generated in-frame with Gal4 DNA binding domain vector, pGBKT7 (BD Biosciences). The subcloned FGE cDNAs encoded amino acids 87-374, 176 -374, 176 -312, 320 -374, which were designated as bait FGE I-III, FGE II-III, FGE II, and FGE III, respectively. As prey, partial ASA cDNAs encoding amino acid residues 19 -114 or 59 -80, (lodging the critical motif, Ref. 17) were subcloned in-frame with the Gal4 activation domain in pGADT7 (BD Biosciences). Bait and prey constructs were co-transformed into Saccharomyces cerevisiae (strain AH109) by a lithium acetate method (23). Co-transformants selected on -trp and -leu plates were tested for His3, ADE2, ␤-galactosidase, and ␣-galactosidase reporter expression according to the MATCHMAKER user manual (BD Biosciences). 2.5 mM 3-amino-1,2,4-triazole was sufficient to suppress the His3 leaky expression observed for FGE II bait. To confirm positive interactions, appropriate controls were performed, as mentioned in "Results." Indirect Immunofluorescence and Immunoelectron Microscopy-To detect endogenous expression of FGE, human skin fibroblasts were grown on cover slips for 2 days and analyzed by indirect immunofluorescence as described previously (24,25). For colocalization studies, an FGE antiserum, raised in rabbits against purified FGE-His, was used as a primary antibody. Various marker proteins, PDI (ER), GM130 (Golgi), or LAMPI (endosomes/lysosomes), were detected with monoclonal antibodies. The primary antibodies were detected with appropriate secondary antibodies coupled to Cy2 or Cy3 (Molecular Probes). Confocal images were taken on a Leica TCS SP2 AOBS laser scan microscope.
For immunoelectron microscopy of human fibroblasts, ultrathin cryosections were prepared as described (26,27). Where indicated, endosomal compartments had been labeled by endocytosis (for 10 min) of 5-nm bovine serum albumin-gold (27). For immunolabeling, sections were incubated with a polyclonal antibody against FGE (1:25) for 30 min followed by a 20-min incubation with protein A-gold (10 nm). After washing steps, sections were fixed for 5 min with 1% glutaraldehyde. Sections were then quenched with glycine and blocked with bovine serum albumin before they were incubated with monoclonal antibodies against PDI (1:30) followed by a bridge antibody for 30 min (rabbit anti-mouse) and protein A-gold (5 nm) for 20 min. Sections were contrasted with uranyl acetate/methyl-cellulose for 10 min on ice, embedded in the same solution and examined with a Philips CM120 electron microscope.

RESULTS
Localization of Endogenous FGE-FGE colocalization with the ER marker PDI was shown earlier in HT1080 cells for recombinant FGE, equipped with an HA or His 6 tag (14). To allow for detection of endogenous FGE, antisera were raised in rabbits against purified human FGE (see below) and shown to be suitable for indirect immunofluorescence, immunoprecipitation, and immunodetection on Western blots. Human skin fibroblasts were analyzed for expression of endogenous FGE by immunofluorescence and immunoelectron microscopy. Both methods demonstrate that the majority of FGE colocalizes with PDI, thus classifying FGE as an ER-resident protein (Fig. 1, A  and B). Using immunofluorescence analysis no colocalization of FGE with the Golgi marker GM130 or the endosomal marker LAMPI was detectable (not shown), while by electron microscopy some FGE was detectable, together with PDI, on the cis side of the Golgi apparatus, possibly in the intermediate compartment (Fig. 1C). In addition some FGE was found in endosomes and lysosomes, where it colocalized with endocytosed BSA-gold particles (shown for endosomes in Fig. 1D).
Purification of Recombinant FGE-Upon stable overexpression FGE escapes from the ER and is secreted. The time course of intracellular accumulation and secretion of FGE-His 6 by HT1080 cells showed that a constant level of FGE is maintained in the cells and that the excess is secreted (Fig. 2A). On Western blots intracellular FGE was detected as one major band of ϳ41 kDa, whereas in the medium a major ϳ37 kDa and a minor ϳ42 kDa form of FGE were detectable. The two forms accumulated in the secretions with similar kinetics suggesting that they were generated in the cells (Fig. 2A).
FGE-His 6 was purified from the secretions in a two-step protocol on Ni-NTA agarose followed by affinity chromatogra-phy on the immobilized 16-mer peptide PVSLSTPSRAALL-TGR (Fig. 2, B and C, lanes 1-3). This synthetic peptide, designated ASA-C69S-(65-80), is derived from arylsulfatase A (positions 65-80 with the critical cysteine 69 mutated to serine) and was used earlier for purification of FGE from bovine testis (14). From 1 liter of medium ϳ2 mg of FGE-His 6 with a specific activity of 0.137 Ϯ 0.013 units/mg (n ϭ 5) could be purified to homogeneity.
On SDS gels the purified FGE-His 6 protein appeared in two molecular forms (Fig. 2B, lane 3). The larger 42-kDa form (ϳ10% of total FGE) corresponded to mature FGE with serine 34 at its N terminus, as revealed by Edman sequencing. Residues 1-33 are predicted to represent a cleavable signal peptide (18). The smaller 37-kDa form (80 -90% of total) was truncated at the N terminus starting with glutamate 73. N-terminal truncation also had been observed in FGE purified from bovine testis and obviously is compatible with catalytic activity (14). This was confirmed by expression of an N-terminally truncated form of FGE, lacking residues 34 -68, which was catalytically active (not shown). All forms of FGE-His 6 contained an intact C terminus as detected by anti-His 6 antibodies (Fig. 2C). Occasionally traces of FGE with a further recessed N terminus were detected (marked by asterisks in Fig. 2C) exposing methionine 91 or proline 106 at the N terminus.
Intracellular FGE-His 6 was partially purified under denaturing conditions by chromatography on Ni-NTA-agarose and reversed-phase HPLC on a C4 column (Fig. 2, B and C, lanes 4 -6). Purified and non-purified intracellular FGE consisted of one major 41-kDa band comprising the mature full-length protein starting with serine 34 and ending with the His 6 tag (Fig.  2, B and C, lane 6). Also here traces of smaller forms with N-terminal truncations were occasionally observed.
Basically the same result was obtained when intracellular FGE was purified under native conditions using the same chromatographic steps as described for purification of secreted FGE (not shown). The specific activity of FGE from cells was identical to that of secreted FGE. This demonstrates that the truncated form of FGE, which is the major form of secreted FGE, is catalytically fully functional.
Carbohydrate Structures of Intra-and Extracellular FGE-Full-length FGE from cells and secretions differed in electrophoretic mobility by ϳ1 kDa (Fig. 2). Since FGE is a glycoprotein with a single N-linked oligosaccharide at asparagine 141 (14,18), we tested as to whether the heterogeneity in size arose from differences in the carbohydrate moiety. Intracellular but not secreted FGE was sensitive to endoglucosaminidase H (Fig.  3A, lanes 1-4), indicating that intracellular FGE contains a high mannose-type oligosaccharide. Interestingly this high mannose type oligosaccharide was resistant to treatment with PNGase F (not shown). The major, i.e. the N-terminally truncated form of secreted FGE contains a complex-type oligosaccharide that was resistant to endoglucosaminidase H and sensitive to treatment with PNGase F. The minor, i.e. the full-length form of secreted FGE

FIG. 1. Subcellular localization of endogenous FGE in human fibroblasts.
A, indirect immunofluorescence of FGE (red) and PDI (green). The merge reveals co-localization of PDI and FGE in yellow. B-D, immunoelectron microscopic detection of FGE with a rabbit antiserum and protein A-gold (10 nm). B and C show double labeling with the ER marker PDI (5-nm gold). D shows FGE in lysosomal and endosomal structures, the latter labeled by a 10-min uptake of BSA-gold (5 nm). ER, GApp, E, and L designate endoplasmic reticulum, Golgi apparatus, endosome, and lysosome, respectively. The scale bar corresponds to 100 nm.  1 and 4), the eluate after Ni-NTA-agarose (lanes 2 and 5), and the final eluate obtained after the ASA-C69S-(65-80) affinity column (5 g of FGE, lane 3) or C4 reversed-phase column (lane 6), respectively, were analyzed by SDS-PAGE and Coomassie Blue staining (B). A 10% aliquot of the same samples was also analyzed by Western blotting using antibodies directed against the C-terminal His tag (C). All visible bands in the final eluates, after electroblotting onto a polyvinylidene difluoride membrane were subjected to Edman sequencing revealing that the indicated 41-and 42-kDa bands both represent mature FGE with serine 34 at the N terminus; the 37 kDa band started with glutamate 73. Traces of FGE with a further recessed N terminus, marked by asterisks, carried methionine 91 or proline 106 as the Nterminal amino acid. was resistant to endoglucosaminidase H and, again surprisingly, also to PNGase F (Fig. 3A, lanes 3-6).
In a more detailed analysis, the tryptic glycopeptides of FGE were purified by reversed-phase HPLC and subjected to MALDI mass spectrometry. Thereby intracellular FGE was shown to contain a family of high mannose carbohydrates comprising 4 -9 hexoses attached to two N-acetylhexosamine residues (see Fig. 3B for the composition of the largest of these glycopeptides with a molecular size of 1864 Da). Extracellular FGE contained a family of complex type oligosaccharides with a core carbohydrate, consisting of 2 N-acetylhexosamine, 3 hexose, and 1 deoxyhexose residues, that carried up to 3 lactosamine, 2 sialic acid, and 1 deoxyhexose residues (see Fig. 3C for the composition of the largest form with a molecular size of 2861 Da). Thus, the size heterogeneity of intra-and extracellular FGE is explained by the differences in glycosylation and N-terminal truncation.
Domain Mapping-FGE has been predicted to consist of three subdomains corresponding to residues 91-154, 179 -306, and 327-366, separated by short linker sequences (18). To examine whether they might correspond to folding domains, FGE-His 6 purified from the secretions was subjected to treatment with elastase or thermolysin at a protease/FGE ratio of 1:100 (w/w). Both proteases produced two stable fragment families (Fig. 4). Protease-resistant fragments of ϳ28 kDa, comprising subdomains II and III, were generated by elastase and thermolysin. In addition, smaller protease-resistant fragments of ϳ18 kDa (elastase) and ϳ16 kDa (thermolysin), respectively, were generated. These smaller fragments comprised subdomain I (Fig. 4). By Edman sequencing we determined that elastase cleaved N-terminal of residues Asn-169, Ala-173, Val-174, Ala-175, Ala-176, Ala-177, and Pro-178, i.e. in the linker region between subdomains I and II. Also thermolysin cleaved in this linker region, namely N-terminal of Ala-173 and Val-174. The ϳ18-kDa fragment generated by elastase contained subdomain I with the intact N terminus of secreted FGE (glutamate 73), whereas thermolysin was trimming the N terminus leading to a stable ϳ16-kDa fragment exposing leucine 86 at the N terminus. The apparent molecular mass of the smaller ϳ16and ϳ18-kDa fragments was 3-4 kDa higher than theoretically calculated, which is attributed to their glycosylation. Taken together these data suggest that in the native protein subdomains II and III are tightly interconnected and form one protease-resistant domain. Subdomain I is also protease-resistant and is connected to subdomains II and III via a protease accessible and therefore surface-exposed linker region. It should be noted that protease-treated FGE had lost FGly generating activity, but still bound the peptidic substrates of FGE (data not shown).
Disulfide Bonds in Intracellular and Secreted FGE-FGE contains eight cysteines, five of which are fully and the others nearly fully conserved in eukaryotes. Two cysteines are located in subdomain II (positions 218 and 235), four in subdomain III (positions 336, 341, 346, and 365), and two cysteines are found in positions 50 and 52, i.e. in a region preceding subdomain I and lacking in the truncated form of secreted FGE. To determine which cysteine residues are part of disulfide bridges, secreted FGE was subjected to treatment with NEM, an alkylating reagent acting on free cysteines, followed by reduction with DTT and carbamidomethylation. The latter serves to modify those cysteines that are part of disulfide bridges. From this material tryptic peptides were generated, purified by reversedphase chromatography and subjected to mass spectrometric The tryptic peptide 139 -151 of intracellular FGE was purified by reversed-phase chromatography and analyzed by MALDI-TOF mass spectrometry. A series of mass signals was found corresponding to peptide 139 -151 linked to two N-acetylhexosamine residues and 4 -9 hexoses (mannoses). The structure of the largest glycopeptide form with a mass of 1864 Da is shown. C, structure of the oligosaccharide side chain at asparagine 141 of secreted FGE, determined as descibed above (see B). The mass signals obtained indicate a complex type structure of the oligosaccharide consisting of a core carbohydrate of 2 N-acetylhexosamine, 3 hexose (mannose), and 1 deoxyhexose (fucose) residues, that carried up to 3 lactosamine, 2 sialic acid, and 1 deoxyhexose (fucose) residues. The structure of the largest peptide is shown, which has a mass of 2861 Da. The pairing of the cysteine residues was identified by analyzing S-S-bridged tryptic peptides generated by tryptic digestion of native FGE. The first two cysteines (positions 50 and 52), which are part of the same tryptic peptide, form an intrapeptidic disulfide bond. The following six cysteines form disulfide bonds that are arranged in a clamp-like manner (Fig.  5). Cysteine 3 (position 218) and 8 (365), cysteine 4 (235) and 7 (346), and cysteine 5 (336) and 6 (341) are disulfide-bonded, thereby connecting subdomains II and III tightly.
FGE is active in the endoplasmic reticulum. We therefore examined the disulfide bonding of intracellular FGE. To alkylate free cysteine residues, HT1080 cells overexpressing FGE-His 6 were treated for 10 min with NEM prior to extraction and purification of FGE. The purified protein was subjected to reductive carbamidomethylation and analyzed as above. Again cysteines 336 and 341 were partially NEM-modified (ϳ20% of total). Different from secreted FGE, part (Յ30%) of the peptide containing cysteines 50 and 52 carried a single NEM-modified cysteine. This implies that in up to 30% of intracellular FGE one of the first two cysteines is free, while the other is disulfidebonded to a cysteine of another FGE molecule.
FGE Dimerization-The accessibility of cysteine residues to NEM (see above) suggested that intracellular FGE might exist as a disulfide-bridged homodimer. Therefore FGE-His 6 from NEMtreated HT1080 cells was examined for its electrophoretic mobility before and after reduction with ␤-mercaptoethanol. FGE from NEM-treated cells was recovered in two forms, a major ϳ40-kDa form and a minor ϳ80-kDa form (Fig. 6, lane 1). Reduction converted both forms into a single ϳ42-kDa form (Fig. 6, lane 3). This indicates that the majority of FGE exists as a monomer, while a minority (ϳ15%) exists as a disulfide-bonded homodimer. Reduction converted all FGE into monomers, which migrated slower because of the unfolding following cleavage of the disulfide bonds (Fig. 6, lanes 3 and 4). MALDI peptide mass fingerprint analysis verified that the ϳ80-kDa band consisted only of FGE, thus excluding heterodimer formation of FGE with an unrelated ϳ40-kDa protein.
The accessibility of cysteine residues to NEM had indicated that the homodimer is established by disulfide bonds between cysteine 50 and/or cysteine 52. In agreement with the latter, no homodimers were detected in cells expressing an N-terminaltruncated form of FGE lacking residues 34 -68 (Fig. 6, lanes 5  and 6).
Dimeric forms of FGE were not observed in cells that had not been treated with NEM (Fig. 6, lane 2). This strongly suggests that during preparation of the cell lysate, the free cysteine residue (either cysteine 50 or 52) of the homodimeric FGE attacks the intermolecular disulfide bond resulting in FGE monomers in which cysteines 50 and 52 are intramolecularly disulfide-bonded.
Metal Content-FGE is a redox enzyme that has to transfer electrons from the cysteine residue in the sulfatases to an unknown acceptor. It was therefore of interest to analyze active FGE for its metal content. FGE-His 6 , purified from HT1080 secretions, was subjected to plasma mass spectrometry. As metals 1.65 mol of calcium per mol of FGE and 0.69 mol of potassium/mol FGE were detected, with no metals in the buffer blank. Also traces of copper (0.09 mol/mol FGE) were detectable, while significant amounts of further redox center-associated metals (Fe, Co, Ni, Mo, Mn, Zn, Se) were not found.
For depletion of divalent cations, purified FGE was treated with a mixture of EDTA and EGTA and subsequently subjected to gel filtration and extensive dialysis. This treatment did not significantly inhibit FGE activity nor did re-addition of Ca 2ϩ result in a significant stimulation, indicating that Ca 2ϩ is not required for FGE activity and/or that Ca 2ϩ cannot be removed from FGE by this treatment. Addition of Cu 2ϩ strongly inhibited FGE activity (IC 50 ϳ5 M), possibly because of a reaction with the easily accessible cysteine pair 336/341 (see above).
Substrate Specificity-The human genome predicts 16 sulfatases of which 13 have been described. To test whether FGE can modify all of them, we synthesized peptides derived from all 16 sulfatases as N-acetylated 23-mers comprising residues Ϫ10 to ϩ11, with the critical cysteine in position 0 and a C-terminal serine amide. These peptides were used as FGE  1-4). The same treatment was applied to HT1080 cells stably expressing an FGE-His mutant lacking the Nterminal residues 34 -68 (lanes 5-6). After separation of samples by SDS-PAGE and transfer to nitrocellulose Western blots were probed with anti-His antibodies. substrates at 200 nM concentration, i.e. about 15-fold above the half-saturating concentration determined for P23 (14). As shown in Table I, FGE was able to generate FGly in all tested peptides.
The sulfatases can be grouped in four subfamilies differing in the ϩ1 residue X of the CXPSR motif. Seven sulfatases contain a threonine in this position, four a serine, another four a cysteine and one an alanine residue. A comparison reveals that on average the activity toward the threonine subfamily of sulfatase-derived peptides is highest (Table I). Taken together these data indicate that FGE can modify all sulfatases, irrespective of the residue in position ϩ1 of the CXPSR motif, albeit with different efficiencies.
Substrate Binding Site-To characterize the interaction between FGE and its sulfatase substrates, a yeast two-hybrid assay was established employing FGE (residues 87-374) as a bait and an ASA fragment that contains the FGly modification motif as prey. Fig. 7A shows that ASA residues 19 -114 fused to the GAL4 activation domain were sufficient to interact with FGE fused to the GAL4 binding domain, as indicated by expression of four independent GAL4-driven reporters. These reporters conveyed auxotrophic growth on histidine and/or adenine-deficient media (Fig. 7A), as well as the capability to cleave ␣and ␤-galactosidase substrates, thereby developing blue color (not shown). A shorter prey construct, with ASA residues 59 -80 fused to the GAL4 binding domain, failed to reveal a detectable interaction (Fig. 7A).
Next we determined which of the subdomains of FGE are required and sufficient for interaction with the sulfatase. It turned out that subdomain II in the bait protein (FGE residues 176 -312 fused to the GAL4 binding domain) was sufficient to turn on all four reporter genes (Fig. 7B). Subdomain III (bait comprising FGE residues 320 -374) did not lead to Gal4 promotor activation (Fig. 7B) and subdomain I could not be analyzed due to prey-independent autoactivation of the reporters (not shown). Controls verified that the interaction of subdo-main II, subdomains IIϩIII, or full-length FGE with the ASA prey depended on the presence of the inserts in both bait and prey. An interaction was not observed when the FGE or the Ϫ, vector without insert. Co-transformants were selected on medium lacking tryptophan and leucine. Two-hybrid interaction leads to reporter activation allowing growth on selection medium that lacks, as indicated, histidine, adenine, or both, in addition to tryptophan and leucine. In all cotransformants showing two-hybrid interaction, activity of two further reporters, ␣and ␤-galactosidase, could be measured (not shown).

TABLE I FGE-mediated turnover of peptides derived from human sulfatases
The indicated synthetic peptides are derived from the 16 human sulfatases predicted by the human genome, with the homologous amino acid positions given in the peptide name. All peptides are N-terminally acetylated and carry a serine amide in the C-terminal position as a protective group against peptidases. Two peptides (P23 and I2S) carry a point mutation in the first position (F59M and F74L, respectively) in order to avoid coincidence of their masses with those of others. The core FGly modification motif (17) of all peptides is underlined. The cysteine to be modified and its ϩ1 neighbor are boldface. The peptides are grouped according to the nature of this ϩ1 residue (threonine, serine, cysteine, or alanine). 6 pmol of each peptide were incubated for 30 min under standard assay conditions (see "Materials and Methods") with 1 ng of FGE, purified from HT1080 secretions. The FGly modification of the peptides by FGE is given as absolute (pmol/h) and relative activity (percentage of P23 turnover). ASA sequence was replaced by sequences derived from either lamin C, p53, or large T-antigen (data not shown). In conclusion, subdomain II is sufficient to bind the sulfatase polypeptide.
In an independent approach, we tried to covalently cross-link a peptidic substrate to FGE. For that purpose, we synthesized a series of p-benzoyl phenylalanine (Bpa)-containing peptides, the sequence of which was based on that of the ASA-derived peptide P23. The photoactivatable Bpa residue (28,29) was used to replace the hydrophobic residues phenylalanine 62, valine 66, or leucine 77 in the ASA sequence. In a second series, each of the three peptides was synthesized with a C69A mutation to prevent peptide turnover by FGly modification of cysteine 69. All six peptides carried a biotinylated N-terminal lysine residue. Competition experiments indicated that the two Bpa 66 peptides acted as effective inhibitors of P23-modification, whereas the two Bpa 62 and the two Bpa 77 peptides had only a low inhibitory activity (not shown). For photocrosslinking, 1 M FGE-His was incubated on ice with a 50-fold molar excess of the respective Bpa peptide and UV-illuminated for 30 min under conditions avoiding photodamage of the protein (22). Western blot analysis using anti-biotin antibodies demonstrated that the cysteine and alanine form of the Bpa 77 peptide was covalently cross-linked to FGE (Fig. 8, upper  panel). The cross-link product of FGE and unmodified FGE could be separated by SDS-PAGE (Fig. 8, lower panel), the cross-linked FGE showing a slightly lower electrophoretic mobility (lanes 5 and 6). The cysteine 69 variant of the Bpa 77 peptide was cross-linked with higher efficiency (ϳ40% of total FGE, lane 6) than the alanine 69 variant (ϳ20%, lane 5). The cross-link products were specific, as their formation was dependent on UV light (lane 7) and could be fully competed by the presence of equimolar concentrations of P23 (lane 8). No cross-link products were detectable when FGE was replaced by bovine serum albumin, ovalbumin, or carboanhydrase (lane 9 and data not shown).
To determine the site of cross-linking in FGE, the cross-link reaction with biotinylated P23-Bpa77 was performed at a larger scale. The material then was subjected to denaturation, reductive carbamidomethylation, and purification on a C4 reversed-phase column. The latter allowed the separation of the cross-linked derivative of FGE from non-cross-linked FGE, as detected in a dot blot assay for biotin. Tryptic peptides of the cross-link product were generated, separated on a C18 column and analyzed by MALDI-TOF mass spectrometry for an FGEderived peptide modified with AAL(Bpa)TGR, i.e. the tryptic fragment of P23-Bpa77 (mass increment of 838.43 Da). The FGE peptide TNIQQAVAAAPWWLPVK (residues 168 -184) was found to carry such a modification. Upon Edman degradation of this cross-linked peptide (CLP), the FGE sequence could be determined up to the position of the first proline (Pro-178). At the same time, the Bpa77 peptide sequence AAL(Bpa)TGR was clearly detectable, with the exception of the Bpa position, as a second sequence in the first 7 Edman cycles. This suggested that the cross-link involved the sequence 179 WWLPVK 184 in FGE. To further narrow down the site of cross-linking we applied MALDI LIFT-TOF/TOF mass spectrometry to analyze CLP. A series of a-, b-, y-, and internal ions in the fragment mass spectrum of CLP clearly restricts the site of cross-linking to the C-terminal sequence 182 PVK 184 (Fig. 9A). A cross-link to lysine 184 can be excluded, since trypsin would be unable to cleave at a lysine cross-linked to a bulky compound of 838 Da. Digestion of CLP by proteinase K resulted in a series of truncated peptides, among them a stable peptide of 1480.83 Da corresponding to the combined masses of 180 WLPVK 184 from FGE and AAL(Bpa)GR from the Bpa peptide (Fig. 9B,  fragment 1). MALDI LIFT-TOF/TOF analysis of this 1480.83 Da product provided clear evidence for a cross-link between Bpa and proline 182. The ion pattern obtained consists of a series of internal fragments representing immonium ions of proline 182 cross-linked to Bpa in different neutral fragments of AAL(Bpa)TGR (Fig. 9C). In conclusion, the Bpa77 peptide is cross-linked to proline 182, which represents the fourth residue of subdomain II. Proline 182 is located close to the elastaseaccessible residue 178 (see above, Fig. 4) suggesting a surface exposure of this region in the FGE molecule.

DISCUSSION
Structural Properties of FGE-FGE is a soluble glycoprotein located in the endoplasmic reticulum. The oligosaccharide attached to the single N-glycosylation site at asparagine 141 is represented by a family of high mannose oligosaccharides containing a chitobiose unit and 4 -9 mannose residues. The lack of additional hexosamine residues indicates that FGE does not evade from the endoplasmic reticulum beyond the cis Golgi.
The eight cysteine residues of FGE are fully (cysteines 50, 52, 336, 341, and 365) or nearly fully (cysteines 218, 235, and 346) conserved in all known eukaryotic FGEs. 2 In intracellular FGE they form an interesting disulfide bond pattern. In 70% of the FGE molecules the first two cysteine residues in positions 50 and 52 form an intramolecular disulfide bond. In the remaining 30% of FGE molecules one of the two cysteines is free, and the other is engaged in an intermolecular disulfide bond. In cells treated with NEM prior to harvesting, about 15% of intracellular FGE was recovered as a disulfide-bonded homodimer. The absence of homodimers in FGE lacking the Nterminal sequence harboring cysteine residues 50 and 52 indicated that cysteines 50 and/or 52 are involved in the formation of these homodimers. Although the present data cannot exclude that in vivo a fraction of cysteines 50 and 52 is also A, CLP (for production, see text) was purified by reversed-phase chromatography and subjected to MALDI LIFT-TOF/TOF mass spectrometry selecting the mass of 2731 Da for isolation and analysis of fragments generated by laser-induced dissociation (LID). The mass spectrum shows all C-terminal FGE peptide ions (y3-y15) with the mass increment of 838 Da corresponding to a crosslink to the substrate peptide AAL-(Bpa)TGR, whereas the N-terminal ions (b2-b10, a3, a7) and the internal fragments PW, PWW, and PWWL have the masses predicted from the FGE sequence only. Two internal fragments correspond to the combined masses of PVK and AAL(Bpa)TG (or AAL(Bpa)TGR). These data restrict the site of cross-linking to the C-terminal sequence PVK. B, CLP was digested with proteinase K and subjected to MALDI-TOF mass spectrometry. Proteinase K truncated the FGE peptide N-terminally up to tryptophan 180 leaving a stable C-terminal peptide of 1480.83 Da (fragment 1) which consisted of the FGE sequence 180 WLPVK 184 cross-linked to the substrate peptide AAL(Bpa)TGR. C, the 1480.83 Da peptide (see B) was analyzed by MALDI LIFT-TOF/TOF mass spectrometry using LID as well as collision-induced dissociation (CID). The LID mass spectrum (shown in this figure) consists of a series of a-, b-, and y-ions as well as internal fragments. Fragmentation in the FGE sequence 180 WLPVK 184 produced the ions y1, y2 (both only found in CID spectra), and b2 corresponding to the sequences K, VK, and WL, whereas a4, b3, b4, y3, and y4 correspond to fragments of WLPVK cross-linked to the substrate fragment. In addition to the occurrence of the ions b3 and y3 a series of internal fragments point to a cross-link between Bpa and proline 182. Because of their mass differences the ions of this series consist of AAL(Bpa)TG, AAL(Bpa)T, AAL(Bpa), AL(Bpa), L(Bpa), and Bpa, respectively. Compared with the neutral masses (HN-peptide-CO) the mass of each ion is shifted by ϩ52 Da, which can be explained by a cross-link between Bpa and proline 182 as immonium ion (a-ion, P i ) with a strong tendency to lose water. The proposed structure for one of these ions is shown in the inset. The ions of a second series (AAL(Bpa), L(Bpa), and Bpa) with a shift by ϩ367 Da compared with the neutral peptides can be interpreted in an analog way as a-ions (Ϫ18) with a cross-link to W ox LP. involved in the formation of disulfide-bonded heterodimers of FGE with some other ER protein, we assume that the cysteines at position 50 and 52 form either an intramolecular disulfide bond (in 70% or more of FGE) or stabilize FGE homodimers by an intermolecular disulfide bond (up to 30% of FGE). This intermolecular disulfide bond is prone to cleavage by the remaining free cysteine residue in position 50 or 52 resulting in the formation of two FGE monomers. This cleavage occurs when FGE is secreted (see below) or during preparation of cell extracts.
The remaining cysteines 3-8 form three intramolecular disulfide bridges arranged such that cysteine residues 3 and 8, 4 and 7, and 5 and 6 form disulfide bonds. An interesting feature was that only 80% of cysteine residues 336 and 341, forming the innermost of the three disulfide bonds, were disulfidebonded. The remaining 20% were free, both in intracellular and secreted FGE.
Limited proteolysis with thermolysin and elastase revealed that FGE is made up of two protease-resistant domains. Phylogenetic sequence conservation within the FGE protein family had suggested that FGE contains three conserved subdomains separated by short linker sequences of up to 25 residues and preceded by an N-terminal extension of 57 residues (18). The first subdomain in fact coincides with the first protease-resistant domain and the cleavage site between the protease-resistant domains is located in the linker sequence separating the first and the second subdomain defined by sequence conservation. However, the second and third subdomains are in fact part of one protease-resistant folding domain that is stabilized by two intersubdomain disulfide bonds.
FGE is retained in the ER by a mechanism that is saturable. Upon overexpression FGE is secreted as a monomer. A saturable mechanism such as the KDEL receptor may be indirectly involved, via an interacting partner, as FGE itself is lacking a retention signal of the KDEL-type. Upon secretion, the Nlinked oligosaccharide is processed to complex-type structures that are partially sialylated and fucosylated. Moreover, the majority of secreted FGE is recovered as truncated polypeptides lacking residues 34 -72 of mature FGE. The full-length form of FGE is stable in the secretions indicating that the proteolytic cleavage occurs either in the secretory route or during secretion. Secretion of endogenous FGE from metabolically labeled skin fibroblasts was hardly detectable (data not shown). If secretion really occurs, as might be suggested by the observation that a small fraction of endogenous FGE paralog (pFGE) escapes from the ER into the medium (32), its physiological meaning is questionable, since FGE can generate FGly residues only on newly synthesized and still unfolded sulfatase polypeptides.
Enzymatic Activity of FGE-The majority of secreted FGE lacks most of the N-terminal extension (residues 34 -72) of mature FGE (34 -374). This extension carries the cysteine pair involved either in homodimer stabilization or in intramolecular disulfide bond formation (see above). A recombinant N-terminally truncated form of FGE lacking residues 34 -68 is active (this study). Moreover, the catalytic activity of secreted (mostly N-terminally truncated) and intracellular (N-terminally intact) FGE was indistinguishable. Thus, neither dimer formation nor presence of the N-terminal extension is required for the catalytic activity of FGE.
FGE is a Ca 2ϩ -containing polypeptide. Plasma mass spectrometry indicated that FGE monomers contain one or two Ca 2ϩ ions. Earlier studies, in which the activity of a crude FGE preparation was monitored with the help of an in vitro translated sulfatase polypeptide, Ca 2ϩ was shown to stimulate FGE activity up to 2.5-fold with a maximum at 15 M Ca 2ϩ (16). The activity of purified FGE was affected neither by treatment with EDTA/EGTA nor by addition of Ca 2ϩ , indicating that free Ca 2ϩ is not critical for catalytic activity; instead it may be tightly bound and fulfill a structural function.
The defect of FGE in MSD is associated with a severe decrease of all sulfatase activities that have been measured so far (30). Presently the activity of eleven human sulfatases can be determined in vitro (arylsulfatases A, B, C, E, and F, iduronate 2-sulfatase, sulfamidase, galactose 6-sulfatase, N-acetylglucosamine 6-sulfatase, Sulf1 and Sulf2), while the sequence of the human genome predicts at least 16 sulfatases. Studies with arylsulfatase A and arylsulfatase B-derived substrates have allowed the definition of a linear sequence of 16 residues that harbors a core CXPSR motif and is sufficient to direct FGEmediated FGly-modification of the cysteine within this motif (5,6,17). Based on this information a series of 23mer peptides representing the region critical for FGly modification in the known and predicted human sulfatases was synthesized and assayed as substrates for FGE. All peptides served as substrates for FGE, albeit the sulfamidase-and the galactose 6-sulfatase-derived peptides showed a rather low turnover. Both sulfamidase and galactose 6-sulfatase are known as substrates of FGE in vivo, as MSD-causing mutations of FGE lead to inactivity of the two sulfatases (31). These data show that FGE is capable of generating FGly residues in all sulfatases. This is in line with the observation that in MSD patients all sulfatases that can be assayed are affected. In human as in other vertebrates and also in echinodermata a paralog of FGE, termed pFGE, exists that colocalizes with FGE in the ER (32). One of the functions discussed for pFGE is that it may serve as the FGly generating enzyme for a subset of sulfatases (13,14). The observation that peptides derived from 16 sulfatases are modified by FGE (this study) and not by pFGE (32) strongly suggests that pFGE is not capable of oxidizing cysteine to FGly in any sulfatase.
Substrate Binding Site of FGE-Photocross-linking of arylsulfatase A-derived peptides demonstrated that proline 182, located close to the N-terminal end of subdomain II of FGE, participates in the binding of peptidic substrates. A yeast twohybrid approach revealed that subdomain II is even sufficient for binding of a sulfatase fragment harboring the FGly-modification motif. Subdomain II forms together with subdomain III a protease-resistant domain. Subdomain III has been suspected to contain the catalytic site of FGE. This assumption is based on the observation that this small subdomain of only 40 residues represents the most highly conserved part of FGE (about 90% similarity among eukaryotic FGEs), contains 3 cysteine residues that are fully conserved among pro-and eukaryotic members of the FGE family and is a hotspot for mutations causing MSD. Of the known 20 missense mutations, located in mature FGE, 8 are confined to this small subdomain (13,14,15). 4 The three-dimensional structure of FGE will show whether the putative substrate binding site in subdomain II, in particular proline 182, can cooperate with a putative catalytic site in subdomain III. The only partial oxidation of cysteines 336 and 341, which represent two of the three fully conserved cysteine residues in subdomain III, may point to a catalytic role of these residues in the oxidation of the critical cysteine in sulfatase substrates.
Note Added in Proof-Reaction of secreted FGE with iodoacetamide (10 mM/5 min/37°C, stopped with 50 mM DTT) led to modification of ϳ95% of cysteines 336 and 341, while ϳ5% were found to be disulfidebonded. In contrast, reaction with NEM led to modification of only ϳ20% of cysteines 336 and 341 (see text). These divergent reactivities may reflect the redox activity of the cysteine pair. Presently it is not clear, however, whether differences in the FGE preparations or in the reaction conditions play a role.