Crystal Structure of Human pFGE, the Paralog of the Cα-formylglycine-generating Enzyme*

In eukaryotes, sulfate esters are degraded by sulfatases, which possess a unique Cα-formylglycine residue in their active site. The defect in post-translational formation of the Cα-formylglycine residue causes a severe lysosomal storage disorder in humans. Recently, FGE (formylglycine-generating enzyme) has been identified as the protein required for this specific modification. Using sequence comparisons, a protein homologous to FGE was found and denoted pFGE (paralog of FGE). pFGE binds a sulfatase-derived peptide bearing the FGE recognition motif, but it lacks formylglycine-generating activity. Both proteins belong to a large family of pro- and eukaryotic proteins containing the DUF323 domain, a formylglycine-generating enzyme domain of unknown three-dimensional structure. We have crystallized the glycosylated human pFGE and determined its crystal structure at a resolution of 1.86 Å. The structure reveals a novel fold, which we denote the FGE fold and which therefore serves as a paradigm for the DUF323 domain. It is characterized by an asymmetric partitioning of secondary structure elements and is stabilized by two calcium cations. A deep cleft on the surface of pFGE most likely represents the sulfatase polypeptide binding site. The asymmetric unit of the pFGE crystal contains a homodimer. The putative peptide binding site is buried between the monomers, indicating a biological significance of the dimer. The structure suggests the capability of pFGE to form a heterodimer with FGE.

In eukaryotes, sulfate esters are degraded by sulfatases, which possess a unique C␣-formylglycine residue in their active site. The defect in post-translational formation of the C␣-formylglycine residue causes a severe lysosomal storage disorder in humans. Recently, FGE (formylglycine-generating enzyme) has been identified as the protein required for this specific modification. Using sequence comparisons, a protein homologous to FGE was found and denoted pFGE (paralog of FGE). pFGE binds a sulfatase-derived peptide bearing the FGE recognition motif, but it lacks formylglycine-generating activity. Both proteins belong to a large family of proand eukaryotic proteins containing the DUF323 domain, a formylglycine-generating enzyme domain of unknown three-dimensional structure. We have crystallized the glycosylated human pFGE and determined its crystal structure at a resolution of 1.86 Å. The structure reveals a novel fold, which we denote the FGE fold and which therefore serves as a paradigm for the DUF323 domain. It is characterized by an asymmetric partitioning of secondary structure elements and is stabilized by two calcium cations. A deep cleft on the surface of pFGE most likely represents the sulfatase polypeptide binding site. The asymmetric unit of the pFGE crystal contains a homodimer. The putative peptide binding site is buried between the monomers, indicating a biological significance of the dimer. The structure suggests the capability of pFGE to form a heterodimer with FGE.
Sulfatases hydrolyze various N-and O-sulfate esters of steroids, glycopeptides, glycolipids, and other compounds. In higher eukaryotes, they are engaged in the remodeling of heparin sulfate located at the cell surface or in lysosomal degradation, whereas in bacteria and lower eukaryotes, sulfatases are involved in sulfur scavenging (1)(2)(3)(4). Sulfatases contain a unique post-translationally generated amino acid in the catalytic site, C␣-formylglycine (FGly) 1 (5). In mammals, the modification of a specific cysteine to FGly occurs during transition of the nascent polypeptide into the lumen of the endoplasmic reticulum (6 -9). FGly, in the form of an aldehyde hydrate, is the nucleophile for the hydrolysis of sulfate esters. Recently, the FGly-generating enzyme (FGE) has been identified independently by biochemical purification (10) and by functional complementation experiments (11) as the sole enzyme responsible for the generation of FGly from cysteines in human sulfatases.
The failure of mutated forms of FGE to generate FGly results in catalytically inactive sulfatases and causes multiple sulfatase deficiency, a rare disorder leading to death in infancy or childhood (10 -12). Recently, the peptide binding site of FGE has been identified by chemical cross-linking of a photoreactive peptide (38).
Data base searches of the human genome revealed a coding region highly similar to FGE (48% sequence identity, 62% similarity), therefore termed paralog of FGE (pFGE) (10,11,13). pFGE and FGE belong to a large protein family currently comprising 164 members sharing a common domain of unknown function and unknown three-dimensional structure (DUF323). These DUF323 domains exhibit a mean average identity of 26% over the average length of 246 amino acids and are found in prokaryotic and eukaryotic proteins. Even more proteins, ϳ300, containing this domain are suggested by the COG1262 (clusters of orthologous groups of proteins (14). Interestingly, the domain resides mostly on its own, presumed to fulfill divergent functions such as an oxygenase involved in the synthesis of a ␤-lactam antibiotic (Erwinia carotovora) (15,16) in toxoflavin biosynthesis (Burkholderia glumae), as transcriptional regulator (enhancement of xylanase A production in Bacillus halodurans), as an enhancer of xylanase production (Bacillus stearothermophilus), and as FGly-generating enzyme FGE (in mammals) (10,11). In ϳ12% of the proteins, the DUF323 domain has been found to be fused to other domains that are thought to function as a serine/threonine kinase in bacteria (Chlamydia trachomatis, Trichodesmium erythraeum), as cytosine/adenosine deaminases in fungi, as a peptidase in archaea and bacteria (Nostoc sp. and Methanosarcina barkeri), or as methyltransferase (Nitrosomonas europaea) or as Hsp60 chaperonin (Homo sapiens).
In vitro studies by Mariappan et al (39) demonstrate that pFGE, which together with FGE resides in the endoplasmic reticulum, has no FGly-generating activity, although it binds sulfatases-derived peptides. Comparison of gene expression reveals an equal pFGE/FGE ratio in all tissue types studied (39). Those observations and the fact that pFGE is highly conserved in vertebrates strongly suggests a function different from, but also related to, that of FGE, such as regulation of FGE activity.
We determined the crystal structure of pFGE, which represents the first three-dimensional structure of a DUF323 domain. The pFGE structure reveals a single domain of a novel fold with a strikingly low degree of secondary structure. The tertiary structure is stabilized by two calcium cations and a disulfide bond. The pFGE structure was used to determine the crystal structure of FGE, revealing a high structural similarity (40). These structures allowed us to delineate the differences between the catalytically inactive pFGE and the active FGE. In addition, the structures point to a possible cellular function of pFGE.

EXPERIMENTAL PROCEDURES
Cloning, Expression, Purification, and Crystallization of pFGE-Cloning, expression, and purification of pFGE were performed as described by Mariappan et al. (39). Protein solutions of 7-10 mg/ml concentration were used for crystallization trials. Crystals were grown at 20°C using the sitting drop vapor diffusion method in Chryschem plates sealed with clear tape (Hampton Research, Riverside, CA) by mixing equal volumes of pFGE with reservoir solution (16 -20% polyethylene glycol 8000, 20% 1,3-methyl-pentane-diol (MPD), 0.1 M CaCl 2 , 4 mM dithiothreitol, 0.1 M TRIS/HCl, pH 8.0), which also served a cryoprotectant. Heavy atom derivatives were obtained by soaking the crystals in reservoir solution supplemented with 2 mM HgCl 2 and 2 mM K 2 PtCl 4 , respectively, for 45 min. Xenon-pFGE derivatives were obtained by incubation of the crystals for 25 min in a pressurized chamber at 15 bar (Xcell, Oxford Cryosystems, Oxford, UK).
Data Collection-X-ray diffraction data were collected in-house using a Micromax 007 rotating copper anode generator (Rigaku/MSC, The Woodlands, TX) operating at 40 kV and 20 mA, equipped with Osmic focusing mirrors and a Mar345dtb detector system (X-ray Research, Hamburg, Germany). Cryo-cooling was performed using the X-Stream system (Rigaku/MSC). The crystal-to-detector distances were 120 -180 mm. Typical data sets contained more than 180 frames with 1°rotation and a 20-min exposure time/image. The native crystal diffracted to a maximum resolution of 1.86 Å with a mosaicity of 0.97°. The data were processed with DENZO/Scalepack (HKL Research, Charlottesville, VA) and revealed a primitive monoclinic lattice with unit cell parameters of a ϭ 47.1 Å, b ϭ 100.8 Å, c ϭ 62.7 Å, and ␤ ϭ 101.7°. Systematic absences along the b axis revealed the space group to be P2 1 . The Matthews coefficient (V m ϭ 2.3 Å 3 /Da) suggested two molecules in the asymmetric is the ith measurement of the h reflection and ͗I(h)͘ is the average value of the reflection intensity.
where F o and F c are the structure factor amplitudes from the data and the model, respectively. R free is R cryst with 5% of test set structure factors. d Based on maximum likelihood. e Calculated using PROCHECK (28).
unit corresponding to a solvent content of ϳ41%. The data collection statistics are summarized below (see Table I).
Structure Determination and Refinement-Initial phases were obtained with AUTOSHARP (Global Phasing Ltd., Cambridge, UK) using data sets obtained from heavy atom derivatized crystals with platinum, xenon, and mercury. The initial phases were improved by solvent flattening as implemented in AUTOSHARP and extended to 1.9 Å, which resulted in overall figure of merits of 0.83 (see Table 1 for details). No noncrystallographic symmetry averaging was necessary due to the high quality of the initial electron density map. A partial model (452 residues out of 568 residues, 54 of which were built as alanines) was automatically built using the auto-build as implemented in RESOLVE (17) and completed by manual fitting into A -weighted 2F o Ϫ DF c and F o Ϫ F c difference electron density maps (18) using the programs XTALVIEW (19) and O (20). The model was refined against the native data set (maximum resolution, 1.86 Å) with REFMAC5 (21) using standard parameters. A random set of 5% of reflections was excluded from refinement to monitor R free (22,23). Water molecules were assigned automatically for Ͼ3 peaks in F o Ϫ F c difference maps by cycling the REFMAC5 refinement with ARP/wARP (24) and retained if they obeyed hydrogen bonding criteria according to HBPLUS (25) and returned Ͼ1 2F o Ϫ F c density after refinement. The final model consists of two molecules (A and B) encompassing residues 28 -294 with Thr-109 missing in molecule A. No electron density was visible for the N-terminal residue 26 of model A and the C-terminal residues 295-312, which were therefore not included in the model. Residues Gln-106 of molecule A and Lys-107 of molecule B were modeled as alanines due to lack of side chain density. Alternate side chain conformations were modeled for Arg-79, Arg-84, Met-89, Arg-127, Arg-253, and Arg-268 of molecule A and Arg-84, Gln-138, Lys-205, and Arg-250 of molecule B. The refinement statistics are summarized below (see Table I). Surface complementarity coefficients and solvent-accessible surface areas were calculated with SC using a 1.7 Å radius probe (26). Possible hydrogen bonds, salt bridges, and van der Waals contacts were detected with HBPLUS and CONTACSYM (27) using default parameters. Surface potentials were calculated with Pymol using the implemented function of vacuum electrostatics. The quality of the model was checked using PROCHECK (28) and WHATCHECK (29). The Ramachandran plots (30) showed that 89.7% are in the most favorable regions and 8.3% are in the additionally allowed region, whereas 1.1% are in the generously allowed region. Only 0.9% of the residues are in the disallowed region, namely Leu-116 and Asn-233. Leu-116 of both molecules has /⌿ combinations (chain A, ϭ 47.2°, ⌿ ϭ Ϫ118.3°; chain B, ϭ 47.5°, ⌿ ϭ Ϫ117.5°) that lie in the disallowed regions of the Ramachandran plot, but the residues are very well defined in the electron density map. Leucine 116 is part of the contact area between the two molecules in the asymmetric unit. The distortion appears to be a consequence of this interaction. Asn-233 is located between 2 residues that are involved in calcium cation binding via their carbonyl oxygens. The calcium cation probably induces force on the peptidyl chain, resulting in the observed forbidden conformation of the Asn-233 residue chain: A, ϭ 72.2°, ⌿ ϭ Ϫ87.3°; chain B, ϭ 69.7°, ⌿ ϭ Ϫ85.0°). The monomer contains two cis-prolines at positions 53 and 201.

RESULTS AND DISCUSSION
Structure Determination-The crystal structure of pFGE was determined by means of MIRAS using several heavy atom derivatives (Table I). The initial electron density map obtained from experimental phases after solvent flattening was of excellent quality and allowed autotracing of ϳ450 out of 560 amino acids of the polypeptide chains (Fig. 1A). The quality of the initial MIRAS electron density map is exemplified by the presence of density for three sugar moieties attached to Asn-191 (Fig. 1B). A native data set was used for refinement to a resolution of 1.86 Å ( Table I). The pFGE structure was refined to a free R-value of 17.2%, exhibiting good stereochemistry ( Table I). The asymmetric unit of the crystal contains two pFGE molecules. Due to the high resolution, no noncrystallographic symmetry restraints were applied during refinement, allowing meaningful comparison of the two monomers. An overlay of both monomers reveals a root mean square deviation of 0.32 Å, indicative for the lack of significant structural differences between the two monomers (data not shown).
The attached trisaccharide, consisting of 1 fucose and 2 FIG. 1. Structure of the pFGE monomer. A, the ribbon model of the pFGE monomer is rainbow-colored from the N terminus in blue to C terminus in red (compare Fig. 2A). Secondary structure motifs ( denotes a 3 10 helix) are numbered consecutively from the N terminus to the C terminus. Calcium cations are depicted as red spheres. B, an experimental electron density map of the pFGE reveals the N-glycosylation site at Asn-191. Electron density maps were calculated using MIRAS phases modified by solvent flattening. From the ␥-amino group of Asn-191, additional electron density emanates that fits a trisaccharide comprising GlcNAc(-Fuc)-GlcNAc. The refined model of the branched trisaccharide is superimposed as a stick representation with the residues colored according to the atom type.
GlcNAc residues, was modeled into the electron density for molecule A (Fig. 1B), but only the 2 GlcNAc residues could be modeled in the molecule B of the asymmetric unit. These Glc-NAc residues protrude from the molecule into a solvent channel in the crystal. By contrast, the fucose residue of molecule A is involved in the crystal packing by interaction with a molecule   of the neighboring asymmetric unit, thereby stabilizing the crystal structure. Usually, the flexibility of carbohydrate side chains is detrimental to ordered crystal packing, but not so in pFGE.
Overall Fold-The fold of pFGE consists of five ␣-helices ranging from 5 to 11 amino acids in length and two short 3 10 helices (Figs. 1A and 2, denoted by ). The 11 ␤-strands are 2-7 amino acids in length and form four antiparallel ␤-sheets ( Fig.  2A). The first two sheets, formed by ␤-strands 1/4/11 and 2/3, respectively, engulf ␣-helix 5, which on the opposing side is flanked by the third sheet (␤-strands 7/8/9/10). The arrangement of sheet 3 and sheet 4 (␤-strands 5/6) allows the formation of a deep cleft (see below). The surface of sheet 4 opposing this cleft is shielded from the solvent by two short helices, ␣-helix 3 and a 3 10 helix, 1. With the exception of the remaining two longer helices of perpendicular arrangement (helices 1 and 4) and one short helix (helix 2), the structure is void of secondary structure elements (Fig. 1A). Indeed, pFGE exhibits a surprising lack of secondary structure elements, 37% in total with 17.7% helices (2.3% 3 10 helices and 15.4% ␣-helices), and 19.2% ␤-strands. Remarkably, the few secondary structure elements present are distributed in a nonuniformly fashion on one side of the molecule, which in the following will be referred to as the "motif-rich" side in contrast to the "sugar side" (Fig. 1A). The pFGE monomer is of almost ellipsoid shape (50 ϫ 30 ϫ 30 Å) and compact with two larger protrusions, one at Asn-191, where the trisaccharide is attached ("sugar side"), and a bulge formed by 3 10 helix 2 on the opposing side of the molecule (Fig. 1A).
The crystal structure of pFGE reveals a novel fold with an interesting partitioning of the molecule in two halves, distinguishable by the amount of secondary structure elements they contain (see above). Thus, it was not surprising that no structures with a significant homology were found when performing a search using DALI (36). The structurally closest protein is a fragment of a sugar-binding protein, the C-type carbohydrate recognition domain (Crd4) from the macrophage mannose receptor (1egg-A) with a DALI score of 5.5 (a score of Ͻ2.5 is structurally dissimilar, whereas the self-comparison score of pFGE is 51). However, this homology extends over only 98 residues (of a total of 268) and shares a sequence identity of 8%. Additionally these homologous 98 residues are dispersed over the complete pFGE as well as the Crd4 domain, which comprises 132 residues. The low DALI score and arrangement of the structurally similar arranged residues prompted us to conclude that pFGE does not share any structural homology with known proteins.
The FGE Fold as a Paradigm for the DUF323 Domain-The unusual paucity of secondary structure elements and their uneven distribution within the molecule are the main hallmarks of this new fold that has paradigmatic character for 300 other proteins. The pFGE and FGE structures represent the first three-dimensional models of a DUF323/COG1262 domaincontaining protein. Due to the close relationship of pFGE to the functionally well characterized FGE, the novel fold of the DUF323 domain is denoted "FGE fold." Sequence comparison of all known DUF323 domains suggests that all of them adopt the FGE fold (Fig. 2B).
The manifold functions associated with the DUF323 domain, such as oxygenase, transcriptional regulator, or as an enhancer of xylanase production (see the Introduction), indicate that this fold harbors an extremely versatile set of functions. This raises the question of how the fold of the DUF323 domain can fulfill this multitude of functions. Examples in the data base showing that the DUF323/COG1262 domain is found in combination with other domains, like a serine/threonine kinase, a peptidase, or a chaperonin domain (see the Introduction), suggest that it has a predominant function in polypeptide binding/modification. The few biochemical data that are available on these proteins are insufficient to point to a common active site pres- The dashed green lines indicate the coordination of the calcium cations (red spheres). A, calcium cation 1 coordinates two water molecules (blue spheres) and the side chains of Asn-194 and Asp-208 as well as the carbonyl groups of Leu-195 and Phe-210. The conserved cis-peptide Phe-200 CisPro-201 introduces a sharp kink in the loop region that bridges the two parts of the calcium cation 1 binding site, thereby positioning Asp-204 to coordinate the water molecules from the opposing side. B, calcium cation 2 is coordinated by the carbonyl groups of Asp-229, Leu-230, Gly-232, and Val-234, the side chain of Glu-236, and one water molecule (blue sphere). C, the disulfide bridge of Cys-156 and Cys-290 links ␤-strand 11 to ␣-helix 4 to enable its proper orientation to allow for a ␤-sheet formation (sheet 1 is shown in Fig. 2A). ent in all FGE fold proteins. Thus, further biochemical and structural data are needed to clarify this question.
Fold-determining and Structure-stabilizing Features of pFGE-The tertiary structure of pFGE is stabilized by a conserved disulfide bond formed by the two cysteines present in the molecule (Fig. 3A). The bridge between Cys-156 located in helix 4 to the Cys-290 in close proximity to the last ␤-strand (␤-11) arrests this ␤-strand in close proximity to ␤-strand 4 to become part of the same ␤-sheet ( Fig. 2A). This disulfide bridge is also conserved in FGE, underscoring its stabilizing function. Interestingly, FGE has 6 additional cysteines, 4 of which form two additional disulfide bridges, and the remaining 2 cysteines play a role in catalytic activity (13,38,40).
Two metal binding sites could clearly be identified in the electron density maps of pFGE. Both metal ions are buried in the core of pFGE and display low B-values, indicative of their structure-stabilizing function. As the presence of calcium cations is required for crystallization, and the endoplasmic reticulum, where pFGE is naturally localized, contains elevated calcium cation concentrations, calcium cations must be the natural metal ion in pFGE. Calcium binding site 1 (Fig. 3C) in pFGE is located in the center of a bipyramidal arrangement with the corners of the pyramid base formed by the carboxamide group of the side chain of Asn-194, the Asp-208 side chain carboxylate group, and two water molecules located in 2.3-2.6 Å distance to the calcium cation. Asp-204 coordinates these two water molecules, and its side chain conformation depends on a sharp kink in the polypeptide chain due to cis-Pro-201 (Fig.  3C). The tips of the bipyramid are located off center with respect to the pyramid base and are occupied by the carbonyl groups of Leu-195 and Phe-210 in a distance of 2.3 Å to the calcium cation (Fig. 3C). A much more irregular coordination is observed for the other calcium site. The coordination of the calcium cation 2 (Fig. 3B) is achieved by the carbonyl groups of Asp-229, Gly-232, Val-234, as well as the side chain of Glu-236, showing comparable distances between 2.5 and 2.9 Å. The calcium cation is shifted from the central plane of the bipyramid toward the tip occupied by Leu-230. The second tip is harboring a water molecule (Fig. 3B). The distances from the calcium cation to the pyramid tips are similar, 2.5 Å to the water molecule and 2.6 Å to the Leu-230 (Fig. 3B). Only Leu-230 is off center when compared with the axis defined by the water molecule and the calcium cation. The carbonyl group of Leu-230 is oriented almost perpendicular to this axis, which is in contrast to the other residues involved in the coordination of both calcium cations. The sequence comparison of DUF323 domains reveals that the two calcium cation binding sites are conserved, but the cis-proline and the disulfide bridge in pFGE are predominantly found in those of mammalian origin ( Fig. 2A).
Peptide Binding and Dimer Formation-It has been shown that human pFGE lacks formylglycine-generating activity (39). Nevertheless, the function of pFGE is likely to be related to FGE function for four reasons. (i) The comparison of expression levels revealed an equal pFGE/FGE ratio in all tissue types studied; (ii) the compromising effect of pFGE on FGE mediated generation of catalytically active sulfatases in coexpression studies; (iii) pFGE has been shown to bind peptides bearing the recognition motif required for the generation of FGly in arylsulfatase A in vitro; and (iv) the three-dimensional structures of pFGE and FGE are closely related (39,40).
The active site of FGE has recently been identified by the analysis of the FGE crystal structure (40) and confirmed by chemical cross-linking studies (38). Pro-182 of FGE, which was shown to chemically cross-link with the bound substrate related peptide, is structurally conserved in pFGE (Pro-120). These prolines border one end of a cleft thought to be involved in peptide binding (Fig. 4A, Pro-120 in blue). This cleft is FIG. 4. Comparison of the putative peptide binding clefts of pFGE and FGE. Left panels, pFGE; right panels, FGE. A, left, pFGE in gray with the cleft (yellow) most likely to be involved in the binding of the sulfatase polypeptide chain colored in yellow. Pro-120, colored in blue, corresponds to Pro-182 of FGE that has been shown to cross-link to a substrate peptide. Right, the FGE cleft is shown in the same color code used for pFGE. The amino acid changes are highlighted in red and are predominantly located in the region of the putative reaction center. B, the electrostatic surface potential of the corresponding clefts. The differences are almost restricted to the region of the putative reaction center. Areas colored in white, red, and blue denote neutral, negative, and positive potential, respectively. conserved in FGE and pFGE and is formed by two ␤-sheets ( Figs. 1 and 2A, S3 and S4). The shape of this cleft is highly similar, but their chemical and electrostatic properties partially differ (Fig. 4). The changes are localized in two distinct regions located at the ends of the cleft. The changes are in a small region in the vicinity of the Pro-120 reaching toward the bottom of the cleft, which results in slight changes in the surface potential. The second region of amino acid substitutions in pFGE is found at the opposite side of the cleft, changing the rather neutral surface charge in FGE into a negatively charged one. Importantly, the catalytically essential cysteine residues Cys-336 and Cys-341 of FGE are located at this end of the cleft (Fig. 4), further corroborating the notion that this cleft harbors the active site. The substitution of these 2 catalytically active cysteines by Ala-261 and Asn-266 in pFGE causes the lack of FGly-generating activity, raising the question as to which function pFGE may serve in vivo.
A compelling notion with respect to the function of pFGE comes from the presence of a homodimer in the asymmetric unit of the pFGE crystal. In this homodimer, the monomers face each other with the putative binding cleft for the substrate peptide (Fig. 5A). The total surface area that is buried in the homodimer is 2559 Å 2 , corresponding to 22.3% of total monomer surface area. However, the actual contact area amounts to only 1437 Å 2 /monomer, as a large cavity including the peptide binding clefts is formed by the two monomers. This relatively small interaction area is on the lower end of high affinity protein-protein contact areas (37). The rather small surface complementary coefficient is 0.59, further pointing to a weak interaction within the putative pFGE dimer. Dimer formation could be significantly enhanced upon binding of the substrate polypeptide into the cavity located between two pFGE monomers.
With respect to the high structural similarity of pFGE and FGE, the formation of a heterodimer appears also to be possible. A model of the pFGE/FGE heterodimer, obtained by superimposing the FGE structure on a monomer of the pFGE dimer, reveals heterodimer formation to be stereochemically feasible (Fig. 5B). The low root mean square deviation value of 1.3 Å for 249 common C␣ atoms of pFGE and FGE indicates that the proposed heterodimer could form, exhibiting a similar interaction surface and cavity size. The heterodimer formation would enhance binding of FGE to the substrate polypeptide and form a protected reaction environment. In line, a channel remains at the heterodimer interface that would still be able to accommodate the unfolded polypeptide chain of the sulfatases. The formation of a heterodimer would explain the observation of a decrease in FGE activity when pFGE is overexpressed in cells and also the constant expression ratio of endogenous mRNA levels found across different tissues (39). Indeed, using the yeast two-hybrid system, coexpression of pFGE and FGE led to the induction of three reporter genes, suggesting a physical interaction of pFGE and FGE (39). Furthermore, for FGE, homodimers and larger complexes with a sulfatase could be identified (38,39). With respect to the structures, pFGE and FGE have the capability to form homo-and heterodimers, respectively, although this oligomerization might only occur upon binding to the unfolded polypeptide chain of the sulfatase substrates.
Conclusion-The crystal structures of pFGE and FGE represent the first three-dimensional structures of the DUF323 domain, which therefore was termed the FGE fold. Due to the high sequence similarity within the proteins belonging to the DUF323 family, we can expect that all DUF323 domains adopt the FGE fold. Despite the fact that pFGE exhibits no FGly-generating activity, it is expressed in an equal pFGE/FGE ratio in many tissue types, indicating an important cellular function. Interestingly, pFGE crystallizes as a dimer, with the peptide binding site buried between the monomers, indicating a biological significance of dimerization. In addition, the formation of a pFGE/FGE heterodimer appears to be possible, pointing toward a function of pFGE in assisting or regulating FGE activity.  5. pFGE forms a homodimer. A, two pFGE monomers are present in the asymmetric unit facing each other with the clefts forming a cavity. One monomer is rainbow-colored from the N terminus in blue to the C terminus in red. The second monomer is shown in surface representation with the residues involved in formation of the pocket highlighted in yellow and Pro-120 highlighted in red. Within the pFGE dimer, the monomers interact by three H-bonds, one salt bridge, and 108 van der Waals interactions, i.e. predominantly hydrophobic. B, superposition of FGE (red) and a pFGE monomer (yellow) of the putative heterodimer revealing the high similarity of FGE and pFGE. The second pFGE of the homodimer is colored in blue. The loop in pFGE (indicated by an arrow) is missing in FGE, as it has been removed by limited proteolysis prior to crystallization.