Crystal Structure of HydF Scaffold Protein Provides Insights into [FeFe]-Hydrogenase Maturation*

Background: HydF is a GTPase essential for maturation of [FeFe]-hydrogenase. Results: The first crystal structure of HydF has been determined. Conclusion: The protein monomer comprises a GTP-binding domain, a dimerization domain, and a metal-cluster binding domain. Two monomers dimerize, and two dimers can aggregate to a tetramer. Significance: The crystal structure of the latter furnishes several clues about the events necessary for cluster generation. [FeFe]-hydrogenases catalyze the reversible production of H2 in some bacteria and unicellular eukaryotes. These enzymes require ancillary proteins to assemble the unique active site H-cluster, a complex structure composed of a 2Fe center bridged to a [4Fe-4S] cubane. The first crystal structure of a key factor in the maturation process, HydF, has been determined at 3 Å resolution. The protein monomer present in the asymmetric unit of the crystal comprises three domains: a GTP-binding domain, a dimerization domain, and a metal cluster-binding domain, all characterized by similar folding motifs. Two monomers dimerize, giving rise to a stable dimer, held together mainly by the formation of a continuous β-sheet comprising eight β-strands from two monomers. Moreover, in the structure presented, two dimers aggregate to form a supramolecular organization that represents an inactivated form of the HydF maturase. The crystal structure of the latter furnishes several clues about the events necessary for cluster generation/transfer and provides an excellent model to begin elucidating the structure/function of HydF in [FeFe]-hydrogenase maturation.

[FeFe]-hydrogenase (HydA) catalytic site, the H-cluster, is a complex structure composed of a 2Fe center bridged to a [4Fe-4S] cubane (4,5). In the 2Fe subcluster, the two metal ions are bridged by what has been proposed to be a dithiomethylamine molecule (6,7) and are both coordinated to terminal CO and CN Ϫ ligands and to an additional bridging CO ligand in the oxidized state (8). Assembly of the HydA catalytic site requires three conserved maturation proteins. HydE and HydG are radical S-adenosylmethionine FeS enzymes (9,10), and HydF is a GTPase containing an FeS cluster-binding motif (9,11,12). In vitro studies clearly demonstrate that HydE, HydF, and HydG (hydrogen maturase E, F, and G, respectively) are able to activate HydA produced in a genetic background lacking the maturases (HydA⌬ EFG ) (13) and that HydF acts as a scaffold in which the 2Fe subcluster is built, exploiting its GTPase activity (12,14). A model has been proposed in which the role of the entire HydE/HydF/HydG maturation machinery is to synthesize and insert only a 2Fe subcluster together with its ligands into the structural enzyme containing a preformed [4Fe-4S] unit, rather than to transfer the entire 6Fe-containing H-cluster (15,16). The HydA⌬ EFG protein three-dimensional structure has been recently determined by x-ray crystallography (17) and indicated a binding interface that putatively interacts with HydF for the transfer of the appropriate organometallic center. Current data demonstrate that HydG synthesizes the CO/CN ligands (18,19), and the diatomic ligands are incorporated onto a 2Fe center on the HydF scaffold (12,20), prior to transfer to HydA (14). Despite the progress in elucidating the steps of H-cluster biosynthesis provided by the in vitro studies described above, significant gaps remain in our understanding of the precise steps involved in the maturation process, specifically the dynamic role of HydF. Previously, the only available HydA maturase three-dimensional structure solved by x-ray crystallography was HydE from Thermotoga maritima (21).

EXPERIMENTAL PROCEDURES
TnHydF Expression and Purification-The Thermotoga neapolitana hydF gene was PCR-amplified using T. neapolitana genomic DNA as template and cloned into a pET-15b vector supplemental Figs. S1-S10. The atomic coordinates and structure factors ( Se-Met derivative was purified under anaerobic conditions, analogously to that described for the native protein. Site-directed Mutagenesis of TnHydF Gene-Site-directed mutagenesis of the TnHydF ⌬1-36 gene was performed with the QuikChange II site-directed mutagenesis kit (Stratagene), using as template the pET-15b/TnHydF ⌬1-36 recombinant plasmid. Oligonucleotides were designed according with the manufacturer's guidelines and the mutant construct, TnHydF ⌬1-36 -C302S, analyzed by DNA sequencing. The oligonucleotide sequences, with the modified bases underlined, were: mutC302Sfor, 5Ј-GTCATCATGGAAGGCAGCACCC-ACAGACCTC-3Ј; and mutC302Srev, 5Ј-GAGGTCTG-TGGGTGCTGCCTTCCATGATGAC-3Ј.
UV-visible Spectroscopic Analysis-Dimeric and tetrameric HydF proteins, separated by size exclusion chromatography under anaerobic conditions, were diluted in 400 l of gel filtration elution buffer at a protein concentration of 3 mg/ml, and the room temperature UV-visible absorption data were acquired using a Lambda Bio 40 UV-visible spectrometer (PerkinElmer Life Sciences). The spectra were collected from 250 to 900 nm at a data interval of 0.4 nm. Iron content was determined by the ferrozine method: 60 l of each sample were mixed with 100 l of chloridric acid and heated for 20 min at 353 K and centrifuged, and the supernatant was treated with 10 l of 10 mM ferrozine, 20 l of 75 mM ascorbic acid, supplemented with 120 l of oversaturated ammonium acetate solution. After 20 min at room temperature, the iron-ferrozine complex concentration was estimated by UV absorption at 562 nm (extinction coefficient, 27,900 M Ϫ1 cm Ϫ1 ).
Crystallization and Structure Determination-Purified TnHydF protein was concentrated to 20 mg/ml and used for crystallization trials, partially automated by an Oryx 8 crystallization robot (Douglas Instruments). The protein crystallized in multiple conditions of the PACT screen (Molecular Dimension Ltd.) but could not be improved upon with standard optimization strategies. Crystals grown in the presence of 8% poly-Lglutamic acid (low molecular mass range), 0.2 M sodium formate, 0.1 M Tris buffer, pH 7.8, gave the best results (poly-Lglutamic acid screen solution number 20; Molecular Dimension Ltd.). The addition of reducing agents impaired the growth of any crystals, whereas optimization trials, performed under anaerobic conditions, resulted in very poor diffraction patterns. On the contrary, crystals that allowed the structure to be determined were grown aerobically. They were briefly soaked in a cryoprotectant solution containing the mother liquor components supplemented with 20% 2-methyl-2,4-pentanediol before flash freezing in liquid nitrogen for data collection.
Diffraction data could be processed as cubic, space group P23, with a ϭ b ϭ c ϭ 138.26 Å. One monomer is present in the asymmetric unit, with V M ϭ 4.6 Å 3 /Da corresponding to a solvent content of ϳ73%. The very low diffracting power of these crystals is justified by the extremely high solvent content, which makes the crystals fragile. Several hundred crystals were tested in different freezing conditions before a native protein data set could be measured at 3 Å resolution. The data set used in the final refinement was measured at the ID14eh4 beamline of the European Synchrotron Radiation Facility (Grenoble, France), whereas Multiple Anomalous Dispersion data for Se-Met derivative were measured at ID23eh1. The data were indexed and integrated with software XDS (22) or Mosflm (23) and merged and scaled with Scala (24), contained in the CCP4 crystallographic package (25).
The structure was solved using the MAD method with software SHARP/AUTOSHARP (26), followed by density modification. Model building was particularly difficult, because the anomalous signal did not extend to more than 4 Å resolution and was only possible because of the presence of 12 Se sites evenly distributed along the entire amino acid sequence. The GTP-binding domain was built with the assistance of a model constructed by sequence homology using the ERA protein N-terminal GTPase domain as a template (Protein Data Bank code 3IEV) (27). Several cycles of manual rebuilding, performed with graphic software Coot (28), were necessary to reach the final structure. Refinement was done using the simulated annealing procedure contained in CNS (29). Because of the low resolution, no solvent molecules were added.
The final model contains 2990 protein atoms. The crystallographic R factor is 0.274 (R free 0.309). The relatively high R factor is justified by the low quality of the diffraction data, which is mainly due to the high solvent content of the crystals. See Table 1 for complete statistics regarding data collection and the final model.
Despite this high R factor, the electron density is in general quite well defined even for side chains with the exception of a few portions, in particular the long stretch connecting domains I and II. The correctness of the model is demonstrated

RESULTS
The three-dimensional crystal structure of a recombinant His-tagged TnHydF has been determined and refined at 3 Å resolution. The polypeptide chain could be traced from amino acids 7 to 398 (numbering system of the truncated protein), with the exception of a stretch, residues 32-44, that is disordered. The asymmetric unit contains a monomer, but the biological unit is a dimer, generated by a crystallographic 2-fold axis, or a tetramer, generated by the dimerization of dimers. These three levels of enzyme organization are described in detail below, and their functional significance is discussed.
Monomer-The content of the asymmetric unit of the crystal is the HydF monomer, which is organized in three domains, each of them characterized by a common fold, i.e. a parallel ␤-sheet flanked on both sides by ␣-helices ( Figs. 1 and 2A).
Domain I corresponds to the GTP-binding domain. Its fold is similar to other GTPases: five parallel and one anti-parallel ␤-strands compose a large sheet, with three ␣-helices flanking this sheet on one side and two ␣-helices on the other.
Domain I of HydF is structurally related to other characterized GTPases. For example, the root mean square deviation of the core residues of C␣ with those of the tRNA modification GTPase (TRNE) from E. coli (Protein Data Bank code 2GJA) (31) is 2.3 Å, whereas for the cytosolic domain of the T. maritima FEOB GTPase, the root mean square deviation (Protein Data Bank code 3A1S) (32) is 2.6 Å. Conserved amino acids considered important for GTP binding and hydrolysis (11,12) are located within domain I and suggest the position of the GTP binding site, which includes the flexible loop region from residues 32 to 44. Because GTP is not bound to our structure, we suggest a functional significance to the flexibility of this region, which could become ordered upon GTP binding. Domain II, which comprises the central portion of the sequence, occupies the opposite side of the monomer and is connected to domain I through a long stretch of amino acids ( Fig. 2A). The latter, composed by amino acids 172-185, runs on one side of domain III, which is located in the middle of the structure between the other two domains and interacts with both. Domain II is responsible for HydF dimerization (see below) and includes a four-stranded parallel ␤-sheet and three ␣-helices, two located on one side of the sheet and one on the other.
Domain III, the iron-sulfur cluster-binding domain, starts just after domain II. It includes a four-stranded parallel ␤-sheet and five ␣-helices, arranged in a more complex way than the other two domains (Figs. 1 and 2A). The three highly conserved cysteine residues (Cys-302, Cys-353, and Cys-356) that represent the FeS cluster-binding site, possibly along with a fourth residue such as His-352 or His-304, are spatially close together forming a superficial pocket. Cys-302 is at the beginning of the loop connecting strand ␤1 to helix ␣3. Cys-353 and Cys-356 are in the loop connecting strand ␤3 to helix ␣4 (Fig. 1), and in our structure they form an intramolecular disulfide bridge (see below). The distance between the S␥ atom of Cys-302 and that of Cys-356 in the crystal structure is 7.8 Å, because the side chain of Cys-302 is not properly oriented to bind a cluster in the structure and points toward the same cysteine of another monomer. However, the two loops containing the cysteine residues can easily rearrange, in particular the loop that connects strand ␤1 to helix ␣3. The residues that separate Cys-353 and Cys-356 are two glycines that confer better flexibility to this area, and this fact could have relevant implications in the binding and transfer of a 2Fe center to HydA.
Dimer-Our structure shows that HydF forms a stable dimer through domain II: ␤-strand 4 of the ␤-sheet pairs in an antiparallel way with the equivalent ␤-strand of another monomer, giving rise to a continuous sheet comprising eight strands (Fig.  2B). A crystallographic 2-fold axis, roughly parallel to the ␤-sheet, relates the two monomers. The interactions between the two domains, in addition to the H-bonds formed between the two anti-parallel strands main chains, involve a large surface that includes helix ␣1 interacting with the long loop connecting ␤1 to ␣1 of the other monomer and helix ␣3 with loop connecting ␣2 to ␤3. The previous interactions are repeated twice, because of the 2-fold symmetry. The total buried surface  caused by dimerization corresponds to ϳ1800 Å 2 /monomer, a number that indicates the existence of a stable physiological dimer. The latter assumes a left-handed helical shape ( Fig. 2B and supplemental Fig. S6), which leaves both the putative subcluster and the GTP-binding sites fully exposed to the solvent and offers a large protein surface for contacts with possible partners. The distance of the centers of mass of the two GTPbinding domains is ϳ75 Å, whereas that of the two domains III is ϳ55 Å. The two FeS cluster-binding motifs are located approximately in the middle of each monomer, whereas the two GTP-binding sites are at the two extreme ends, at an estimated distance of ϳ70 Å from each other. The distance between the intramolecular disulfide bridge, which can be taken as representative of the position of the cluster, and the two hypothetical GTP positions are more than 30 and 60 Å for the intra-and intermonomer distances, respectively (Fig. 2B).
Tetramer-Two dimers aggregate to form a supramolecular organization that is most appropriately denoted as a dimer of dimers but for brevity will be designated as a tetramer. The tetramer that we observe in the crystal lacks FeS clusters, at variance with the freshly purified tetramer that binds iron and consequently likely has some unique structural features relative to the newly prepared tetramer in solution (see below). The former is highly symmetrical and is characterized by three perpendicular 2-fold axes, all corresponding to crystallographic symmetry axes in our case (Fig. 3). The interaction between the two dimers involves mainly a large area of the two FeS clusterbinding domains, in particular the two ␤2 strands, the initial part of the long loop that connects strand ␤2 to strand ␤3, and the loop that connects strand ␤1 to helix ␣3. All of these secondary structure elements belong to domain III, which is primarily responsible for tetramer formation. In addition, the loop region at the beginning of helix ␣3 of domain III comes in contact with the initial part of strand ␤2 of domain I, residues 48 -50, allowing structural interactions between the metal binding and GTPase domains in the tetramer. It should be pointed out that the connection between the latter and helix ␣1 is flexible and could not be traced in our model, a feature common in several other GTPases. Interactions in the tetramer are apparently less specific than those of the dimer, but the tetramer in the crystal is stabilized by the two disulfide bridges between pairs of conserved Cys-302 of each monomer (Fig. 4). Formation of a tetramer results in a compact assembly, at vari-ance with the isolated dimer, which appears as an open structure.
Cluster Binding Properties and Oligomeric State Studies in Solution-In gel filtration experiments, both HydF protein homologues from T. neapolitana and Clostridium acetobutylicum (not shown), expressed in E. coli, invariably elute as two distinct species, corresponding to the molecular masses of a dimer and a tetramer, in an ϳ2:1 weight ratio. The two separated species show a long term stability when stored at 277 K and keep the same behavior in solution, as assessed by size exclusion chromatography analysis. Moreover, when produced in anaerobic conditions, both the tetrameric and dimeric HydF proteins show a UV-visible absorption spectrum typical of FeS cluster-containing proteins (supplemental Fig. S7). T. neapolitana HydF binds ϳ0.5 iron atom/monomer, and a slightly lower amount in the case of the tetramer. This value is not significantly different from that previously reported for an affinity-purified C. acetobutylicum HydF, where the two oligomeric forms are mixed together (19). These low substoichiometric ratios could be partially ascribed to the heterologous nature of the expression system, which results in very high expression levels of T. neapolitana HydF or to an inherently low binding affinity that is a consequence of the role of HydF in iron transfer.
Intriguingly, during crystallization, the dimer invariably interconverts into an apo-tetramer. Even when starting from freshly purified dimeric species, the crystals obtained contained a tetramer (supplemental Fig. S2), deprived of any FeS cluster, at least in the best diffracting crystals.
A TnHydF protein carrying a point mutation in the cysteine residue involved in the intermolecular bridge between two dimers (i.e. TnHydF C302S ) was obtained and purified in the same conditions. UV-visible absorption spectra showed that the affinity-purified mutant still binds iron (supplemental Fig. S7), but the iron  content of the freshly purified mutant corresponds, in the best case, to ϳ0.35 iron atoms per monomer. Moreover, the TnHydF C302S mutant has a dimer/tetramer weight ratio of ϳ5:1 (supplemental Fig. S2), suggesting that two dimers can associate independently of these cysteine residues.

DISCUSSION
The precise nature of the FeS cluster(s) bound to HydF has not been clearly established. Literature data are somewhat inconsistent, which likely reflects the dynamic nature of HydF in the maturation process and the difficulties in isolating and studying intermediate states (11,12,14,17,20,33). Consensus is emerging that both [4Fe-4S] and [2Fe2S] clusters are bound to HydF before the interaction with the other maturases, HydE and HydG, and that these clusters are bound to HydF in substoichiometic ratios in most as isolated biochemical preparations (11,12,14,20). As demonstrated by UV-visible absorption spectra and ferrozine method, both the dimer and the tetramer purified from E. coli under anaerobic conditions in this study contain FeS clusters at substoichiometric levels.
When expressed in similar heterologous systems (12,14), recombinant HydF protein from C. acetobutylicum has been reported to bind roughly 0.8 -0.9 iron ion/monomer, a value in line with our results. This ratio is quite low relative to protein, if we consider that each monomer could bind up to six iron ions, including both the [4Fe-4S] and a binuclear subcluster (33), and indicates a heterogeneous sample preparation.
The structure we determined corresponds to an inactivated form of HydF, where both the cluster(s) and GTP are not bound. This indicates that the conditions and time necessary for crystal growth shifts the equilibrium of the reaction toward an inactivated, tetrameric state, totally deprived of reactive species. These events most likely facilitated crystallization of the HydF maturase and allowed the first HydF crystal structure to be determined.
Although the lack of iron and GTP in our crystal form hinders a clarification of the complete mechanism of cluster formation/transfer, important insights regarding HydF function are suggested by the crystal structure. The dimer represents an open form of HydF, where both the cluster and the GTP-binding site are fully accessible. On the contrary, the cluster sites in the tetramer are concealed, with pairs of cluster-binding sites belonging to the two independent dimers located in front of one another, at a distance of ϳ18 Å. We are tempted to speculate that one cluster could be bridged in the tetramer between two cluster-binding sites or that somehow the two clusters facing each other in the tetramer are close enough to establish some interactions that could play a role in cluster maturation and delivery to HydA. Indeed, such a structural organization is similar to the distribution of multiple [4Fe-4S] clusters in the HydA enzyme, where at a similar distance to the active site, one of the accessory clusters (F cluster) is positioned for electron transfer to the active site (4).
Because HydE and HydG, in addition to GTP, are required for cluster maturation, it is plausible that the open HydF dimer interacts with one or both potential partners in its extended conformation when the cluster-binding sites are fully accessible. This interaction could be mediated by GTP binding and hydrolysis, as suggested by the observation that the presence of HydE and HydG increases the rate of hydrolysis of GTP by ϳ50% (19). Moreover, it has been reported that neither the binding of GDP nor a GTP nonhydrolyzable analogue apparently interfere with the ability of HydF to interact with HydA and to transfer a mature cluster (19), pointing toward a potential role of this domain in regulating protein-protein interactions with the other maturases.
The binding of GTP could order the region 32-44 of domain I, which on the contrary is flexible in the absence of ligand. These residues, which connect helix ␣1 to strand ␤2 of domain I, are on the surface of the model, close to one of the interaction areas of one dimer (supplemental Fig. S8). A theoretical model of the possible position of this long loop based on the structure of cytosolic T. maritima FeoB (Protein Data Bank code 3A1S) (32) with the nucleotide bound is shown in supplemental Fig.  S9. Intriguingly, in the model, several residues of loop 32-44 of one monomer clash with residues belonging to domain III of a monomer involved in the tetramer organization, suggesting that this area must be disordered for the formation of the tetramer. If GTP binding coordinates interactions with the other maturases, the loss/hydrolysis of GTP after association with HydG/E may facilitate structural rearrangements promoting interactions with HydA and transfer of the 2Fe subcluster.
Once the final H-cluster precursor is formed on HydF, the protein may directly interact with one (or two) HydA monomer(s), where a [4Fe-4S] cluster is suggested to be already present (17), and a 2Fe subcluster is transferred to the latter. Interestingly, a superposition of C␣ atoms of HydA to HydF shows that a significant portion of the HydA domain that binds the H-cluster presents the same fold and superimposes quite well to the core of the cluster domain of HydF (supplemental Fig.  S10). The similarity with the HydA active site domain is limited to the tertiary structure motif and becomes rather poor in the loops and turns that are involved in defining the cluster-binding pocket, preventing the full modeling of a similar cluster bound to HydF.
As mentioned before, the two pairs of Cys-302 are at a distance close enough to form an interdimer disulfide bridge (Fig.  4) that stabilizes and favors the formation of covalent pairs of dimers. Cysteines 353 and 356 form an additional internal disulfide bridge, stabilizing the conformation of the inactivated form of HydF ( Fig. 4 and supplemental Fig. S4A). It is intriguing to speculate as to whether redox reactions involving any of the three cysteines could play a physiological role in cluster biosynthesis/transfer. Close to these conserved cysteines, His-304 and His-352 represent possible candidates that are part of the cluster coordination sphere. The first one is exposed to the protein surface, whereas the second is positioned in a more bulky and shielded region, at less than 4 Å from the disulfide bridge between Cys-354 and Cys-356 (Fig. 4).
If a [4Fe-4S] subcluster can be hypothesized to be bound by the aforementioned residues and one of the conserved cysteines bridges to a subcluster, a coordination sphere similar to the H-cluster could be established. We are tempted to speculate that some HydF residues in close proximity to the conserved cysteines can accommodate the 2Fe subcluster after important rearrangements, in a manner similar to what is suggested in the mature (4) and apo-HydA (17) structures.
To further investigate the role of Cys-302 in FeS cluster binding and, eventually, in tetramer formation, we expressed and purified a C302S mutant of HydF from T. neapolitana. The resulting protein was still able to bind a FeS cluster, but the retention capacity of both mutated tetramers and dimers was weaker when compared with wild type protein. Indeed, once purified by gel filtration chromatography, the oligomeric forms produce a much weaker signal in the 320 -550-nm range, even at very high protein concentration (supplemental Fig. S7). Moreover, the dimer to tetramer ratio was significantly reduced, confirming that the introduced mutation impairs, without totally abolishing, formation of the tetramer. The observed small fraction of tetrameric HydF may be mediated by other residues present in the interaction surface, as described previously, and corresponds to an inactive tetramer, deprived of the covalent disulfide bridge.
Although many mechanistic details remain to be defined, the first structural insights regarding the nature of HydF provide a valuable foundation for further examination of HydF structure/ function. Convincing details have been provided regarding the dimerization interface and the spatial organization of the three distinct HydF domains. Despite conclusive evidence, we are tempted to speculate that a dimeric form of HydF interacts with HydE/G in a GTP-bound form. After interaction with HydE/G, GTP is hydrolyzed/disassociated, allowing conformational changes and interactions with HydA. A labile cluster could then be transferred to HydA, facilitated by the conserved His residues and perhaps HydF disulfide bond formation. Because our structure does not correspond to an intermediate in cluster assembly or transfer, it is plausible that the structure presented represents the most thermodynamically stable form of HydF after cluster transfer and that the loss of GTP, tetramer assembly, and disulfide bond formation are all involved in the transfer mechanism. A detailed understanding of the mechanism of [FeFe]-hydrogenase H-cluster maturation and subcluster transfer requires further investigation, but the crystal structure furnishes several insights for future experiments.