3-Keto-5-aminohexanoate Cleavage Enzyme

The exponential increase in genome sequencing output has led to the accumulation of thousands of predicted genes lacking a proper functional annotation. Among this mass of hypothetical proteins, enzymes catalyzing new reactions or using novel ways to catalyze already known reactions might still wait to be identified. Here, we provide a structural and biochemical characterization of the 3-keto-5-aminohexanoate cleavage enzyme (Kce), an enzymatic activity long known as being involved in the anaerobic fermentation of lysine but whose catalytic mechanism has remained elusive so far. Although the enzyme shows the ubiquitous triose phosphate isomerase (TIM) barrel fold and a Zn2+ cation reminiscent of metal-dependent class II aldolases, our results based on a combination of x-ray snapshots and molecular modeling point to an unprecedented mechanism that proceeds through deprotonation of the 3-keto-5-aminohexanoate substrate, nucleophilic addition onto an incoming acetyl-CoA, intramolecular transfer of the CoA moiety, and final retro-Claisen reaction leading to acetoacetate and 3-aminobutyryl-CoA. This model also accounts for earlier observations showing the origin of carbon atoms in the products, as well as the absence of detection of any covalent acyl-enzyme intermediate. Kce is the first representative of a large family of prokaryotic hypothetical proteins, currently annotated as the “domain of unknown function” DUF849.

The exponential increase in genome sequencing output has led to the accumulation of thousands of predicted genes lacking a proper functional annotation. Among this mass of hypothetical proteins, enzymes catalyzing new reactions or using novel ways to catalyze already known reactions might still wait to be identified. Here, we provide a structural and biochemical characterization of the 3-keto-5-aminohexanoate cleavage enzyme (Kce), an enzymatic activity long known as being involved in the anaerobic fermentation of lysine but whose catalytic mechanism has remained elusive so far. Although the enzyme shows the ubiquitous triose phosphate isomerase (TIM) barrel fold and a Zn 2؉ cation reminiscent of metal-dependent class II aldolases, our results based on a combination of x-ray snapshots and molecular modeling point to an unprecedented mechanism that proceeds through deprotonation of the 3-keto-5-aminohexanoate substrate, nucleophilic addition onto an incoming acetyl-CoA, intramolecular transfer of the CoA moiety, and final retro-Claisen reaction leading to acetoacetate and 3-aminobutyryl-CoA. This model also accounts for earlier observations showing the origin of carbon atoms in the products, as well as the absence of detection of any covalent acyl-enzyme intermediate. Kce is the first representative of a large family of prokaryotic hypothetical proteins, currently annotated as the "domain of unknown function" DUF849.
One of the most striking outputs of genome sequencing is the high proportion of proteins without pertinent functional annotation. Predicted genes have been mainly annotated on the basis of sequence homology to already characterized proteins from other genomes, resulting, on average, in a fraction as high as 30 -50% of predicted gene products that, in a given genome, have no accurate functional assignments. These hundreds of proteins without known function within a single organism prevent a deeper understanding of the cell as a biological system because it might be expected that some of these proteins of yet undiscovered functionalities may end up revealing significant novelty in chemical reactions.
The increasing numbers of proteins of unknown function from newly sequenced genomes have been grouped into families and classified within the Pfam database as "domain of unknown function" (DUF) 6 (1). Besides grouping protein families by sequence similarities (2, 3) and gene context analyses (4), structural genomics has emerged as a promising tool for functional annotation, given the assumption that proteins with a similar fold can be related in function (5). To narrow down the possible functions of each family and to achieve a better perception of the protein universe, the NIH Protein Structure Initiative (PSI; www.nigms.nih.gov/Research/FeaturedPrograms/ PSI) has made a systematic effort to produce representative structures of most DUF families (6). Our attention was drawn by DUF849, a family restricted to prokaryotes and found in ϳ850 bacterial proteins belonging to several different phyla where, according to gene predictions, it accounts in most (albeit not all) cases for the whole protein chain. Despite the fact that a few crystallographic structures of proteins associated with DUF849 have been recently deposited in the Protein Data Bank (PDB), no mechanistic or biochemical investigation is at present available. In a previous work, we identified a gene associated with DUF849 as coding for the 3-keto-5-aminohexanoate cleavage enzyme (Kce) involved in the lysine fermentation pathway (7), an intriguing enzyme whose catalytic mechanism has remained so far enigmatic. The reaction catalyzed by Kce involves the reversible condensation of a six-carbon molecule, 3-keto-5-aminohexanoate (KAH), derived from L-lysine, through three different reactions with acetyl-CoA to produce two four-carbon species, of which one still esterified to CoA, i.e. acetoacetate and a 3-aminobutyryl-CoA (8 -10) (see Fig. 1A). Such ␤-keto acid-degrading activity proceeding through the cleavage on the ␤-carbon still has no analogues in literature and therefore makes the characterization of this enzyme of particular interest.
Here, we investigated the catalytic mechanism of Kce from Candidatus Cloacamonas acidaminovorans by structural and biochemical means. We determined the crystal structures of the enzyme in its native state, as well as complexed with either one of its natural substrates ((S)-KAH) or one of the products (acetoacetate). Additionally, we modeled the assumed intermediates of the reaction into the binding pocket, resulting in poses that agree with the proposed catalytic mechanism. The ensemble of our results suggests that the enzyme, although structurally similar to class II aldolases, proceeds through a Claisen-like condensation with the peculiarity that CoA moiety is transferred in an intramolecular fashion to the keto group of the KAH moiety.

EXPERIMENTAL PROCEDURES
Experimental procedures for DNA cloning, protein production, purification and crystallization, biochemical assays, and site-directed mutagenesis are described in the supplemental material. Oligonucleotide sequences are provided in supplemental Table S1.
Data Collection, Structure Solution, and Refinement-X-ray diffraction data were collected either at Synchrotron Soleil (Saint-Aubin, France) or at the European Synchrotron Radiation Facility (ESRF) (Grenoble, France) from single crystals at 100 K. Diffraction frames were processed with XDS (11) followed by data reduction with either XSCALE (11) or SCALA within the CCP4 suite of programs (12) (see Table 1 and supplemental Table S1). The structure was solved by multiwavelength anomalous diffraction (MAD) phasing from Se-Met-labeled protein crystals. A total of 27 selenium sites in the asymmetric unit were located with the program SHELXD (13) on data from a redundant single anomalous dispersion dataset from a Se-Met-labeled crystal (supplemental Table S2); threewavelength MAD phasing and density modification were then carried out on data collected from an isomorphous crystal with SOLVE/RESOLVE (14). The resulting electron density map showed clear secondary structure allowing the manual tracing of the polypeptide chain with Coot (15). This preliminary model was partially refined with Refmac5 (16) and then used to perform molecular replacement on higher resolution data from a different crystal form (space group P2 1 2 1 2 1 ). All subsequent complexes with ligands were solved by molecular replacement with the program Molrep (17) using the structure of the Kce monomer, without ligands, as the search model. Each model was inspected and rebuilt with Coot and refined with auto-BUSTER (18). All models were validated through the Molprobity server (19). Final refinement parameters are shown in Table 1. Figures were generated and rendered with the PyMOL Molecular Graphics System, Version 1.3, (Schrödinger, LLC); electrostatic surfaces were rendered with the APBS plug-in within PyMOL, supplying atomic charges and radii calculated through the Poisson-Boltzmann pdb2pqr server (20) applying the CHARMM force field. (21) and (Rs,R)-(Ϫ)-methyl N-(p-toluenesulfinyl)-3-amino-3-methylpropanoate, respectively. Detailed experimental procedures and compound characterization data, including 1 H and 13 C NMR data, are available in the supplemental material.

RESULTS AND DISCUSSION
Overall Structure-The structure of Kce was solved by multiwavelength anomalous diffraction on selenomethionyl-labeled protein crystals diffracting at 3.2 Å resolution (supplemental Table S2). The non-isomorphous structures of the native enzyme at higher resolution (1.6 Å), as well as the complexes with substrates and products, were solved by molecular replacement; a total of four separate models were refined at resolutions ranging from 1.75 to 1.28 Å (Table 1). In all the crystal structures determined, the enzyme is assembled as a homotetramer with point symmetry 222 (Fig. 1B), consistent with the observed tetrameric state of the recombinant enzyme in solution (7) and with earlier observations on purified bacterial homologs from other species (8 -10). Each subunit, made of 276 residues, shows the canonical (␤/␣) 8 TIM barrel fold, with a short additional C-terminal ␣-helix (␣9) protruding at the N-terminal face of the barrel and three ␤-turn extensions coming out from the barrel core, inserted in between ␤2 and ␣2 (residues 49 -58), ␤4 and ␣4 (residues 112-119), and ␤8 and ␣8 (residues 234 -240) (Fig. 1C). The only missing part in the model is the loop connecting ␤3 to ␣3 (residues 85-92), due to the lack of supporting electron density in two out of the four monomers, as a consequence of its high conformational flexibility (Fig. 1B). Each protomer buries a total of ϳ2100 Å 2 (18%) of the monomer surface in intersubunit contacts, which mainly involve helices ␣5 and ␣6, as well as the ␤-turn ␤4a-␤4b, for one dyad, and ␣7 and ␣8 for the other dyad. As observed in the vast majority of TIM barrel enzymes, the active site is situated in each monomer at the C-terminal face of the central ␤-barrel and could be identified by the presence of a metal ion coordinated by His-46, His-48, and Glu-230 (Fig. 1D). This ion was modeled as Zn 2ϩ according to its coordination geometry, and its nature was then confirmed by the analysis of double difference anomalous maps calculated from data collected around the zinc K-edge (supplemental Fig. S1A and supplemental Table S3). The metal ion is situated at the bottom of a crevice accessible from two opposite openings, one of which is delimited by the mobile ␤3-␣3 loop (Fig. 1D). A structural similarity search showed significant matches (C␣ r.m.s. deviation ϳ1.2 Å) to seven structures of proteins of unknown function, all composed of the Pfam database DUF849 domain (PDB entries 3NO5, 3CHV, 3FA5, 3LOT, 3E02, 3E49, and 3C6C) and showing a conserved divalent metal ion and its coordinating environment. However, no complexes with substrates or products, nor biochemical data   AUGUST 5, 2011 • VOLUME 286 • NUMBER 31 about these enzymes, were available, and no mechanistic clues on catalysis could thus be inferred from a simple structural comparison.

Structure and Catalytic Mechanism of Kce
Complexes with 3-Keto-5-aminohexanoate and Acetoacetate-To get insights into the catalytic mechanism, we crystallized Kce in complex with either the substrate (S)-3-keto-5-aminohexanoate ((S)-KAH) or the product acetoacetate in two different space groups (Table 1). Both molecules were found to bind at full occupancy and could be modeled unambiguously in the active site (Fig. 2). The binding of KAH did not promote significant overall structural changes (C␣ r.m.s. deviation ϳ0.9 Å) as the main chain displacements are essentially limited to the ␤3-␣3 loop and the ␣3 helix (residues 82-100). In particular, the ␤3-␣3 loop, floppy in the native enzyme, is well structured upon the bound KAH closing one of the two entrances to the active site and leaving only one access to the Zn 2ϩ crevice for the second substrate acetyl-CoA (Fig. 1D). Further, the side chains of most residues surrounding the active site pocket adopt different rotamers with respect to the unliganded enzyme, reorienting toward the bound KAH. The ␤-keto acid binds as a bidentate chelator of Zn 2ϩ , with a carboxylate oxygen as well as the keto group coordinating the divalent ion at distances of 2.0 and 2.2 Å, respectively ( Fig. 2A). The binding of the substrate is further stabilized by the hydrogen bonds of the carboxylate of KAH with the hydroxyl groups of Ser-82 and Thr-106 and the amide group of Asn-108, therefore preventing the decarboxylative formation of enolate as observed in decarboxylating condensing enzymes of fatty acid and polyketide synthesis (25). The ␦-amino group of KAH interacts with the main chain carbonyl of Gly-85 and the carboxyl of Glu-14 ( Fig.  2A). These interactions play a key role in the proper positioning of the S-ligand, suggesting a S-stereopreference of the enzyme. Indeed, the kinetic data show a moderate stereospecificity, the relative activity with (R)-KAH being 40% of the activity obtained with (S)-KAH (initial rates obtained with 600 M KAH at a saturating concentration of acetyl-CoA). In addition to Gly-85, the ␤3-␣3 loop also contributes to the binding of the substrate through Val-87, whose side chain makes hydrophobic interactions with the terminal methyl group of KAH. The side chains of Val-87, Phe-114, and Phe-119 define a small hydrophobic cleft in which the methyl group of KAH is inserted ( Fig. 2A). A well ordered water molecule could be placed approximately halfway between the NH 2 of the fully conserved Arg-226 and the carbon 2 of KAH, the predicted site of C-C cleavage in the course of the reaction, to which it positioned at an average distance of 2.9 Å, suggesting a possible role in catalysis (see below).
The structure of Kce in complex with the reaction product acetoacetate, although obtained from crystals belonging to a different space group, shows a virtually identical enzyme fold (C␣ r.m.s. deviation ϭ 0.3 Å) when compared with the complex with KAH. Acetoacetate also binds to the Zn 2ϩ ion as a bidentate ligand, with the carboxylate group superimposable to that of KAH, making the same interactions with Ser-82, Thr-106, and Asn-108 (Fig. 2B). However, the orientation of the molecule is approximately perpendicular to the six-carbon chain of KAH because the ketonyl oxygen of acetoacetate coordinates the metal from the Arg-226 side, with which it interacts firmly (Fig. 2B). This oxygen atom actually displaces the water molecule observed, in both the native enzyme and the KAH complex, to make a hydrogen bond with the guanidinium group of Arg-226 (see above), whereas the methyl group of acetoacetate makes van der Waals contacts with Val-172. Another water molecule now completes the Zn 2ϩ coordination sphere in the position otherwise occupied by the ␤-keto oxygen of KAH.
Reaction Mechanism and Structural Models-From the two snapshot structures, (S)-KAH-Kce and acetoacetate-Kce, and considering the primary results on the origin of carbon atoms of reaction products through tracer experiments (10), we suggest a potential reaction mechanism, depicted in Fig. 3. We hypothesize that the condensation catalyzed by Kce is a Zn 2ϩ -dependent, Claisen-like mechanism (25) that proceeds through the formation of four intermediates: an enolate derived from (S)-KAH (here named I1), a first tetrahedral high-energy intermediate (I2), a second tetrahedral high-energy intermediate (I3), and eventually an enolate form of acetoacetate (I4). Computational modeling of the predicted reaction intermediates shows compatible positioning in the active site for all of them (Fig. 4), in agreement with our hypothesis. Indeed, it has been reported that high-energy intermediates are more likely to adopt catalytically competent poses in molecular docking than ground-state intermediates (26,27). Moreover, this kind of Claisen-like mechanism is fully compatible with the catalyzed reaction being reversible, as reported earlier from other Kce homologs (8 -10).

FIGURE 2. View of the Kce active site in complex with either KAH (A) or acetoacetate (AAE) (B).
In both views, the A -weighted difference electron density map (mF o Ϫ DF c ), calculated before each ligand was added to the model and contoured at the 3 level, is shown in the form of a green mesh. Neighboring side chains are depicted, as well as the water molecules likely to be involved in the reaction (Fig. 3).
The crystal structure of the KAH-Kce complex shows a strong salt bridge connecting Asp-231 to Arg-226 (d ϭ 2.9 Å), with the guanidinium group from the latter also being engaged into a hydrogen bond to a nearby water molecule (d ϭ 2.9 Å), which, in turn, is close enough to the C2 of KAH (d ϭ 2.9 Å) to abstract the proton in a series of sequential acid/base exchanges   Fig. 3). The CoA pantetheine chain, which is orientated toward the outside of the pocket, has been omitted for clarity. initiated by Asp-231 (28). We propose that the condensation starts by the abstraction of the pro-S proton from C2 of KAH, mediated by a catalytic water molecule activated by the Asp-231-Arg-226 charge relay dyad. The key role of this Asp-Arg dyad in catalysis is strengthened by the observation that its disruption, illustrated by the Arg-226 or Asp-231 to glycine variants, leads to a catalytically inactive enzyme ( Table 2). According to our hypothesis, the stability of the resulting negatively charged KAH enolate (I1) is ensured by interactions with the Zn 2ϩ cation and through hydrogen bonds with Thr-106 and Ser-82, as observed in the crystal structure of the Kce-KAH complex and further supported by the docking model (Fig. 4, I1). Steady-state kinetic data are also consistent with this scheme as the replacement of Ser-82 by a glycine significantly increases the K m for KAH and decreases the k cat (Table 2). Although initial attempts of docking acetyl-CoA onto the crystallographic structure of the KAH-Kce complex failed, suggesting that either KAH adopts a different conformation in the presence of acetyl-CoA or the active site undergoes further conformational changes to accommodate both substrates, a perfect match could be obtained with the two reactive entities, i.e. I1 and acetyl-CoA, in the binding pocket (Fig. 4). According to this model, the orientation of the pantetheine chain of acetyl-CoA is parallel to (S)-KAH (supplemental Fig. S2A). The carbonyl carbon of acetyl-CoA is within van der Waals contact range to both C1 and C2 of KAH (respective distances are 3.4 and 3.1 Å) and is therefore well placed to react with the deprotonated carbon C2 of KAH (Figs. 3 and 4, I1). Nucleophilic addition of enolate I1 to the thioester carbon of the acetyl-CoA would lead to a tetrahedral oxyanion intermediate I2 strongly coordinated to the Zn 2ϩ ion (Fig. 3). As suggested by docking calculations, the KAH-derived part of I2 is perfectly superimposed on the position of the substrate in the KAH-Kce crystallographic structure (Fig. 4, I2). The oxyanion from the acetyl moiety from acetyl-CoA takes exactly the same position as the oxygen atom of the well ordered water molecule and is stabilized by a hydrogen bond with Arg-226, whereas the three oxygen atoms (i.e. the oxyanion from acetyl-CoA and the carboxylic and the ␤-keto oxygens from KAH) form a tripod-like structure with the peak being the Zn 2ϩ ion (Fig. 3, I2). In this conformation, the sulfur atom of acetyl-CoA makes polar interactions with Tyr-145 and van der Waals contacts with Phe-114, Val-172, and Arg-226, whereas the methyl group of the acetyl moiety is stabilized through van der Waals contacts with Glu-143 and Val-172. The short distance between the sulfur atom of the CoA moiety and the C3 keto atom of the former KAH (3.2 Å in the docking model), as well as the angle between the two reactive centers (ϳ100°, close to the Bürgi-Dunitz trajectory (29)), suggest that the intermediate may collapse through an intramolecular SCoA nucleophilic addition to the C3 ␤-keto group (Fig. 3). According to molecular docking, the newly obtained tetrahedral oxyanion intermediate I3 is also expected to coordinate the Zn 2ϩ ion in a similar fashion as I2 (Fig. 4). However, the sulfur atom of acetyl-CoA is here less strongly stabilized by Tyr-145 (d ϭ 3.5 Å) and is mainly held by weak van der Waals contact with Phe-114. This conformation is consistent with an intramolecular retro-Claisen reaction that would release 3-aminobutyryl-CoA as the leaving group and lead to the formation of acetoacetate in the enolate form, bound to Zn 2ϩ ion by a bidentate interaction mediated by the carboxyl and ␤-keto groups, as well as by hydrogen bonds to Ser-82 and Thr-106 (Figs. 3 and 4). It is worth noting that the predicted conformation of acetoacetate at the end of the catalytic process coincides with the binding mode observed in the high-resolution crystal structure of the enzyme-product complex (Fig. 2B) and is consistent with the origin of carbon atoms in acetoacetate as determined by tracer experiments with 14 C made by Barker and co-workers (8 -10). The protonation of the enolic form of acetoacetate, possibly by the action of another water molecule taking part in a proton relay with the Asp-231-Arg-226 dyad (in a reverted process with respect to the deprotonation of KAH), would finally release the acetoacetate (Figs. 3 and  4).
In all docking models (I1, I2, I3, and I4), the adenosine moiety from the CoA ester is situated inside or near the shallow pocket flanked by the side chains of Arg-209 and Lys-237 (supplemental Fig. S2). In contrast, it should be noted that the pantetheine chain is modeled differently in the different intermediates. Indeed, CoA can adopt widely varying conformations in different enzymes (30), making its mode of binding difficult to predict (31).
Although the first attempts at modeling acetyl-CoA onto the crystallographic structure of KAH-Kce suggested a role of Glu-143 in stabilizing this substrate in the active site, activity measurements of the Glu-143 to Gly and Gln mutants invalidated this hypothesis as none of these mutations notably altered the K m of the enzyme for acetyl-CoA. Rather, their main effect is to affect the k cat , and to a lesser extent, the K m for KAH (Table 2). Although unexpected, this effect may be explained by the interactions that, in the apo enzyme, Glu-143 makes with Arg-226 (d ϭ 3.0 Å), Asn-108 (d ϭ 3.4 Å), and Glu-141 (d ϭ 2.5 Å) (supplemental Fig. S3A). As crystallographic data revealed that Asn-108 interacts with (S)-KAH ( Fig. 2A), Glu-143 variants may alter or suppress the hydrogen bond with the carbamide group of Asn-108, and in turn, slightly lower the stabilization of KAH in the active site, leading to an increased K m . Moreover, Glu-143 is involved in the network of interactions that hold the catalytic water molecule in position, as observed in both the apo Kce structure and the complex with KAH (supplemental Fig.  S3), in which it adopts a different rotamer upon substrate binding and structuring of the ␤3-␣3 loop (see above). Substitutions of this residue are thus likely to alter the network that holds the key water molecule, decreasing the overall catalytic efficiency.
Concluding Remarks-The wide functional diversity of TIM barrel domains, which at present account for almost 9% of the whole PDB, is well known. The vast majority of these proteins are enzymes, spread ubiquitously in all kingdoms of life and most often involved in metabolic pathways (32). However, despite the already known large chemical variety of reactions catalyzed by this fold, new types of reactions or novel catalytic mechanisms performed by TIM barrel enzymes could still have escaped identification. Indeed, we provide here strong evidence pointing to Kce as the first representative of a new subclass of TIM barrel Zn 2ϩ -dependent condensing enzymes. Similarly to canonical class II aldolases, which share with Kce both the presence of the zinc cation and the same (␤/␣) 8 barrel fold, Zn 2ϩ is crucial to chelate both the substrate and the nascent acetoacetate product, as well as to stabilize their enolic forms, but the overall reaction course is mechanistically closer to a carboncarbon Claisen condensation. Our model is fully consistent with earlier literature reports about the origin of carbon atoms in the products and the fact that no covalent acyl-enzyme intermediate could ever be observed (8 -10). Further research would reveal whether or not Kce represents a member of a whole new class of enzymes catalyzing carbon-carbon condensations involving CoA esters and small organic ␤-keto acids.