Acetyl-CoA synthetase from the amitochondriate eukaryote Giardia lamblia belongs to the newly recognized superfamily of acyl-CoA synthetases (Nucleoside diphosphate-forming).

The gene coding for the acetyl-CoA synthetase (ADP-forming) from the amitochondriate eukaryote Giardia lamblia has been expressed in Escherichia coli. The recombinant enzyme exhibited the same substrate specificity as the native enzyme, utilizing acetyl-CoA and adenine nucleotides as preferred substrates and less efficiently, propionyl- and succinyl-CoA. N- and C-terminal parts of the G. lamblia acetyl-CoA synthetase sequence were found to be homologous to the alpha- and beta-subunits, respectively, of succinyl-CoA synthetase. Sequence analysis of homologous enzymes from various bacteria, archaea, and the eukaryote, Plasmodium falciparum, identified conserved features in their organization, which allowed us to delineate a new superfamily of acyl-CoA synthetases (nucleoside diphosphate-forming) and its signature motifs. The representatives of this new superfamily of thiokinases vary in their domain arrangement, some consisting of separate alpha- and beta-subunits and others comprising fusion proteins in alpha-beta or beta-alpha orientation. The presence of homologs of acetyl-CoA synthetase (ADP-forming) in such human pathogens as G. lamblia, Yersinia pestis, Bordetella pertussis, Pseudomonas aeruginosa, Vibrio cholerae, Salmonella typhi, Porphyromonas gingivalis, and the malaria agent P. falciparum suggests that they might be used as potential drug targets.

Enzymes catalyzing this process play central roles in energy metabolism. Based on the reverse reaction, they are usually referred to as acid thiokinases or acyl-CoA synthetases (NDP 1forming). The most studied representative of this enzyme group is succinyl-CoA synthetase (SCS; R ϭ COOH-CH 2 -CH 2 -), a component of the tricarboxylic acid cycle (1,2). In eukaryotes this enzyme has a mitochondrial/hydrogenosomal location (3). The two eukaryotic isoenzymes, specific for either ATP (EC 6.2.1.5) or GTP (EC 6.2.1.4), are heterodimers composed of an ␣and a ␤-subunit. In vertebrates, the ␣-subunit of both enzymes is coded by the same gene, whereas the ␤-subunits derive from separate genes (4). Gram-positive bacteria also harbor a heterodimeric enzyme, whereas the SCSs from Escherichia coli and other Gram-negative bacteria are heterotetramers of two ␣and two ␤-subunits and have broader specificity for the nucleotide substrate, preferring adenine to guanine nucleotides (5).
A somewhat similar enzyme acting on acetyl-CoA instead of succinyl-CoA, acetyl-CoA synthetase (ACS) (ADP-forming; EC 6.2.1.13; R ϭ CH 3 -), has been detected in certain amitochondriate eukaryotes without metabolic compartmentalization (type I amitochondriates) (6 -8) and in several archaea (9). These organisms lack a complete tricarboxylic acid cycle, and substrate level phosphorylation of ADP or GDP by acetyl-CoA synthetase seems to play a role that is comparable with that of succinyl-CoA synthetase in organisms with functional tricarboxylic acid cycles.
Recent biochemical analysis of the eukaryotic (from the diplomonad Giardia lamblia) and archaeal (from the hyperthermophile Pyrococcus furiosus) acetyl-CoA synthetases revealed both similarities and differences between these enzymes. The two ACSs studied from P. furiosus are heterotetramers (␣ 2 -␤ 2 ) of broad substrate specificity that are able to utilize ATP, GTP, and several acyl-CoA esters with comparable efficiencies (9,10). In contrast, the G. lamblia ACS is composed of a single polypeptide chain, and the enzyme preferentially uses adenine nucleotides and acetyl-CoA (7,11).
When the sequence of a putative acetyl-CoA synthetase was cloned from G. lamblia (GenBank™ accession number AF107206) (11) it showed no detectable similarity to the major types of previously sequenced enzymes that are capable of acetyl-CoA synthesis, that is acetyl-CoA synthetase (AMPforming, EC 6.2.1.1, (12)) and phosphotransacetylase (EC 2.3.1.8) (13). Instead it appeared to be similar to the succinyl-CoA synthetase sequences from a variety of sources. To verify that the gene cloned from G. lamblia indeed coded for an ACS, we have expressed it in E. coli and characterized the properties of the resulting recombinant protein. Here we report the substrate specificity of the recombinant G. lamblia ACS and compare it with a number of previously uncharacterized proteins identified in the course of microbial genome sequencing projects. Our results indicate that ACSs and ACS-like proteins form a so far unrecognized enzyme superfamily with succinyl-CoA synthetases, malate thiokinase, and ATP citrate lyase. We present here the definition of this superfamily and its signature sequence motifs.

EXPERIMENTAL PROCEDURES
Expression of the ACS Gene-The G. lamblia ACS gene was cloned in the plasmid vector pQE-32 (Qiagen) as described previously (11). E. coli M15(pREP4) strain containing the recombinant plasmid construct was grown at 37°C in LB broth containing 100 g⅐ml Ϫ1 ampicillin to an OD 600 of 0.6 and induced by the addition of 1 mM isopropyl-1-thio-␤-Dgalactopyranoside for 5 h at 37°C. The cultures were centrifuged at 4000 ϫ g for 15 min, washed with phosphate-buffered saline, and resuspended in 50 mM potassium phosphate, 10 mM Tris-HCl buffer, pH 8.0, containing 5 g⅐ml Ϫ1 leupeptin and 1 g-ml Ϫ1 lysozyme. After incubation for 30 min on ice the suspension was sonicated by 6 pulses of 10 s and centrifuged at 20,000 ϫ g for 20 min. The supernatant was mixed with 5 volumes of 50% Ni-NTA agarose (Qiagen), pre-equilibrated with 300 mM NaCl in 50 mM potassium phosphate, pH 8.0 (Buffer A), incubated for 60 min at 4°C, and poured into a column. Unbound proteins were removed by washing the resin with 5 volumes of Buffer A and 5 volumes of 5 mM imidazole in Buffer A. The bound protein was eluted with 150 mM imidazole in Buffer A. Fractions containing the ACS activity were pooled, desalted, concentrated using a Centricon concentrator (Amicon, 50-kDa cutoff), and stored at Ϫ20°C.
Enzyme Assays-The ACS activity was measured in the direction of ATP synthesis (forward reaction) by following the ADP-dependent release of coenzyme A from acyl-CoA using the thiol reagent 5,5Ј-dithiobis(2-nitrobenzoic acid) (DTNB). The standard reaction mixture contained 1 mM MgCl 2 , 2 mM ADP, 40 mM potassium phosphate, 0.1 mM DTNB in 50 mM Tris-HCl, pH 7.5, and 0.05 mM of an acyl-CoA ester. The substrates tested were acetyl-, N-propionyl-, N-butyryl-, isobutyryl-, isovaleryl-, succinyl-, malonyl-, or glutaryl-CoA. The increase in absorbance at 412 nm was followed for 10 min at 30°C (⑀ 412 ϭ 13,600 M Ϫ1 ⅐cm Ϫ1 ). For the reverse reaction in the thiokinase direction the activity was determined by the following: 1) the formation of acylhydroxamate at 505 nm, in an assay mixture containing 5 mM ATP, 0.5 mM CoA, 10 mM MgCl 2 , and 5 mM sodium acetate in 50 mM Tris-HCl, pH 7.5, and after incubation at 30°C, 1 volume of 10% FeCl 2 in 0.7 N HCl was added; or 2) the formation of ADP, coupling the reaction with pyruvate kinase and lactate dehydrogenase and monitoring the oxidation of NADH; and 3) titration of the unreacted CoA with DTNB as described above.
Sequence Similarity and Phylogenetic Analysis-Sequence similarity searches against the non-redundant protein data base maintained at the NCBI, National Institutes of Health, Bethesda, MD, were performed using PSI-BLAST (14) with the G. lamblia ACS sequence as a query. Based on the initial results each identified domain was used as a separate query for PSI-BLAST run to convergence. The data base hits identified this way were used as queries for subsequent searches. Gapped BLAST (14) searches of the unfinished microbial genome sequences, generated in the course of genome projects at the Sanger Center, The Institute for Genome Research, Utah Genome Center, and the Pseudomonas sequencing project, were performed through the NCBI World Wide Web site. The sizes of the proteins in unfinished genome sequences were estimated using ORFinder 2 at the NCBI World Wide Web site. The multiple sequence alignment was constructed using ClustalW (15) with subsequent manual refinement based on PSI-BLAST outputs (14).
Phylogenetic relationships were analyzed using the maximum likelihood method (16) based on ClustalW and PSI-BLAST alignments performed with the PROTML program, version 2.3 (17). The maximum likelihood tree was obtained by local rearrangement of a neighborjoining tree using the Jones, Taylor, and Thornton model of amino acid substitutions (18). Bootstrap support was calculated with the resampling estimated log-liklihood method (17).

Properties of Recombinant ACS-
The sequence of the putative ACS cloned from G. lamblia (Ref. 11, GenBank™ accession number AF107206) showed no detectable similarity to either of the previously sequenced enzymes capable of acetyl-CoA synthesis, that is acetyl-CoA synthetase (AMP-forming, EC 6.2.1.1) (12) and phosphotransacetylase (EC 2.3.1.8) (13). Instead it appeared to be similar to the SCS sequences from a variety of sources ( Fig. 1). To verify that the gene cloned from G. lamblia coded for an ACS we have expressed it in E. coli and characterized the properties of the resulting recombinant protein.
Indeed the purified recombinant G. lamblia ACS expressed in E. coli was found to catalyze the forward reaction, the formation of acetate, ATP, and CoA as measured by the release of CoA (Table I). Under standard assay conditions linear double reciprocal plots were obtained for acetyl-CoA, ADP, and orthophosphate, with apparent K m values of 0.06 mM, 0.20 mM, and 1.35 mM, respectively, similar to those obtained for the native enzyme isolated from G. lamblia (7). Purified recombinant G. lamblia ACS did not support CoA release when several C4-C5 mono and dicarboxylic CoA esters were used as substrates (Table I). Under standard assay conditions the specific activity of the reverse reaction acetyl-CoA formation was about 4% of the rate in the forward direction. Thus the product of the cloned ACS gene showed essentially the same catalytic properties as the native enzyme purified from G. lamblia (Table I), which proved that this gene indeed coded for the G. lamblia ACS.
Acyl-CoA Synthetase Superfamily-The apparent sequence similarities between the Giardia ACS and SCSs prompted us to investigate possible relatedness of these enzymes. Data base searches using Giardia ACS as a query revealed its highly statistically significant similarity to the family containing succinyl-CoA synthetases, ATP citrate lyase, and malate thiokinase (19 -21) (Fig. 1). These searches also revealed homology of the Giardia ACS to a number of previously uncharacterized proteins identified in the course of genome sequencing of the archaea Methanococcus jannaschii (MJ0590) and Archaeoglobus fulgidus (AF1211 and AF1511) and of the bacteria E. coli (YfiQ) and Streptomyces coelicolor (SC9B10.09; see Fig. 1). The N-and C-terminal parts of Giardia ACS turned out to be homologous to ␣and ␤-subunits of SCS and to a number of similar shorter proteins from a variety of organisms, respectively ( Fig. 1). In addition proteins consisting of similar ␣and ␤-subunits fused in the opposite order (␤-␣) were identified in A. fulgidus (AF0932, AF1192, and AF1938) and S. coelicolor (SC8A6.03c) (Fig. 2). A search of the unfinished genome sequences recognized ␣and ␤-subunits of the previously studied ACS isoenzymes of P. furiosus (9,10,22) and demonstrated the presence of fused ACS-related enzymes in Yersinia pestis, Shewanella putrefaciens, Bordetella pertussis, Pseudomonas aeruginosa, Clostridium difficile, and Porphyromonas gingivalis ( Fig. 1). Partial sequences with significant similarity to the Giardia ACS were also found in Vibrio cholerae and Salmonella typhi (data not shown). Remarkably, the complete genome of Pyrococcus horikoshii contained five paralogous genes for the ␣-subunit of the ACSs but only two ␤-subunit-encoding genes (Fig. 1). The same number of ACS genes was found in the genomes of closely related P. furiosus and Pyrococcus abyssi (not shown). Finally, homologs of Giardia ACS were encoded in both thirteenth and fourteenth chromosomes of the malaria pathogen Plasmodium falciparum (Fig. 1). An additional member of this new enzyme family was the recently characterized pimeloyl-CoA synthetase from Pseudomonas mendocina that catalyzes the conversion of pimelic acid into pimeloyl-CoA, a precursor for biotin biosynthesis (23). Thus, of all the proteins identified in these sequence data base searches every one with experimentally determined activity turned out to be a thiokinase (acyl-CoA synthetase, GDP-or ADP-forming; see Table I).
We therefore refer to this superfamily of enzymes as acyl-CoA synthetases (NDP-forming), even though the exact activity and substrate specificity of each of these enzymes remains to be determined.
Sequence Conservation among Acyl-CoA Synthetases-Most of the amino acid residues that are known to be important for catalysis in the SCS (24 -26) are conserved in the ACSs as shown by multiple alignments of ACSs with the ␣and ␤-sub-units of SCSs (Fig. 1). Thus, His 246 ␣ 3 of the ␣-subunit of the E. coli SCS, which is known to be phosphorylated in the course of the SCS-catalyzed reaction (24,26), is absolutely conserved in every protein of the ACS family (Fig. 1A). Of the two residues that likely interact with phosphorylated and dephosphorylated forms of this His residue (26), Glu 208 ␣ is absolutely conserved in ACSs, whereas Glu 197 ␤ is substituted by Asp in all ACSs (Fig. 1). Several important residues of the phosphohistidine loop that interact with other residues of the protein (Gly 235 ␣, Thr 237 ␣, and Gly 248 ␣) are also well conserved, whereas the residues interacting with them (Arg 152 ␣, Gly 256 ␣, Arg 116 ␤, and Asp 274 ␤) display varying degrees of substitution. Arg 152 ␣, for example, is changed into Gln in all ACSs, whereas Gly 256 ␣ does not appear to be conserved at all (Fig. 1).
The amino acids forming the coenzyme A-binding site on the ␣-subunit of E. coli SCS such as Pro 40 ␣, Lys 42 ␣, Val 72 ␣, Pro 73 ␣, Ile 95 ␣, and Cys 123 ␣ are generally conserved in the ACS sequences. Most substitutions are by closely related residues, e.g. Lys 42 ␣ of SCSs is replaced by Asn in most ACSs, whereas various hydrophobic residues (Val, Leu, Phe, and Tyr) substitute for Ile 95 ␣ (Fig. 1A). However, the substitution level is fairly high particularly for the residues that bind CoA with the carbonyl or amide groups of the protein backbone, e.g. Gln 19 ␣ and Glu 97 ␣. The CoA-interacting residues of the ␤-subunit of E. The numbers indicate the positions of the first and the last residues in the alignment and the sizes of the gaps between aligned segments. Conserved residues are shown in bold; the ones conserved throughout the superfamily are colored blue, and conserved Pro and small residues (G, A, or S) are shown in green. The residues implicated in catalytic activity of succinyl-CoA synthetase (24,26) are indicated by white letters on blue backgrounds and by their position numbers; the residues conserved among various ATP-grasp fold enzymes (25,26)  coli SCS show even lower levels of conservation. Thus, Glu 33 ␤, Ser 36 ␤, and Lys 66 ␤ are all substituted in the enzymes of the ACS family; the same is the case, however, in several SCSs (Fig. 1B).
Signature Motifs for the Acyl-CoA Synthetases-Despite the high degree of sequence similarity between ACSs and SCSs ( Fig. 1)  Domain Organization of the Acyl-CoA Synthetases-Sequence comparison shows that the N-and C-terminal regions of G. lamblia ACS are homologous to the ␣and ␤-subunits of E. coli SCS (Fig. 1). A three-dimensional structure of the E. coli SCS reveals the presence of two domains (1 and 2) in the ␣-subunit and three domains (3)(4)(5) in the ␤-subunit (26). The structure of the members of the acyl-CS superfamily detected in this study can be best described with reference to this domain and subunit structure. The comparison of the G. lamblia ACS and the E. coli SCS shows that whereas the order of the domains of the SCS ␣-subunit is preserved in the N-terminal region of the ACS, the region corresponding to domain 5, the last domain of the ␤-subunit of SCS, is located in the central part of the ACS molecule. In essence, when compared with SCS the G. lamblia ACS can be regarded as an ␣-␤ fusion protein in which domains of the ␤-subunit were swapped (domain order 1-2-5-3-4).
Considering all homologs recognized, there exists an extremely complex picture with five different patterns of domain order and fusion (Fig. 2). In the group closely related to G. lamblia ACS (proven or putative ACS enzymes), fusion proteins with identical domain order (1-2-5-3-4) are found in M.  Fig. 1), and two ␥-proteobacteria, E. coli (YfiQ) and Y. pestis. Pimeloyl-CoA synthetase of P. mendocina is also a fusion protein with the same domain structure. All homologs in Pyrococcus sp. consist of separate ␣and ␤subunits; however, the domain structure of these subunits differs from that in SCS. The ␣-subunit represents a combination of the SCS ␣-subunit with the domain 5 of the SCS ␤-subunit (domain order ␣, 1-2-5 and ␤, 3-4). Three proteins in A. fulgidus (AF0932, AF1192, and AF1938) and one in S. coelicolor (SC8A6.03c) also represent fusion proteins in which the SCS ␣-subunit homolog is intercalated between the N-terminal part (domains 3 and 4) and the C-terminal part (domain 5) of the SCS ␤-subunit homolog (domain order 3-4-1-2-5). Whereas all SCSs and malyl-CoA synthetase exhibit the subunit and domain structure of the prototype E. coli SCS (␣, 1-2 and ␤, 3-4-5), ATP citrate lyase comprises a fused protein with yet another structure (domain order 3-4-5-1-2), in essence a ␤-␣ fusion of the SCS subunits. To complicate the matter even further, several of these enzymes contain additional acyltransferase domains in their N-terminal (AF1511 and SC9B10) or C-terminal (E. coli YfiQ, enzymes from Y. pestis, S. putrefa-ciens, and some other ␥-proteobacteria) regions (Fig. 2).
Hinge Regions between the Domains-The presence of both ␣-␤ and ␤-␣ fused proteins (Fig. 2) in a single genome (A. fulgidus and S. coelicolor) suggests that positioning both subunits of an acyl-CoA synthetase on a single polypeptide chain should have its evolutionary advantages, e.g. ensuring that these subunits are transcribed and work in tandem. However, to assume the same three-dimensional organization as SCS, the corresponding domains of G. lamblia ACS have to be properly oriented and linked by a hinge region that is sufficiently long to allow that arrangement. Sequence comparison between ACSs and SCSs does not give any clues as to how this proper orientation is achieved, because, with the sole exception of the Thr 237 ␣-Asp 274 ␤ pair, the residues that participate in dimer formation between ␣and ␤-subunits in E. coli SCS (Asp 103 ␣-Arg 225 ␤, Tyr 158 ␣-Phe 319 ␤, Glu 159 ␣-Arg 348 ␤, Lys 242 ␣-Glu 242 ␤, and Leu 276 ␣-Leu 374 ␤ (23)) are not conserved in G. lamblia ACS and related enzymes from M. jannaschii and A. fulgidus (Fig.  1). This could be because of the fact that each of these enzymes is composed of a single polypeptide chain, which might reduce the need for exact recognition between the two subunits. Remarkably, these residues are not conserved even in the mem- bers of the acyl-CS group that, like SCS, are composed of two subunits. It seems likely that those additional amino acid residues that participate in inter-subunit interactions in these acyl-CSs are less constrained in their substitution rate than the active site residues. On the other hand, comparison of G. lamblia ACS and SCSs readily identifies in the G. lamblia ACS sequence a Pro-rich region (Pro 442 -Pro 449 ), followed by a Lys/ Arg-rich stretch (Lys 464 -Lys 486 ), that is missing in the SCSs. These regions can be expected to form a turn and a connecting rod, respectively, for the proper positioning of the two domains of the ACS. Indeed, assuming that the overall ACS structure is substantially similar to that of SCS, and homologous domains shown on Fig. 1 occupy the same positions, this region (indicated by an arrow in Fig. 2) should be sufficiently long to cover the distance of ϳ40 Å that separates Met 1 ␤ from Lys 388 ␤ in the SCS structure. Noteworthy, whereas this predicted rod in G. lamblia ACS is rich in Lys and Arg and has an overall charge of ϩ7, the corresponding region in the homologous protein MJ0590 from M. jannaschii is rich in Glu and Asp and has the overall charge of Ϫ4 (see Fig. 1C). This might indicate that amino acid residues in these regions could have been selected for the charge density and hence the rod shape, rather than for a particular sequence pattern.
Phylogenetic Relationships and Enzyme Activities-Alignments shown in Fig. 1 were used to construct the maximum likelihood phylogenetic trees for the ␣or ␤-subunit homologs (data not shown) as well as a composite tree that included all identified sequences. This analysis showed that the members of the acyl-CS superfamily form two well separated groups (bootstrap values close to 100%), one that includes G. lamblia ACS and related proteins and the other that unifies the SCSs and malyl-CoA synthetase, as indicated in Fig. 1. This separation between the two clusters supported the notion that although these two groups derive from the same ancestral protein, their evolutionary paths separated early, leading to the enzymes being active mostly against dicarboxylate (SCS, malate thiokinase) or monocarboxylate substrates (ACS-like enzymes).
The composite tree for the ACS family of acyl-CoA synthetases (Fig. 3) revealed complex evolutionary relationships of the enzymes from different sources. In only a few cases was bootstrap support sufficient to make judgements regarding evolutionary history and, hence, likely functions of these enzymes. One such well separated clade comprises the products of the yfiQ gene of E. coli and nearly identical open reading frames from Y. pestis and S. putrefaciens (see Fig. 1). All these proteins contain an additional C-terminal domain with predicted acyltransferase activity (Fig. 2). Remarkably, the proteins from E. coli and Y. pestis, while showing a high level of overall sequence conservation with other ACS homologs, have the critical His-246 ␣ residue changed into Asn. Changing this His residue into Asp has been shown to render SCS inactive (28). No such data, however, are available for the Asn substitution. Noteworthy, E. coli contains an SCS and additional enzymes responsible for both acetate-activating and acetate-utilizing activities (phosphotransacetylase and AMP-forming acetyl-CoA synthetase, respectively (12, 13)); thus, a mutational change in the FIG. 1-continued enterobacterial lineage could allow the yfiQ gene product to either lose the enzymatic activity altogether or acquire an alternative substrate specificity. The presence of the additional acyltransferase domain in these proteins (Fig. 2) seems to argue for the latter possibility.
The second strongly supported clade of bacterial enzymes includes the pimeloyl-CoA synthetase of P. mendocina, active toward dicarboxylic acids containing from 5 to 9 carbon atoms (23) and related proteins from P. aeruginosa and B. pertussis (Fig. 3). In an earlier report, pimeloyl-CoA synthetase has been assumed to be an AMP-forming enzyme (23); however, the conditions of the enzyme assay used would not differentiate between AMP or ADP formation. The similarity of this protein to the other members of the acyl-CS superfamily (Fig. 1) and the absence of statistically significant sequence similarity to the previously described AMP-forming pimeloyl-CoA synthetases from Bacillus sphaericus and Bacillus subtilis (29,30) or any other AMP-forming acyl-CoA synthetases (data not shown) strongly suggest that the P. mendocina enzyme is in fact an ADP-forming enzyme. This case of non-related enzymes catalyzing the same biochemical reaction (conversion of pimelate to pimeloyl-CoA) is not that surprising as this phenomenon has been recently found to be very common in microbial world (31,32).
Among archaeal species, all the P. furiosus proteins grouped together, indicating relatively recent duplications of the respective genes in the pyrococcal branch. This correlated with the overlapping substrate specificities of the corresponding enzymes (Table I). On the other hand, the different sequences of A. fulgidus group largely according to their domain organization (Figs. 2 and 3). The proteins AF0932, AF1192, and AF1938 that have the domain order 3-4-1-2-5 group together, whereas AF1211 groups with the Giardia ACS that has the same domain 1-2-5-3-4 order, and AF1511 groups with the S. coelicolor protein SC9B10 that has the same domain order 1-2-5-3-4 with an additional N-terminal acyltransferase domain (see Figs. 2 and 3). Whereas the exact substrate specificities of these multiple ACS-like enzymes in archaea remain to be determined experimentally, their sheer number and apparently independent duplication in Archaeoglobus and Pyrococcus branches indicates their critical role(s) in the cellular metabolism.  tase (both ADP-and GDP-forming variants), malyl-CoA synthetase, ATP citrate lyase, and several hypothetical proteins, possibly with distinct substrate specificities. The enzymes of this superfamily demonstrate a significant degree of conservation of the amino acid residues that form their likely active site and can be confidently predicted to act as NDP-forming acyl-CoA synthetases. The common theme in their reactions in the direction of acyl-CoA synthesis is thus the requirement for an organic acid (mono or dicarboxylate), coenzyme A, a divalent cation, and the ability to utilize a purine nucleotide triphosphate. Moreover, several of the biochemically studied members of this group act in vivo in the opposite direction, conserving the energy of the thioester bond by substrate level phosphorylation of NDP. A comparison of the sequences of ACSs with the well characterized subunit and three-dimensional structure of the E. coli SCS (26) allowed us to pinpoint the conserved and varying features of this superfamily. The most striking differences concerned the relative order of the five domains that are recognized in the two subunits of SCS (␣, 1-2 and ␤, 3-4-5) and the fusion or presence of the two subunits in the various gene products. Despite this marked permutation, domains 1 and 2 were always adjacent. This is the area in SCS that includes the active site His residue as well as some of the residues interacting with the phosphohistidine loop. In addition, conservation of the amino acid residues interacting with CoA and the magnesium-binding residues (Asn 199 -Pro 200 ) suggests certain similarities in the reaction mechanism between ACS and SCS, that is phosphorylation of the N-3 of the imidazole ring of the active site His residue.
The acyl-CoA synthetase superfamily is unique in the degree of domain rearrangements among its members. This can be compared only to the domain shuffling seen in various proteins  Fig. 1, B and C); the black box indicates the predicted acyltransferase domain; and the curved arrow indicates the hinge region (see Fig. 1C). The C-terminal citrate synthase-like domain of ATP citrate lyase is not shown.
FIG. 3. Phylogenetic relationships between different acyl-CoA synthetases. The Maximum likelihood tree of the sequences homologous to the ␣-subunit of E. coli succinyl-CoA synthetase (see Fig. 1A) is shown. 250 shared positions were analyzed after exclusion of nonoverlapping ends and gaps. Only bootstrap proportions higher than 85% are indicated at the nodes. of the P-enolpyruvate-dependent sugar:phosphotransferase system (see Refs. 33 and 34) and to the modular structure of polyketide synthases (35,36). In the latter case, however, individual enzyme domains are usually connected with long (ϳ500 amino acid residues) linkers, whereas the hinge regions in acyl-CSs are surprisingly short (Figs. 1 and 2).
Phylogenetic distribution of members of the acyl-CS superfamily is unusual. Among the sequenced bacterial genomes, acyl-CSs other than SCS have been found so far only in ␥-proteobacteria (E. coli, Y. pestis, V. cholerae, S. putrefaciens, and others), in one ␤-proteobacterium (B. pertussis), in one representative of the Cytophaga/Bacteroides group (P. gingivalis), and in two representatives of Gram-positive bacteria (C. difficile and S. coelicolor). Such enzymes are clearly not encoded in complete genomes of such Gram-positive bacteria as B. subtilis, Clostridium acetobutylicum, and Mycobacterium tuberculosis, in the ␤-proteobacterium Neisseria meningitidis, and the ⑀-proteobacterium Helicobacter pylori. The corresponding genes are also missing in the genomes of chlamydiae and the spirochetes Borrelia burgdorferi and Treponema pallidum (data not shown). Among eukaryotes, yeast contains only succinyl-CoA synthetase, whereas C. elegans has succinyl-CoA synthetase in the mitochondria and ATP citrate lyase in the cytoplasm. The absence of ACS in yeast and, probably, higher eukaryotes correlates with its absence in Rickettsia prowazekii, an ␣-proteobacterium, and a close relative of the pre-mitochondrial symbiont (37).
Among eukaryotes, functional ACS, thus the conservation of the energy generated in keto acid oxidations by a single enzyme, seems to be restricted to the Type I amitochondriate species, i.e. organisms without a separate, membrane-bounded organelle of energy metabolism (8). In organisms harboring typical mitochondria, SCS performs this function as part of a complete tricarboxylic acid cycle (1). In certain acetate-producing eukaryotes that contain modified mitochondria or hydrogenosomes, organellar SCS remains the enzyme of substrate level phosphorylation. In this case, however, it forms a twoenzyme pathway with a CoA acyltransferase (3,38,39). The identification of ACS homologs in the genome of P. falciparum was somewhat unexpected, because this organism has mitochondria, albeit deeply modified ones, that probably are not involved in a major way in core energy metabolism. These statements are, however, conjectures based on biochemical data on only two parasitic species that belong to two separate eukaryotic lineages. A much broader sampling of the protist diversity is necessary to test the proposed but as yet unexplained correlation of the presence of ACS and lack of mitochondria and hydrogenosomes. The case of P. falciparum gives a clear warning that much remains hidden.
The unusual phylogenetic distribution of the acyl-CSs, other than SCS, in the microbial world makes them attractive as potential drug targets. Indeed, these enzymes appear to play a key role in the cell energy metabolism and should be indispensable for survival of such bacterial pathogens as B. pertussis, P. gingivalis, and the malaria agent P. falciparum. The function(s) of the ␥-proteobacterial acyl-CSs with an additional acyltransferase domain are more obscure, but their remarkable sequence conservation indicates that they also play some important role in the cell metabolism. Most importantly, the apparent absence of such acyl-CS activity in humans (or any other metazoans) makes it possible to specifically target pathogens with little or no effect on the human host. Design of such new anti-infectious drugs would be a worthy outcome of genome studies like this one.