Escherichia coli FolC Structure Reveals an Unexpected Dihydrofolate Binding Site Providing an Attractive Target for Anti-microbial Therapy*

In some bacteria, such as Escherichia coli, the addition of l-glutamate to dihydropteroate (dihydrofolate synthetase activity) and the subsequent additions of l-glutamate to tetrahydrofolate (folylpolyglutamate synthetase (FPGS) activity) are catalyzed by the same enzyme, FolC. The crystal structure of E. coli FolC is described in this paper. It showed strong similarities to that of the FPGS enzyme of Lactobacillus casei within the ATP binding site and the catalytic site, as do all other members of the Mur synthethase superfamily. FolC structure revealed an unexpected dihydropteroate binding site very different from the folate site identified previously in the FPGS structure. The relevance of this site is exemplified by the presence of phosphorylated dihydropteroate, a reaction intermediate in the DHFS reaction. L. casei FPGS is considered a relevant model for human FPGS. As such, the presence of a folate binding site in E. coli FolC, which is different from the one seen in FPGS enzymes, provides avenues for the design of specific inhibitors of this enzyme in antimicrobial therapy.

Folate molecules (tetrahydrofolate and derivatives) are used as cofactors in various metabolic pathways. Tetrahydrofolate (vitamin B9) is essential for normal cell growth and replication, and its metabolic pathway has been the target for cytotoxic drugs in cancer therapy (1). Folates are synthesized de novo in many organisms (plants and most bacteria), but other organisms such as mammals or Lactobacillus casei must obtain folates through uptake. During the biosynthesis of folates, dihydrofolate synthetase (DHFS) 1 activity adds L-glutamate to dihydropteroate to form dihydrofolate. After the reduction of dihydrofolate to tetrahydrofolate by dihydrofolate reductase, the folylpolyglutamate synthetase (FPGS) adds a second and third glutamate by ␥-linkage to form polyglutamates (Fig. 1). These polyglutamate products are the in vivo cofactors of folatedependent enzymes.
FPGS activity seems to be ubiquitous. The polyglutamation of folates allows their retention in the cell and increases their affinity for some of the folate-dependent enzymes. In mammals, folate cofactors are also stored in mitochondria for use in various folate-dependent reactions such as glycine synthesis (2). The importance of FPGS in the folate pathway makes it interesting for cancer therapy, as folates are essential in cell growth, and some of the folate analogs used as inhibitors of the folate pathway increase their potency by being polyglutamylated in the cell (2). DHFS activity is present only in organisms that synthesize folates de novo. It can be associated on the same protein with FPGS activity as in Escherichia coli or Neisseria gonorrhoeae, and this protein will be called FolC in this paper. Different enzymes can also carry out DHFS and FPGS activities, sometimes in different compartments as in the plant Arabidopsis thaliana (3).
DHFS activity has been shown to be essential in Gram negative bacteria such as E. coli (4) and N. gonorrhoeae (5) as well as in Gram positive bacteria like Staphylococcus aureus or Streptomyces coelicolor (6), whereas it is not present in mammals. Because of its ubiquity in pathogenic bacteria, its essentiality for these organisms, and its absence in humans, the inhibition of DHFS activity would appear to be a promising target for antimicrobial therapy. There are strong similarities between DHFS and FPGS enzymes at both the sequence level and the mechanism level. A sequence search using the E. coli FolC sequence as a reference finds DHFS/FPGS/FolC enzymes, irrespective of their specificity, with at least 25% identity between any two of these sequences, and they cluster mainly according to their origin and not to their specificity. Furthermore, the reactions (Fig. 1a) are supposed to be identical and to follow an ordered Ter Ter mechanism (7). Phosphate is first added to the folate substrate upon hydrolysis of the ATP to form an acyl-phosphate intermediate. The second step of the reaction is suggested to be a nucleophilic attack by the amine of L-glutamate, producing a tetrahedral intermediate. In the last step, the tetrahedral intermediate collapses to yield ADP, the glutamylated folate, and inorganic phosphate (Fig. 1b).
The structure of the L. casei FPGS enzyme, free (3) and bound to its substrates (8), shows that it is made of two domains folded consecutively and joined by a six-residue linker. ATP binds in a channel sandwiched between the two domains. A significant rearrangement of the ATP binding mode is observed upon the binding of a folate molecule from a folateunbound form with the ribose ring in a C3Ј-endo conformation to a folate-bound form in anti conformation. A rigid body movement of the two domains is also observed, closing on the interdomain cleft into which the folate molecule binds.
Analysis of the L. casei FPGS active site reveals that it belongs structurally to the Mur ADP-forming ligase superfam-ily (9), which also comprises the structures of MurD (10) and MurE (11). MurD catalyzes the addition of D-glutamate to UDP-N-acetylmuramoyl-L-alanine in the presence of ATP. The proposed mechanism for this reaction is similar to that of FPGS. The superimposition of the structures reveals a very conserved catalytic site, the presence of a similar phosphatebinding loop (P-loop) in the nucleotide binding site, and very similar phosphate binding pockets. The ␤and ␥-phosphates of ATP are held in place by their interactions with a magnesium ion, which is further coordinated by a conserved lysine at the end of the P-loop, a conserved glutamate, and another residue which, in L. casei FPGS, is Ser 73 of the ⍀-loop, a feature specific to FPGS enzymes. This loop contains residues whose properties are conserved among all DHFS/FPGS sequences and especially a Ser-cis-Pro motif (Ser 73 -Pro 74 in L. casei) in the active site. A second magnesium site is present in the substrate-containing structures. It is hexacoordinated by four water molecules arranged in a plane, an oxygen of the ␥-phosphate on one side of the water plane, and a conserved histidine on the other side. Two of the water molecules are further coordinated by a conserved carbamylated lysine. This lysine seems to be present in all members of the Mur synthethase superfamily including MurD, MurE, MurF, cyanophycin synthethase, and FPGS enzymes (12); the precise role of the carbamate in the enzymatic activity is difficult to assess but is supposed to be important for spatial stabilization of the two water molecules that correctly position the Mg 2ϩ . Mutagenesis experiments indicate that the carbamylated lysine is essential for activity.
The sequence similarities between FPGS, DHFS, and FolC enzymes suggest that both the ATP binding site and the catalytic site are conserved at the three-dimensional level in all these enzymes. Hence, differences in specificity should mostly result from differences in the binding of the various folate molecules to the enzymes.
To verify this hypothesis, we decided to study the E. coli enzyme. The folC gene is an essential gene in E. coli (4), and its product (FolC) catalyzes both DHFS and FPGS reactions. Biochemical studies suggested that it was bifunctional (13). A number of independent mutants could be made that affect both activities similarly (14). These mutations can be mapped, using the L. casei FPGS structure, either to the ATP binding site or to the catalytic site. These two studies suggest that FolC has a single ATP binding site and a single catalytic site but might have several binding sites or binding modes for its various substrates. Presently, it remains unclear how FolC can perform the two reactions and whether it qualifies as an attractive target for designing selective inhibitors against the folate binding enzymes. Our work addresses these important issues.

EXPERIMENTAL PROCEDURES
Protein Production-The E. coli folC gene was cloned in plasmid pET29. The resulting plasmid, pVRC1432, was transferred into the E. coli BL21 (DE3) strain. This strain was grown at 37°C in 1 liter of Luria-Bertani medium in presence of 50 mg/liter kanamycin. When A 600 reached 0.8, 1 mM isopropyl-1-thio-␤-D-galactopyranoside was added to the culture until A 600 reached 1.5. After centrifugation, cell pellets (8 g of frozen cells) were resuspended in 50 mM Tris-HCl buffer, pH 7, containing 2 mM phenylmethanesulfonyl fluoride and lysed by sonication.
The supernatant (50 ml) was loaded on a Q-Sepharose Hi Load 16/10 column (Amersham Biosciences) and equilibrated with 50 mM Tris-HCl buffer, pH 7. The column was washed with the same buffer. The proteins were eluted with a 200-ml gradient of 0 -1 M NaCl in the buffer. Fractions (10 ml) were checked by SDS-PAGE. The fraction containing the larger amount of the FolC protein was applied to a Superdex 75 16/60 column (Amersham Biosciences) and equilibrated with 50 mM Tris-HCl, 150 mM NaCl, pH 7. Elution was performed with the same buffer. 2-ml fractions were collected at a flow rate of 1 ml/min. Using this protocol, 95 mg of pure FolC protein was obtained.
Biochemical Data-The enzyme was incubated in 10 mM glycine-OH, 10 mM MgSO 4 , 50 mM KCl, pH 9.5, with DHP for measurement of DHFS activity or aminopterin for FPGS activity in the presence of Tris(2carboxyethyl)phosphine, ATP, and glutamate (cf. Table I for concentrations). The reaction was stopped by the addition of HCl (20 l, 1 N). 300 l was transferred to a microplate injection system for separation by high performance liquid chromatography (Gilson apparatus). 250 l was injected onto a Macherey-Nagel Nucleodur column (105-mm C8). A gradient between buffer A (H 2 O and 0.1% trifluoroacetic acid) and buffer B (acetonitrile plus 0.1% trifluoroacetic acid) was applied for 5 min 20 s with a flow rate of 2.2 ml/min. The peak of dihydrofolate (DHF) was measured by a fluorescence spectrophotometer ( EX , 285 nm; EM , 420 nm). In the conditions of measurement of DHFS activity, the K m values of DHP, ATP, and glutamate were 150 nM, 11 M, and 142 M, respectively.
Crystallization and Structure Determination-Protein was transferred and concentrated into a buffer containing 50 mM Tris-HCl, pH 8, 6 mM dithiothreitol, 10 mM ATP (or ADP), and 15 mM MgCl 2 to ϳ50 mg/ml. It is then incubated overnight with the ligand of interest at 0.4 mM. Crystals are obtained by the hanging drop method in 1.2-1.7 M ammonium sulfate, 5 mM dithiothreitol, and Bicine, pH 8.7. The crystals were then frozen in a buffer containing 1.8 M ammonium sulfate, 25% glycerol, and Bicine, pH 8.7. Data were collected at the European Synchrotron Radiation Facility on line ID14. Molecular replacement was done using Amore (18) of the CCP4 suite (19); the model was the apo L. casei FPGS structure separated into its two domains. A solution was unambiguously found but was of poor quality. Intensive rebuilding had to be done using maps calculated with Buster/TNT (20 -22). Refinement was then done using this same program (Table II).
Sequence Alignments-A sequence search was carried out using the E. coli FolC sequence as a reference, and an overall sequence alignment was done on the 47 bacterial sequences thus obtained using programs of the Vector NTI suite. Even though there is at least 25% identity between any two of these proteins, the overall sequence identity was very low (4%), with 39% overall sequence similarity. Important residues (ADP binding site, active site residues) are conserved among all sequences, validating the alignment. Sequences cluster in bacterial families, but this alignment does not distinguish between the different specificities of the enzymes.

RESULTS
Three-dimensional Structure of E. coli FolC-The 25% identity in sequence between E. coli FolC and L. casei FPGS was also found at the three-dimensional level. The proteins share the same overall structure. Both individual N-terminal and C-terminal domains of E. coli FolC superimposed with those of L. casei FPGS with a root mean square fit of 1.6 Å for the N-terminal domains and 1.9 Å for the C-terminal domains, although their relative orientations are different in these two structures. (Fig. 2a) The N-terminal domain presents some differences with its FPGS counterpart, namely an ordered loop between ␣1 and ␣2 (which, as will be seen later, has some functional role), shorter ␤-strands 8 and 9, and a shorter ␣8. All residues of the Cterminal domain are clearly visible, revealing an additional ␣-helix between ␤-strands 14 and 15 in a region that is disordered in FPGS (Fig. 2b).
ATP Binding Site-E. coli DHFS crystals grow only in the presence of nucleotides, ADP, ATP, or AMP-PNP. Extensive crystallization trials were carried out to find crystallization conditions for the apo-form of the enzyme, but this approach remains unsuccessful. In all cases, in the presence or absence of another substrate (DHF) the nucleotide adopts the same conformation, which corresponds to the "activated" conformation found in FPGS.
In this conformation, the nucleotide molecule sits in a narrow channel sandwiched between the N and C domains (Fig. 3). It is covered by ordered water molecules on the adenosine site and is opened toward the catalytic site on the other side. The sides of the channel are made up by the N-terminal parts of ␣-helices 3 and 8, the inter-domain linker, and the loop joining ␤-strand 10 and ␣-helix 10. The adenosine moiety is stacked between the glycine-rich loop (P-loop, residues 56 -59) and Tyr 312 of ␣-helix 10. The ribose moiety is in hydrogen-bond interactions with Asp 302 , equivalent to Asp 313 of L. casei FPGS and the major contributor to the ribose binding site. The ␣-phosphate makes interactions with the P-loop on one side and with the linker residue Arg 289 side chain on the other side. The ␤and ␥-phosphates interact as already described in L. casei. The ⍀-loop, containing a cis-Pro, adopts the same conformation, thus positioning the carbonyl of Ser 83 to coordinate the first magnesium ion. The second magnesium ion is also coordinated in the same way as in FPGS, with the carbamylated Lys 188 fixing the position of two of the water molecules around the metal ion.
A significant rearrangement of ATP was described in L. casei FPGS upon the binding of a folate molecule (8). By superimposing the L. casei and E. coli ATP binding sites, the binding site of the C3Ј-endo conformation of ATP appeared blocked in E. coli, as Gly 314 of L. casei was replaced by Val 303 in a position that would clash with the adenosine moiety.
All attempts to obtain crystals of the complex between DHFS and ATP resulted in a structure in which the nucleotide was at least partly hydrolyzed. AMP-PNP shows full binding in the same "anti" conformation as ADP, even in the absence of substrate. From these observations, even though it is not possible to discard the possibility of a C3Ј-endo binding mode of ATP in E. coli FolC, such a movement seems ruled out because of the steric hindrance with Val 303 .
Folate Binding Site-E. coli FolC was crystallized in the presence of Mg-ADP or Mg-ATP and either DHF or thiopteroate at 1 mM (Fig. 5). DHF is the product of the DHFS reaction, and the unsaturated thiopteroate is not active against the enzyme. In the absence of a bound folate molecule, residues 24 -26 are not visible in the electron density ("open" conformation; Fig. 4a). Upon the binding of DHF, additional electron density shows up unambiguously corresponding to a pteroic cycle, and the loop becomes ordered. The pteroic cycle fits into a narrow pocket, stacked on one side against Phe 124 of ␣-helix 5 and Ala 155 and on the other side against Ile 28 and Leu 30 . These two hydrophobic residues are further stabilized by their  stacking against Trp 176 . All atoms from the pteroate moiety, which have H-bonds capabilities, are engaged in productive interactions with neighboring residues (Fig. 4b). A water molecule, coordinated by the Thr 27 and Phe 124 main chain as well as the pteroic cycle, further occupies the pocket. It would seem that the binding of a pteroate molecule stabilizes residues 24 -32; this induced fit mechanism is further supported by the structure obtained from the crystal grown in the presence of thiopteroate (weak binder and thus not present in the structure); in this case residues 24 -26 are not visible, the rest of this The main difference between the thiopteroate and dihydropteroate heterocycles is the protonation of N8 and C7 (Fig. 5). Because the pteroic cycle of thiopteroate did not show binding to the protein in the biochemical test, it is suggested that the interaction of the protonated N8 with Asp 154 is one of the leading forces in the specific binding of dihydropteroate molecules to FolC. Of all residues involved in pteroate binding in FolC, only the loop comprising residues 28 -32 is not conserved among bacterial DHFS/FPGS sequences, even though groups of enzymes contain this pattern. This observation suggests a possible sequence marker for DHFS activity. It is difficult to verify this hypothesis, as the function of most DHFS/FPGS/FolC enzymes is not fully characterized. Nevertheless, such a work has been done in yeast (15), where both an FPGS and a DHFS enzyme have been identified, and in A. thaliana (16). In these DHFS enzymes a similar pattern of residues could be identified, namely hydrophobic (D/E)LGL. This pattern is present in the 13 enterobacteria FolC sequences analyzed, suggesting that this novel site might exist in a reduced number of bacteria.
The rest of the DHP molecule, following the pteroate moiety, is clearly visible. Loop 147-150 of the enzyme delineates a platform for the benzoate moiety, which points toward the active site, at the right distance for the addition of phosphate, the first step in the DHFS reaction. Loop 147-150 is a highly conserved sequence (G(L/I/M)(G/A)G) present in all DHFS/ FPGS/FOLC enzymes, just before residue Asp 154 .
Catalytic Site-The catalytic site where addition of phosphate and subsequently glutamate takes place is identical to that of L. casei FPGS. In the absence of a pteroate molecule ATP has been hydrolyzed to ADP, and its ␥-phosphate is held in place by two magnesium ions. The additional partners of the first magnesium ion are two water molecules, the main-chain oxygen of Ser 83 (whose position is constrained by the conserved cis-Pro 84 of the ⍀-loop), a carboxylate oxygen of Glu 146 , and one of the oxygens of a putative sulfate molecule. Because of the presence of Ͼ1 M ammonium sulfate in the crystallization conditions, the density at this position was interpreted as a sulfate but could also be modeled as a phosphate resulting from the hydrolysis of ATP. The second magnesium ion (Mg2) is hexacoordinated by four water molecules, an oxygen of the ␥-phosphate on one side of the water plane, and His 173 on the other side. Two of the water molecules are further coordinated by the conserved carbamylated Lys 188 .
Mechanism-The DHFS reaction is presumed to be a Ter Ter sequential reaction (7). Phosphate is first added to the dihy-dropteroate upon hydrolysis of the ATP (forming DHP-P), before the addition of the L-glutamate and the dissociation of the phosphate ion. E. coli FolC crystallized in the presence of 2 mM DHF and 10 mM Mg-ADP, and all residues are in an extremely well defined density. However the glutamate moiety of DHF was not visible and was very likely hydrolyzed. The DHF was back-transformed into DHP-P, probably because of the presence of residual phosphate in solution. A control experiment was done using DHF and ATP, and the structure obtained contained a mixture of DHP-P/ADP and DHP/ATP (data not shown).
As described previously, the pteroic cycle lies in its binding pocket, sandwiched between ␣-helix 5 and the DHP binding loop (Fig. 4a). The benzoate moiety points toward the ADP binding site, and the phosphate moiety lies in well defined density. The phosphate is stabilized via interactions with the two magnesium ions, whereas one of the water molecules coordinating Mg2 has been displaced by the carbonyl oxygen of DHP-P (Fig. 6). The third oxygen of the phosphate is not in direct interaction with any atom but it is at ϳ3.4 Å of the NH of His 305 , suggesting a role of this residue in the next step of the reaction.

DISCUSSION
The ternary complex between E. coli FolC, ADP, and DHP-P reveals the existence of a specific binding site for the dihydropteroate, which is significantly different from the 5,10-methylene tetrahydrofolate (mTHF) binding site of L. casei FPGS. This DHP binding site is validated by the presence of the first reaction intermediate expected from the Ter Ter addition mechanism.
FolC can catalyze both the DHFS and the FPGS reactions. Both reactions are thought to follow the same mechanism of addition of glutamate molecules. The dihydrofolate synthetase activity adds a first L-glutamate to dihydropteroate, whereas the folylpolyglutamate synthetase activity adds the subsequent glutamates to the tetrahydrofolate.
The identified dihydropteroate binding site ideally positions the DHP molecule for the addition of a single glutamate. However, the exact location of the glutamate binding site has not been yet determined structurally in any member of the FolCrelated enzymes. The closeness of DHP-P with ADP ( Fig. 6) provides enough space for an L-glutamate molecule to get in contact with DHP-P for the second step of the Ter Ter reaction. Nevertheless, it appears unlikely that subsequent addition of L-glutamate molecules could take place at the same location without a conformational adjustment. This suggests that the DHP binding site is specific to the DHFS reaction, whereas another, presumably larger binding site may exist for the FPGS reaction. This would be in agreement with biochemical studies (13) demonstrating that the two activities can be inhibited independently from one another. A second hypothesis would be that the protein undergoes a conformational rearrangement to enlarge the DHP binding site to accommodate larger substrates.
To determine whether a second folate binding site corresponding to that of L. casei FPGS exists in E. coli FolC, the superimposition of both enzymes was done using either the Nor C-terminal domain as template. In L. casei FPGS, mTHF binds in the interdomain cleft adjacent to the ⍀-loop (8), with the pteridine ring sandwiched between Phe 75 of the ⍀-loop and Tyr 414 of the C-terminal domain. In this position, the amide nitrogen atom of the benzoyl group is ϳ10 Å from the active site, a distance that could accommodate a diglutamate moiety.
When the N-domains of L. casei FPGS and E. coli FolC were superimposed, there appeared a site that could possibly accommodate a folate molecule as indicated by a good overlap of the  ⍀-loops and His 85 substituting for Phe 75 . In contrast, the superimposition of the C-domains does not highlight clearly a common folate binding site. This binding site could be found if helix 13 adopted a different conformation. The mTHF molecule could then sit in a ridge between the two FolC domains with His 85 on one side and His 405 and Tyr 123 (Glu 120 in L. casei) on the other side (Fig. 7). The amide nitrogen atom of the benzoyl group would be positioned near the nitrogen of the aminobenzoate moiety of the DHPP, whereas the carboxylate moiety would be right at the position of the water molecule present in the DHP binding site.
Cocrystallizing FPGS substrate analogs under E. coli FolC crystallization conditions has remained unsuccessful to date by always yielding the same crystal form, suggesting that the binding cleft for the FPGS substrate might result from an induced fit mechanism that cannot take place in this crystal form. This hypothesis awaits three-dimensional structure determination.
The pteridine moiety of dihydropteroate fits snugly into the observed binding site of FolC, and the only free space between the protein and the ligand is found in the vicinity of the N5 of the dihydropyridine. This cavity is occupied by a single water molecule. It highlights the small size of the pteridine binding site. All polar atoms of the pteridine ring are involved in hydrogen bond contacts with the protein except for N5, which makes a water-mediated contact (Fig. 4b). Interestingly, the loss of one hydrogen bond reduces dramatically the inhibitory activity. Taken together, these data indicate that there is an exquisite complementarity both at the shape and electrostatic levels between the protein and DHP. As a consequence, the scope for designing potent inhibitors that interact specifically in the pteridine binding site would appear rather limited in terms of chemical diversity. Interestingly, some isosteric modifications of the pteridine moiety are well tolerated at positions 5 and 8 of the ring for the FPGS binding (17). This would be consistent with the open three-dimensional structural data of L. casei FPGS complexed with mTHF, which shows that only the aminopyrimidine ring is involved in hydrogen bond contacts with the protein, whereas the second ring is exposed to the solvent. A homology model (data not shown) of human FPGS would suggest that human FPGS exhibits a very similar folate binding site. Actually the distance between the ␥-carboxyl group of the glutamate side chain and the amide nitrogen of folyl analogs would appear to be the most critical for substrate activity (17).
The areas where the aryl moieties sit in FPGS and DHFS differ very significantly. These differences between this novel site and the FPGS folate binding site suggest that specific inhibitors could be identified that would not display significant activity against human FPGS. These selective inhibitors should be more amenable to drug development for antimicrobial therapy as they would be less likely to exhibit adverse side effects in humans.
The structural data reported within this paper have shown that there exists a novel pteroate binding site in E. coli, which appears to be conserved in enterobacteria, where the DHFS reaction takes place that is different from the site observed for the L. casei enzyme. This has implications on the drug potential ("druggability") of this therapeutic target for bacterial infections but also on the understanding of the mechanism of this bi-functional enzyme. It raises the question of how the second reaction (FPGS activity) is being performed. Can a second folate binding site be formed on this protein to accommodate larger substrates, presumably following a conformational rearrangement, or does the pteridine moiety remain in place and a subsequent conformation change enables the addition of glutamates to the substrate? These questions await further structural characterization of FolC/ligand complexes.