Phylogenetic Classification of Protozoa Based on the Structure of the Linker Domain in the Bifunctional Enzyme, Dihydrofolate Reductase-Thymidylate Synthase*

We have determined the crystal structure of dihydrofolate reductase-thymidylate synthase (DHFR-TS) from Cryptosporidium hominis, revealing a unique linker domain containing an 11-residue α-helix that has extensive interactions with the opposite DHFR-TS monomer of the homodimeric enzyme. Analysis of the structure of DHFR-TS from C. hominis and of previously solved structures of DHFR-TS from Plasmodium falciparum and Leishmania major reveals that the linker domain primarily controls the relative orientation of the DHFR and TS domains. Using the tertiary structure of the linker domains, we have been able to place a number of protozoa in two distinct and dissimilar structural families corresponding to two evolutionary families and provide the first structural evidence validating the use of DHFR-TS as a tool of phylogenetic classification. Furthermore, the structure of C. hominis DHFR-TS calls into question surface electrostatic channeling as the universal means of dihydrofolate transport between TS and DHFR in the bifunctional enzyme.

the problem of locating the root of the eukaryotic tree, one of the most challenging evolutionary problems. In several protozoa, including Alveolates and Euglenozoa, and in some plants, the genes for DHFR and TS are translated as a single polypeptide, forming a bifunctional enzyme (DHFR-TS), whereas in most animals, fungi, and bacteria, these two enzymes are monofunctional. The monofunctional form of DHFR is a monomer, and that of TS is a dimer. The currently held hypothesis is that the primordial form of DHFR and TS is the monofunctional form and that the genes for DHFR and TS became fused at a single evolutionary point. If the DHFR-TS gene fusion occurred just once, then the fused gene provides an excellent phylogenetic marker, since reversing the fusion would require multiple genetic events. Stechmann and Cavalier-Smith have used the derived gene fusion between DHFR and TS to place the root of the tree below the common ancestor of plants, Alveolates, and Euglenozoa ( Fig. 1 shows the overview of the tree; classifications are described in greater detail below).
There has been significant controversy concerning the classification and evolutionary progression of the protozoa (2)(3)(4)(5). It is thought that many eukaryotic taxa arose during one explosive event into a cluster that has proved difficult to resolve (6). Sequences of rRNA and four other protein-encoding genes (5) have been used to place the protists into several groups. In this work, we investigated several protozoa with bifunctional DHFR-TS: Alveolata, including the Apicomplexans Cryptosporidium, Toxoplasma, and Plasmodium and Euglenozoa, including the kinetoplastids such as Leishmania and the trypanosomes.
To better understand whether bifunctional DHFR-TS is a good tool for the phylogenetic classification of some families of protozoa, we have solved the crystal structure of DHFR-TS from Cryptosporidium hominis (ChDHFR-TS), previously called Cryptosporidium parvum type 1 (7), to 2.8 Å and performed a structure-function analysis with previously solved structures of DHFR-TS from Plasmodium falciparum (PfDHFR-TS) (8) and Leishmania major (LmDHFR-TS) (9). The structures differ with respect to the docking of DHFR on TS, the length and interactions of the N terminus, and the structure of the linker domains between the DHFR and TS domains. Major differences exist between the Apicomplexan and the kinetoplastid DHFR-TS structures. The analysis reveals that there are dissimilar associations for DHFR and TS that correspond to two distinct families of protozoa. Furthermore, sequence analysis of additional DHFR-TS genes from various families of protozoa reveals that they, too, fall into these two families.
The structural differences we discovered between the linker domains of the Apicomplexan and kinetoplastid DHFR-TS fam-ilies raised questions concerning the conservation of the electrostatic channeling mechanism between the families. The structure of LmDHFR-TS, the first DHFR-TS structure to be solved, enabled a hypothesis to explain the channeling of dihydrofolate from TS to DHFR (9). Knighton et al. reported that a series of positively charged residues lined the surface of the protein, extending from the TS active site to the DHFR active site. These residues were postulated to potentially guide the negatively charged dihydrofolate from TS to DHFR in the sequential reaction. In fact, there is experimental evidence for channeling of dihydrofolate from TS to DHFR (10) in L. major, but it is not conclusive that this mechanism is based on electrostatics or extends to all species of DHFR-TS. Stroud (11) has previously postulated an alternative mechanism based on a rapid association of the DHFR and TS active sites by molecular dynamics. Our analysis of the electrostatic potential of ChDHFR-TS shows no pattern of positively charged residues between active sites that would indicate an electrostatic transfer of dihydrofolate.

EXPERIMENTAL PROCEDURES
Crystallization-Pure ChDHFR-TS was obtained from J. Vasquez and R. Nelson (7). The protein was purified on a methotrexate affinity column and eluted with 2 mM dihydrofolate. Fractions containing pure protein were concentrated to 7 mg/ml and incubated with 2 mM ligands (methotrexate, NADPH, CB3717, and dUMP) for approximately 1 h on ice. Using hanging drop vapor diffusion, a promising crystallization condition was refined to 10% polyethylene glycol 6000, 50 mM ammonium sulfate, 150 mM lithium sulfate, and 100 mM Tris, pH 8.0. Crystals grew in 2 weeks. The crystals were soaked in 15% ethylene glycol for 5 min and then in 25% ethylene glycol for another 5 min before being plunged into liquid nitrogen for cryogenic data collection.
Data Collection-Data were collected to 2.8-Å resolution at Brookhaven National Laboratory at beamline X12C on a B4 -2k CCD detector. The crystals belong to space group C2 with unit cell edges a ϭ 214.9, b ϭ 116.3, c ϭ 219.7 Å and ␤ ϭ 95.23°. The data were indexed, integrated, and scaled with Denzo/Scalepack (12) and converted to structure factors with Truncate (13). A random set of reflections (10%) was set aside for the calculation of R free .
Structure Determination-The structure was determined by molecular replacement using a model of thymidylate synthase from P. carinii (Protein Data Bank 1F28) (14), from which all ligands were removed, as a search model (15). A self-rotation map showed a large peak indicating 5-fold noncrystallographic symmetry. Cross-rotation peaks related by 72°were selected by a quaternion algorithm 2 and subjected to a translation search. Two full dimers of TS were placed in the asymmetric unit, and a third dimer of TS was located across the 2-fold symmetry axis, yielding five monomers per asymmetric unit. The molecules are conjugated in an endless helix with a noncrystallographic 5 1 axis that that runs throughout the entire crystal. Five models of DHFR from P. carinii (Protein Data Bank 1CD2) (16), without ligands, were placed using the fixed positions of the TS models as initial phase estimates. The starting R-factor was 52%.
Residues for ChDHFR-TS were substituted in the models of P. carinii DHFR and TS and refined using noncrystallographic restraints between all monomers of DHFR and TS. Electron density for the ligands and the linker region was visible in the initial maps. The ligands were positioned, the linker region was built, and the entire complex was refined to an R-factor of 24.1% and R free ϭ 25.8% using CNS. Water molecules were added where adequate hydrogen bonding to a donor or acceptor on the protein was visible and until there was no remaining difference density. The final R-factor was 22.5%, and R free was 24.5%. Data and refinement statistics are found in Table I.
The electron density map for PfDHFR-TS was calculated using CNS and the structure factors for PfDHFR-TS (Protein Data Bank 1J3I) deposited in the Protein Data Bank. The deposited coordinates were used to calculate phase angles.

RESULTS AND DISCUSSION
Description of the Overall Structure-The structure of ChDHFR-TS illustrates that the enzyme is a homodimer based on the canonical dimer interface of TS (Fig. 2). Based on structural comparisons with other DHFR and TS proteins, the DHFR domain is 178 residues, the linker domain is 58 residues, and the TS domain is 284 residues, yielding a 63-kDa monomer. All residues from 3 to 521 are clearly defined in the electron density, allowing the entire protein model to be visualized. The bulk of the DHFR domains do not contact each other, but the linker polypeptide between the DHFR and TS domains crosses from one DHFR monomer (A) to the other DHFR monomer (B), noncovalently connecting the two DHFR domains, and then returns to form the TS monomer of A (see discussion below). In the view shown in Fig. 1, the DHFR active site of the left monomer faces the viewer, and the TS active site is on the side of the enzyme. The DHFR and TS active sites of the same monomer are ϳ74 Å apart along the surface of the enzyme and 45 Å apart if measured directly through the enzyme from the diaminopyrimidine ring of dihydrofolate to the quinazoline ring of CB3717. The molecules in the unit cell are arranged with a noncrystallographic quasi-5 1 screw axis, forming lines of molecules extending throughout the crystal.
Structure of the C. hominis DHFR Domain-The structure of the DHFR domain of ChDHFR-TS generally resembles that of several other DHFR proteins from eukaryotic organisms (16 -18). However, DHFR from most other organisms, including L. major, has an eight-stranded ␤-sheet, and C. hominis DHFR has a nine-stranded ␤-sheet, where the last four residues of the linker polypeptide form the ninth strand of the sheet.
The C. hominis DHFR active site reveals the conservation of catalytically important residues (17)(18)(19)(20). Electron density for the ligands, dihydrofolate and NADPH, was visible in the initial maps (Fig. 3a). In the ChDHFR-TS structure, dihydrofolate is bound in the active site in the same orientation as dihydrofolate in the human DHFR active site (21). NADPH is bound using similar interactions as noted in other structures of eukaryotic species of DHFR (17,18).
In ChDHFR-TS, the DHFR interface with the TS domain of the same monomer buries a combined surface area of ϳ1300 Å 2 . The interface includes not only residues in the canonical TS (2)). Protist species with available DHFR-TS sequence information are shown in color. Species in green belong to the short linker family, and species in red belong to the long linker family. and DHFR domains but also includes several residues that belong to the linker between the domains. In general, the TS residues in the interface are widely conserved across most species; the DHFR residues are not generally conserved. The interface between C. hominis DHFR and C. hominis TS is primarily hydrophobic and includes a number of van der Waals' interactions between hydrophobic side chains. There are two hydrogen bonds (Glu 139 -Tyr 510 and Ser 169 -Gln 486 ) between residues belonging to the canonical DHFR and TS domains. There are an additional three hydrogen bonds (Arg 210 -Glu 276 , Arg 215 -Arg 275 , and Arg 233 -Asp 242 ) stabilizing the interface between residues in the linker and residues in TS. Overall, there are relatively few interactions between the residues in the canonical DHFR and TS domains and the interactions that are present involve weak van der Waals' forces.

FIG. 1. A portion of the evolutionary tree (as described by Stechmann and Cavalier-Smith
Structure of the C. hominis TS Domain-The overall structure of the TS domain of ChDHFR-TS strongly resembles that of human TS (22,23) and TS from other eukaryotic sources (24,25). The substrate, dUMP, and the cofactor mimic, CB3717, are bound to C. hominis TS in both active sites. The nucleophilic cysteine residue, Cys 402 , is covalently attached to the C-6 position of dUMP. CB3717 is bound in a similar manner as seen in other structures of monofunctional TS (24, 26) and Lm-DHFR-TS (9).
Structure of the ChDHFR-TS Linker Domain-Several DHFR-TS proteins, notably from Apicomplexan protozoa, include a long linker between the DHFR and TS domains. The linker polypeptide, even within the Apicomplexans, varies significantly in length; the entire linker in C. hominis is 58 amino acids, the linker in Toxoplasma gondii is 72 amino acids, and the linker in P. falciparum is 89 amino acids. The linker in ChDHFR-TS plays an important role in the structure of the enzyme. Compared with the structure of DHFR-TS from PfDHFR-TS (8) in which major portions of the linker domain are not built into the model, the entire linker domain of ChDHFR-TS is clearly visible in the electron density (Fig. 3b). The chain leaves DHFR of monomer A at residue 178, crosses to DHFR of monomer B, forms an 11-residue helix, termed the donated helix, that packs against the DHFR active site residues of monomer B, crosses back to DHFR of monomer A to form the last strand of the 9-stranded ␤-sheet of the DHFR domain, and then finally joins the TS domain of monomer A. The crossover is repeated for both dimers, resulting in four chains crossing the cleft between the DHFR domains: one leaving the A monomer and going to the donated helix (tether 1), one returning to A after forming the donated helix (tether 2), and the 2-fold related chains leaving and returning to B. There is one hydrogen bond between the linker polypeptides of the A and B monomers (Gln 192A with Gln 192B ). There are several additional hydrogen bonds between tethers 1 and 2 within one monomer: Arg 230A with Gln 184A , Asp 201A with Arg 190A , and Asn 185A with Asn 187A . The donated helix (residues 196 -207) packs tightly against the opposite monomer on the back side of its DHFR active site using mainly hydrophobic interactions (Table II). Clearly, all of the interactions of the donated helix are important in domain stabilization.
The donated helix may be implicated in the mechanism of the bifunctional enzyme. Genetic analysis, kinetic measurements, and molecular dynamics simulations identify a network of coupled motions that occur during DHFR catalysis (27). Molecular dynamics simulations suggest that Phe 31 (Escherichia coli numbering) approaches the bound dihydrofolate, directing the conformational changes that bring the substrate to the transition state (27,28). Phe 31 is strictly conserved, and mutations of this residue strongly decrease the rate of hydride transfer (29).
The donated helix in ChDHFR-TS contacts residues in C. hominis DHFR that are equivalent to those identified as being involved in the coupled motions of catalysis (see Fig. 4). Phe 36 in ChDHFR-TS is equivalent to Phe 31 in E. coli, and the helix containing Phe 36 and Phe 35 in the ChDHFR-TS active site packs against the donated helix from the linker of the opposite monomer. The closely fitting interaction between the donated helix and the catalytic residues implies that the donated helix may be affected by the conformational changes occurring during catalysis and may represent a signaling mechanism be- tween the two monomers of the enzyme. The tethers crossing back to the originating monomer could possibly carry a signal revealing the catalytic status of the opposite monomer.
The human monofunctional DHFR positions several residues at the same location as the donated helix. Residues Tyr 162 , Pro 163 , Gly 164 , and Val 165 of human DHFR form a loop that extends away from the body of DHFR and makes very similar interactions with the back side of the active site. However, if the donated helix is crucial to a mechanism in the bifunctional enzymes, it must be significantly different from any mechanism in the monofunctional enzymes if, in fact, the monofunctional enzymes associate in the cell.
Comparison of All Bifunctional DHFR-TS Structures-The docking surface of C. hominis DHFR and C. hominis TS consists of mainly hydrophobic residues and does not contribute to the docking of DHFR on TS in a lock-and-key fashion. The docking area for DHFR on TS appears to provide only an attractive and not an orienting force. The fact that the contacts between DHFR and TS are nonspecific was previously recognized in an analysis of the LmDHFR-TS structure by Knighton et al. (9). In fact, the three structures of bifunctional DHFR-TS show different docking surfaces for DHFR on TS despite the conservation of TS residues in that area, giving further evidence that the TS surface provides no common orienting forces. A structural alignment of ChDHFR-TS, PfDHFR-TS, and Lm-DHFR-TS reveals that the linker polypeptides 58, 93, and 2 residues, respectively, play a crucial role in the orientation of the DHFR domain relative to TS. In the structures of all three enzymes, there are examples of dimers where the monomers are not related by crystallographic symmetry, providing evidence that the orientations of the monomers and the linker domains are determined within the molecule rather than by crystal packing forces. The short linker in LmDHFR-TS creates a taut tether, restricting the range of DHFR orientations relative to TS. In ChDHFR-TS, the linker creates an orienting force, composed of van der Waals' interactions and hydrogen bonds, that positions the DHFR monomers in relation to the TS domain (Fig. 5). In PfDHFR-TS, the donated helix is strongly negatively charged (Glu 285 , Asp 284 , and Asp 288 ), complementing the positively charged groove on DHFR into which it fits (Lys 56 , Lys 187 , Lys 180 , Lys 160 , Lys 227 , and Lys 232 ) (Fig. 5). The electrostatic interactions of the donated helix in PfDHFR-TS are the primary determinants of the PfDHFR-TS orientation   and are supplemented by the nonspecific van der Waals' interactions of the P. falciparum DHFR to P. falciparum TS interface.
A comparison of ChDHFR-TS and PfDHFR-TS reveals that the two linker domains are structurally similar despite the fact that C. hominis DHFR and P. falciparum DHFR do not contact TS in the same orientation. When the TS domains of ChDHFR-TS and PfDHFR-TS are aligned, P. falciparum DHFR is rotated 22.6°relative to C. hominis DHFR. The active site of DHFR, specifically the phosphate moiety near the adenine ring of NADPH, forms the pivot point of this rotation. The placement of the donated helix, at the back side of the opposite DHFR active site, is the same in ChDHFR-TS and PfDHFR-TS. If the orientation differences are taken into account and the structure of ChDHFR-TS is overlaid on PfDHFR-TS and if the electron density map for PfDHFR-TS is visualized using the structure factors deposited with the coordinates in the Protein Data Bank, then electron density corresponding to tether 1 of PfDHFR-TS is evident in the PfDHFR-TS electron density map, although its connectivity is poor. Overall, the location of the donated helix and the observation of electron density corresponding to tether 1 leads to the conclusion that the linker in PfDHFR-TS may also cross from one DHFR domain to the opposite and then cross back, similarly to ChDHFR-TS.
Surprisingly, the distances between all active sites, measured directly through the enzyme, are conserved within 1 Å in both ChDHFR-TS and PfDHFR-TS (Table III). The impact of this conservation of active site distances is potentially important. If we hypothesize that a third conserved enzyme, perhaps serine hydroxymethyl transferase, is interacting in a complex composed of all of the folate cycle enzymes in the cell, the overall geometry of the interaction would be conserved across a wide variety of Apicomplexan species of the bifunctional enzyme.
The orientation of L. major DHFR relative to L. major TS is drastically different from the orientation of DHFR to TS in ChDHFR-TS or PfDHFR-TS. LmDHFR-TS has a two-residue linker between the DHFR and TS domains. L. major DHFR is rotated upside down relative to C. hominis DHFR, causing the body of L. major DHFR to rest on the shoulder of the L. major TS domain. The DHFR and TS active sites are exposed on the same exterior side of the molecule in LmDHFR-TS. In contrast, in ChDHFR-TS, the DHFR and TS active sites are on orthogonal faces of the enzyme.
In addition to the different DHFR and TS orientations, ChDHFR-TS and LmDHFR-TS differ in the placement and function of the N-terminal amino acids of DHFR. ChDHFR-TS and PfDHFR-TS do not have an N-terminal extension, relative to the canonical fold of DHFR, which interacts with the TS domain. In LmDHFR-TS, the 22-residue N-terminal extension on DHFR fits into a groove on the exterior of TS and stabilizes the interaction between L. major DHFR and L. major TS. In contrast, the N terminus of ChDHFR-TS and PfDHFR-TS points upward away from TS and remains close to DHFR.
Sequence Analysis of DHFR-TS within Multiple Protist Families-Using the three structures analyzed above, we created a functional analysis of all available complete sequences in the Euglenozoa and Alveolata families. This functional analysis aligns the residues corresponding to the canonical folds for DHFR and TS (Fig. 6). The beginning and the end of the DHFR and TS domains were determined by structural alignment for ChDHFR-TS, PfDHFR-TS, and LmDHFR-TS. The beginning and the end of the DHFR and TS domains of T. gondii (Fig. 6, TgDHFR-TS), Trypanosome cruzi, and Trypanosome brucei DHFR-TS were determined by sequence alignments, since the  All surfaces are contoured from Ϫ5 kiloteslas (red) to ϩ5 kiloteslas (blue). The A monomer is kept as a single surface, and the B monomer is divided between the DHFR domain and the TS domain (the linker is omitted for clarity in monomer B). The interface area between DHFR (B) and TS (B) is opened slightly to show details of the docking region. In ChDHFR-TS, Arg 210A and the area surrounding Glu 276B (including Glu 31B and the backbone oxygen of Cys 164B ; these residues not shown for clarity) form hydrogen bonds. Inset 1 in PfDHFR-TS is defined as residues 20 -36. structures of these proteins have not yet been determined. DHFR-TS from Euglenozoa such as Leishmania, T. cruzi, and T. brucei all have an N-terminal extension relative to the start of the canonical DHFR fold and short, two-residue (Arg-Asn) linkers. The Apicomplexans Cryptosporidium and Toxoplasma do not have N-terminal extensions and generally have longer linkers. Secondary structure prediction suggests a helix within the linker of T. gondii DHFR-TS at positions 288 -297. This helix may be, in fact, longer than nine residues, but has been predicted conservatively. Plasmodium falciparum, an Apicomplexan, has a long linker and a partial (eight-residue) N-terminal extension, although this extension points away from TS and does not form an interaction with TS (8) similar to that seen in LmDHFR-TS. The functional analysis leads to the conclusion that there are two families of DHFR-TS structures: a long linker family that resembles ChDHFR-TS and PfDHFR-TS and includes a donated helix (Fig. 7, a and b) and a short linker family that resembles LmDHFR-TS and uses the N-terminal extension that appears to stabilize the DHFR-TS interaction (Fig. 7, c and d).
Several partial sequences of DHFR-TS from other protist families, three of which are distantly related to the Alveolates or Euglenozoa, were analyzed and found to fall into one of the two structural families. The sequence of DHFR-TS from Tetrahymena pyriformis (all partial sequences, discussed in Ref. 2, include a portion of the DHFR domain, the entire linker domain, and a portion of the TS domain), a member of the Alveolates (see Fig. 1), but not an Apicomplexan, also has a long, 58-residue, linker domain between DHFR and TS with helical content corresponding to the same region as ChDHFR-TS. The sequence of the DHFR-TS gene from Diplonema ambulator (2), another protozoa in the Euglenozoa family, but not a kinetoplastid like Leishmania or the trypanosomes, confirms that it has a short, two-residue linker (Val-Asn) between the DHFR and TS domains. Sequences of DHFR-TS from Chlamydaster sterni (from the Heliozoa phylum), Cercomonas longicauda (from the Cercozoa phylum), and Amastigomonas debruynei (from the Apusozoa phylum) also have short (2-6-residue) linker domains. Apusozoa is on the first branch after the DHFR-TS fusion event. Heliozoa and Cercozoa are on the second branch, the same branch off which the Euglenozoa originate, but are only distantly related to the Euglenozoa. Based on the length of these various linker domains, we hypothesize that the structure of Tetrahymena more closely resembles the Apicomplexan DHFR-TS and that the structures of DHFR-TS from Apusozoa, Heliozoa, and Cercozoa more closely resemble the structure of LmDHFR-TS.
The long linker family includes the Apicomplexans and may extend to the Alveolates (shown in red in Fig. 1); the short linker family includes the kinetoplastids and may extend to the Euglenozoa, Cercozoa, Heliozoa, and Apusozoa (shown in green in Fig. 1). The two classes appear to be evolutionarily distinct and appear to have evolved separately after the DHFR-TS gene fusion event. The two families of DHFR-TS structures confirm the separate branches shown in the evolutionary tree derived by Stechmann and Cavalier-Smith (2). Since Apusozoa repre- sent the first branch after the DHFR-TS fusion event and appear to belong to the short linker family, the short linker family may represent the primitive condition for DHFR-TS.
Electrostatic Channeling-Electrostatic channeling, using several positively charged residues aligned between the TS and DHFR active sites, has been postulated to guide dihydrofolate from the TS active site to the DHFR active site in the Lm-DHFR-TS and PfDHFR-TS enzymes. In L. major TS, three basic side chains (Lys 282 , Arg 283 , and Arg 287 ) are proposed to bind the polyglutamylated tail of the folate. These basic residues are adjacent to a highly basic insertion in L. major DHFR (Lys 66 , Lys 67 , Lys 72 , and Lys 73 ), forming a putative electrostatic "highway" (11) in LmDHFR-TS. In PfDHFR-TS, there are two positively charged grooves lined with residues conserved between plasmodial DHFR-TS enzymes and hypothesized to be involved in electrostatic channeling (8). Biochemical experiments show that the rates of substrate transfer in T. gondii DHFR-TS are altered in the presence of varying amounts of salt, lending evidence to the electrostatic channeling hypothesis (30). Rapid chemical quench assays (10) verify that, for bifunctional LmDHFR-TS enzymes, there is no lag in tetrahydrofolate production or accumulation of dihydrofolate, unlike the case for monofunctional enzymes. Brownian dynamics simulation experiments determined that almost all substrate molecules with charge Ϫ2 leaving the TS active site reached the DHFR active site in LmDHFR-TS (31,32). However, in recent experiments (33), charge reversal or charge neutralization mutants in the putative channeling region of LmDHFR-TS were shown not to interfere with the kinetic channeling of substrate.
There is currently no evidence supporting or denying substrate channeling in ChDHFR-TS. However, the structure of ChDHFR-TS does not show a pattern of positively charged residues like the one seen in LmDHFR-TS, giving no structural support to the theory of electrostatic channeling in ChDHFR-TS. Three of the basic residues implicated in the electrostatic highway, Lys 282 , Arg 283 , and Arg 287 in L. major TS, are conserved and located near the C. hominis TS active site (Lys 284 , Lys 285 , and Arg 289 in ChDHFR-TS). One of the four residues in the basic insertion found in L. major DHFR is conserved in C. hominis DHFR (Lys 48 ), but its location, 46 Å from the positively charged residues near the TS active site, is drastically different from that of the equivalent residue in L. major DHFR, due to the significantly different docking of DHFR on TS in ChDHFR-TS relative to that in LmDHFR-TS. An electrostatic potential map of ChDHFR-TS does not reveal any other pattern of positively charged residues that would create an alternate electrostatic highway if, in fact, ChDHFR-TS does channel substrate (Fig. 8). Therefore, a pattern of positively charged residues between the TS and DHFR active sites is not a common trait between all of the bifunctional DHFR-TS enzymes.
A previously proposed, alternate version of substrate channeling proposes that dynamical motion can bring the two active sites close together (11). Whereas the ChDHFR-TS structure cannot rule out the possibility that the two active sites approach each other, it does seem very unlikely, since the DHFR and TS active sites are on orthogonal faces of the enzyme and are cross-tied by the linker between the domains. In fact, the opening of the DHFR active site points toward the 2-fold axis between the two monomers and the opening of the TS active site points in the opposite direction, away from the 2-fold axis. The enzyme would need to undergo a major rearrangement in order to bring the two sites into proximity.
Conclusion-The crystal structure of DHFR-TS from C. hominis reveals a long linker and a new interaction, the donated helix, in the bifunctional DHFR-TS protein. The structures of the Apicomplexan DHFR-TS proteins, ChDHFR-TS and PfDHFR-TS, show the donated helix, and the structure of the kinetoplastid LmDHFR-TS does not. A functional alignment, partitioning several sequences of DHFR-TS proteins into DHFR, linker, and TS domains, suggests that there are two structural families of DHFR-TS proteins that evolved separately after the gene fusion event. The observation of these two families coincides with other genetic evidence that separates the Alveolates, of which the Apicomplexans are a member, and the Euglenozoa, of which Leishmania is a member. Furthermore, if the donated helix, at a conserved location with respect to TS and the opposite DHFR active site, confers an important structural or functional role to the Apicomplexans, it would confirm the advanced evolutionary placement of the Alveolates over the Euglenozoa in the phylogenetic tree (2) (see Fig. 1). Finally, the structure of ChDHFR-TS does not show a pattern of positively charged residues between the DHFR and TS active sites, suggesting that it belongs to a different class of enzymes that may not use electrostatic channeling in the transfer of dihydrofolate from the TS active site to the DHFR active site.