Structure of the Bifunctional dCTP Deaminase-dUTPase from Methanocaldococcus jannaschii and Its Relation to Other Homotrimeric dUTPases* S

The bifunctional dCTP deaminase-dUTPase (DCD-DUT) from Methanocaldococcus jannaschii catalyzes the deamination of the cytosine moiety in dCTP and the hydrolysis of the triphosphate moiety forming dUMP, thereby preventing uracil from being incorporated into DNA. The crystal structure of DCD-DUT has been determined to 1.88-Å resolution and represents the first known structure of an enzyme catalyzing dCTP deamination. The functional form of DCD-DUT is a homotrimer wherein the subunits are composed of a central distorted (cid:1) -barrel surrounded by two (cid:1) -sheets and four helices. The trimeric DCD-DUT shows structural similarity to trimeric dUTPases at the tertiary and quater-nary levels. There are also additional structural elements in DCD-DUT compared with dUTPase because of a longer primary structure. Four of the five conserved sequence motifs that create the active sites in dUTPase are found in structurally equivalent positions in DCD-DUT. The last 25 C-terminal residues of the 204-residue-long DCD-DUT are not visible in the electron density map, but, analogous to dUTPases, the C terminus is probably ordered, closing the active site upon catalysis. Unlike other enzymes catalyzing

dUMP is the origin of thymidine precursors for DNA in all organisms. It is the substrate of thymidylate synthase that catalyzes the formation of dTMP (1). The main pathway for dUMP generation differs markedly between different organisms. dUMP is produced on the monophosphate level by the action of dCMP deaminase in eukaryotes and Gram-positive bacteria, whereas in Gram-negative bacteria such as Escherichia coli and some Archaea, dCTP is deaminated and dephosphorylated. dCTP deaminase (EC 3.5.4.13) and dUTPase (EC 3.6.1.23) catalyze the two consecutive steps where dUMP is formed in E. coli (dCTP 3 dUTP 3 dUMP). The hyperthermophilic archaeon Methanocaldococcus jannaschii produces a bifunctional dCTP deaminase-dUTPase (DCD-DUT) 1 with the ability to catalyze both reactions ( Fig. 1) (2,3). The gene that encodes this enzyme was first annotated as coding for a dCTP deaminase, but enzymatic studies revealed the bifunctionality, which implies that dUTP is never released from the enzyme.
dUTP is also formed upon spontaneous deamination of dCTP, and the reaction catalyzed by dUTPase is therefore essential in keeping the cellular concentration of dUTP low to suppress misincorporation of uracil into DNA. If such an event would happen, DNA is restored by an inherent repair mechanism. However, when dUTPase is deficient, the organism will not survive (4,5). The dUTPases exist in different oligomeric forms. The homotrimeric form of dUTPases is by far the most common, but there are also monomeric and homodimeric variants of the enzyme. dCTP deaminases, together with DCD-DUT from M. jannaschii, display sequence similarities to homotrimeric dUTPases (6,7). The homotrimeric dUTPases contain five conserved motifs (8), four of which are also found in dCTP deaminases (7). The four matching motifs are part of the three identical active sites in homotrimeric dUTPases, and each active site harbors motifs 1, 2, and 4 from one subunit of the trimer and motif 3 from another subunit. The third subunit of the dUTPase trimer is also part of the active site by contributing the fifth motif. This motif is located at the C terminus, which is mobile in the absence of substrate, but, upon catalysis, it closes the active site (9 -12).
Deamination of cytosine compounds also takes place in the pyrimidine salvage pathways, wherein preformed cytosine and (deoxy)cytidine is converted into uracil and (deoxy)uridine, respectively. The first of these reactions is catalyzed by cytosine deaminase, an enzyme that only is present in bacteria and fungi. Cytidine deaminase that catalyzes the second reaction is, on the other hand, present in almost all organisms, including higher eukaryotes. Deamination of a cytosine ring implies an attack by a water molecule (or hydroxide ion) and subsequent expulsion of the amino group. Cytidine deaminase, dCMP deaminase, and yeast cytosine deaminase utilize zinc ions, whereas bacterial cytosine deaminase makes use of an iron ion for formation of the nucleophilic hydroxide ion used in the reaction. dCTP deaminase has been shown to not contain any metal ions. 2 Therefore, dCTP deaminase and the highly similar DCD-DUT must operate with a different catalytic ma-chinery from the other enzymes studied that deaminate cytosine compounds. Nevertheless, as for dUTP hydrolysis by dUTPases, dCTP deaminase and DCD-DUT require magnesium ions in order to be catalytically active. Studies of Salmonella typhimurium dCTP deaminase suggest that the true substrate of the reaction is the magnesium-dCTP complex (13).
We have undertaken to determine the structure of DCD-DUT from M. jannaschii to obtain information on the structural relationship of this enzyme and the homotrimeric dUTPases, of which there are crystal structures available from five different organisms (11,12,14,15). Moreover, the DCD-DUT structure provides information about the mechanism for generation of the nucleophile in this type of deaminase. There is still no crystal structure available for any monofunctional dCTP deaminase.

EXPERIMENTAL PROCEDURES
Crystallization-The MJ0430 gene from M. jannaschii was expressed in E. coli and purified as described in Björnberg et al. (2). Crystal screen I from Hampton research (16) was used for initial crystallization screening using the hanging drop vapor diffusion technique. Small crystals were obtained in solution 36 (8% polyethylene glycol (PEG) 8000, 0.1 M Tris-HCl, pH 8.5) and the conditions were optimized. Crystals used for diffraction experiments were grown at 20°C with a hanging drop of 2 l of 3 mg/ml protein in 20 mM Tris-HCl, pH 8.5, mixed with 2 l mother liquor (5% PEG 8000, 0.1 M Tris-HCl, pH8.5) equilibrated over 1 ml of mother liquor. The crystals grew to a size of 0.3 ϫ 0.3 ϫ 0.3 mm in 3 days. The heavy atom derivative crystal was obtained by the addition of 2 mM lead acetate in the mother liquor.
Data Collection-Diffraction data were collected under cryogenic conditions (100 K) at beamline I711, MAX-lab, Lund university, Sweden (17) on a MAR Research CCD detector. One hour prior to data collection, the crystal was transferred to mother liquor to which ethylene glycol had been added to 25% for cryo protection. Auto indexing, data reduction, and scaling were performed with programs from the HKL suite (18). The crystals belong to the cubic space group P2 1 3 (a ϭ b ϭ c ϭ 111.1 Å), and two protein chains per asymmetric unit give a reasonable Mathews coefficient of 2.44 Å 3 /Da, corresponding to ϳ50% solvent content.
Structure Determination and Refinement-The three-dimensional structure was determined using the method of single isomorphous replacement with anomalous scattering. The wavelength of the data collection (1.098 Å) was not ideal for obtaining an optimal anomalous signal from lead, but, nonetheless, the positions of two lead atoms were found with the program SOLVE (19), which may be attributed to the high redundancy and accuracy of the data. RESOLVE (20) was used for density modification and automatic tracing. The phases were extended from 2.5-Å resolution (lead derivative) to 1.88-Å resolution (native data) using the ARP-wARP program (21), and a free atom model was produced. From this model, 97% of the 356 amino acid residues contained in the final model could be automatically traced by ARP-wARP. After one step of refinement with REFMAC5 (22), the remaining amino acid residues of the model were manually built in O (23). Cycles of refinement with REFMAC5 and water picking with ARP-wARP (21) were performed. During refinement, non-crystallographic symmetry (NCS) restraints were applied to the two molecules in the asymmetric unit, water molecules related by NCS were detected with WATNCS (24), and these were also added to the NCS restraints. The quality of the model was checked with PROCHECK (25) and WHATIF (26) as refinement progressed. The structure factors and coordinates have been deposited in the Protein Data Bank with accession code 1OGH.

RESULTS
Structure Determination-The structure of DCD-DUT from M. jannaschii was solved by single isomorphous replacement with anomalous scattering using a lead derivative. The quality of the initial electron density map was sufficient for automatic tracing of most residues of the final model. The model was refined to an R-factor of 14.6% (R free ϭ 18.4%). The statistics of the data and the refinement are summarized in Tables I and II, respectively.
The subunit in homotrimeric DCD-DUT is composed of 204 amino acid residues with a molecular mass of 23.4 kDa. The asymmetric unit in the crystal contains two subunits (chain A and B) forming two independent homotrimers by the crystallographic 3-fold symmetry. The A chain is composed of residues A 1-175 and the B chain of residues B 1-181, respectively. Of the 356 amino acid residues, 18 were modeled with double conformations of their side chains. Although there is clear electron-density for all amino acid residues in the model that could be refined to low R-values, the remaining C-terminal residues (A 176 -204 and B 182-204) could not to be localized in the electron-density map. The C␣-atoms (1-175) of the two chains can be superimposed with a root mean square deviation of 0.30 Å using default parameters in the program O (23). The model also contains 400 water molecules, of which 192 are related by NCS. In the Ramachandran plot there are no nonglycine residues in disfavored regions except for residues Lys 20 and Pro 21 , which form a cis-peptide bond. This proline residue is strongly conserved among dCTP deaminase amino acid sequences ( Fig. 5, supplementary material).
The subunits related by crystallographic 3-fold symmetry give rise to a homotrimeric structure, the presumed active form of the enzyme (Fig. 2b). The surface perpendicular to the 3-fold axis has an equilaterally triangular shape with a side of ϳ40 Å. The thickness of the trimer along the 3-fold axis is 50 Å. The N-terminal residues of the three subunits are buried in the interior of the trimer, whereas the C-terminal residues extrude in the solvent region. The last five residues from chain B in the model form an additional ␤-strand on the ␤-arm S3 of the A-chain (A 56 -61) in the crystal. These interactions are not part of the homotrimer contacts and may be considered a crystal-packing artifact. The long loop between ␤3 and ␤4 intrudes into the next subunit forming an important intersubunit interaction. An analysis of the interactions of two of the subunits in the trimer with the Protein-Protein Interaction Server (28) gives a value of 1677 Å 2 for the interface-accessible surface area with residues from eight different segments. 65 and 35% of the residues in the surface are non-polar and polar, respectively. The analysis was performed including all the NCSrelated water molecules in the asymmetric unit, and, of these, 79 form bridges between the subunits.
Structural Similarity to Homotrimeric dUTPases- Fig. 3a shows the superimposition of one subunit of DCD-DUT from M. jannaschii and dUTPase from feline immunodeficiency virus (12), respectively. This dUTPase structure has been chosen as an illustration because it has an ordered C terminus. The two crystal structures superimpose with a root mean square deviation of 1.8 Å for 92 C␣ atoms as determined using default parameters in the program O (23). Panels b and d in Fig. 2 display schematics of the same trimers in equivalent views. There are additional structural features in the DCD-DUT structure, as reflected by its additional 71 amino acid residues. This is made clearer in the topology diagrams in Fig. 3, b and c as well as in the structure based sequence alignment in Fig.  4 where the crystal structures of five dUTPases of different origins have been superimposed with the DCD-DUT structure. Structural elements, which are present in DCD-DUT but not in dUTPase, are the anti-parallel ␤-arm S3 and ␣-helices ␣1 and ␣2. The N-terminal ␣1 lies on top of the homotrimer as seen in the view of Fig. 2b. The ␤-arm S3, with ␤3 and ␤4 and their interjacent loop, gives rise to additional trimerization interactions, and the long ␣-helix ␣2 and ␤-strand ␤6 generate an extension of DCD-DUT along the 3-fold axis of the trimer, as compared with dUTPases.
Active Site-There are several crystal structures in the Protein Data Bank of dUTPases in complex with dUMP, dUDP, or dUTP, which all map to the same position in the trimer as illustrated in Fig. 2d. Fig. 2e shows a close-up of the presumed active site of DCD-DUT with dUDP and a strontium ion from the superimposed equine infectious anemia virus dUTPase (PDB code 1DUC). This structure has been chosen because it is the only one where a metal ion is bound in the active site (15). Like dUTPases, DCD-DUT requires a divalent metal ion such as magnesium to be active. dUDP and the strontium ion can be contained in DCD-DUT without any clashes with the protein.
The binding site for the pyrimidine ring is occupied by two well ordered water molecules (w359 and w361).
Residues from two of the subunits of the trimers contribute to the active site in DCD-DUT, as illustrated in Fig. 2e. Among these residues, three are conserved in dCTP deaminases but not in dUTPases, namely Arg 122 , Thr 130 , and Glu 145 (Fig. 4) (Fig. 5, supplementary material). In the active site, residue Phe 138 seems to serve the same role as the corresponding residue in dUTPases, which is predominantly a tyrosine residue. The ring of this residue stacks with the deoxyribose moiety of the substrate (Fig. 2e). In a few cases, for example in the mouse mammary tumor virus dUTPase, this residue is also a phenylalanine (29) and, in other dCTP deaminases, it is a tryptophan residue, which agrees with the stacking ability.   2. Ribbon views of the subunit (a) and the homotrimer (b) of DCD-DUT from M. jannaschii. The ␤-sheets of the subunit are displayed in different shades of blue. The three subunits of the trimer are shown in red, blue, and gray, respectively. c, topology diagram of the dCTP deaminase subunit with the colors of the ␤-sheets corresponding to those in panel a. d, ribbon view of the feline immunodeficiency virus

DISCUSSION
The overall fold of DCD-DUT is similar to homotrimeric dUTPases, although DCD-DUT harbors additional structural features (Fig. 3). The position of the active site could be localized based on superimpositions of DCD-DUT with structures of dUTPases of different origins (Fig. 2e). Four of the conserved regions from dUTPase sequence alignments are also found in the dCTP sequences (Fig. 5, supplementary material). The sequence conservation of motif 1 is less pronounced than those for the other three, but the structural correspondence of this motif is still unambiguous. Two of the motifs (2 and 3) are most likely, as is the case in dUTPase, involved in base and sugar recognition as well as phosphate binding. The other two motifs (1 and 4) are found in a farther end of the substrate binding region, also in a similar way as is the case with dUTPases.
The fifth conserved sequence motif in the dUTPases (motif 5), located in the C terminus, is not preserved in dCTP deaminases. However, the C-terminal region in dCTP deaminases contains a different amino acid sequence (residues 184 -190 in M. jannaschii DCD-DUT) (Fig. 5, supplementary material). These conserved residues could be involved in closing the active site as has been shown for dUTPases (9 -12). The C terminus is indeed required for activity of DCD-DUT, as has been found for a C-terminally truncated form of the enzyme (2).
The trimer interface in DCD-DUT contains only a few invariant residues among dCTP deaminases (Asp 49 , Leu 97 , Thr 130 , and Glu 145 in M. jannaschii DCD-DUT). Therefore, the nature of the interaction with 65% non-polar and 35% polar residues with 79 bridging water molecules for M. jannaschii DCD-DUT may not be typical for dCTP deaminases. A similar trend is observed in the dUTPases, wherein the character of the subunit interactions varies from exclusively hydrophobic (E. coli dUTPase) to alternating layers of positively and negatively charged residues that contain numerous water molecules (human dUTPase) (30).
dUTPases contain a proline residue that is considered to be a hinge, which is important for correct bending of the C-terminal tail so that it reaches all over the trimer to the correct active site (Fig. 2d) (31). No such curvature is observed in the structure of M. jannaschii DCD-DUT, where the last visible residues in the C terminus continue in a relatively straight line (Fig. 2b). This may not represent the conformation in solution, and, as mentioned previously, the ␤-strand formed by the last residues in the C terminus of protein chain B may be regarded as a crystal-packing artifact. There are no indications that the C terminus of DCD-DUT could not cross around the adjacent subunit of the trimer and reach the active site farthest away, as is the case for dUTPases (Fig. 2d). On the other hand, the dUTPase homotrimer (PDB code 1F7R) that corresponds to the view in panel b. dUDP is shown in ball-and-stick representation. e, stereo view of M. jannaschii DCD-DUT active site with dUDP and strontium from the equine infectious anemia virus dUTPase complex structure (PDB code 1DUC) superimposed. Residues from the different subunits of the trimer are shown in yellow and orange, respectively. f, stereo view of the hydrogen bonding network, shown with broken lines, in the region of the active site in DCD-DUT where the deamination reaction is assumed to take place . Panels a, b, d, e, and f were prepared with MOLSCRIPT (36) and Raster3D (37). number of amino acid residues in the C terminus is even sufficient for going in the other direction (counterclockwise in Fig. 2b) and reaching the third active site. The cis-peptide bond between Lys 20 and Pro 21 in DCD-DUT is placed at the end of ␤-strand ␤1 before the peptide chain enters the 3 10 -helix ␥1. The bond is placed on the protein surface and, though there is no obvious reason for its presence, it may be involved in interactions with the C terminus upon catalysis.
In motif 3, the aspartate residue corresponding to Asp 135 is strictly conserved in dUTPases and strongly conserved between the dCTP deaminases (Fig. 5, supplementary material). Crystal structures of dUTPases complexed with substrate analogues show that the carboxylate group of this aspartate is hydrogen bonded to the 3-hydroxy group of deoxyribose. In the dUTPase from E. coli in complex with dUDP, this carboxylate group (Asp 90 ) also interacts with a water molecule, appropriately positioned for a nucleophilic in-line attack on the ␣-phosphate. It has therefore been suggested that this residue may act as a general base in dUTP hydrolysis. Site-directed mutagenesis of the residue in other dUTPases has consistently generated an inactive enzyme (32,33). In DCD-DUT, Asp 135 was recently mutated to an asparagine residue by Li et al. (3), and this enzyme had neither deaminase nor dUTPase activity. Whether the residue is essential for catalysis or substrate binding or both remains to be answered, as an aspartate or glutamate residue is also found at this position in monofunctional dCTP deaminases.
A crucial role as general acid/base catalyst has been ascribed to a glutamic acid residue in cytidine deaminases, namely Glu 104 in dimeric cytidine deaminase from E. coli (34) and Glu 55 in tetrameric cytidine deaminase from Bacillus subtilis (35). The structures of these enzymes were determined using transition state analogues of cytidine with a hydroxyl group attached to a tetrahedral C4 atom. Assisted by a firmly bound zinc atom, the glutamate residue is supposed to be involved in both the generation of the nucleophilic hydroxide ion and the protonation of N3. Li and co-workers have mutated the strongly conserved Glu 145 to glutamine in DCD-DUT (3). The mutant enzyme was devoid of deaminase activity but showed 25% residual dUTPase activity. Glu 145 may have a role analogous to that of the residue in cytidine deaminases, because it is found in the active site close to the plausible position of the pyrimidine ring (Fig. 2e).
In cytidine deaminases, dCMP deaminases, and cytosine deaminases, a metal ion such as zinc or iron is used for water activation prior to the nucleophilic attack. dCTP deaminase and DCD-DUT do not contain any metal ions, as has been examined by energy dispersive x-ray fluorescence for dCTP deaminase from E. coli 2 and seen in the crystal structure described here. Hence, formation of the nucleophile that performs the attack on the C4 atom of the pyrimidine ring must occur in a different way. The region of the active site in DCD-DUT, where the deamination reaction is assumed to take place, contains a network of hydrogen bonds involving well defined water molecules and the side chains of five amino acid residues, Ser 118 , Arg 122 , His 128 , Thr 130 , and Glu 145 (Fig. 2f). All of these amino acid residues, except for His 128 , are invariant among dCTP deaminases (Fig. 5, supplementary material). His 128 is substituted by an asparagine residue in some dCTP deaminases, a replacement that may preserve the hydrogen bonding abilities of the histidine residue. There is a very narrow pocket formed by motif 3 in homotrimeric dUTPases, where the O4 atom of uracil is hydrogen bonded to a water molecule that is held in place by hydrogen bonds to main chain atoms (11,30). In DCD-DUT, the His 128 side chain occupies the position equivalent to this water molecule, but, nevertheless, binding of uracil is possible (Fig. 2e).
The active site (Fig. 2f) contains three water molecules, i.e. w326, w350, and w359. Both w326 and w359 are hydrogen bonded to residues that are strictly conserved among the dCTP deaminases, which make them candidates as nucleophiles for the deamination reaction. w359 is hydrogen bonded to a positively charged side chain (Arg 122 ), which could favor its tendency toward deprotonization and conversion to a hydroxyl  (12), Mycobacterium tuberculosis (PDB code 1MQ7), and human (11). The numbering on top of the sequences is according to M. jannaschii DCD-DUT as are the secondary structure elements that are shown above the sequences with a twisted rods for ␣-helices and arrows for ␤-strands. The numbering of the secondary structure elements is according to that in Fig. 2, a and c. Residues with 100% sequence identity are shown in dark gray boxes, those with Ͼ80% identity are in gray boxes, and those with Ͼ60% identity are in light gray boxes. Equal signs (ϭ) under the sequences represent amino acid residues with C␣ positions considered to be equal. Observe that, for the dUTPase structures from equine infectious anemia virus, E. coli, Mycobacterium tuberculosis, and human, the 14, 16, 19, and 5 C-terminal residues are missing, respectively, and these are added in this alignment for the sake of completeness as are the 23 last residues of DCD-DUT. The figure was prepared with INDONESIA. 3 group. However, it is also hydrogen bonded to Glu 145 , which counterbalances the positive charge. If Glu 145 interacts with the pyrimidine ring, as in cytidine deaminase (34,35), this could also favor the conversion of w359 to a nucleophilic hydroxyl ion. Binding of the substrate may perturb the dynamic hydrogen bonding network, altering the positions of the protons. The structure of the enzyme in complex with an inhibitor or product is required to resolve this issue and may give structural information on the role of the flexible C terminus.
The structure of DCD-DUT has confirmed that this bifunctional enzyme is not a fusion protein of a dCTP deaminase and a dUTPase. The evolutionary aspect of whether the bifunctional enzyme evolved before or after the dUTPases remains an open question.