Structural and Functional Insights into the DNA Replication Factor Cdc45 Reveal an Evolutionary Relationship to the DHH Family of Phosphoesterases*

Background: Although Cdc45 is a key replication factor, there are no biochemical or structural studies on the isolated protein. Results: We report the first purification and biochemical characterization of human Cdc45, as well as the first structural data on the isolated Cdc45 by small angle x-ray scattering. Conclusion: Cdc45 is related to the RecJ/DHH family of phosphoesterases and binds single-stranded DNA. Significance: The similarity has important evolutionary implications. Cdc45 is an essential protein conserved in all eukaryotes and is involved both in the initiation of DNA replication and the progression of the replication fork. With GINS, Cdc45 is an essential cofactor of the Mcm2–7 replicative helicase complex. Despite its importance, no detailed information is available on either the structure or the biochemistry of the protein. Intriguingly, whereas homologues of both GINS and Mcm proteins have been described in Archaea, no counterpart for Cdc45 is known. Herein we report a bioinformatic analysis that shows a weak but significant relationship among eukaryotic Cdc45 proteins and a large family of phosphoesterases that has been described as the DHH family, including inorganic pyrophosphatases and RecJ ssDNA exonucleases. These enzymes catalyze the hydrolysis of phosphodiester bonds via a mechanism involving two Mn2+ ions. Only a subset of the amino acids that coordinates Mn2+ is conserved in Cdc45. We report biochemical and structural data on the recombinant human Cdc45 protein, consistent with the proposed DHH family affiliation. Like the RecJ exonucleases, the human Cdc45 protein is able to bind single-stranded, but not double-stranded DNA. Small angle x-ray scattering data are consistent with a model compatible with the crystallographic structure of the RecJ/DHH family members.

Cdc45 is an essential protein conserved in all eukaryotes and is involved both in the initiation of DNA replication and the progression of the replication fork. With GINS, Cdc45 is an essential cofactor of the Mcm2-7 replicative helicase complex. Despite its importance, no detailed information is available on either the structure or the biochemistry of the protein. Intriguingly, whereas homologues of both GINS and Mcm proteins have been described in Archaea, no counterpart for Cdc45 is known. Herein we report a bioinformatic analysis that shows a weak but significant relationship among eukaryotic Cdc45 proteins and a large family of phosphoesterases that has been described as the DHH family, including inorganic pyrophosphatases and RecJ ssDNA exonucleases. These enzymes catalyze the hydrolysis of phosphodiester bonds via a mechanism involving two Mn 2؉ ions. Only a subset of the amino acids that coordinates Mn 2؉ is conserved in Cdc45. We report biochemical and structural data on the recombinant human Cdc45 protein, consistent with the proposed DHH family affiliation. Like the RecJ exonucleases, the human Cdc45 protein is able to bind singlestranded, but not double-stranded DNA. Small angle x-ray scattering data are consistent with a model compatible with the crystallographic structure of the RecJ/DHH family members.
Cdc45 is an essential factor required for the establishment (1-3) and progression (4 -7) of the DNA replication fork in eukaryotic cells. As many other DNA replication factors, Cdc45 is more abundant in proliferating cells, whereas it is almost absent from long term quiescent, terminally differentiated, and senescent cells (8). Several genetic studies, two-hybrid screens, and co-immunoprecipitation analyses revealed that Cdc45 interacts with a number of other replication factors, including the Mcm2-7 complex, GINS, MCM10, replication protein A, DNA polymerases ␣, ␦, and ⑀, the origin recognition complex subunit 2, and TopBP1 (for a review see Ref. 9). In a variety of eukaryotic organisms Cdc45 has been found to stably associate with Mcm2-7 and GINS to form a complex (the CMG), 4 which is believed to act as the DNA helicase at the replication fork (7, 10 -13). This hypothesis has been reinforced by the demonstration that the Drosophila melanogaster CMG complex can be reconstituted by co-producing its protein components in baculovirus-infected insect cells and is found to possess a robust DNA-unwinding activity in vitro, whereas the Mcm2-7 complex alone is almost completely unable to unwind duplex DNA (14). This analysis suggests that Cdc45 and GINS are helicase auxiliary factors whose association with the Mcm hetero-hexameric ring is absolutely required to reconstitute an active complex.
The critical biological function played by Cdc45 is underscored by its ubiquitous distribution and high degree of sequence conservation from yeast to man. Nevertheless, not much is known on the exact role of Cdc45 either in the preinitiation complex or the CMG helicase, due to the lack of biochemical studies on the isolated protein. The analysis of the primary structure of Cdc45 has failed to reveal the presence of any significant similarity to known protein family or any characteristic sequence motif.
Due to the complexity of the eukaryotic DNA replication machinery, most of the information on the structure and biochemistry of replication proteins has been inferred from the study of the simpler archaeal system. In fact, Archaea possess a simplified version of the Mcm (15,16) and GINS (17)(18)(19) complexes, but so far no archaeal homologue of Cdc45 has been identified.

EXPERIMENTAL PROCEDURES
Protein Expression and Purification-The nucleotide sequence encoding the human Cdc45 (hCdc45) was amplified using the Platinum Pfx DNA polymerase (Invitrogen), using a cDNA clone from the mammalian Gene Collection (IMAGE ID 2964592) as a template, and the following primers: 5Ј-TTAA-GAAGGAGATATACTATGTTCGTGTCCGATTTCCGCA-AAG-3Ј and 5Ј-GATTGGAAGTAGAGGTTCTCTGCGGA-CAGGAGGGAAATAAGTGCG-3Ј.
The amplified fragment was then Ligation independentcloned into the pNIC-CTHF vector (Structural Genomics Consortium, Oxford (20)) for protein expression as a fusion with a C-terminal His 6 -FLAG tag cleavable with tobacco etch virus (TEV) protease.
Expression of hCdc45 was carried out in Escherichia coli BL21 (DE3)-R3-Rosetta cells (SGC, Oxford) grown overnight at 25°C in TB broth, following 0.1 mM isopropyl 1-thio-␤-D-galactopyranoside induction. A frozen cell pellet (corresponding to 1 liter of culture) was resuspended in 100 ml of buffer A (50 mM sodium phosphate buffer, pH 7.4, 1 M NaCl, 10% (v/v) glycerol, 2 mM ␤-mercaptoethanol) containing EDTA-free protease inhibitor mixture (Roche Applied Science), 50 units/ml benzonase nuclease (Novagen), and 1 mg/ml lysozyme (Sigma-Aldrich) and incubated for 2 h at 4°C. Cells were further disrupted by sonication (Bandelin Sonopuls HD3200 ultrasonic homogenizer). The insoluble material was removed by centrifugation (17,000 rpm for 1 h at 4°C, Beckman Allegra 64R), and the supernatant was loaded onto a 5-ml HiTrap chelating column (GE Healthcare) previously charged with Ni 2ϩ (NiSO 4 ) and equilibrated with buffer A. An imidazole step gradient (25 mM, 250 mM, and 500 mM) was applied, and protein elution was achieved using buffer A plus 250 mM imidazole. Fractions containing the hCdc45 were pooled and subsequently loaded onto a Superdex 200 16/60 size-exclusion column (GE Healthcare) equilibrated with buffer B (50 mM sodium phosphate, pH 7.0, 150 mM NaCl, 5% (v/v) glycerol, 2 mM ␤-mercaptoethanol). The eluted protein was incubated with TEV protease for overnight tag cleavage at 4°C. 12% SDS-PAGE was carried out to check that the cleavage was complete. The hCdc45-TEV mix was loaded onto a 5-ml Heparin column (GE Healthcare), previously equilibrated with buffer B. The cleaved hCdc45 was eluted from the column applying a linear (150 mM to 1 M) NaCl gradient. Finally the protein was subjected to size-exclusion chromatography in buffer D (20 mM Tris-HCl, pH 7.9, 150 mM NaCl, 5% (v/v) glycerol, 2 mM ␤-mercaptoethanol).
EMSAs-The synthetic oligonucleotide used as singlestranded DNA had the following sequence: 5Ј-TCTACCT-GGACGACCGGGTATATAGGGCCCTATATATAGGG-CCAGCAGGTCCATCA-3Ј. A complementary synthetic oligonucleotide used to prepare the blunt DNA duplex had the following sequence: 5Ј-TGATGGACCTGCTGGCCCT-ATATATAGGGCCCTATATACCCGGTCGTCCAGGT-AGA-3Ј. The first oligonucleotide was labeled using T4 polynucleotide kinase and [␥-32 P]ATP and annealed to a 2-fold molar excess of the cold complementary strand to prepare the double-stranded DNA ligand. For the DNA mobility shift assays, 10-l mixtures were prepared that contained 100 fmol of 32 P-labeled DNA in 20 mM Tris-HCl, pH 7.5, and the indicated amounts of protein (0.5-5 g). Following incubation for 20 min at 27°C, complexes were separated by electrophoresis through 5% polyacrylamide/bis gels (19:1) in 0.5ϫ TBE (1ϫ TBE (89 mM Tris Base, 89 mM Boric Acid, 2 mM EDTA, pH 8.3)). Gels were dried down and analyzed by phosphorimaging. Experiments were performed in triplicate, and the results were averaged. The error bars on the graphs are the standard error.
Electrophoretic mobility shift assays (EMSAs) were carried out after incubation of hCdc45-His-FLAG-DNA complexes with anti-FLAG monoclonal antibodies (Abcam). For these experiments 10-l mixtures were prepared that contained 50 fmol of 32 P-labeled oligonucleotide in 20 mM Tris-HCl, pH 7.5, and 5 g of hCdc45-His-FLAG. Following incubation for 20 min at 27°C, anti-FLAG antibody was added (0.5, 1, and 2 g in 2 l of the following buffer: 10 mM sodium phosphate, pH 7.4, 150 mM NaCl, 50% glycerol; an equal volume of buffer (2 l) was added into the samples where the antibody was omitted, and the incubation was continued for additional 30 min. The mixtures were subjected to electrophoresis, as previously described, and the gels were analyzed by phosphorimaging.
Analytical Gel Filtration and DNA-binding Activity of hCdc45-An aliquot of purified hCdc45 (1.9 mg) was loaded onto an analytical gel-filtration column (Bio-Sil SEC-250, Bio-Rad). The column was developed with 50 mM Tris-HCl. pH 8.0, 150 mM NaCl, 2 mM ␤-mercaptoethanol, 5% (v/v) glycerol at a flow rate of 1.0 ml/min. Fractions (200 l) were collected, and aliquots (5 l) were run through an 8% (w/v) polyacrylamide-SDS gel. The gel-filtration column was calibrated using the following markers: tyroglobulin, bovine ␥-globulin, chicken ovalbumin, equin myoglobin, and vitamin B-12. The DNAbinding activity of each peak fraction (aliquots of 1 l) was analyzed by EMSAs using radiolabeled ssDNA as a probe, as previously described.
SAXS-The synchrotron scattering data were collected at the Austrian small angle x-ray scattering (SAXS) beamline of the electron storage ring ELETTRA (21) at a wavelength of 1.54 Å. A Pilatus 100K (Tectris, Baden, Switzerland) was used as a detector, and a sample distance of 0.75 m was set to resolve the momentum transfer, q (q ϭ 4 sin()/, with as half scattering angle), in the range from 0.17 to 4.5 nm Ϫ1 . Samples of recombinant hCdc45 at a concentration of 0.44, 0.90 mg/ml, and 1.85 mg/ml in buffer D were used. All samples were kept and measured in a 1.5-mm glass capillary (Glass, Berlin, Germany) at 8°C. The three 30-s exposures have been averaged, because the comparison of the first and last pattern did not show any effect of radiation damage.
The primary data reduction was performed using IGOR Pro (Wavemetrics, Lake Oswego, OR), which was also used to estimate the molecular mass of the solutes by comparing them to the scattering of water. The reduced data were treated further with the ATSAS program collection (22). The indirect Fourier transformation was calculated by using the program GNOM (23), which determines the distance distribution function, the radius of gyration R g , the forward scattering I(0), as well as the maximal dimension D max of the proteins. The results are summarized in supplemental Table S1, in which additionally the molecular mass of each sample has been calculated. As the sample c (1.85 mg/ml) showed some evidence of aggregation, as seen by the SAXS pattern change at q Ͻ 0.4 nm Ϫ1 and the difference in the relative molecular mass, the low q-data of sample b (0.9 mg/ml) has been merged with the high q-data of the sample c (1.85 mg/ml; supplemental Fig. S1). The ab initio shape of the protein was reconstructed using DAMMIF (24) with the combined scattering curve bc. The initial volume was a sphere with a D max of 12.5 nm consisting of 5594 individual spheres (diameter 0.32 nm). The best 44 models have been averaged to obtain the final model with the program DAMAVER (25).

Cdc45 Shows Sequence Similarity to Archaeal Proteins
Belonging to the DHH Family of Phosphoesterases-We carried out a bioinformatic analysis on Cdc45. Database searches with the Position-Specific Interactive BLAST algorithm, using the human Cdc45 sequence as search model and default parameters, identified weak but significant similarity to two archaeal sequences on the second iteration run. The proteins were annotated as "phosphoesterase domain-containing protein" from Candidatus korarchaeum cryptofilum OPF8 (E-value: 2 ϫ 10 Ϫ4 ) and "putative single-stranded DNA-specific exonuclease RecJ" from Methanocella paludicola SANAE (E-value: 3 ϫ 10 Ϫ4 ). They both belong to the DHH family of phosphoesterases that was first described by Aravind and Koonin (26). Both sequences have been recently added to the databases, explaining why the similarity had not been previously identified. Although the sequence similarity detected by BLAST involved only the first 130 -140 residues, multiple alignments between Cdc45 and putative archaeal orthologues show that some similarity can be noticed throughout the entire sequence (Fig. 1).
The DHH family is defined by a number of conserved sequence motifs and can be split into two distinct clusters: subfamily 1 (whose prototype is the Escherichia coli RecJ protein), is so far represented only in Archaea and Bacteria; subfamily 2 (which includes the Drosophila melanogaster Prune protein and the Saccharomyces cerevisiae exopolyphosphatase PPX1), is also found in Eukarya. Crystallographic analysis of some DHH proteins (27)(28)(29) has revealed the presence of a catalytic core formed by two domains: in the N-terminal domain invariant residues (aspartic acid and histidine residues) coordinate two metals ions (typically manganese ions), suggesting a twometal mediated hydrolysis reaction.
Because Archaea possess multiple RecJ-like proteins, we have selected the sequences displaying the closest match to the Candidatus and Methanocella homologues. To distinguish the archaeal RecJ/DHH proteins that may be putative Cdc45 orthologues from other members of the DHH family, we will use the notation RecJ Cdc45 . However, most Archaea seem to possess two close paralogues of the putative Cdc45 counterpart. A study, focused on detecting the archaeal equivalent of E. coli RecJ, identifies two highly similar proteins from Methanocaldococcus jannaschii (MJ0977 and MJ0831). Although both were able to partially complement a RecJ mutant phenotype in E. coli, ssDNA nuclease activity could only be observed for the MJ0977 protein (30). Two highly similar RecJ Cdc45 proteins (AF0699 and AF0735) were also found associated with a replication protein network in Archaeoglobus fulgidus (31). More recently, a RecJ-like single-stranded 5Ј-3Ј DNA exonuclease from Thermococcus kodakaraensis was found to physically associate with GINS, and this association was found to stimulate its exonuclease activity (32).
A putative RecJ homologue has been reported to co-purify with Mcm and GINS from cellular extracts of Sulfolobus solfataricus (SSO0295 (18)). This sequence is rather divergent from most of the archaeal proteins mentioned above and lacks some of the motifs; in particular, some of the conserved residues putatively responsible for the nuclease activity are absent. An open reading frame highly similar to SSO0295 is present in most Sulfolobales genomes. Divergent sequences are also detected in other Crenarchaeota, such as Aeropyrum pernix.
Both the eukaryotic Cdc45 proteins and the archaeal RecJ Cdc45 sequences match only to the RecJ catalytic core (domains I and II, as defined in the crystal structure of Thermus thermophilus RecJ (27), while lacking the 50-residues N-terminal extension as well as domains III and IV. These additional elements form a closed-ring structure that is predicted to encircle ssDNA.
Cdc45 sequences only partially retain the motifs typical of the DHH family that are involved in metal binding. Whereas in motif 1 both aspartate residues coordinating the metal are conserved ( 26 DVD), the DHH motif 3, which gives the name to the family, is mutated to 99 DTH. Although a number of residues belonging to motifs 2 and 4 are conserved suggesting that the protein fold is similar in those regions, the key residues for metal binding and catalysis are not conserved, with the aspartate of motif 2 mutated to asparagine, and the aspartate of motif 4 to a glutamine (Fig. 1). In contrast, most of the archaeal RecJ Cdc45 comprise all of the motifs that are conserved in the bacterial RecJ exonucleases, with the exception of the putative orthologues from Sulfolobales, whose sequence is rather divergent and lacks most of the RecJ canonical residues.
Threading Algorithms Confirm the Presence of a RecJ/DHHlike Core Fold in hCdc45-A variety of threading/fold recognition algorithms were used to verify the similarity among hCdc45 and DHH family members. The Protein Fold Recognition Server Phyre (33) identifies a three-dimensional similarity between hCdc45 and a number of DHH protein structures such as the RecJ exonuclease from T. thermophilus (PDB code: 1IR6), as well as various inorganic pyrophosphatases (PDB codes: 1WPN, 1I74, and 2HAW), but the similarity was restricted to the first 110 amino acids. When a multiple sequence alignment, including a number of Cdc45 sequences, was used as input for the HHPRED server (34), a similarity (involving the first 315 residues) was detected to an inorganic pyrophosphatase from Bacillus subtilis belonging to the DHH family (PDB code: 1WPN). The profile-profile alignment and fold-recognition server FFAS03 (35) was also used both with the full-length hCdc45 sequence and a number of fragments corresponding to putative domains. A score below the recommended threshold (Ϫ9) was obtained when the first 140 amino acid residues of hCdc45 were used as input, matching the N-terminal domain of a number of manganese-dependent inorganic pyrophosphatases from the DHH family, with the B. subtilis inorganic pyro-

Structure and Function of Human Cdc45
FEBRUARY 3, 2012 • VOLUME 287 • NUMBER 6 phosphatase giving the best agreement (Ϫ10.5, PDB code: 1K23). Using as input longer hCdc45 fragments still provides a match with the DHH proteins, although the scores get progressively higher, indicating a lower degree of confidence; using the full-length protein it is possible to detect some similarity to 1K23. The sequence-structure homology recognition server FUGUE (36) unambiguously identifies both the T. thermophilus RecJ (1IR6) as well as the Streptococcus gordonii inorganic pyrophosphatase (1K20) structures as the two best hits along the entire sequence with a degree of confidence higher than 99%. When using as input the sequences of a variety of archaeal RecJ Cdc45 proteins, all the servers predicted a strong structural similarity to the bacterial RecJ proteins, along the entire sequence, as expected from the significant sequence homology and the conservation of the characteristic motifs.
A central region in both Cdc45 (residues 252-363 in hCdc45) and archaeal RecJ Cdc45 (residues 188 -275 in the sequence from T. kodakaraensis) proteins, appears as a long insertion into the RecJ/DHH core. This region is reasonably well conserved between eukaryotic and archaeal proteins. Secondary structure predictions suggest a helical fold, and threading algorithms tend to find matches with helical bundle proteins, such as acyl-CoA-binding proteins and helix-turn-helix transcription factors. An additional insertion is unique to the eukaryotic Cdc45 proteins (residues 108 -175 in the human sequence), and the central part includes a large number of charged residues (aspartate, glutamate, arginine, and lysine), thus suggesting a partially structured region (Fig. 1).
Based on the results of both the sequence analysis and the threading algorithms, we have produced a sequence alignment FIGURE 1. Sequence alignment between hCdc45, RecJ Cdc45 from T. kodakaraensis, and RecJ from T. thermophilus. The alignment presented here is based on an extended multiple alignment using 15 eukaryotic Cdc45 sequences, 16 archaeal sequences, and 10 bacterial RecJ sequences, selected from evolutionary diverse organisms. Only the RecJ core (residues 50 -425, comprising domains I and II) has been used in the alignment. Residues that are conserved in more than 70% of the eukaryotic, archaeal, and bacterial sequences are highlighted in green, cyan, and yellow, respectively. The following groups of amino acid residues were considered similar: Asp/Glu, Lys/Arg, Phe/Tyr, Ser/Thr, Gly/Ala, and Val/Leu/Ile/Met. The position of the secondary structural elements in the T. thermophilus RecJ crystal structure (PDB code: 2ZXP) is indicated at the bottom, whereas the predicted secondary structure for human Cdc45 is shown at the top. Secondary structure elements are named according to the TthRecJ nomenclature (28). The position of the characteristic RecJ motifs is shown by red boxes, with the residues conserved highlighted in bold. The alignment was carried out using the multiple sequence alignment program MUSCLE (41) and manually modified to take into account the structural constraints, and the results of the -fold recognition/threading algorithms. Up to motif IV the similarity is strong enough to be detected based on sequence alone, whereas the second half of the alignment relies on the threading data, which identify similarity patterns in the absence of high sequence homology, as exemplified by the conservation of the patterns of hydrophobic residues and the excellent match between RecJ secondary structure elements and the prediction for Cdc45. An insertion unique to eukaryotic Cdc45 orthologues and containing many charged amino acid residues is shown in magenta. The putative helical insertion present in both archaeal and eukaryotic proteins is shown in yellow.
that summarizes the putative relationships among bacterial RecJ ssDNA exonucleases, archaeal RecJ Cdc45 and the eukaryotic Cdc45 (Fig. 1). In the first half of the molecule (up to motif IV) the similarity is strong enough to be detected based on sequence alone, whereas for the second half the alignment relies on the threading data, which identify similarity patterns in the absence of high sequence conservation. Consistent with the threading results, Fig. 1 shows the conservation of patterns of hydrophobic residues and an excellent match between the experimentally determined ␣-helices and ␤-strands of RecJ and the predicted Cdc45 secondary structure elements.
Although the correspondence is good overall, there are a few uncertainties in the central region of the proteins. For example it is difficult to establish unambiguously whether the first Cdc45 insertion occurs after ␤5 (as depicted in Fig. 1) or after ␣7. In the same way the second insertion (common to eukary-otic and archaeal proteins) may slide from the current position in Fig. 1 (after ␣10) to an alternative position after ␣11. However the match is very convincing in the first half of domain I as well as in domain II, including the long helix connecting the two domains (␣14), suggesting that the relationship extends throughout the entire sequence.
While we were finalizing this report, a bioinformatic analysis (37) was published suggesting a similarity between Cdc45 and RecJ proteins, limited to the first 100 amino acid residues. Our report is consistent with that one (37) but further extends the analysis by showing that the sequence and structural similarity covers the entire length of the protein.
Biochemical Characterization of the Recombinant hCdc45-We produced in bacterial cells hCdc45 as a fusion protein with a C-terminal His 6 -FLAG tag, using the pNIC-CTHF vector (20). The tag could be cleaved with the TEV protease and the protein purified to homogeneity (Fig. 2A). , starting from the protein obtained after the first step of Ni-affinity purification (IMAC (immobilized metal-ion affinity chromatography)), followed by size-exclusion chromatography (SEC); the protein after cleavage of the His 6 -FLAG tag using TEV protease and purification over a heparin column (Heparin) to a final round of SEC. B, DNA-binding activity of hCdc45. Example of an EMSA on single-stranded DNA is shown. The assay was carried out with increasing concentrations of hCdc45 (0.5, 1, 2, 3, 4, and 5 g of protein were present in the mixtures loaded into the lanes from 2 to 7). A radiolabeled 56-mer DNA was used as a ligand (see the text for details). A control mixture without protein was run on lane 1. C, single-stranded versus double-stranded DNA binding. Shifted DNA (either in single-(•) or double-stranded (f) form) is reported versus the amount of hCdc45. Experiments were performed in triplicate, and the results are averaged. Curves represent best fits to the data points. The error bars on the graphs are the ϮS.E. D, gel-filtration analysis of hCdc45 and EMSAs of the corresponding peak fractions. Gel-filtration chromatography of purified hCdc45 was performed using a Bio-Sil SEC-250 column (Bio-Rad) as described under "Experimental Procedures." Peak fractions were analyzed by SDS-PAGE (5 l/fraction) and used in a gel shift experiment (1 l/fraction). E, EMSA with hCdc45-His-FLAG in the presence of anti-FLAG antibody. The assays were carried out by adding increasing amounts of a monoclonal anti-FLAG antibody (0.5, 1, and 2 g, lanes 2 and 6, 3 and 7, and 4 and 8, respectively) into mixtures containing the single-stranded DNA probe with hCdc45-His-FLAG (lanes 6 -8; 5 g of protein) or without the recombinant protein (lanes 2-4; see text for details). A black arrow indicates the Cdc45-DNA complex, whereas the white arrows identify the ternary complexes with the anti-FLAG antibody.

JOURNAL OF BIOLOGICAL CHEMISTRY 4125
Although many of the residues that in RecJ/DHH proteins are involved in Mn 2ϩ binding and catalysis are not conserved in hCdc45, we examined the possibility that the few remaining aspartate and histidine residues (namely Asp-26, Asp-28, Asp-99, and His-101) may possibly coordinate one metal ion. We have therefore used both inductively coupled plasma/atomic emission spectroscopy and atomic absorption spectroscopy to test whether the purified protein contains manganese, magnesium, or zinc, but we failed to confirm the presence of any of these metals (data not shown).
We also carried out activity assays to check whether hCdc45 displays either pyrophosphatase or exonuclease activity. Consistently with the absence of metal ions and some of the putative catalytic residues, we were unable to detect any of the above enzymatic activities (data not shown).
Both its role in DNA replication and the putative similarity with a single-stranded DNA exonuclease, such as RecJ, suggested that Cdc45 could be a DNA-binding protein. We used EMSAs to evaluate the DNA-binding properties of hCdc45 (Fig. 2B). The purified recombinant protein binds singlestranded synthetic oligonucleotides with a weak but detectable affinity, comparable with the one observed for the Drosophila GINS complex (14) or for the full-length T. thermophilus RecJ (27). No increase in affinity was observed when a fork-containing DNA molecule was used as a ligand in the EMSAs, and negligible binding was detected to short blunt double-stranded DNA (Fig. 3C). Addition of Mg 2ϩ , Mn 2ϩ , or Zn 2ϩ ions into the buffer, as well as variation of pH in the range 5.5-8.5, were found to have no effect on the hCdc45 DNA-binding activity (data not shown).
Preliminary experiments indicated that the affinity of hCdc45 for ssRNA is weaker with respect to ssDNA. The protein shows a preference for an oligo(dG) in comparison to oligo(dC), oligo(dA), and oligo(dT) (supplemental Fig. S2).
To demonstrate that the weak DNA-binding activity is indeed due to hCdc45, and not to trace amounts of a contaminating protein, we analyzed the DNA-binding activity of the protein fractions following size-exclusion chromatography. The DNA-binding activity profile precisely co-migrates with the protein peak ( Fig. 2D and supplemental Fig. S3) suggesting that the ability to bind DNA is a truly intrinsic feature of hCdc45. Furthermore, we analyzed whether the protein-DNA complex could be super-shifted by a specific antibody. For this experiment, we used a FLAG-tagged version of hCdc45 (purified according to the same protocol as the untagged protein with the omission of the TEV cleavage step) and a monoclonal anti-FLAG antibody. This analysis (Fig. 2E) revealed that the anti-FLAG antibody was able to super-shift the protein-DNA complex.
The ssDNA-binding properties of hCdc45 are consistent with the role of the protein as inferred from the single particle electron microscopy structure of the Drosophila CMG complex (38). In the CMG context Cdc45 (together with GINS) contributes to the formation of a tracking channel for the lagging strand. DNA binding is therefore not a property of the isolated factor but is achieved synergically with Mcm2-7 and GINS.
SAXS Data from Recombinant hCdc45 Are Consistent with the Three-dimensional Structure of the RecJ Core-SAXS data were collected from highly purified samples of recombinant hCdc45 and corrected for the scattering from the buffer. Ab initio modeling was performed using the program DAMMIN (39), with 44 runs being averaged using the program DAMAVER (25). The model obtained can be described as a compact core with two lateral extensions (Fig. 3). We can fit the RecJ core (encompassing domains I and II, residues 50 -421) from the T. thermophilus RecJ crystallographic structure (PDB code: 2ZXP) in the central part of the envelope (Fig. 4). One of the two lateral extensions is larger and better defined and can be assigned to the putative helical insertion domain that is common to both the eukaryotic Cdc45 and the archaeal RecJ Cdc45 proteins (residues 188 -275 in hCdc45), as the putative insertion loop within the RecJ core is positioned in a manner compatible with this interpretation (Figs. 1 and 4). As an example, we choose to fit to this region of the map the helical bundle from the acyl-CoA-binding protein (PDB code: 2FDQ) as suggested from the results of the threading analysis. The second insertion has a more elongated shape and can be allocated to the partially unstructured insertion that is unique to the Cdc45 sequences (residues 108 -175 in hCdc45, Figs. 1 and 4).
As discussed above, there is some uncertainty in the exact positions of the two long insertions with respect to the RecJ core, and alternative insertion points can be suggested. However, in both cases the alternative loops are still positioned in a manner compatible with the interpretation of the SAXS data.
Evolutionary Implications-The results of these biochemical and structural analyses suggest an evolutionary link between Cdc45 and DHH proteins. The hypothesis that Cdc45 is derived from an ancestral pyrophosphatase cannot be completely ruled out. It has been long suggested that a pyrophosphatase activity may be associated to the replication machinery (40). Pyrophosphate hydrolysis could enhance the catalytic activity of replicative DNA polymerases, because removal of the pyrophosphate, a product of dNTPs incorporation, would drive forward the polymerization reaction.
However, an important clue about the function and the evolutionary origin of Cdc45 comes from the finding that some of its putative archaeal homologues display a much closer similarity to the DHH subfamily 1 that includes the RecJ singlestranded DNA exonucleases. Moreover, the ab initio model obtained from SAXS data is more consistent with a compact RecJ core, rather than the more "open" conformation of inorganic pyrophosphatases. A number of RecJ-like proteins have been found associated with the DNA replication machinery in a variety of archaeal organisms, such as S. solfataricus (18), A. fulgidus (31), T. kodakaraensis (32). In particular, the T. kodakaraensis RecJ Cdc45 homologue has been shown to possess a 5Ј-3Ј exonuclease activity that is highly stimulated by physical interaction with the GINS15 subunit (32). It has been proposed that this enzyme could participate in maturation of the Okazaki fragments in a pathway that is redundant with the FEN1-and Dna2-dependent pathways. Out of the multiple RecJ-like proteins present in archaeal organisms, it is plausible that some are associated with the DNA replication machinery and have a direct role in genome duplication; whereas others are "true" RecJ homologues, being involved in DNA repair/recombination transactions.
Based on our biochemical and bioinformatic analyses we suggest that Cdc45 originated from an ancestral 5Ј-3Ј exonuclease loosing during evolution its catalytic activity and only retaining the ability to bind single-stranded DNA. The proteins from Sulfolobales may have followed a similar evolutionary path, as demonstrated by the absence of key catalytic residues and their tight association with GINS15. Association with the replication fork and GINS, in particular, may have led Cdc45 to loose its enzymatic activity and to acquire the function of "chaperoning" the lagging strand that is sterically excluded from the central channel of the MCM2-7 complex, as suggested from the cryoelectron microscopy of the CMG complex (38). The conserved residues in the seven RecJ motifs are shown in red. B, the ab initio calculated SAXS model for hCdc45 (depicted as gray light spheres) is superimposed to the crystal structure of the core of T. thermophilus RecJ, in blue. Highlighted in magenta and indicated by a magenta asterisk is the putative position of the insertion, which is unique to the eukaryotic Cdc45 orthologues; highlighted in yellow and indicated by a yellow asterisk is the position of the helical bundle insertion that is common to both archaeal and eukaryotic proteins (see Fig. 1). As an example, the helical domain of the acyl-CoAbinding protein (PDB code: 2FDQ) has been fitted to the map, consistently with the results of the threading algorithms. The two views are roughly related by a 90°rotation around a horizontal axis.