The NH2-terminal php Domain of the α Subunit of the Escherichia coli Replicase Binds the ϵ Proofreading Subunit*

The α subunit of the replicase of all bacteria contains a php domain, initially identified by its similarity to histidinol phosphatase but of otherwise unknown function (Aravind, L., and Koonin, E. V. (1998) Nucleic Acids Res. 26, 3746-3752). Deletion of 60 residues from the NH2 terminus of the α php domain destroys ϵ binding. The minimal 255-residue php domain, estimated by sequence alignment with homolog YcdX, is insufficient for ϵ binding. However, a 320-residue segment including sequences that immediately precede the polymerase domain binds ϵ with the same affinity as the 1160-residue full-length α subunit. A subset of mutations of a conserved acidic residue (Asp43 in Escherichia coli α) present in the php domain of all bacterial replicases resulted in defects in ϵ binding. Using sequence alignments, we show that the prototypical Gram(+) Pol C, which contains the polymerase and proofreading activities within the same polypeptide chain, has an ϵ-like sequence inserted in a surface loop near the center of the homologous YcdX protein. These findings suggest that the php domain serves as a platform to enable coordination of proofreading and polymerase activities during chromosomal replication.

The initial discovery of a proofreading 3Ј35Ј exonuclease within DNA polymerase I provided the prototype for proofreading among all polymerases (1). Editing activity was found in the same polypeptide chain as Pol 2 I and required exchange of the 3Ј-primer terminus between the polymerase and proofreading exonuclease sites, which are separated by ϳ30 Å (2). The structure of the exonuclease site and functional studies with cross-linked substrates suggested that ϳ4 bases needed to be unwound to permit migration of the 3Ј end of a primer from the Pol to exonuclease active site (2,3). This provided a basis for the preference of the 3Ј35Ј exonuclease for mispaired over properly base-paired primer termini. Partitioning of the primer terminus between the exonuclease and polymerase sites is determined by both DNA structure and protein contacts (4). The kinetic basis for efficient proofreading is the very slow rate of elongation of primers containing a terminal mismatch, providing time for partitioning (5,6). The exonucleolytic mechanism requires two Mg 2ϩ ions tethered by acidic side chains within the polymerase active site (2), which demonstrates the close functional relationship between the two activities. Escherichia coli Pol II also contains an exonuclease activity in the same polypeptide chain as the polymerase (7).
The multisubunit core of the E. coli Pol III replicase was found to contain the polymerase subunit ␣ bound to subunits ⑀ and (8). ⑀ is the product of the mutD gene and was initially identified as conferring high levels of spontaneous mutagenesis when defective (9). In contrast to Pol I and II, ⑀ was shown to be the separate proofreading subunit of the Pol III replicase (10 -13). Holoenzyme reconstituted with ⑀ also exhibits lower fidelity in vitro (11,14).
Similar proofreading activities are found in eukaryotic nuclear replicative Pol ␦ and ⑀ and the mitochondrial Pol ␥ (15,16). Structure of a prototypical B class replicase (17) revealed an exonuclease that exhibited strong similarity to A class polymerases (2) and the ⑀ subunit of Pol III (C family of polymerases) (18). Sequence conservation also supports homology and a common mechanism (15,19). Functional studies within Pol ␦ support a tight coordination of proofreading and polymerization activities (21).
The proofreading activity of ⑀ and the polymerase activity of the ␣ subunit of Pol III appear to also be closely coordinated. Binding of ⑀ to ␣ stimulates polymerase activity 2-fold and proofreading activity 10 -80fold, depending on the substrate (22). Studies with mutant ⑀ subunits suggest that a functional interaction is required in addition to tight tethering for functional coordination of synthesis with proofreading (23), and the existence of ⑀ and ␣ as separate polypeptide chains provides a convenient experimental system for studying this integration. The usefulness of such an experimental system requires identification of the sites of subunit interaction. It is known that the primary binding energy of ⑀ to ␣ is provided by the COOH terminus of ⑀, which is separable from the exonuclease domain (24,25). The position of the ⑀-binding site within ␣, however, is unknown. To identify regions important for binding, we exploited a series of ␣ subunits from which varying lengths of NH 2 -and COOH-terminal sequences have been deleted. Using this approach, we localized the site of ⑀ binding to the NH 2 -terminal php domain of ␣, a region of previously unknown function.
Proteins and Enzymes-The tagged ␣ derivatives N⌬0, N⌬60, N⌬240, C⌬0, C⌬169, and C⌬794 have been described previously (26,27). Plasmids expressing C⌬840 and C⌬905 were constructed by similar strategies using pET-11C. Expression occurred in BL21(DE3) with 2 h induction at room temperature. The resulting proteins were purified by Ni 2ϩ -NTA resin, as described (26,27) except that the column, equilibrated in Buffer N, was washed with 20 mM imidazole, and the protein was eluted with a gradient of 20 -120 mM imidazole. Each tagged protein contained only one significant biotinylated species that was selectively immobilized on streptavidin-coated chips for ⑀ binding analysis (Fig. 1). NH 2 -tagged ␣ subunits carrying point mutations D43A, D43N, and D43E have been described (28).
Purification of ⑀-The ⑀ subunit of Pol III was obtained using the denaturation-renaturation protocol as described (12) starting with 10 g cells. The renatured protein (200 mg) was purified using a Q-Sepharose column (70 ml) pre-equilibrated with buffer Q. After loading, the column was washed with 2 column volumes of buffer Q containing 50 mM NaCl. The protein was eluted with 10 column volumes in a gradient of 50 -150 mM NaCl in buffer Q. The major protein peak, identified as ⑀ by SDS gels, was pooled (90 mg). The eluted protein was precipitated with ammonium sulfate (60% saturation). One quarter of the pellet (ϳ22 mg) was dissolved in buffer S (0.5 ml final volume) and applied to a 24-ml Superose-6 column, Amersham Biosciences HR 10/30. The purity of the eluted protein (9.4 mg, 7.1. ϫ 10 6 units) was determined to be ϳ98%, and it was assessed in a 4 -20% SDS-polyacrylamide gel. The activity was determined using an exonuclease assay.
Assays-The gap-filling assay for Pol III was performed as described (29). Protein concentrations were determined by the method of Bradford using the Coomassie Plus Bradford assay reagent (Pierce) according to the manufacturer's instructions (bovine serum albumin as a standard). Exonuclease assays were performed by monitoring the removal of a mispaired 3Ј-[␣-32 P]ddAMP. A 29-mer oligonucleotide (AGGCTGGCTGACCTTCATCAAGAGTAATC) was labeled at the 3Ј end with [␣-32 P]ddATP using terminal deoxynucleotidyltransferase. The reaction was carried out in the manufacturer's buffer with the addition of 1.5 M primer, 1 M [␣-32 P]ddATP and 20 units of terminal deoxynucleotidyl transferase. The mixture, in a total volume of 20 l, was incubated at 30°C for 40 min. After that, 1 mM non-radioactive ddATP and an additional 10 units of terminal deoxynucleotidyl transferase were added, and the mixture was incubated for another 40 min at 30°C. Enzyme was thermally inactivated (10 min, 70°C). The labeled primer was annealed to M13 Gori template as described (30). The substrate was purified on a Bio-Gel column equilibrated in buffer B. Concentration of the purified substrate was determined at 260 nm using a conversion factor of A 1 ϭ 0.04 mg/ml. 15 l aliquots of a premix that contained the following components were placed on ice: 50 mM HEPES (pH 7.5), 10 mM MgCl 2 , 4% glycerol, 80 g/ml BSA, 10 mM DTT, and 114 fmol DNA. After the addition of ⑀, the samples were transferred to 30°C and incubated for 5 min. 10-l aliquots were withdrawn, immediately spotted onto DE-81 filters, and washed (three times, 5 min each, in 0.3 M ammonium formate (pH 7.8) ϩ 10 mM sodium pyrophosphate; once in deionized H 2 O; and once in 95% ethanol). The filters were dried, and the amount of radiolabel remaining was measured by scintillation counting. One unit of exonuclease activity was defined as the amount of enzyme catalyzing the removal of 1 pmol of nucleotides per minute.
Biotin Blots-Biotin blots were used to detect the presence of biotinylated ␣ derivatives. Proteins (0.2-0.4 g) were separated on a 4 -20% SDS-polyacrylamide gel and transferred to a nitrocellulose membrane at 350 mA for 0.5 h using a Bio-Rad mini-apparatus. The membrane was blocked overnight with PBS containing 5% nonfat milk, washed two times with PBS-Tween 0.1% for 10 min, washed once with PBS for 10 min, and incubated in streptavidin-horseradish peroxidase conjugate for 1 h at 1:10,000 dilution. The wash steps were repeated, and the sample was incubated for 1 min using ECL Western blotting detection reagents from Amersham Biosciences according to the manufacturer's instructions. The image was obtained by exposing the membrane to x-ray film (Hyperfilm ECL from Amersham Biosciences) for 30 s.
␣-⑀ Binding Quantitation Using Surface Plasmon Resonance-A Biacore 3000 instrument was used to monitor binding of ⑀ to immobilized ␣ derivatives. Tagged ␣ proteins were immobilized on a SA chip via biotin-streptavidin interaction. Typically, three individual flow cells were loaded with ϳ500, 1000, and 1500 response units of ␣ subunit. Three different concentrations of ⑀ (0.5, 1.0, and 1.5 M) were run over the derivatized SA chip at 20 l/min in HK buffer at 25°C. Control injections were performed on underivatized flow cells and were subtracted from the data. The dissociation constant K D for ␣-⑀ binding was determined using BiaEvaluation 3.2 software, and the curves were fit using the 1:1 Langmuir model.

RESULTS
Understanding the Mechanism Used to coordinate the editing activity of the ⑀ subunit with the polymerization activity of the ␣ subunit within the E. coli replicase requires knowledge of their respective binding sites. We have previously developed a technique that permits facile identification of protein binding domains. It involves creation of a series of proteins from which progressive deletions have been made from either the NH 2 or COOH terminus. The deleted portion is replaced with a tag that is biotinylated in vivo followed by a His 6 sequence. The His 6 sequence facilitates purification, and the biotinylated tag permits expression analysis in crude mixtures, monitoring of the presence of inactive protein during purification, and tethering of purified protein to streptavidin-coated BIAcore chips, which permits quantification of the strength of their binding to protein partners (26,31).
The ␣ subunit of Pol III is a long (1160 residues), modular protein. Its central polymerase domain contains three essential catalytic residues that are presumably required to bind the two Mg 2ϩ ions required for catalysis in other polymerases (28) (Fig. 2). The COOH-terminal end contains the binding sites for and two distinct sites for binding ␤, including the essential internal site (amino acids 920 -924) (26,(32)(33)(34). Also contained within the COOH-terminal end are HhH and OB fold domains, which have only been predicted by bioinformatics approaches FIGURE 1. Series of tagged ␣ subunits containing deletions used to localize the ⑀-binding site. All proteins were purified by Ni 2ϩ -NTA chromatography as described under "Experimental Procedures" and contain only one biotinylated band, permitting immobilization of a homogeneous population of protein on a BIAcore chip as demonstrated previously for another set of proteins purified by a similar procedure (31). Biotin blots were performed as described under "Experimental Procedures." The migration position of protein standards is indicated on the left, and the identity of the tagged ␣ derivative is listed above each lane.
to date (35). These motifs are often involved in nucleic acid binding but may also serve as protein-binding sites (36 -38). The NH 2 -terminal end of ␣ contains a putative domain homologous to histidinol phosphatase and was thus termed the php domain (polymerase and histidinol phosphatase) (39). A function for this domain in replication has not been ascribed.
⑀ Binds to the php Domain of ␣ Plus a Short COOH-terminal Extension-To permit preliminary identification of the region within ␣ required for binding the ⑀ 3Ј35Ј proofreading exonuclease, we initially surveyed a group of existing tagged ␣ deletions obtained from other studies in our laboratory. We observed that the C-tagged full-length ␣ bound ⑀ tightly (K D ϭ 5 nM) (Fig. 2). Deletion of half of the COOHterminal domain, removing sequences essential for binding and the non-essential COOH-terminal ␤ site, did not significantly impair ⑀ binding. Similarly, deletion of the entire COOH-terminal domain and of the core of the polymerase domain that contained the essential catalytic Asp residues had no effect (Fig. 2). These observations localized the binding site to the NH 2 terminus.
We observed that deletion of either 60 residues or 240 residues from the NH 2 terminus of the php domain abolished detectable ⑀ binding. We know these ⑀ binding deficiencies are not due to global folding problems in ␣ because these same mutants bind ␤ and with normal affinity (26,27). Placing a tag on the NH 2 terminus of fulllength ␣ did not change the binding affinity for ⑀ (Fig. 2). To carefully assess sequences required for ⑀ binding, we constructed ␣ derivatives with more targeted deletions. We exploited the recently determined structure of YcdX, a protein of unknown function that has higher sequence similarity to the php domain of ␣ than the histidinol phosphatates upon which the domain assignment was originally made (40). Alignments suggested that the end of the php domain may be in the vicinity of residue 255 of the E. coli ␣ sequence or earlier (Fig. 3). Alignments of diverse Pol III ␣ sequences suggest the beginning of the polymerase domain may start near residue 320 of E. coli ␣ (data not shown). Thus, we constructed COOH-tagged proteins containing the predicted minimal php domain (255 residues; C⌬905 ␣) and this domain plus a COOH-terminal extension of sequence preceding the polymerase domain (320 residues; C⌬840 ␣). BIAcore binding experiments indicated that the shorter C⌬905 ␣ protein failed to bind ⑀ with detectable affinity and that the php protein containing a short COOH-terminal extension, C⌬840 ␣, bound ⑀ with approximately the same affinity as full-length ␣ (Fig. 2). We take this as an indication that all energetically important ⑀-binding sequences are found within the NH 2 -terminal 320 residues of ␣.

Many of the Zn 2ϩ Ligands Observed in the YcdX Structure Are
Conserved in the php Domain of ␣-Alignment of YcdX and ␣ php domain sequences from diverse organisms indicates sequence similarity around the ligands that chelate Zn 2ϩ 1 and 2 within the YcdX structure (40) (Fig. 3). The sequences that bind Zn 2ϩ 3 in YcdX are less conserved in the php domain, but residues that might serve this function were identified and are noted in Fig. 3. In principle, any nitrogen, oxygen, or sulfur can serve as a ligand for Zn 2ϩ . Typically, these roles are served by His, Glu, Asp, or Cys, but examples of Lys, Tyr, Ser, Thr, the amino terminus of a protein chain, and the carbonyl oxygen of a peptide backbone are known (41)(42)(43). A survey of all completely sequenced bacteria indicates absolute conservation of an acidic residue in the position corresponding to Asp 43 within the E. coli ␣ php domain. We examined the binding properties of several Asp 43 point mutants and found that this residue is important in binding ⑀. D43A and D43E mutants bound ⑀ with 19-and 16-fold reduced affinities, respectively. A D43N mutant bound ⑀ only 3-fold more weakly than wild-type ( Table 1).
The Structure of the php Domain and the Polymerase Activity of ␣ Are Coupled-Modification of the php domain influenced the activity of the adjacent polymerase domain. Deletion of 60 residues from the amino terminus of php destroyed both polymerase catalytic activity and ⑀ binding (Table 1). Other mutations affected ⑀ binding and polymerase activity differentially. For example, the D43A mutation severely reduced polymerase activity but decreased ⑀ binding more modestly. The D43E mutation that did not affect polymerase activity diminished ⑀ binding to a similar extent to D43A. A D43N mutation reduced both ⑀ binding and polymerase activity to the same small extent.
The Exonuclease Domain of Gram(ϩ) Pol C Subunits Is Inserted within the php Domain-The prototypical Gram(ϩ) Pol III, Pol C, contains its proofreading exonuclease within the same subunit as ␣ (44). Alignments of the Pol C sequences with the Pol III ␣ php domain sequences and ⑀ sequences show that the php domain is conserved within Pol C sequences, even though the proofreading exonuclease does not need a protein-binding site to secure it to the polymerase (Fig. 4). Pol Cs only contain the NH 2 -terminal exonuclease domain of the proofreading exonuclease and lack the COOH-terminal ␣-binding domain found in bacteria that use ␣ as the catalytic subunit of their replicase. Interestingly, the Pol C exonuclease is located internal to php sequences (Fig. 4). Comparing the break in the Pol C php sequence (Fig. 4) with the sequences that follow the acidic ligand that chelates both Zn 2ϩ 2 and 3 in the YcdX structure (Fig. 3), suggests that the insertion may occur in a surface loop in the YcdX structure (Fig. 5). Sy splice site indicates a highly conserved region near the COOH terminus of the polymerase domain that also happens to be the site of trans-splicing in the homologous Synechocystis proteins (20). Other annotated features are described under "Results." Each ␣ subunit contains a biotin-His 6 tag (26) placed on the terminus from which a deletion was made. The position of the tag is indicated by an asterisk. The K D for ⑀ binding to each ␣ derivative is indicated and was determined using BIAcore as described under "Experimental Procedures."

DISCUSSION
Proofreading and elongation activities within cellular replicases must be highly coordinated to permit efficient error correction and to maintain chromosomal integrity without interfering with fast elongation rates. We plan to exploit the existence of the proofreading exonuclease of the E. coli Pol III holoenzyme as a separate subunit to establish the communication channels required for the coordination of these activities in a prototypical multisubunit replicase. As a first step, we report here that the ⑀ subunit binds to the php domain of ␣ immediately upstream from the polymerase domain. This location may permit placement of the exonuclease in a position to permit efficient transfer of mismatched termini from the polymerase. The conserved php domain  For php sequences, only the region aligning with YcdX is shown. The YcdX and php alignments were performed separately and manually aligned to yield maximum correspondence between Zn 2ϩ ligands obtained from the YcdX structure (40) and putative ligands within the php sequences. The entire E. coli YcdX sequence is shown. For E. coli ␣, residues 6 -255 are shown. The numbers above the YcdX alignment indicate which of three Zn 2ϩ atoms are bound within YcdX. The numbers above the php sequences indicate the possible corresponding ligands. The ligands that bind Zn 2ϩ atoms 1 or 2 can be assigned with confidence and align when YcdX and DnaE php sequences are aligned simultaneously. The ligands for Zn 2ϩ #3 are more speculative and are based on identification of the best conserved potential ligands present in php. The position where the php sequence is interrupted by insertion of ⑀ within Pol C (taken from Fig. 5) and the proposed corresponding loop within the YcdX structure is indicated (marked as region of ⑀ insertion). Experimental support for a loop within php was obtained by observing trypsin hypersensitivity after position 87 of Tth (cleavage at GK/GLD) (R. C. Pope and C. McHenry, unpublished observation.). A black background indicates residue identity; gray indicates similarity. Numbers above residues indicate the Zn 2ϩ coordinated within YcdX and the putative corresponding residues in ␣ php. D43 indicates the position of an absolutely conserved acidic residue that was mutated as part of this study. The position of ⑀-like sequence insertion into the php domain of Pol C is indicated. The position was obtained from an alignment of DnaE and Pol C php sequences (Fig. 4). may help provide a channel for this transfer. Evidence for this hypothesis is provided by the observation of stimulation of proofreading activity up to 80-fold upon binding of ⑀ to ␣ (22) and by the presence of a php domain in Pol C-type Pol IIIs from low GC Gram(ϩ) organisms. Pol C subunits contain polymerase and exonuclease activity as part of the same polypeptide chain (44). Thus, their php domain is not required for binding a separate ⑀ subunit. Therefore, conservation of a php domain and insertion of the proofreading activity within the php domain of Pol Cs suggest a function other than a simple tether for the proofreading domain. Comparison to the homologous YcdX protein, for which a structure has been determined, suggests the insertion occurs in a surface loop that should not significantly perturb the structure of the php domain.
The php domain was initially identified by sequence similarity between conserved domains of Pol III subunits and histidinol phosphatase (39). It was noted that a number of conserved residues within this domain could serve as metal ligands. However, the conservation was found not to be absolute, varying at specific positions between species. It was suggested that the domain might provide an enzymatic activity, perhaps a pyrophosphatase helping to drive the energetics of the elongation reaction. More recently, the structure has been determined for a small 245-residue protein, YcdX, that is of unknown function but is more closely related, by sequence, to the php domain of Pol IIIs than histidinol phosphatase. Sequence alignments between Pol III ␣ subunits and YcdX suggest the homologous region within ␣ ends around residue 255. We observed that this predicted minimal php domain fails to bind ⑀. An additional stretch of sequence between the minimal php domain and the polymerase domain is required for binding comparable with that observed for full-length ␣. This additional sequence element may cooperate separately in ⑀ binding or may be part of a larger php domain in Pol III ␣. Supporting the latter hypothesis, we observe that the longer 320-residue sequence is expressed at higher levels than the minimal 255-residue php domain. It also exhibits lower levels of degradation Sequence alignments using Pol III ␣ php domains, Pol C sequences, and ⑀ sequences from a diverse set of organisms were prepared as described in the legend to Fig. 3  products in crude cellular extracts, which is consistent with a more compact structure (data not shown). A php-or histidinol phosphatase-like domain has previously been observed at the COOH-terminal end of bacterial Pol ␤s (39,45). These enzymes are from the X polymerase family and share related polymerase domains with eukaryotic Pols ␤, , and and terminal deoxynucleotidyl transferase. We performed a comprehensive psi-BLAST search and confirmed that all bacterial candidates contain a COOH-terminal sequence that aligns with ␣ php (data not shown). This raises the possibility that bacterial Pol ␤s may also be able to bind cellular ⑀ subunits, enabling a coupled proofreading reaction. Such a mechanism would be distinct from eukaryotic ␤ polymerases, which lack a proofreading subunit and exhibit low fidelity (46). Alternatively, as suggested solely from bioinformatics and structure comparisons (39,40), the php domain may exhibit an independent activity in these enzymes. However, caution in their investigation is warranted to ensure that any activity observed is not due to trace levels of bound cellular ⑀. Deletion or other php domain mutants may not provide useful controls, since the mutation could abolish ⑀ binding rather than any putative intrinsic activity observed.
A functional co-dependence of the polymerase and exonuclease activities of Pol III ␣ and ⑀ subunits has been observed previously (22,23). Here, we note a similar relationship between the polymerase activity and the php ⑀-binding domain of ␣. Deletion of the NH 2terminal 60 residues of the php of ␣ abolishes both ⑀ binding and polymerase activity. Mutations in the conserved Asp 43 affect the two functions to different extents, indicating that the two functions are not absolutely linked. This suggests that the ability to bind ⑀, per se, is not influencing polymerase activity but that the activity of the polymerase domain is influenced by the conformation of the php domain. By microscopic reversibility, this would suggest that the conformation of php could be influenced by the conformation of the polymerase domain. This could create an allosteric mechanism whereby the status of the polymerase active site, modulated by the state of the primer terminus, could activate proofreading activity by altering the conformation of the php domain to facilitate transfer of the primer terminus to the ⑀ active site. Clearly, these speculations require further investigation, but the existence of the polymerase and proofreading subunits of Pol III as separate subunits should make such studies much more amenable than would be possible in a single subunit enzyme. Having established the ⑀-binding domain of Pol III and a potential communication mechanism, the Pol III holoenzyme stands to serve as a prototype for revealing how cellular replicases efficiently process replication errors while maintaining the high elongation rate required for chromosomal replication.