Structure of the Escherichia coli DNA Polymerase III ϵ-HOT Proofreading Complex*

The ϵ subunit of Escherichia coli DNA polymerase III possesses 3′-exonucleolytic proofreading activity. Within the Pol III core, ϵ is tightly bound between the α subunit (DNA polymerase) and θ subunit. Here, we present the crystal structure of ϵ in complex with HOT, the bacteriophage P1-encoded homolog of θ, at 2.1 Å resolution. The ϵ-HOT interface is defined by two areas of contact: an interaction of the previously unstructured N terminus of HOT with an edge of the ϵ central β-sheet as well as interactions between HOT and the catalytically important helix α1-loop-helix α2 motif of ϵ. This structure provides insight into how HOT and, by implication, θ may stabilize the ϵ subunit, thus promoting efficient proofreading during chromosomal replication.

The precise mechanisms by which cells are able to duplicate their DNA with both high accuracy (Ͻ1 error/10 10 bases replicated) and speed (up to 1000 nucleotides/s) are of major interest. Chromosomal replication is performed by multisubunit replicases that conduct the simultaneous, coordinated replication of the leading and lagging strands. Among the model systems currently being investigated, the best understood is that of the bacterium Escherichia coli (1,2). In this organism, chromosomal replication is performed by DNA polymerase III holoenzyme (HE) 2 , a dimeric complex containing 10 distinct subunits (17 total). Within the HE complex (␣⑀) 2 ␤ 4 2 (␥␦␦Ј), there are two polymerase core assemblies (␣⑀), one for each strand, which are the primary determinants of replication fidelity. Each core consists of the ␣ subunit (the polymerase, (M r ϭ 135,000)), the ⑀ subunit (a 3Ј 3 5Ј exonuclease that acts as a proofreader for polymerase misinsertion errors, (M r ϭ 28,500)), and the subunit (M r ϭ 8,000), connected in the linear order ␣-⑀-.
The ⑀ subunit, encoded by the dnaQ gene, plays a critical role within the Pol III core, both catalytically and structurally. Many dnaQ mutants exhibit strong mutator phenotypes, whereas a fully catalytically deficient mutation causes lethality due to excessively high mutation rates (error catastrophe) (3). Deletion mutants of dnaQ have been generated, but they also proved to be nonviable unless accompanied by a suppressing mutation in the polymerase (4). Based on these studies, ⑀ is thought to have at least two functions: a fidelity function and a structural function, due to its tight and presumably stabilizing interaction with the polymerase. Recently, a potential second proofreading activity has been discovered residing in the N-terminal PHP domain of the ␣ subunit (5,6), which also contains the binding site for ⑀. Interaction between ⑀ and ␣ is dependent on the C-terminal domain of ⑀ (residues 187-243) (7,8). In contrast, the N-terminal domain of ⑀ (residues 1-186) contains the exonuclease active site and retains binding affinity for the subunit.
The subunit does not have any known enzymatic function; its role within the Pol III core is presumed to be structural, through its interaction with the ⑀ subunit. Strains lacking (⌬holE mutants), although viable, are modest mutators (9), suggesting that plays a role in the fidelity of the Pol III core. E. coli strains carrying certain mutant ⑀ subunits (dnaQ mutants) demonstrate a dramatic sensitivity to the presence of the subunit. An increase of 1000-fold in the mutability of a dnaQ49 (V96G) strain was observed in the absence of (9). The subunit also stimulates the exonuclease activity of the ⑀ double mutant I170T/V215A when present in the Pol III core complex (10). This observation supports a role for in coordinating the ␣-⑀ polymerase-exonuclease interaction (10).
Structural information is critical to understanding the mechanism by which modulates exonuclease activity of the Pol III core. Both NMR (11) and crystallographic (12) structures of ⑀186 (residues 1-186), the catalytic domain of ⑀ that binds , have been obtained previously, as has the NMR solution structure of (13,14). As part of these ongoing studies, we have also investigated the bacteriophage P1 HOT protein, a functional homolog of (15,16). The precise role of HOT is unclear, but it may assist in phage replication, which relies on the E. coli replication machinery (15). Genetic experiments show that HOT protein can readily substitute for . For example, HOT can fully reduce the extreme mutability of the dnaQ49 mutant (16). As the NMR solution structures of and HOT are essentially identical (13,17), and as the HOT protein has shown greater stabil-ity in vitro than , we have chosen to study the HOT-⑀186 complex.
Based on the crystal structure presented here, HOT does not contribute residues to the active site of ⑀ but rather appears to stabilize the enzyme through interactions clustered around two regions. In particular, the N terminus of HOT becomes ordered upon binding ⑀186, forming interactions with backbone atoms of an exposed ␤-strand of the central ␤-sheet. The structure of this inter-species complex provides insight into the basis for the accurate replication of the bacteriophage genome in E. coli and also serves as a model for the ⑀complex in the native Pol III holoenzyme.

EXPERIMENTAL PROCEDURES
HOT and ⑀186 were expressed and purified as previously described (7,17). The complex of ⑀186 and HOT was prepared by mixing 2 volumes of a 3.8 mg/ml solution of ⑀186 with 1 volume of 3 M Tris⅐HCl, pH 7.0, and then adding 0.028 volumes of 62 mg/ml HOT, yielding a ratio of HOT:⑀186 of 1.03:1. As noted previously, ⑀186 is stabilized in the presence of high Tris concentrations (11).
The mixture was concentrated to ϳ30 mg/ml using a Centriprep YM-3 filter unit (Millipore), and the HOT-⑀186 complex was separated from the excess HOT by gel filtration chromatography using a 2.6 ϫ 60-cm Superdex 75 column (Amersham Biosciences) that was eluted at 0.4 ml/min with 25 mM Tris⅐HCl buffer at pH 7.5, containing 100 mM NaCl. Fractions containing the complex were concentrated to 24 mg/ml using a Centriprep YM-3 filter unit. Protein concentrations were determined spectrophotometrically using extinction coefficients at 280 nm of 11544 M Ϫ1 cm Ϫ1 for HOT, 7292 M Ϫ1 cm Ϫ1 for ⑀186, and 18836 M Ϫ1 cm Ϫ1 for the ⑀186-HOT complex.
Crystals of the ⑀186-HOT complex were obtained at 4°C by mixing, in a sitting drop tray, 1 l of a 24 mg/ml complex in 25 mM Tris, pH 7.5, 100 mM NaCl, 5 mM MnSO 4 , and 5 mM TMP with 1 l of the reservoir solution consisting of 0.1 M Tris⅐HCl, pH 8.0, and 22% polyethylene glycol 6000. Diffraction quality crystals were obtained by streak seeding. After 10 days, crystals reached their maximum size of ϳ0.2 ϫ 0.2 ϫ 0.2 mm. Crystals were harvested and transferred to a stabilization solution consisting of 0.1 M Tris, pH 7.5, 23% polyethylene glycol 6000, 100 mM NaCl, 5 mM MnSO 4 , and 5 mM TMP. Crystals were then transferred in four steps from the stabilization solution to the cryosolution consisting of 0.1 M Tris, pH 7.5, 25% polyethylene glycol 6000, 15% ethylene glycol, 200 mM NaCl, 5 mM MnSO 4 , and 5 mM TMP. Crystals were mounted in a loop, flash frozen in liquid nitrogen, and placed on the goniometer in a stream of nitrogen gas cooled to 93 K. Data were collected using a Rigaku 007HF Micromax generator equipped with VariMax HF mirrors and a Saturn 92 CCD detector. Diffraction data were collected at 2.1 Å resolution and processed using HKL2000 (18). To obtain phases, the ⑀186 model (Protein Data Bank accession code 1J53) was used for the molecular replacement using Mol-Rep (19) in the CCP4i package (20). Two molecules of ⑀186 were found in the asymmetric unit. The two main helices of HOT from the Protein Data Bank coordinates 1SE7 were placed separately into the electron density manually using the pro- Structural alignments and r.m.s. deviations were done using Pymol (23). Figures were made using Pymol (23) or using Molscript (24) and Raster3D (25).

RESULTS AND DISCUSSION
To better understand the role of in stabilizing the ⑀ proofreading subunit, we have determined the crystal structure of the ⑀186-HOT complex in the presence of TMP, a product and competitive inhibitor of the exonuclease, and MnSO 4 ( Table 1). These crystals contain two ⑀186-HOT dimers in the asymmetric unit and diffract to a resolution of 2.1 Å. Each individual complex contains one molecule of ⑀186 and one molecule of HOT, in addition to one TMP molecule and two Mn 2ϩ ions in the ⑀ exonuclease active site. In the complex, ⑀186 adopts the familiar ␣/␤ structure of proofreading domains with the fivestranded twisted ␤-sheet surrounded by seven ␣-helices ( Fig.  1A) (12). HOT in the complex is characterized by a three-helix bundle, as in solution (17), with an extensive hydrophobic core defined by residues on the three ␣-helices (Fig. 1A). The two ⑀186-HOT complexes in the unit cell are very similar; the 156 ␣-carbon atoms of the two ⑀186 molecules (listed as molecules A and C) superimpose with a r.m.s.d. of 0.4 Å, and 73 ␣-carbons of the two HOT molecules (listed as molecules B and D) super-impose with a r.m.s.d. of 0.4 Å as well. However, one area of discrepancy arises for the ⑀186 loop immediately preceding ␣7 ⑀ , a region of considerable interest because it includes the catalytically important H162 ⑀ residue (12). In one molecule (molecule C), this loop is located away from the ⑀ active site as residues Lys 158 -Leu 161 form lattice contacts with residues Glu 153 -Asp 155 from the other ⑀186 molecule (molecule A). Residues Lys 158 -Thr 160 of molecule A are mostly disordered. This is demonstrated in Fig. 2, which shows the superposition of the crystal structures of the two molecules of ⑀186 in the asymmetric unit of the ⑀186-HOT complex ( Fig. 2A).
Intersubunit contacts occur in two separate areas of the complex, as shown in some detail in Fig. 1, B and C. First, a series of N-terminal HOT residues interact with ⑀186 strand ␤3 at the edge of the central ␤-sheet (Fig. 1B). These interactions include (i) hydrogen bonds from the backbone amides of W4 HOT and N5 HOT to the carboxylate group of E85 ⑀ ; (ii) a pair of hydrogen bonds connecting the N5 HOT amide side chain with the N47 ⑀ carbonyl oxygen and the H49 ⑀ backbone amide; (iii) the hydrogen bond from the backbone amide of I6 HOT to the backbone carbonyl of H49 ⑀ ; and (iv) a hydrophobic interaction of W4 HOT with P77 ⑀ . There are also extensive water-mediated interactions between HOT residues of the C-terminal portion of helix ␣1, as well as the surrounding residues, with residues from the helix ␣1-loop-helix ␣2 region of ⑀186 (Fig. 1C). Additionally, there are hydrogen bonds between the R31 HOT , D70 ⑀ , and E71 ⑀ side chains, and a stacking interaction of Y28 HOT with F63 ⑀ . This latter interaction is facilitated by the insertion of the F63 ⑀ phenyl ring into a "notch" (26) in HOT helix ␣1 created by the lack of a side chain at G25 HOT . Fig. 3 illustrates this stacking interaction with a stereo view of electron density in the region.
The overall structure of ⑀186 in the complex is very similar to the previously determined structure of the TMP-complexed proofreading catalytic domain (12) obtained in the absence of  or HOT as shown in Fig. 2B. The ␣-carbon atoms of ⑀186 molecule C superimpose with the reported structures (Protein Data Bank codes 1J53, 1J54) with a r.m.s.d. of 0.4 Å for 162 atoms and 0.4 Å for 161 atoms, respectively. In ⑀186 molecule A from the ⑀186-HOT complex, the proposed general base His 162 and other atoms in the active site superimpose well with the ⑀186 structures in complex with TMP only. While one ⑀186 mole-cule (molecule A) in our structure appears to be in a catalytically relevant structure based on the position of His 162 , the other (molecule C) does not. Although the conformation of the loop in molecule C is stabilized by lattice contacts and a possible hydrogen bond to N18 HOT , the inherent flexibility of this loop may be related to DNA substrate binding or regulation of catalysis.
The central core three-helical bundle in HOT (residues 11-62) is structurally similar to that of the solution NMR structure (r.m.s.d. ϭ 3.2 Å over 52 ␣-carbon atoms; Fig.  4A). The two HOT structures superpose fairly well along the two long ␣-helices (␣1 and ␣3) but not in the N-and C-terminal regions and in the area of helix ␣2 (Asn 33 to Gln 45 ) and the loop preceding ␣2. Importantly, a novel additional helical structure is seen at the N terminus of HOT in the complex that was not observed in solution (17). The extensive contacts in this region between HOT and ⑀186 indicate that this structure is probably strongly stabilized, if not induced, by the interaction between the two proteins ( Fig. 1, B and C; Table 2). The published NMR structure of uncomplexed (13) was also aligned with the HOT crystal structure (Fig. 4B), and again the termini deviate somewhat but the region of ␣2 aligns considerably better than that of HOT (r.m.s.d. of 1.3 Å for 49 ␣-carbon atoms in the region of residues 11-62 of HOT). Interestingly, reevaluation of the solution structure of HOT using the original nuclear Overhauser effect data and the program CYANA, version 2.1 (27,28), resulted in significantly better agreement with the crystal structure and suggests a dimeric structure for HOT at the concentration used in the NMR study.
A recent NMR structure of in complex with ⑀186 was determined by labeling and not ⑀186. This work predicted the relative orientation and position of with respect to ⑀186 based on the observations of shift perturbations resulting from lanthanide ions bound to the active site of ⑀186 (14). Because of the nature of the NMR experiment, however, the structure lacked data on the ⑀ residues of the ⑀186interface as well as detailed structural information for the C-terminal and N-terminal residues, which in the ⑀186-HOT complex appear to be critical for stable dimer formation. The published NMR structure of in complex with ⑀186 (14) was aligned with the HOT crystal structure from the ⑀186-HOT complex (Fig. 5B), resulting in a r.m.s.d. of 1.7 Å for 48 ␣-carbon atoms of HOT (residues 11-62). Information as to how interacts with ⑀ can be gained FIGURE 3. Stereo view of the electron density at the complex interface. Residue F63 ⑀ is seen in a ringstacking arrangement with Y28 HOT and occupies a "notch" in helix ␣1 of HOT left by the lack of a side chain from G25 HOT . ⑀ is shown in blue, and HOT is shown in orange. Electron density map shown is a simulated annealing F o Ϫ F c omit map contoured at 3 .  from the crystal structure of the ⑀186-HOT complex presented here. HOT and share 52% sequence identity and 61% similarity (Fig. 5A). Sequence alignments reveal that all the potential hydrogen bonding interactions between HOT and ⑀186 listed in Table 2 are likely to be conserved in the ⑀complex with the exception of that formed by S64 HOT . In particular, structural and sequence alignment of N5 HOT with N4 (Fig. 5A) allows preservation of the set of hydrogen bonding interactions shown in Fig. 1B. Additionally, hydrophobic interactions in the complex are also likely to be conserved, such as Y28 HOT (F27 ) stacking with F63 ⑀ as well as the aliphatic side chain of K3 substituting for the hydrophobic interaction of W4 HOT with P77 ⑀ . The similarities in these important areas suggest that the ⑀186-HOT complex is a good model for the ⑀complex. The structure of the ⑀186-HOT complex provides significant insight into the stabilizing role of HOT/ on ⑀. First, formation of the HOT-⑀186 complex removes 11% of the (mostly hydrophobic) ⑀186 surface area from solvent exposure. As ⑀ is an intrinsically unstable protein, subject to proteolysis in vivo (29) and aggregation and precipitation in vitro (11,30), diminution of its hydrophobic surface is likely to be highly beneficial. Second, the interaction of the N-terminal HOT residues with ⑀186 strand ␤3 at the edge of the ␤-sheet must be considered a major stabilization factor, as unprotected edges of ␤-sheets generally represent adventitious interaction sites (31). As one example, the TREX2 nuclease, structurally very similar to ⑀186 (32), overcomes the negative effects of the corresponding ␤-sheet edge by using this edge as a dimerization interface. Third, the interaction with HOT likely leads to specific stabilization of the ⑀186 (␣1-␣2) helix-loop-helix. This region is particularly important because it contains pertinent residues, e.g. His 66 , which line the substrate binding pocket (11,12); the dnaQ923 mutant (H66Y) is a proofreading-impaired mutant whose mutator phenotype is greatly exacerbated by the lack of or HOT (9,16).
The present structure may provide a rationalization for several other dnaQ mutants with impaired proofreading capacity whose activity is negatively affected by the lack of or HOT (9,16). Many of the mutants (Fig. 6) involve residues located on the ⑀ central ␤-sheet (Thr 16 , Arg 56 , Val 96 ) or residues that interact directly with this sheet (Leu 171 ). It is likely that destabilization of the central sheet presents a situation where stabilization of the outside edge by or HOT becomes increasingly critical. In addition, destabilization of ⑀ by mutations of residues on helix ␣1 (His 66 ) or of residues that interact with ␣1 (Gly 17 , Arg 56 ) is also reversed/limited in the presence of /HOT, consistent with the proposed stabilization of this region by HOT. Finally, a notable aspect of the current structure is that it presents an example of a naturally occurring interspecies complex; as far as we know, it is the first such complex A, sequence alignment of HOT with showing conserved residues (red) and HOT residues involved in direct interactions with ⑀186 (cyan). Among the eight interacting residues, W4 HOT and S24 HOT represent non-conservative substitutions. B, superposition of NMR structure of (cyan) determined in complex with ⑀186 (14) and crystal structure of HOT (orange) complexed with ⑀186 (blue). reported for a DNA replication assembly. Obviously, one interesting question is why bacteriophage P1 carries a homolog of while depending otherwise completely on the host replication machinery, as it carries no homolog for any other HE subunit (15). Data from our laboratory have shown that the HOT gene product is expressed from the phage genome during both lytic and lysogenic stages and that HOT and compete for incorporation into the HE. 3 In view of the intrinsic instability of ⑀, it is possible that the amount of cellular is rate-limiting for incorporation of ⑀ (or ⑀) into the core. Thus, increased expression of the homolog from the phage may assure sufficient HE for phage replication.