A Comparison of Eubacterial and Archaeal Structure-specific 5′-Exonucleases*

The 5′-exonuclease domains of the DNA polymerase I proteins of Eubacteria and the FEN1 proteins of Eukarya and Archaea are members of a family of structure-specific 5′-exonucleases with similar function but limited sequence similarity. Their physiological role is to remove the displaced 5′ strands created by DNA polymerase during displacement synthesis, thereby creating a substrate for DNA ligase. In this paper, we define the substrate requirements for the 5′-exonuclease enzymes from Thermus aquaticus, Thermus thermophilus,Archaeoglobus fulgidus, Pyrococcus furiosus,Methanococcus jannaschii, and Methanobacterium thermoautotrophicum. The optimal substrate of these enzymes resembles DNA undergoing strand displacement synthesis and consists of a bifurcated downstream duplex with a directly abutted upstream duplex that overlaps the downstream duplex by one base pair. That single base of overlap causes the enzymes to leave a nick after cleavage and to cleave several orders of magnitude faster than a substrate that lacks overlap. The downstream duplex needs to be 10 base pairs long or greater for most of the enzymes to cut efficiently. The upstream duplex needs to be only 2 or 3 base pairs long for most enzymes, and there appears to be interaction with the last base of the primer strand. Overall, the enzymes display very similar substrate specificities, despite their limited level of sequence similarity.

The 5Ј nuclease domains of DNA polymerase I from Escherichia coli and Thermus aquaticus were the first extensively characterized members of a large class of structure-specific 5Ј-exonucleases (1,2). Initially it was proposed that these enzymes work as true exonucleases removing predominantly mono-or dinucleotides from the 5Ј end of double-stranded DNA (3). More detailed studies have shown that 5Ј nucleases of this type specifically recognize bifurcated ends of double-stranded regions and remove single-stranded 5Ј arms by cutting the phosphodiester bond after the first base pair of the duplex, leaving a 3Ј hydroxyl end (2). A mammalian enzyme with functional similarity to the 5Ј-exonuclease domain of E. coli polymerase I was isolated nearly 30 years ago (4). Later, additional members of this group of enzymes called flap endonucleases (FEN1) from Eukarya and Archaea were shown to possess a nearly identical structure-specific activity (5)(6)(7)(8), although they have limited sequence similarity to the bacterial 5Ј-exonuclease proteins.
The substrate specificities of the FEN1 enzymes and the eubacterial and related bacteriophage enzymes have been examined and found to be similar for all enzymes (2, 5, 6, 8 -11). The minimal requirement for cleavage is a bifurcated duplex with a free 5Ј end. The presence of an upstream primer that directly abuts the downstream strand stimulates cleavage, but its precise effect on the site of cleavage remains unclear. In the majority of studies that were done with the flap substrate described in Harrington et al. (5), the enzymes leave predominately a 1-nucleotide gap or 1-nucleotide overlap between the upstream primer and cleaved downstream DNA strand (5, 8, 10 -13). When this substrate is modified to contain G-C base pairs at the cleavage site, human FEN1 and Methanococcus jannaschii FEN1 change the cleavage site and cleavage by Archaeoglobus fulgidus and Pyrococcus furiosus FEN1s is very poor (8). These results clearly demonstrate enzyme-and sequence-specific cleavage of the flap substrate, despite the classification of the 5Ј-exonucleases as structure-specific enzymes.
The heterogeneity in cleavage position exhibited by the FEN1 enzymes seems inconsistent with their proposed role in DNA replication. The role of human FEN1 in DNA replication (14,15) has been examined, and FEN1 has been found to be necessary to complete Okazaki fragment processing in vitro. But these studies cannot answer the question of whether it is the FEN1 or the DNA polymerase that generates ligatable nicks, because both enzymes are required, along with RNase H and DNA ligase, to process Okazaki fragments in vitro.
The 5Ј nuclease activity of E. coli DNA polymerase I is essential for both the synthesis and the repair of DNA (3). An early study reported that cleavage of displaced strands by E. coli DNA polymerase I creates a nick between the upstream and downstream strands (1). However, this result was contradicted by a study (13) showing that the cleavage of the flap substrate (5) by E. coli DNA polymerase I leaves a 1-nucleotide gap that would have to be filled by polymerase before ligation could occur.
In this paper we present analyses of the substrate requirements for cleavage by seven structure-specific 5Ј-exonuclease enzymes from Archaea and Eubacteria that we purified in order to understand clearly the substrate requirements at the site of cleavage. In particular, we focus on the consequences of changes in the 3Ј end of the upstream primer strand. The minimal substrate for cleavage is a bifurcated double-stranded DNA, but cleavage is greatly stimulated by an upstream primer annealed adjacent to the bifurcation. In addition, efficient cleavage requires at least one base of overlap between the two duplexes. Such an overlap increases the cleavage several orders of magnitude over the rate reported for these enzymes using a nonoverlapping flap substrate (8). Cleavage by most enzymes is reduced by modifications of the 3Ј end of the upstream primer implying recognition of the end of the upstream primer duplex. All enzymes leave a ligatable nick upon cleavage, contrary to published data on identical and related 5Јexonucleases. These data demonstrate that the 5Ј structurespecific nucleases are able to carry out the final step of DNA replication prior to ligation. The amplified genes were cloned into the expression vector pTrc99a (Amersham Pharmacia Biotech) by standard techniques. Six-amino acid histidine tags were added onto the carboxyl termini of all enzymes by site-directed mutagenesis (TaqPol, TthPol, TaqExo, and PfuFEN) or by including the His tag sequence in the oligonucleotide used for PCR (Mja, Mth, and Afu FEN1 genes). A conserved aspartic acid at position 785 of TaqPol and position 787 of TthPol was mutated to an asparagine to create polymerase-deficient versions of the enzymes used in this study.

Materials
For expression, plasmids were transformed into the E. coli strain BL21 (Novagen), which is deficient in the lon and ompT proteases. Log phase cultures of BL21 were induced with 0.5 mM isopropyl-1-thio-␤-Dgalactopyranoside (Promega) and grown for an additional 16 h prior to harvest. Enzymes were purified as described (17,31) using His Bind resin metal chelation chromatography (Novagen) as a final step. Enzyme concentration was determined by measuring absorption at 279 nm as described (18). All enzymes were dialyzed and stored in 50% glycerol, 20 mM Tris-HCl, pH 8, 50 mM KCl, 0.5% Tween 20, 0.5% Nonidet P-40, 100 g/ml bovine serum albumin.
Substrate Preparation-All oligonucleotides substrates were synthesized on a PerSeptive Biosystems instrument using standard phosphoramidite chemistries (Glen Research). The oligonucleotide with the d-spacer modification at the 3Ј end was synthesized using 3Ј-phosphate CPG (Glen Research) followed by phosphate removal using calf intestinal alkaline phosphatase (Promega). The oligonucleotides were purified by separating the primary synthesis products on a 20% denaturing polyacrylamide gel and by excision and elution of the major band. The oligonucleotides labeled on their 5Ј ends with 5Ј-fluorescein (6-FAM, Glen Research) were further purified by reverse phase HPLC using a Dionex DX 500 instrument and a Microsorb-MV C-18 column (Rainin).
Activity Assays-Unless otherwise indicated, 10-l reactions contained 10 mM MOPS, pH 7.5, 0.05% Tween 20, 0.05% Nonidet P-40, 10 g/ml tRNA, and 200 mM KCl for TaqPol and TthPol or 50 mM KCl for all other enzymes. Reactions with the hairpin substrates contained no KCl and 4 mM MnCl 2 . Substrates (2 M) and varying amounts of enzyme were mixed with the indicated (above) reaction buffer and overlaid with Chill-out (MJ Research) liquid wax. Substrates were heat denatured at 90°C for 20 s and cooled to 50°C, then reactions were started by addition of MgCl 2 or MnCl 2 and incubated at 50°C for the specified length of time. Reactions were stopped by the addition of 10 l of 95% formamide containing 10 mM EDTA and 0.02% methyl violet (Sigma). Samples were heated to 90°C for 1 min immediately before electrophoresis on a 20% denaturing acrylamide gel (19:1 cross-linked), with 7 M urea, and in a buffer of 45 mM Tris borate, pH 8.3, 1.4 mM EDTA. Unless otherwise indicated, 1 l of each stopped reaction was loaded per lane. Gels were then scanned on an FMBIO-100 fluorescent gel scanner (Hitachi) using a 505-nm filter. The fraction of cleaved product was determined from intensities of bands corresponding to uncut and cut substrate with FMBIO Analysis software (version 6.0, Hitachi). The fraction of cut product did not exceed 20% to ensure that measurements approximated initial cleavage rates. The cleavage rate was defined as the concentration of cut product divided by the enzyme concentration and the time of the reaction (in minutes). For each enzyme three data points were used to determine the rate and experimental error.
Ligation Experiments-The ligation experiment was done using three separate oligonucleotides that anneal to form an overlapping flap substrate. The oligonucleotides have the following sequences: template oligonucleotide, 5Ј-GAAAGCGAGACAGCGAAAGACGCTCGTGAA; upstream primer oligonucleotide, 5Ј-ACGAGCGTCTTTC; and downstream oligonucleotide, 5Ј-AAACGCTGTCTCGCT. The downstream oligonucleotide and the mock product oligonucleotide (5Ј-ACGAGCGT-CTTTCGCTGTCTCGCT) were labeled with fluorescein at the 3Ј ends (Glen Research). All oligonucleotides were gel purified as described above. 10-l reactions contained 10 mM MOPS, pH 7.5, 2 mM MgCl 2 , 20 g/ml tRNA, 1 mM ATP, DNA substrate, and 5Ј nuclease and were incubated for 5 min at 50°C, after which the temperature was shifted to 23°C, and 1 unit of T4 DNA ligase (Promega) was added and incubated for an additional 15 min. The reactions were stopped, and the samples were analyzed as described above except that samples were electrophoresed on a 15% polyacrylamide gel.

Cloning and Purification of Seven Structure-specific 5Ј
Nucleases-Four archaeal FEN1 enzymes from A. fulgidus (AfuFEN), P. furiosus (PfuFEN), M. jannaschii (MjaFEN), and M. thermoautotrophicum (MthFEN), two eubacterial polymerase I enzymes from T. aquaticus (TaqPol) and T. thermophilus (TthPol) and the 5Ј nuclease domain of TaqPol (TaqExo) (31) were cloned, expressed in E. coli, and extensively purified to study their substrate specificity (Fig. 1). All enzymes used in this study have six-histidine tags on their carboxyl termini to facilitate purification. To determine whether this modification affects enzyme specificity, five of the enzymes studied here (except TaqExo and MthFEN) were cloned without the histidine tags and were purified by heparin affinity chromatography to a similar level of purity. These enzymes have no measurable differences in specificity compared with their His tag containing counterparts (data not shown). TaqPol and TthPol each contain a single amino acid substitution of aspartic acid to asparagine (D785N for TaqPol and D787N for TthPol) in their polymerase domain, which eliminates polymerization activity; the equivalent mutation in E. coli DNA polymerase I has been shown to have no effect on DNA binding (19) and also has no effect on the cleavage rate or substrate specificity of TaqPol or TthPol (data not shown).
Optimal Substrate for the Structure-specific 5Ј Nucleases- The presence of an upstream primer is known to be important in stimulating cleavage of 5Ј arm containing substrates by the structure-specific nucleases (2). Furthermore, overlap between the upstream and downstream duplexes stimulates cleavage of some nucleases even further (17,20). The identity of the 3Ј of the upstream primer may also play a role in the stimulation of cleavage (31). To investigate the role of the 3Ј-terminal nucleotide of the upstream primer in substrate recognition, we created substrates that differed in the identity at that position. In all substrates, the downstream oligonucleotide was labeled with fluorescein at its 5Ј end and connected to the template strand via the exceptionally stable GAA hairpin loop (21). Quantitative HPLC analysis of the products of cleavage reactions indicates that the presence of fluoroscein at the 5Ј end of substrates increases slightly the cleavage rate compared with unlabeled substrates for the enzymes used in this study (data not shown). The 3Ј arm of the hairpin is free to anneal to an upstream primer to form the flap structures shown in Fig. 2A. The upstream primer had dA, dC, dG, dT, dC with a 3Ј phosphate, dideoxy C, or a d-spacer mimicking a deoxyribose sugar moiety at its 3Ј end to create an overlap with the downstream duplex. We refer to the duplexes formed by the template strand and the downstream or upstream primers as downstream or upstream duplexes, respectively. Also, we created a shorter version of the upstream primer that forms an upstream duplex that abuts the downstream duplex to form a flap substrate that lacks overlap between the two duplexes, like those previously described in the literature (5).
Reactions with all seven enzymes were performed in the presence of an excess of substrate (2 M) over enzyme (0.35 nM for TaqExo and FENs and 2.8 nM for TaqPol and TthPol). These substrate concentrations are much higher than the K m value for all studied substrates (data not shown) assuming that the cleavage rates measured as described under "Experimental Procedures" are close to V max for each enzyme. For all substrate enzyme combinations, the one major product observed corresponded to the product generated by cleavage after the first base pair of the downstream duplex. Cleavage generated fivenucleotide fragments, as shown in Fig. 2B for AfuFEN and TthPol. Release of a five-nucleotide arm should create a nick between the upstream and downstream primers rather than a gap or an overlap. Cleavage rates for all enzymes are summarized in Table I. For all enzymes a natural base at the 3Ј end of the upstream primer supported the highest rate of cleavage (Fig. 2B, lanes 5-8 and 17-20). The archaeal FEN1 enzymes used all four natural bases with approximately equal efficiency, but the cleavage rates of the eubacterial enzymes were clearly dependent on the nature of the 3Ј-terminal nucleotide. For TaqPol and TaqExo, dT inhibited cleavage, whereas dA supported the highest level of cleavage among the natural bases; for TthPol, dA and dG inhibited cleavage compared with dT and dC. A 3Ј phosphate or d-spacer group largely eliminated cleavage for all enzymes. The dideoxy C greatly inhibited the activity of all archaeal enzymes, whereas it reduced the level of cleavage for the eubacterial enzymes by only 10 -30% relative to a substrate with dC at the end of the primer strand (Table I). Under these conditions, no cleavage was observed in the absence of an upstream primer (Fig. 2B, lanes 3 and 15).
For all enzymes the substrates with a nonoverlapping flap had a lower rate of cleavage than the substrate with an overlapping natural 3Ј nucleotide. Surprisingly, the observed cleavage rate was higher than that reported for cleavage of a similar nonoverlapping type of substrate (17). We hypothesize, as previously suggested (31), that alternative flap structures can be generated to produce substrates with an overlap. In particular, one of the three consecutive As of the template strand could be bulged out to create overlapping structures by slippage (Fig.  2C). To test this hypothesis the sequence of the template strand was modified to prevent the potential slippage by substituting a G-C for an A-T in the middle of the AAA track as shown in Fig. 3A, and the overlapping and nonoverlapping flap substrates were incubated with each of the seven enzymes in reaction conditions identical to those in Fig. 2B. As shown in Fig. 3B, cleavage of the nonoverlapping flap substrate was almost undetectable over the background level (no enzyme control) for all enzymes except AfuFEN, which had only 5% of the activity observed for the overlapping substrate (Fig. 3B, lanes 9 and 13). We conclude that overlap is required for efficient cleavage and that slippage can occur to create overlaps. It is unclear whether this slippage is induced by the enzyme itself.
Overlapping and Hairpin Substrates and the Effect of Mg and Mn Ions on Enzyme Activity-After establishing the key features of substrate recognition we designed two substrates to further investigate the specificity of the seven nucleases. The first (overlapping flap substrate, Fig. 4A) has upstream and downstream strands connected to the template strand by two GAA loops to form a "dumbbell" structure. The upstream and downstream duplexes overlap by 3 base pairs. The second substrate (hairpin substrate, Fig. 4B) was designed to study cleavage in the absence of the upstream primer; it has the same 5Ј arm and substrate duplex region connected to the short 3ЈTTT arm. To obtain similar levels of cleavage, the incubation time and enzyme concentration were varied as shown in Fig. 4.
The cleavage patterns for all enzymes are shown in Fig. 4, and the corresponding cleavage rates are summarized in Table  II for both the overlapping and hairpin substrates in MgCl 2 and MnCl 2 . The most striking aspect of these data is the increase in cleavage rate and cleavage accuracy conferred by the presence of the upstream primer duplex. For instance, in MgCl 2 , the presence of the upstream primer duplex caused the cleavage rate of TaqExo to increase 4 orders of magnitude. An even greater increase is seen for AfuFEN and PfuFEN, because cleavage of the hairpin substrate cannot be detected in MgCl 2 .
Using the background signal as the upper limit of hairpin cleavage, we estimate that the difference in cleavage rate of two substrates for these enzymes is at least 60,000-fold. Substitution of MnCl 2 for MgCl 2 stimulated cleavage of the hairpin by about 1 or 2 orders of magnitude, depending on the enzyme, whereas it decreased the rate of cleavage of the overlapping flap by about 0.5 order of magnitude. Although the stimulatory effect of the upstream duplex is not as great in the presence of MnCl 2 , it is still significant, averaging 1.5-2 orders of magnitude. The differences between Tables I and II in cleavage rates for the overlapping substrates can be explained by differences in the MgCl 2 and KCl concentrations as well as by subtle differences in the structures of the substrates (compare Figs.  2A and 4A).
The products of cleavage of the hairpin substrate differed between enzymes, although the major product of most enzymes is 5 nucleotides long, indicating that cleavage occurs after the first base pair of the substrate duplex. The exception is Mth-FEN, whose major product was 6 nucleotides long. In addition to significantly increasing the cleavage rate, the presence of the upstream primer duplex also made the cleavage pattern nearly identical for all enzymes. For both the overlapping flap and hairpin substrates, the positions of cleavage were not affected by the choice of divalent cation. The major cleavage product of the overlapping flap substrate is produced by cleavage at the  2, 4, 6, 8, 10, 12, 14, and 16) and overlapping (lanes 1, 3, 5, 7, 9, 11, 13, and 15) (Fig. 2A).
position expected if the upstream primer is fully base paired. The appearance of a small amount of 3-and 4-nucleotide products indicates branch migration (known as a three-stranded branch migration) occurs between the upstream and downstream duplexes, and the enzymes are able to cleave the sub-strate inefficiently when it has two or even three of the 3Јterminal nucleotides of the upstream primer unpaired. pH, Salt, Divalent Ion, and Temperature Effects-The effects of pH, KCl concentration, and divalent cation concentration were investigated for the overlapping flap and hairpin substrates (Fig. 4) for each of seven enzymes. In the range from 0.2 to 7 mM of MnCl 2 and MgCl 2 concentrations, the rates of cleavage of the overlapping flap substrate by most enzymes increased 3-10 times reaching a plateau in the range from 4 to 7 mM MgCl 2 and MnCl 2 . In the presence of MgCl 2 , the optimal KCl concentration was 200 mM for the polymerase enzymes TaqPol and TthPol, 100 mM for MthFEN and MjaFEN, and 20 -50 mM for the others (data not shown). The higher optimal salt concentration for the eubacterial polymerases suggest that they bind more strongly than archaeal enzymes to the overlapping substrate and that at low salt concentration, product release is slow for the polymerase enzymes. Interestingly, in the presence of MnCl 2 , the optimal KCl concentration for the overlapping substrate increased 2-3 times for all enzymes (data not shown). KCl at concentrations above 10 -30 mM was found to inhibit cleavage of the hairpin substrate by all enzymes (data not shown).
The cleavage reaction was highly dependent on the pH of the buffer, varying as much as 2 orders of magnitude in the range from pH 6 to 10. Most enzymes had optimal activity between pH 8 and 9, with the exceptions of MthFEN and TthPol, which reach maximum activity at pH 10 (Table III).
With the overlapping flap substrate in MgCl 2 , archaeal enzymes had fairly sharp temperature optima between 70 and 85°C, TaqExo and TaqPol enzymes had optimal activity between 65 and 70°C, respectively, and TthPol was most active at 85°C (Table IV). All enzymes were able to cut the overlapping flap substrate at both 40 and 90°C and survived incubation at 90°C for 15 min, except MthFEN, which is not thermostable above 75°C (Table IV and data not shown). All enzymes had lower temperature optima in MnCl 2 than MgCl 2 except for TaqPol and TaqExo, for which the optima were increased by 5-10°C (data not shown).
The Effect of Substrate Structure on Cleavage Efficiencies-Having established a standard set of reaction conditions, we undertook a comparison of how structural elements of the hairpin and overlapping flap substrates affected cleavage by the seven nucleases. The effect of the 3Ј arm length in the hairpin substrate was determined with a series of hairpin substrates like the substrate in Fig. 4B but having 3Ј arms of 0, 3, 8, or 15 dT nucleotides. Surprisingly, the 3Ј arm length can have as great as a 10-fold effect on the cleavage rate. The archaeal FEN1 enzymes cleaved at the highest rate when the 3Ј arm was 8 nucleotides long, whereas the eubacterial enzymes were most active on substrates lacking a 3Ј arm (Table  V).
To determine the minimal length of upstream duplex re-
quired for stimulation of cleavage, we used a set of overlapping flap substrates like the one shown in Fig. 4A, but with progressively shorter upstream duplexes generated by removing base pairs from their loop side. All enzymes were able to cleave at maximal or nearly maximal rates when the primer duplex was at least 5 base pairs long (data not shown). Because primer duplexes shorter then 5 base pairs consisted mainly of A-T base pairs, they were unstable at 50°C, so we synthesized a series of substrates with only G-C base pairs in the upstream duplex (Fig. 5A). To simplify the preparation of substrates, the labeled downstream strand was synthesized as a separate oligonucleotide, whereas the remainder of the substrate containing the upstream primer and template strand was on a separate single oligonucleotide. AfuFEN, MjaFEN, and TthPol cleaved these G-C-rich substrates at rates close to maximal even when the substrate had only a 3-base pair upstream duplex (Fig. 5B). But with the 2-base pair upstream duplex substrate, the rates decreased 6-fold for PfuFEN and 2-fold for AfuFEN, TaqPol, TthPol, and TaqExo; surprisingly, the cleavage rate increased for MjaFEN and MthFEN ( Fig. 5B and data not shown). The activity of all enzymes dropped more then 10-fold when the upstream duplex length was reduced to 1 base pair, with the exception of MjaFEN, for which only a 30% decrease was observed. No significant cleavage was observed when the 3Ј arm could not form a duplex (Fig. 5, lanes 3, 9, and 15). The effect of downstream duplex length on cleavage efficiency was determined using substrates with downstream duplex lengths of between 8 and 16 base pairs for the hairpin substrate and 6 and 16 base pairs for the overlapping flap substrate (Fig. 6A), in which the upstream duplex was 6 base pairs. All enzymes, except TaqPol, cleaved the overlapping flap substrates at rates independent of downstream duplex length in the range from 10 to 16 base pairs; TaqPol cleaved the 10 base pair substrate approximately five times slower than the 12 base pair substrate (Fig. 6B). For most enzymes, cleavage activities decreased for the 8 base pair downstream duplex substrate and significantly dropped when the duplex length was reduced to 6 bp. Only MjaFEN and MthFEN were able to cleave the substrate with the 6-base pair downstream duplex. For the hairpin substrates, cleavage rates decreased nearly linearly with decreasing duplex length and were only 5-10% of maximal (data not shown) with an 8-base pair hairpin for TthPol and the FEN enzymes. TaqPol and TaqExo had essentially no activity on that substrate.
Ligation after Cleavage by the Structure-specific 5Ј-Exonucleases-In vivo, strand displacement to remove RNA primers or damaged DNA should ultimately generate a structure that can be sealed by ligation. To determine whether the enzymes studied here are capable of generating such nicked duplexes, we determined whether the upstream and downstream primers could be ligated after cleavage in the absence of any polymerase activity. Ligation would also confirm that the deduced site of cleavage is juxtaposed to the 3Ј end of the upstream primer. In these experiments, the substrate was generated by annealing 3 oligonucleotides and cleaving the resulting substrate with each of the enzymes followed by the addition of T4 DNA ligase. The oligonucleotide that was to be cleaved was labeled at its 3Ј end to permit monitoring of its ligation to the upstream primer oligonucleotide (Fig. 7). The 3Ј labeled substrate was cleaved at approximately the same rate as the 5Ј labeled substrate for all seven enzymes within experimental error (data not shown),   (Fig. 4A) substrate in the presence of 4 mM MgCl 2 as described under "Experimental Procedures" and presented as percentages relative to highest activity.  further supporting the conclusion that the 5Ј fluoroscein has little effect on the cleavage rate. For all seven nucleases, DNA ligase converted essentially all the cleaved 11-nucleotide product to a 24-nucleotide product that comigrated with a synthetic 24-nucleotide oligonucleotide with the same sequence. These data clearly demonstrate the ability of the structure-specific 5Ј-exonucleases to create a substrate for DNA ligase and strongly indicate that they carry out that same function in vivo.

DISCUSSION
During DNA replication, a large number of RNA primers must be removed, and the Okazaki fragments generated by those RNA primers must be joined by DNA ligase. The literature regarding the question of which enzyme acts immediately prior to DNA ligase is contradictory. The work by Lundquist and Oliviera (1) demonstrated the ability of E. coli DNA polymerase I to leave a nick between the upstream and downstream strands after cleavage of displaced single-stranded overhangs generated during nick translation. However, studies with the whole DNA polymerase I or its isolated 5Ј nuclease domain only gave a different result (13). Using a preformed substrate, the nuclease leaves a gap, leading the authors to speculate that the DNA polymerase must then act to fill in that gap to generate a ligatable nick. A number of other 5Ј nucleases have been shown to leave a gap or overlap after cleavage of the same or similar flap substrates (5,8,22,23).
The principal finding of this work is that all the structurespecific 5Ј-exonucleases leave a nick after cleavage of a substrate that has the overlap between the upstream and downstream duplexes (Figs. 2 and 3). When overlap exists, there can be branch migration of the two duplexes resulting in the 3Ј end of the upstream primer being alternately paired and unpaired (1,17). Our data show that cleavage occurs on the substrate having the conformation where the last nucleotide at the 3Ј end of the upstream strand is unpaired because the cleavage rate is essentially the same whether the end of the upstream primer is A, C, G, or T (Fig. 2). Thus, it is positional overlap between the 3Ј end of the upstream primer and downstream duplex rather then sequence overlap that is required for optimal cleavage. The fact that the 3Ј end of the upstream primer is unpaired indicates that, as in the case of the hairpin substrates (2), the cleavage occurs between the first two base pairs of the downstream duplex. Its also explains why nonoverlapping flap substrates, such as described (5), generate gaps. When the up-stream duplex abuts the downstream duplex with no overlap, cleavage after the first base pair of the downstream duplex (described as proximal in Ref. 5) will create a gap.
Comparing the results from Figs. 2 and 3, it appears that the enzymes are able to tolerate slippage of the two strands of the upstream duplex to generate overlap and thus a more cleavable substrate (Fig. 2C), even when such slippage would cause a distortion in the DNA duplex. Alternative structures of the downstream duplex that create overlap with the upstream primer have been proposed to explain cleavage of nonoverlapping substrates by TaqPol and TaqExo. 2 We believe the ability of the 5Ј nucleases to support cleavage of the nonoverlapping flap structure used by Harrington and Lieber (5) may be due to a similar rearrangements of the downstream duplex (distal cutting in Ref. 5). The fact that mutations that stabilize the 3Ј end of the downstream duplex significantly reduce cleavage for some enzymes supports that idea (8).
Another aspect of cleavage affected by overlap is the cleavage rate. We can directly compare our results for PfuFEN, AfuFEN, and MjaFEN with data from the literature because cleavage rate data for these enzymes are available (8). We observe cleavage rates of ϳ100 cleavages/minute/enzyme with the overlapping flap substrate for all three enzymes at 50°C (Table I). When the same three enzymes were examined at 55°C with a nonoverlapping flap substrate (8) 0.04 nM/s combined for all products (19 and 21-mer) were observed for MjaFEN, AfuFEN, and PfuFEN, respectively. When converted to the cleavage rates as described under "Experimental Procedures," those numbers are equivalent to 0.032, 0.015, and 0.003 cleavages/minute for MjaFEN, AfuFEN, and Pfu FEN, respectively. These rates are more then 3 orders of magnitude slower than the rates we observe for the same enzymes with the overlapping flap substrate. Part of the difference may be explained by differences in methodology and in the particular substrate used. However, a cleavage rate decrease of several orders of magnitude is consistent with what we observe for most enzymes except AfuFEN. The cleavage rate of AfuFEN only decreases a single order of magnitude when overlap is eliminated for the substrate used in Fig. 3. Further experimentation will be required to determine whether this result is reproducible with different substrates. The stimulatory effect of overlap has been reported for human FEN1 (20), PfuFEN and AfuFEN (17), and TaqPol (31), but quantitative analysis of this effect has not been done. Interestingly, length of the overlap between the two duplexes seems to not effect the cleavage rate (Refs. 17 and 20 and data not shown), which indicates that branch migration is fast compared with enzyme binding and cleavage.
Although the upstream primer stimulates and focuses cleavage by the 5Ј nucleases, the downstream duplex alone is sufficient to support cleavage by the 5Ј nucleases (Fig. 4B). The three-dimensional structures for five members of the 5Ј nuclease family have been solved (24 -28), and all five show two divalent metal binding sites. It has been proposed for human FEN1 (29) and for T4 RNase H (30) that one metal ion is involved primarily in binding to the DNA substrate. It has also been proposed that the structure-specific 5-nucleases contain the helix-hairpin-helix motif, which is thought to bind to double-stranded DNA in a nonspecific manner (16). When we examined the three-dimensional structure of Taq polymerase determined by x-ray diffraction (24), we noted that the distance between the metal ion binding site and the helix-hairpin-helix motif (aspartic acid 142 to lysine 197) is about 30 Å or roughly 9 base pairs of B-form helix. A contact length of 12 base pairs between PfuFEN and a DNA substrate was recently proposed based on the length of a prominent groove on the surface of that enzyme (28). It is interesting to note that cleavage by TaqExo drops 10-fold going from a 10-to an 8-base pair downstream duplex length (Fig. 6). We hypothesize that one of the bound divalent cations and the helix-hairpin-helix motif are the primary binding sites for the downstream duplex of DNA substrates on the 5Ј nucleases.
The presence of the upstream duplex is important for both activating and positioning cleavage by the structure-specific 5Ј-exonucleases (Fig. 2), but only a small portion of that duplex is recognized by the enzymes (Fig. 5). Most enzymes still show a significant amount of cleavage when the upstream duplex is as small as two base pairs, although the stability of this duplex at 50°C is questionable. In particular, recognition of the last nucleotide of the upstream primer strand appears to be important. Removal of a single oxygen atom by substitution of a deoxy-ribose with a dideoxy-ribose at the 3Ј end of the primer duplex significantly reduces cleavage by archaeal enzymes, and substitution of a 3Ј OH with a 3Ј phosphate affects all enzymes ( Fig. 2 and Table I). Coupled with the fact that the 3Ј nucleotide of the upstream primer is unpaired during cleavage, this leads us to propose that there is a region or pocket in the enzymes that specifically recognizes that nucleotide. Co-crystallization or other structure determination studies will be needed to the identify this region and to elucidate the molecular basis of enzyme recognition.
Many of the aspects of substrate recognition are identical or very similar between the enzymes examined here. We have observed the cleavage rate increase caused by overlap between the upstream and downstream duplexes of the flap substrate for all enzymes examined and using many different substrate sequences (Ref. 17 and data not shown). Furthermore, the cleavage rate does not depend on the GC or AT content of the substrates and is remarkably constant between different substrates (data not shown). We have also found that all enzymes leave a nick after cleavage when overlap exists. We expect these conclusions to be true for all other 5Ј structure-specific exonucleases because the members of the group examined here are evolutionarily distant from one another, spanning two kingdoms. Further studies will be needed to determine whether in fact that is true.