Creating Directed Double-strand Breaks with the Ref Protein

The bacteriophage P1-encoded Ref protein enhances RecA-dependent recombination in vivo by an unknown mechanism. We demonstrate that Ref is a new type of enzyme; that is, a RecA-dependent nuclease. Ref binds to ss- and dsDNA but does not cleave any DNA substrate until RecA protein and ATP are added to form RecA nucleoprotein filaments. Ref cleaves only where RecA protein is bound. RecA functions as a co-nuclease in the Ref/RecA system. Ref nuclease activity can be limited to the targeted strands of short RecA-containing D-loops. The result is a uniquely programmable endonuclease activity, producing targeted double-strand breaks at any chosen DNA sequence in an oligonucleotide-directed fashion. We present evidence indicating that cleavage occurs in the RecA filament groove. The structure of the Ref protein has been determined to 1.4 Å resolution. The core structure, consisting of residues 77–186, consists of a central 2-stranded β-hairpin that is sandwiched between several α-helical and extended loop elements. The N-terminal 76 amino acid residues are disordered; this flexible region is required for optimal activity. The overall structure of Ref, including several putative active site histidine residues, defines a new subclass of HNH-family nucleases. We propose that enhancement of recombination by Ref reflects the introduction of directed, recombinogenic double-strand breaks.

The bacterial recombinase, RecA protein, plays a central role in maintenance of genome stability. It generally functions as a RecA nucleoprotein filament formed on DNA (1). RecA has three identified functions in the cell. First, it participates directly in all recombination processes. Filaments of Escherichia coli RecA protein form most readily on ssDNA; these filaments can then catalyze DNA pairing and strand exchange with a homologous duplex DNA (2). A RecA filament complexed with an oligonucleotide can invade duplex DNA and pair the oligonucleotide with a complementary sequence within a much longer duplex, forming a displacement loop (D-loop) (2). Second, RecA filaments play a central role in the induction of the SOS response (3). In brief, RecA filaments assembled on ssDNA generated by stalled replication forks bind and stimulate the autocatalytic cleavage of the LexA repressor. This co-protease function for RecA inactivates LexA, which leads to the induction of SOS. Third, RecA filaments play two roles in activating the translesion DNA synthesis function of DNA polymerase V. If cellular genomic replication is not restored after 30 -60 min of SOS response, polymerase V opens a final, mutagenic phase of SOS. The UmuD subunit of polymerase V is autocatalytically cleaved, again facilitated by interaction with RecA filaments, to create UmuDЈ. The weakly active UmuDЈ 2 ⅐UmuC complex undergoes a final activation step with the transfer of a RecA subunit from the 3Ј-proximal end of a RecA filament to form the activated polymerase V enzyme consisting of UmuDЈ 2 ⅐UmuC/RecA (4,5). The presence of RecA as a subunit of active polymerase V is the only known activity where RecA exhibits a function when it is not part of a filament.
Many other proteins interact with RecA protein filaments on DNA, and many of these serve to regulate almost every aspect of RecA function (6). The regulators include DinI (7), RecX (8 -10), RdgC (11), PsiB (12), RecFOR (13)(14)(15), UvrD (16), and RecBCD (17)(18)(19), a list that will doubtlessly grow. In all, more than a dozen known proteins interact with RecA and help coordinate its functions with many aspects of DNA metabolism. A number of RecA partner proteins bind within the helical groove of active RecA nucleoprotein filaments. These include LexA (20,21), UmuD (22), RecX (20), and DinI. 2 The LexA and UmuD proteins (along with the bacteriophage repressor (23)) undergo autocatalytic cleavage in this environment. The regulation of RecA is not limited to bacteria-encoded functions. The PsiB protein is a product of conjugative F-plasmids, expressed early in conjugation to suppress the SOS response in the recipient cell (12,24,25). Additional proteins that interact with RecA are encoded by bacteriophages, including the bacteriophage P1.
Almost from the moment it was described in 1951 (26), bacteriophage P1 has been a workhorse of molecular biology. It is now used largely for generalized transduction applied to strain construction (27,28). Its genome (93.6 kbp; ϳ117 genes (27)) is packaged in phage particles as a terminally redundant linear DNA molecule that is cyclized once it enters the cell. The cyclization can use the phage-encoded Cre-lox site-specific recombination system. Cyclization can also utilize homologous * This work was supported, in whole or in part, by National Institutes of Health Grants GM32335 (to M. M. C.) and GM068061 (NIGMS; to J. L. K.). This work was also supported by a William R. and Dorothy E. Sullivan Wisconsin Distinguished Graduate Fellowship (to M. C. G.). The atomic coordinates and structure factors (

EXPERIMENTAL PROCEDURES
Proteins-The native E. coli wild type RecA and singlestranded DNA binding proteins were purified as described previously (12). The concentration of the purified RecA and single-stranded DNA binding proteins was determined from the absorbance at 280 nm using the extinction coefficients of 2.23 ϫ 10 4 M Ϫ1 cm Ϫ1 (33) and 2.83 ϫ 10 4 M Ϫ1 cm Ϫ1 , respectively (34). The non-cleavable UmuD1 protein (35) was a generous gift from M. Goodman. LexA S119A (36) was a generous gift from J. Little. Wild type LexA was purified as described (37). DinI was purified as described (7).
The RecA E38K mutant was purified like wild type RecA with the following modifications. The polyethyleneimine pellet containing RecA E38K was washed with R buffer (20 mM Tris Cl (80% cation; pH 7.5) 0.1 mM EDTA 1 mM DTT 10% glycerol), not R plus 150 mM (NH 4 ) 2 SO 4 . The protein was then purified by successive chromatographic steps using a butyl-Sepharose column, an SP-Sepharose column, a ceramic hydroxyapatite column, and a Sephacryl S-300 gel filtration column. RecA E38K K72R and RecA K72R were purified as described previously (37). The concentrations of all three RecA mutant proteins were determined from the absorbance at 280 nm using the extinction coefficient 2.23 ϫ 10 4 M Ϫ1 cm Ϫ1 , and they are stored in R buffer.
DNA Substrates-The circular ssDNA from bacteriophage M13mp18 (7249 nucleotides) was prepared essentially as described (38,39). The linear single-stranded DNA was prepared by annealing an oligonucleotide complementary to the BamHI site followed by a BamHI restriction digest. The linear ssDNA was cleaned up by removal of the oligonucleotide using a YM-100 Centricon. The concentration of circular and linear ssDNA was determined by absorbance at 260 nm using 36 g ml Ϫ1 A 260 Ϫ1 as the conversion factor. The M13mp18 circular dsDNA was prepared as described in Refs. 38 -40. The M13mp18 linear dsDNA substrate was prepared by digesting M13mp18 circular dsDNA with PstI. All DNA concentrations are given in total nucleotides. Oligonucleotides were purchased from Integrated DNA Technologies. Sequences of oligonucleotides used in this study are presented in Table 1. For better codon usage in E. coli, the GGG coding for Gly at amino acid (aa) 2 (codon 77) was changed to GGT, the AGA coding for Arg at aa 3 was changed to CGT, the ACA coding for Thr at aa 4 was changed to ACC, the ACG coding for Thr at aa 6 was changed to ACC, and the CGG coding for Arg at aa 10 was changed to CGT. The downstream primer was the same used to clone pEAW584. The PCR product was digested with NdeI and BamHI and ligated to pET21A (Novagen) digested with the same enzymes. The resulting plasmid was designated pEAW685. The structure of both mutant ref genes was confirmed by direct sequencing.
Competent cells of E. coli strain BL21(DE3) were transformed with plasmid pEAW584. Ten liters of culture were grown in LB broth to an A 600 of 0.51. Ref protein expression was induced by the addition of isopropyl 1-thio-␤-D-galactopyranoside to 0.4 mM. After a 3-h 10-min outgrowth at 37°C, 24.6 g of cells were harvested by centrifugation, flash-frozen in liquid N 2 , and stored at Ϫ80°C. The protein expressed is the native, 186amino acid polypeptide without tags.
All purification steps were carried out at 4°C. Purification entailed polyethyleneimine precipitation and pellet extraction, precipitation with (NH 4 ) 2 SO 4 , and chromatography successively using butyl-Sepharose, Source 15 Q, ceramic hydroxyapatite columns, and Sephacryl S-100 gel filtration columns. This was followed by another butyl-Sepharose chromatography step. The protein was concentrated using Amicon Centricon-Plus 20 and dialyzed against Ref storage buffer (R plus 200 mM potassium glutamate), flash-frozen in liquid N 2 , and stored at 80°C. The protein was Ͼ99% pure and free of detectable nuclease activity when incubated at 37°C for 2 h with different DNA substrates (circular ssDNA, linear and supercoiled dsDNA, and labeled oligonucleotides) in a buffer containing 25 mM Tris-OAc (80% cation, pH 7.6), 1 mM DTT, 3 mM potassium glutamate, 10 mM Mg(OAc) 2 , and 5% (w/v) glycerol (buffer A). A more detailed purification protocol is available on request.
The concentration of the Ref protein was determined from the absorbance at 280 nm using the extinction coefficient 2.851 ϫ 10 4 M Ϫ1 cm Ϫ1 . The Ref extinction coefficient (⑀ nat,280 nm ϭ 2.851 ϫ 10 4 Ϯ 0.108 M Ϫ1 cm Ϫ1 ) was determined during the course of the present work using procedures described elsewhere (41,42). The identity of the purified protein was confirmed by mass spectrometry. The measured mass of the protein was 21,326 Da, in very good agreement with the calculated mass of Ref protein of 21,329 Da (with the initiator Met residue still present). In the course of these studies, a higher molecular weight band on SDS-PAGE gels was observed that corresponded to the approximate size of a Ref dimer. The identity of this band was confirmed to be Ref by mass spectrometry.
The Ref H153A and ⌬N76 mutant proteins were purified with procedures using very similar growth, induction, cell harvesting, and early fractionation steps. Ref 153A was purified with successive chromatographic steps employing butyl-Sepharose, ceramic hydroxyapatite, and Sephacryl S-100 followed by additional ceramic hydroxyapatite and butyl-Sepharose steps. The protein was Ͼ99% pure by SDS-PAGE and free of any detectable nuclease contamination. Ref H153A is folded the same as wild type Ref as confirmed by CD spectroscopy (supplemental Fig. S1). The Ref ⌬N76 protein was purified with successive butyl-Sepharose, ceramic hydroxyapatite, Sephacryl S-100, and Source 15 Q-Sepharose. The protein was Ͼ99% pure and free of detectable nuclease activity. The concentration of the Ref ⌬N76 protein was determined from the absorbance at 280 nm using a calculated extinction coefficient of 15,220 M Ϫ1 cm Ϫ1 . The identity of the Ref ⌬N76 protein was confirmed by excising the protein band from a gel and digesting with trypsin. The products were subjected to MALDI-TOF/TOF mass spectrometry (Applied Biosystems/MDS SCIEX 4800) for identification of peptides. Peptides detected and sequenced (many repeatedly) were all consistent with the predicted sequence of Ref ⌬N76 and included 92% of the amino acid residues in the protein.
To generate labeled blunt-ended double-stranded DNA, the labeled 50-mer was annealed to an unlabeled complementary 50-mer, den10 ( Table 1). The two labeled DNA substrates were used at 2.5 or 5 M (in total nucleotides) for the ssDNA or dsDNA oligos, respectively. The DNA binding reactions also contained buffer A. The reactions (10 l) containing the indicated concentrations of Ref protein (replaced in controls with Ref storage buffer) were incubated at 37°C for 10 min. Then 5 l of 6ϫ loading buffer (18% (w/v) Ficoll, 20 mM Tris-OAc 80% cation) was added to 10 l of the reactions, and the reactions were loaded onto a native 8% polyacrylamide gel and subjected to electrophoresis in TBE buffer (90 mM Tris borate and 10 mM EDTA). The longer M13mp18 circular ssDNA or linear dsDNA substrates were used at a concentration of 10 Mnt 3 in reactions (40 l) containing otherwise the same components as described above. Various concentrations of Ref were added to the reactions and incubated for 40 min at 37°C. Then 5 l of 6ϫ loading buffer (18% (w/v) Ficoll, 20 mM Tris-OAc 80% cation) was added to the reactions, and the entire reactions were loaded onto a native 0.5% agarose gel and subjected to electrophoresis in TE buffer (10 mM Tris acetate (80% cation) and 1 mM EDTA).
Nuclease Assay-The reactions were carried out at 37°C and contained buffer A, an ATP regeneration system (10 units/ml pyruvate kinase and 3.5 mM phosphoenolpyruvate), and 4 Mnt M13mp18 circular ssDNA. The aforementioned components were incubated with 2.4 M RecA for 10 min. Three mM ATP was added followed by a 15-min incubation to allow for RecA filament formation before Ref was added at the concentrations indicated. After 20 min, 20 l of the reaction was stopped by incubation with 5 l of 20 mg/ml proteinase K for 60 min at 37°C followed by the addition of 10 l of a solution containing 9% Ficoll, 0.25% bromphenol blue, 0.25% xylene cyanol, and 4% SDS and another 60-min incubation at 37°C. Samples were subjected to electrophoresis in 0.8% agarose gels with TAE buffer (40 mM Tris-Acetate 1 mM EDTA), stained with SYBR-Gold nucleic acid stain (Invitrogen), and exposed to UV light.
Nuclease Site-specific Targeting Assay-The reactions were carried out at 37°C and contained buffer A and an ATP regeneration system (10 units/ml pyruvate kinase and 3.5 mM phosphoenolpyruvate). Four Mnt of a 150-nt long oligonucleotide (rlb1) and 0.67 M RecAE38K were incubated with the components mentioned above for 10 min followed by the addition of 3 The digest was stopped by the addition of 10 l of loading dye (9% Ficoll, 0.25% bromphenol blue, 0.25% xylene cyanol, and 4% SDS) followed by another 30-min incubation at 37°C. Samples were subjected to electrophoresis in 0.8% agarose gels with TAE buffer, stained with SYBR-Gold nucleic acid stain (Invitrogen), and exposed to UV light. Nuclease Site-specific Targeting Assay to Define Cut Sites-Targeting assays were carried out as described above up until the restriction digest, with the following exception. The 150-nt oligonucleotide used (mcg3) is homologous to bases 6070 -6219 of M13mp18 DNA and is 54 nt (top strand) and 50 nt (bottom strand) away from the PstI restriction site. Reactions were digested with PstI at 37°C for 3 h. The digest was stopped by phenol chloroform extraction and ethanol precipitation. The resulting pellet was resuspended in T4 DNA ligase buffer. Ligations to fluorescently labeled linkers (0.65 pmol) were carried out using 2 l of T4 DNA ligase and incubating for 1 h at room temperature. Linker 1 (L1) consisted of den7 annealed to mcg2, and linker 2 (L2) consisted of den4 annealed to mcg1. Den7 and den4 were phosphorylated at the 5Ј OH using polynucleotide kinase (Promega) at 37°C for 30 min according to the manufacturer's instructions before annealing. The annealing reactions were carried out as described under electrophoretic mobility shift assays. The linkers were designed to have a sticky end complementary to the end created by PstI and a label on the blunt end. The ligation reactions were stopped by ethanol precipitation of the samples. The resulting pellets were resuspended in 90% formamide, 10% EDTA, heated at 95°C for 10 min, and quick-cooled in an ice-water slurry for 10 min. The samples were then loaded on a 10% denaturing acrylamide sequencing gel and run at 30 watts (1600 V) for 4.5 h in TBE buffer. The fluorescently labeled DNA was then visualized using a Typhoon 9410 Variable Mode Imager (Amersham Biosciences) blue laser at 488 nm.
Structure Determination-P1 Ref (22 mg/ml in 20 mM Tris-HCl, pH 8.0, 200 mM NaCl) was mixed with mother liquor (0.2 M ammonium nitrate, 20% PEG 3350) in a 1:1 ratio and suspended over 1 ml of mother liquor in hanging-drop vapor diffusion experiments to generate crystals. Crystals were transferred to a cryoprotectant solution (0.15 M ammonium nitrate, 22% PEG 3350, 25% ethylene glycol) and flash-frozen in liquid nitrogen before data collection. The structure of the Ref was determined by single-wavelength anomalous dispersion phasing that took advantage of the anomalous scattering of the bound Zn 2ϩ ions ( Table 2). Diffraction data were collected at a suboptimal wavelength for Zn 2ϩ anomalous scattering because it preceded discovery that Ref binds Zn 2ϩ . However the singlewavelength anomalous dispersion phases calculated from the dataset were sufficient to produce an excellent experimental electron density map for model building. The data were indexed and scaled using HKL2000 (43). Zinc positions were identified, and an initial structure was built using Phenix (44).
The structure was improved by rounds of manual fitting using Coot (45) and refinement against the native data set using REFMAC5 (46). Coordinate and structure factor files have been deposited at the Protein Data Bank (PDB ID 3PLW).
Assay   (Fig. 2). All four of the components in the reconstituted system were required for nuclease activity. In particular, the nuclease activity depended on both Ref and active RecA filaments (Fig. 2). ATP is needed to form active RecA nucleoprotein filaments, whereas Mg 2ϩ may be used by both RecA and Ref.     (Fig. 3B).
The RecA dependence of the nuclease activity implies an interaction between RecA and Ref. In principle, the RecA-dependent nuclease reaction could occur at a RecA filament end. Alternatively, it could involve DNA strands exposed in the RecA filament groove. We first examined the effect of RecAmediated ATP hydrolysis. DNA degradation is reduced when ATP is not hydrolyzed by RecA. This is true for wild type RecA protein incubated with the non-hydrolyzable ATP analog, ATP␥S, (Fig. 4, lane 9), and when ATPase-deficient RecA mutants such as RecA K72R and RecA E38K K72R are used (Fig.  4, lanes 10 -11). Inasmuch as ATP hydrolysis is coupled to RecA dissociation at the 5Ј-proximal filament end, this could indicate that Ref-mediated cleavage occurs at the disassembling end of a filament. However, the RecA E38K mutant protein (also called recA730 (48)), which disassembles much less than the wild type protein (supplemental Fig. S4), promoted the Ref cleavage reaction at least as well and sometimes better (20 -30% increase in some assays (data not shown)) than the wild type RecA. The RecA E38K mutant, thus, replaced the wild type RecA in many assays with Ref. We also examined the effect of proteins that bind in the RecA filament groove. Ref-mediated nuclease activity is blocked by a non-cleavable variant of the LexA protein (S119A) that binds well enough to active RecA filaments to inhibit DNA strand exchange (49), but not by the DinI protein, the non-cleavable UmuD1 protein, or the wild type LexA protein (Fig. 5). All of these proteins appear to bind in distinct (and sometimes transient) ways to the major groove of a RecA filament (20). When single-stranded DNA binding protein is incubated with DNA before RecA is added, so as to inhibit RecA binding, Ref does not exhibit nuclease activity (Fig.  4, lane 12).
The X-ray Crystal Structure of Ref Reveals a Novel HNHfamily Nuclease-A sequence data base search revealed only a small number of uncharacterized Ref homologs. These included six close homologs (Ͼ90% identity) encoded by bacteriophages or prophages related to P1 and three additional bacterial genes encoding more distantly related proteins. A sampling of these is presented in Fig. 6A. The bacterial proteins, such as those from Salmonella enterica subsp. enterica serovar Newport strain SL317, ϳ65% identical, and Bordetella avium may be parts of cryptic prophages. It is not known if any of the detected homologs are active.
Comparison of this small sampling of Ref sequences revealed the presence of invariant Cys-Xaa-Xaa-Cys, Cys-Xaa-Xaa-His, and His-His motifs. Similar motifs are known to be involved in  (Table 3). Because chelating agents were included during purification of Ref, these Zn 2ϩ ions are likely to be stably bound to the protein.
A crystallographic approach was taken to better understand the structure and function of Ref. Crystals of Ref that diffracted to 1.4 Å resolution were produced, and the structure was determined by single-wavelength anomalous dispersion phasing that took advantage of anomalous scattering by the bound Zn 2ϩ ions ( Table 2). Consistent with the solution Zn 2ϩ binding studies, the crystallographic asymmetric unit contained a single Ref protein bound to 2 Zn 2ϩ ions (Fig. 6, B and C).
Ref folds as a globular protein, with a central two-stranded ␤-hairpin that is sandwiched between several ␣-helical and extended loop elements (Fig. 6B). Electron density for the N-terminal 76 residues was absent, indicating that it may be a flexible element. A careful search for difference density did not reveal any additional structural information about the missing N-terminal amino acids. To test whether the N terminus of the protein remained intact after crystallization, we washed and dissolved several crystals and subjected them to SDS acrylamide electrophoresis. The crystallized protein was completely intact (supplemental Fig. S5).
To test the functional importance of the N terminus, we constructed a deletion mutant of Ref that lacked the 76 N-terminal amino acids and purified it. It proved to be devoid of Ref-mediated nuclease activity on ssDNA under normal reaction conditions (Fig. 7, lanes 11-13). Some RecA-dependent ssDNA cleavage was seen at much higher concentrations of the mutant Ref protein (10ϫ and 100ϫ; Fig. 7, lanes 14 -21). Ref ⌬N76 was completely deficient in DNA binding to a 50-mer oligonucleotide up to a concentration of 2.4 M using the assay shown in Fig. 1 (data not shown). This indicates that the disordered N-terminal domain plays an essential role the DNA binding activity of Ref and also enhances the RecA-dependent nuclease activity. We note that these N-terminal 76 amino acid residues represent a very highly charged part of the protein. Of the 76, 25 are amino acids with a positive charge (Arg or Lys (Fig. 6A)), and another 9 are negatively charged (Glu or Asp). Interestingly, the apparent homolog from B. avium has lost 75 of these 76 amino acid residues (Fig. 6A).
The central ␤-hairpin element of Ref is threaded through the core of the protein and presents ligands that define the Zn 2ϩbinding sites. The first binding site is composed of three His residues that are presented by the first ␤-strand of the hairpin, an adjacent ␣-helix, and a loop that is C-terminal to the hairpin  ( Fig. 6B). Interestingly, this Zn 2ϩ is also liganded by a SO 4 2Ϫ ion, indicating that the Zn 2ϩ bound at this site is solvent-exposed. The second site is composed of three Cys and one His residues from helical and loop elements in the structure. This second Zn 2ϩ ion is entirely buried within the protein core.
Comparison of the Ref structure to other proteins in the Protein Data Bank using automated structure-comparison software failed to reveal any similar structures. However, we noticed that a motif in Ref bore similarity to an element in the HNH family of nucleases. HNH family members are defined by the presence of ␤␤␣-metal core elements in which residues from the ␤-hairpin/␣-helical structure form a metal-binding site. Outside of this core, HNH enzymes are structurally diverse. The HNH family includes a number of bacteriophageencoded homing endonucleases among others (50). Overlaying the ␤-hairpin core of Ref and the colicin E9 HNH DNase (51) revealed a striking similarity in the ␤␤␣-metal core between the two proteins (Fig. 8A). In both cases, the three Zn 2ϩ binding His residues superimpose nearly identically, and a fourth Zn 2ϩ ligand (SO 4 2Ϫ or PO 4 3Ϫ ) is also nearly identically positioned. Outside of the ␤␤␣-metal core, the Ref and the E9 DNase fold lack tertiary or topological similarity. Even within the ␤␤␣metal core there are differences. For E9 DNase, the second and third Zn 2ϩ binding His residues are both presented from the ␣-helix of the ␤␤␣ fold. In contrast, for Ref the first of these His residues is from the helix, but the second (His-153) is presented   with Ala was constructed, purified, and found to lack all nuclease activity (Fig. 2). The inactivity of this Ref H153A mutant is not due to a loss of the Zn 2ϩ ion in the active site ( Table 3), suggesting that the His-153 residue itself is important for catalysis. In addition, Ref H153A appears to be properly folded, as determined by circular dichroism (supplemental Fig.  S1). Interestingly, elements that are important for DNA binding in E9 (52) are absent in the core Ref structure, further suggesting that the DNA binding activity is embedded in the N-terminal amino acids.
HNH-family nucleases are defined by the ␤␤␣-metal core fold in which a central His or Asp residue is immediately flanked by an N-terminal Asp or His residue and at some distance by a C-terminal His, Asp, or Glu residue (53). For P1 Ref, this sequence is -His-His-(9 residues)-His-for residues 112-123. HNH-family members generally also have conserved Cys-Xaa-Xaa-Cys sequence motifs N-and C-terminal to this central sequence cluster that are used for binding metals; a His residue can substitute for one of the Cys residues in these motifs. P1 Ref also has such motifs (Cys-Ile-Ala-Cys (residues 96 -99) and Cys-Arg-Trp-His (residues 130 -133)). We note that the apparent homolog encoded by B. avium (Fig. 6A) (2,54). As shown in Fig. 9, Ref will cleave both strands of a duplex DNA at the site of a D-loop formed by a RecA-coated oligonucleotide complementary to that site. In this experiment, the RecA E38K mutant protein replaced the wild type RecA protein as it consistently produced a 20 -30% enhancement of activity relative to the wild type RecA protein (data not shown). Lanes 9 -12 of Fig. 9, employing four oligonucleotides targeted to three different sites in the duplex DNA circle, make two points. The best cleavage is seen with the two 150-mers. A 100-mer targeted to the same site as one of the 150-mers is nearly as effective. Cleavage efficiency drops off substantially for the 60-mer, corresponding to the decreased stability of RecA filaments on the shorter oligonucleotide (55). Like the ssDNA nuclease activity, targeted dsDNA nuclease activity depends on RecA, ATP, Mg 2ϩ , and the presence of a homologous oligonucleotide (Fig. 9). The Ref H153A variant was again inactive.
We determined sites of DNA cleavage within the region targeted by one of the 150-mer oligonucleotides, cutting the reaction product at a nearby restriction site (PstI) and ligating a labeled linker to the resulting fragment. A gel allowing dideoxy sequencing-like resolution reveals that Ref-mediated cuts are found at multiple locations, all relatively close to the 3Ј end of the invading oligonucleotide (Fig. 10). We note that the method employed would eliminate signals from distal cleavage sites for any DNA that was cleaved twice on the same strand of the D-loop and, thus, may bias the results toward detection of 3Ј end-proximal events. However, the results clearly indicate that cleavage can occur at multiple locations within the D-loop. Preferred cleavage sites were evident, although the mechanistic basis of the pattern is not yet understood. The patterns also indicate that Ref-mediated introduction of double-strand breaks does not necessarily generate blunt ends.

DISCUSSION
There are four major conclusions to this work. First, the bacteriophage P1-encoded Ref protein is an endonuclease, with the novel property that nuclease function is entirely dependent on the presence of active RecA nucleoprotein filaments. Second, Ref is not simply activated by RecA. Where cleavage locations can be correlated with RecA locations (Fig. 9)  Multiple experiments indicate that the observed nuclease activity is due to Ref and not to a contaminating nuclease activity. All protein preparations used in this study were carefully tested and found free of exo-or endonuclease activity on circular or linear single-or double-stranded DNA under standard reaction conditions. The observed nuclease activity co-elutes with the Ref protein on a size exclusion column. Finally, the structure of the Ref protein identifies it as a novel HNH endonuclease.
These results further broaden the already long list of functions for RecA filaments to include co-nuclease. The Ref protein cleaves only where RecA protein is bound to DNA. The reaction is enhanced if the RecA filament actively hydrolyzes ATP. In principle, Ref could cleave at a filament end or in the RecA filament groove. The enhancement by ATP hydrolysis might suggest a link to end-dependent RecA filament dissociation. However, we currently favor a mechanism in which cleavage occurs in the filament groove for two major reasons. First, the Ref-mediated cleavage reaction is also enhanced when the RecA E38K mutant protein replaces the wild type RecA. The filaments formed by the E38K mutant protein are much less dynamic than those of the wild type protein, exhibiting no measurable dissociation in a standard challenge assay when they are bound to ssDNA. Second, the Ref-mediated cleavage reaction is completely inhibited by the LexA S119A mutant protein, which binds stably in the RecA filament groove. RecAmediated ATP hydrolysis is not limited to the ends of RecA filaments (56) but occurs throughout the filament. Conformation changes associated with the ATP hydrolytic cycle may play some role in the Ref nuclease reaction. A structure of RecA bound to DNA in the presence of ATP has appeared (57). However, the core domain of RecA is closely related to helicases in which substantial conformational changes are associated with ATP hydrolysis (58 -61). For RecA, little is known about the structural changes associated with ATP hydrolysis or the status of bound DNA strands at different stages of that cycle.
Cleavage of RecA-created D-loops has previously been reported for the S1 and Bal-31 nucleases (62), both normally specific for single-stranded DNA. The unwound DNA at the ends of D-loops offers a target for such nucleases. The effects of the Ref nuclease differ from this earlier report in several important ways. First, neither S1 nor Bal-31 has any RecA-dependent phenotype in vivo. Second, both S1 and Bal31 will cleave singlestranded DNA wherever it occurs, including at nicks and small gaps; Ref will cleave only where RecA protein is bound. It is likely that Ref evolved to work with RecA filaments. There is no such indication for the S1 and Bal-31 nucleases. Finally, the more proficient of the two nucleases, S1, requires a non-physiological set of reaction conditions to carry out its reaction efficiently.
The efficiency of oligonucleotide-directed Ref cleavage of DNA appears to be strongly dependent on the efficiency of RecA-mediated D-loop formation. This efficiency in turn can be modulated by the length of the DNA oligonucleotide, RecA mutant proteins with enhanced DNA binding and/or DNA pairing properties (the RecA E38K mutant protein is the most effective variant we have tested to date) and likely other factors that remain to be elucidated. Efficient targeted cleavage by the Ref/RecA system may eventually provide a convenient and inexpensive method to introduce targeted double-strand breaks in a range of biotechnology applications.