Regions of Endonuclease Eco RII Involved in DNA Target Recognition Identified by Membrane-bound Peptide Repertoires*

Target sequence-specific DNA binding regions of the restriction endonuclease Eco RII were identified by screening a membrane-bound Eco RII-derived peptide scan with an Eco RII recognition site (CCWGG) oligonucleotide duplex. Dodecapeptides overlapping by nine amino acids and representing the complete protein were prepared by spot synthesis. Two separate DNA binding regions, amino acids 88–102 and amino acids 256–273, which share the consensus motif K X R XX K, emerged. Screening 570 single substitution analogues obtained by exchanging every residue of both binding sites for all other amino acids demonstrated that replac-ing basic residues in the consensus motifs significantly reduced DNA binding. Eco RII mutant enzymes gener-ated by substituting alanine or glutamic acid for the consensus lysine residues in DNA binding site I expressed attenuated DNA binding, whereas corresponding substitutions in DNA binding site II caused impaired cleavage, but enzyme secondary structure was unaf-fected. Furthermore, Glu 96 , which is part of a potential catalytic motif and also locates to DNA binding site I, was demonstrated to be critical for DNA cleavage and binding. Homology studies of DNA binding site II re-vealed strong local homology to Sso II (recognition sequence, CCNGG) and patterns of sequence conservation, suggesting the existence of functionally related DNA binding sites in diverse restriction endonucleases with recognition sequences containing terminal C:G or G:C pairs. Type mutagenesis and again after re- cloning into the expression vector pQE-30. The sequence of the whole ecoRIIR gene was determined as described above. The quadruple mu- tations in the DNA binding sites were constructed by ligating suitable Kpn I/ Eco RV DNA fragments of the four double mutants. After DNA transformation and propagation in Escherichia coli JM109 (pDK1 R 2 M 1 ), the cloning sites, as well as the mutations themselves, were confirmed by resequencing.

Type II restriction endonucleases (ENases) 1 are ideal models for investigating the molecular basis of specificity in the interaction of proteins with their specific DNA recognition sites. More than 2900 type II ENases isolated from different sources have been described, representing more than 200 individual DNA sequence specificities (1). They form one of the most comprehensive groups of functionally similar proteins with distinct DNA binding specificities. As phylogenetically diverse enzymes coded by eubacteria, archaebacteria, and viruses evolved to recognize identical DNA sequences, a variety of structures and mechanisms involved in DNA recognition can be expected.
EcoRII belongs to the IIE type of ENases characterized by their essential interaction with two copies of the recognition site for DNA cleavage. This cooperative mode of action limits their efficiency of DNA restriction and may reflect additional biological functions besides the defense of the host cell against invading foreign DNA (for reviews, see Refs. 2 and 3). In bridging two DNA recognition sites in cis or trans, type IIE ENases resemble proteins involved in DNA replication and recombination, as well as in transcription control, in pro-and eukaryotes (2)(3)(4)(5)(6). The dimeric EcoRII ENase associates with two DNA sites (7)(8)(9) in a distance-dependent fashion through a DNA bending/looping mechanism (10). Up to now, functional domains of EcoRII have not been identified, and it is not known how DNA recognition is realized at the amino acid level. The crystal structure of an EcoRII-DNA complex remains to be determined.
Restriction ENases are remarkably recognition site-specific enzymes. This is borne out in a high affinity (association constants, K a Ϸ 10 8 -10 9 M Ϫ1 ) to their recognition sequence, a low affinity to sites differing by only 1 base pair, and the precise coupling of DNA recognition and site-specific cleavage. The structural elements that make specific contacts with DNA target sites differ in all ENase-DNA co-crystal structures analyzed so far, whereas their tertiary and quaternary structure and the residues involved in catalysis show a degree of conservation (cf. Refs. 11 and 12 for review). The structural data base of ENases is still too limited to infer rules of protein-DNA recognition and, in particular, to decide whether these may define a protein-DNA recognition code. The theoretical interest and practical impact of such recognition rules make faster methods of identifying DNA binding sites in proteins most desirable.
We used spot synthesis (13) to prepare EcoRII-derived peptide scans bound C-terminally to continuous cellulose membranes. Peptide scans have been applied successfully to the investigation of linear or discontinuous protein-protein (14 -16), or protein-metal (17) contact sites. We have now adapted the peptide scan approach to investigate protein-DNA contacts.
The study presented here led to the identification of two EcoRII peptides capable of sequence-specific DNA binding. EcoRII mutants constructed with mutations in these DNA binding sites had altered DNA binding and scission properties. The sequence of one of the two DNA binding sites led to the discovery of a putative family of DNA binding peptides in diverse restriction enzymes.

EXPERIMENTAL PROCEDURES
Determination of the EcoRII-encoding Nucleotide Sequence-At the onset of this study, there were two GenBank TM entries for the complete nucleotide sequence of the ecoRII ENase gene, X16025 (18) and M26404 (19), which differ in six nucleotides and include two frameshifts. Before synthesizing peptide scans, it was necessary to clarify these uncertainties. We sequenced our EcoRII expression plasmid (10) originating from pR209 (19) in both directions. By using the Thermo Sequenase de-AzaG-Kit (Amersham Pharmacia Biotech), it was possible to resolve a track of four guanines, otherwise appearing as three, corresponding to nucleotide position 676 -678 of the sequence M26404. Except for this one additional G and the deletion of a C at nucleotide position 710, our results match sequence M26404. The revised nucleotide sequence was submitted to GenBank TM (accession number AJ224995).
Synthesis of Peptide Scans-All cellulose-bound EcoRII-derived peptides were prepared semiautomatically on a spot synthesizer (Abimed GmbH, Langefeld, Germany) as described previously in detail (20).
Oligonucleotide Design and Labeling-For testing the specific binding of DNA to EcoRII peptides the following synthetic oligonucleotide duplex (30-mer) was used: 5ЈCGTAACGAATATCCAGGGTTACGACG-TCGA/ 5ЈTCGACGTCGTAACCCTGGATATTCGTTACG (the EcoRIIspecific recognition site is underlined). Competition experiments were performed in the presence of an excess of the above specific duplex or the following unspecific oligonucleotide duplex: 5ЈCGATCGACGATCG-CGTATTATACGCGATCG/5ЈCGATCGCGTATAATACGCGATCGTCG-ATCG. The oligonucleotides were end-labeled with [␥-32 P]ATP (ICN Biomedicals GmbH) using T4 polynucleotide kinase (MBI Fermentas) and purified by gel filtration on probequant columns (Amersham Pharmacia Biotech).
DNA-Peptide Binding Experiments-The cellulose-bound peptide scan was rinsed for 5 min in methanol and then preincubated for 1 h in 100 mM maleic acid, 150 mM NaCl, 1% blocking reagent (Boehringer Mannheim), pH 7.5, followed by three washes with EcoRII binding buffer (33 mM TrisOAc, 66 mM KOAc, 10 mM Mg(OAc) 2 ; pH 7.6) for 10 min. Usually the peptide scan was incubated with 25 pmol of 5Ј 32 Plabeled substrate DNA in a total volume of 10 ml of EcoRII binding buffer for 1 h at room temperature. The membrane was washed three times for 5 min with EcoRII binding buffer and air-dried. The bound radioactivity per peptide spot was quantitated by phosphorimaging (PhosphorImager, type SI) and processed with ImageQuant software (Molecular Dynamics GmbH). To compare data from different experiments, a serial dilution of a 14 C standard was included with each experiment, and the absolute amount of radioactivity was taken into account. Peptide scans were stripped of bound radioactivity either by electroblotting and/or extended washing steps in EcoRII binding buffer supplemented with 2 M NaCl or 1 M K 2 HPO 4 , pH 8.0.
Site-specific Mutagenesis-Oligonucleotide-directed mutagenesis was carried out according to Vandeyar et al. (21). The 1294-base pair EcoRI/PstI fragment from expression vector pQE-30 containing the coding region for ENase EcoRII (10) was cloned in M13mp18. Potential active site mutants were obtained by replacement of Glu at positions 96 and 234 and Asp at position 130 by Ala. DNA binding site I double mutants were obtained by substituting both lysines 92 and 97 by either Ala or Glu. Analogously, Ala or Glu substituents were introduced into DNA binding site II for lysines 263 and 268. The mutations were verified twice by sequencing, after mutagenesis and again after recloning into the expression vector pQE-30. The sequence of the whole ecoRIIR gene was determined as described above. The quadruple mutations in the DNA binding sites were constructed by ligating suitable KpnI/EcoRV DNA fragments of the four double mutants. After DNA transformation and propagation in Escherichia coli JM109 (pDK1 R Ϫ M ϩ ), the cloning sites, as well as the mutations themselves, were confirmed by resequencing.
Protein Expression and Purification and Western Blots-Mutant and wild-type N-terminally His 6 -tagged EcoRII proteins were purified by affinity chromatography on nickel-nitrilotriacetic acid-agarose columns as described (10). In a further purification step, the most concentrated protein fractions were loaded onto HiTrap® Heparin Sepharose (Amersham Pharmacia Biotech). EcoRII of Ն95% purity eluted at 600 mM NaCl in 20 mM potassium phosphate buffer (pH 7.0) with 1 mM EDTA. Protein-containing fractions were analyzed on 12% polyacrylamide gels, and protein concentrations were determined spectrophotometrically at 280 nm. After electroblotting of polyacrylamide gels for 1 h, the nitrocellulose membrane was rinsed with phosphate-buffered saline (137 mM NaCl, 2.7 mM KCl, 8.1 mM Na 2 HPO 4 , 1.5 mM KH 2 PO 4 ; pH 7.3) and blocked for 1 h in phosphate-buffered saline with 10% neonatal calf serum and 0.2% Tween. The blot was incubated for 1 h with primary antibody (polyclonal rabbit EcoRII antiserum) diluted 1:2000 in blocking solution. After five 10-min washes with phosphate-buffered saline, 0.2% Tween, peroxidase-labeled secondary antibody (donkey-anti-rabbit peroxidase, Pierce) was added 1:2500 for 1 h in blocking solution followed by five washes with phosphate-buffered saline, 0.2% Tween. EcoRII bands were visualized by chemiluminescence using SuperSignal TM ULTRA (Pierce) diluted 1:5 in water.
Enzyme Activity Assay-The plasmid pBR322 (0.1 pmol; six EcoRII sites per molecule) was cleaved to completion with BamHI, extracted with phenol and chloroform, and precipitated with ethanol. The redissolved DNA was incubated with 0.3 pmol of EcoRII wild-type or mutant proteins in 20 l at 37°C. Cleavage was monitored on 0.9% agarose gels.
The capacity of the enzyme variants to be activated to cleave resistant DNA was tested by incubating 200 ng of phage T3 DNA (3 sites per 40,000-base pair genome) in the presence of an approximately 200-fold molar excess of an EcoRII site-containing oligonucleotide duplex and 1 pmol of the enzyme in 20 l for 90 min at 37°C. DNA fragments were separated on 0.7% agarose (22).
Assays for Protein-DNA Interaction: Gel Retardation and K D Determination-A radioactively labeled 191-base pair polymerase chain reaction fragment (0.025 pmol) containing one EcoRII-specific DNA recognition sequence was incubated with increasing protein concentrations (0 -5 pmol) in a 20-l reaction volume containing 33 mM TrisOAc (pH 7.6), 66 mM KOAc, 10 mM CaCl 2 , 10% glycerol, bovine serum albumin (10 g/ml) for 20 min at room temperature. Reactions were immediately loaded on 5% polyacrylamide gels, and electrophoresis was run at 4°C. Results were analyzed with a PhosphorImager (Molecular Dynamics GmbH). The ratio of protein-bound to total radioactivity was calculated for each lane, and the apparent K D was determined as the enzyme concentration at which 50% of the total radioactivity was bound.
Comparison of DNA-protein complexes formed by EcoRII and its mutants, shown in Fig. 5b, was made after incubating 0.025 pmol of polymerase chain reaction fragment with 2 pmol of the respective protein for 20 min at room temperature.
CD Spectrometry of Purified EcoRII and Its Mutants-For circular dichroism spectroscopy, the proteins purified to Ն95% were used at concentrations of about 1 mg/ml except for the quadruple mutants K92E/K97E/K263E/K268E (0.26 mg/ml) and K92E/K97E/K263A/ K268A (0.15 mg/ml). Three successive runs from 250 to 190 nm were performed in a cuvette with a 0.1-mm light path in a Jobin Yvon Dichrograph III at a bandwidth of 5 nm, a scanning rate of 0.01 nm/s, and a time constant of 1 s.
Phage Titration on E. coli Strains Expressing Wild-type and Mutant EcoRII-Phage grown on a Dcm Ϫ E. coli strain was diluted in SM buffer (23), and 0.5 ml was incubated with 0.3 ml of E. coli JM109 carrying plasmids coding for the different EcoRII mutants or the wildtype enzyme for 20 min at 37°C. To improve phage adsorption, the indicator cells were grown in the presence of 0.2% maltose. The virushost cell mixture plus 3 ml of top agar was then poured on LB plates and incubated at 37°C overnight.
Homology Studies-A BLAST search was conducted of the sequences of the EcoRII DNA binding peptides against the SwissProt data base. Sequences of isoschizomers of EcoRII, SsoII, NgoPII and LlaII were accessed through REBASE (1) and GenBank TM . The multiple alignment (Higgins-Sharp) and the homology matching (Lipman-Pearson FASTA) routines of the PC-based sequence analysis package DNAsis 2.5 (Hitachi) were initially used for alignments of the total EcoRII sequence to other ENases in the data base. Taking account of the unusual length of the EcoRII sequence and the locations of the two DNA binding sites, the search was repeated for the left and the right halves of the EcoRII sequence. Sets of ENase sequences were submitted to the multiple alignment program DIALIGN 2.0 (24) at the Web site http://bibiserv.TechFak.Uni-Bielefeld.DE/dialign/.

Sequence-specific DNA Binding Regions Identified in EcoRII
Peptide Scans-The entire amino acid sequence presented as a set of 132 covalently cellulose-bound dodecapeptides overlapping by 9 amino acids was screened for its ability to bind EcoRII recognition site-containing oligonucleotide duplexes. The length and overlap of the peptides in the scan were chosen because individual segments of discontinuous DNA binding sites generally do not exceed 9 -12 amino acids. Fig. 1a shows that peptides from two regions distant in the primary sequence exhibit high affinity to the oligonucleotide. The influence of Mg 2ϩ ions on DNA-peptide binding is evident by comparing Fig. 1a and Fig. 1b. The quantitative effect of Mg 2ϩ is shown in Fig. 1c. Because Mg 2ϩ ions clearly diminished unspecific DNA binding to several EcoRII peptides, all subsequent DNA binding experiments were carried out in the presence of 10 mM Mg(OAc) 2 .
To examine whether the interaction with the DNA substrate was specific to the EcoRII recognition sequence CCWGG, the following series of competition experiments was carried out: (i) binding of the labeled specific oligonucleotide duplex alone, (ii) binding as in i with saturating amounts of an unlabeled, unspecific oligonucleotide, and (iii) binding as in i with saturating amounts of an unlabeled, specific oligonucleotide. The unspecific competitor was designed so as to avoid any homology with the EcoRII recognition sequence at the dinucleotide level. Competing oligonucleotide duplexes were saturating with respect to total peptide in the spots and represented a 10 4 -10 5 molar excess of unlabeled over labeled substrate. Data from series of competition experiments were compared quantitatively on the basis of a 14 C standard.
Under stringent competitive conditions, only five dodecapeptide spots specifically bound oligonucleotides containing the EcoRII DNA recognition site (Fig. 2). The binding of labeled specific oligonucleotide duplex to the peptides was not reduced by saturating concentrations of unlabeled unspecific oligonucleotides. In contrast, the same concentration of unlabeled specific competitor DNA decreased the binding of labeled specific substrate below 1%. In comparison to an EcoRII-specific substrate, the labeled unspecific oligonucleotide duplex itself bound to these peptides with an efficiency between 1% and, maximally, 10% (data not shown).
Two specific DNA binding regions spanning 15 and 18 amino acids, located between EcoRII amino acid positions 88 -102 (binding site I) and 256 -273 (binding site II) were thus identified under highly competitive binding conditions. Both potential binding regions share the minimal consensus motif KXRXXK (Fig. 2), which does not occur elsewhere in the EcoRII sequence.
Substitution Analysis Reveals Residues Critical for Peptide-DNA Binding-Substitution analogues of both potential DNA binding sites of EcoRII were synthesized in which every amino acid of the original sequence is replaced by all others. Fig. 3 depicts the layout of the peptide scan, where each binding site is represented by 15 rows and 21 columns. The original (wildtype) sequence occurs twice in each row: in the left-most spot and in the column of the substituent corresponding to the original residue. To evaluate the influence of single substitutions on DNA binding, we calculated the initial binding efficiency (i.e. to the wild-type peptide) as the average over all 30 original peptide spots plus or minus the 3-fold S.D. All binding efficiencies outside this range were considered significant. It is evident that certain basic amino acids had a significant influence on DNA binding to both sites. Considerable effects of amino acid exchanges on DNA binding are seen at the consensus motif lysine and arginine residues at positions 92, 94, 97, and 98, as well as tryptophan 102 in the first binding region and, more pronounced, at positions 263, 265, and 268 in the second binding region (cf. Fig. 2). Three CD spectra for each EcoRII mutant enzyme were accumulated, and the secondary structure was calculated according to Chen et al. (25). Fig. 4 shows that the CD spectra of the wild-type and the DNA binding site mutants do not reveal obvious differences in secondary structure of the proteins. Because of the lower protein concentrations available for the spectrometric measurements of the quadruple mutants K92E/ K97E/K263E/K268E and K92E/K97E/K263A/K268A, their spectra exhibited a lower signal-to-noise ratio than those from the other proteins.

Replacement of Critical Lysine Residues Alters DNA Binding
The DNA binding behavior of wild-type EcoRII and its derivatives mutated in one or both binding sites was compared by gel retardation assays in the absence of Mg 2ϩ ions to prevent DNA cleavage (Fig. 5). Constant low DNA concentration and varying protein concentrations, covering at least 2 orders of magnitude, were employed to determine the apparent dissociation constants K D(app) of EcoRII-DNA complexes. For the calculation of K D(app) values, we made the simplifying assumption that given the excess of enzyme over DNA, only one substrate (S) binding site of the enzyme dimer (E) was occupied. The obtained constants shown in Fig. 5a were therefore calculated according to The replacement of both K residues by A in DNA binding site I (K92A/K97A) led to a 6-fold decrease in DNA binding and the formation of slower migrating complexes in comparison to wildtype EcoRII, whereas the introduction of E in place of K (K92E/ K97E) nearly abolished DNA binding. In contrast, both binding site II mutants showed dissociation constants comparable to wild-type. The K D values of the quadruple mutants substituted in both binding sites exceeded those of the individual binding site mutants from which they were composed. Furthermore, the gel mobility shift assay with the quadruple mutant enzymes reproducibly separated a number of complexes distinct in size and/or conformation from the EcoRII wild-type-DNA complex (Fig. 5b). It was evident from competition experiments with a 1000-fold molar excess of unspecific or specific unlabeled DNA that the residual binding capacity of the EcoRII mutants was still recognition sequence-specific (data not shown).
Impaired Restriction Activity of EcoRII DNA Binding Site Mutants-The catalytic activity of the eight mutant enzymes was tested on linearized pBR322 DNA and compared with the wild-type enzyme (Fig. 6a). Alanine substitutions in DNA binding site I at positions 92 and 97 did not reduce cleavage activity compared with the EcoRII wild-type ENase. However, lysine exchanges to glutamic acid caused a strong decrease in DNA cleavage activity, without a concomitant change in recognition site specificity (Fig. 6a). This was apparently correlated to the extremely weak substrate binding of the mutant enzyme K92E/ K97E (cf. Fig. 5), consistent with a repulsion of DNA phosphates by the introduced negative charges. Monitoring the time dependence of pBR322 cleavage confirmed the low but still measurable cleavage rate of K92E/K97E (data not shown). Both mutations in DNA binding site II and all of the quadruple mutations resulted in catalytically inactive enzymes (Fig. 6a). Even after 2 h, no DNA hydrolysis could be detected under standard EcoRII reaction conditions (data not shown). Table I summarizes the properties of all constructed EcoRII mutants and additionally includes the ability of the enzymes to be activated by site-containing oligonucleotides to cleave resistant DNA sites, as well as their efficiency of restricting phage reproduction in E. coli host cells. EcoRII can be activated in trans to cleave single, resistant recognition sites, e.g. on viral T3 and T7 DNA, by co-incubation with another, susceptible DNA (26) or even by short synthetic oligonucleotide duplexes carrying the canonical recognition sequence (22).
The alanine substitutions in binding site I yielded a mutant with a 6-fold increased K D(app) value for DNA that behaved like the wild-type in the other investigated parameters. This is not surprising, because the assays were run under optimal conditions. For example, the ratio of enzyme dimers to recognition sites of 1:2 employed in the cleavage reactions was sufficient to counteract the reduced substrate affinity of K92A/K97A. For the glutamic acid mutant in the same binding site, DNA binding was hardly detectable, and catalytic activity was very weak. These in vitro features of K92E/K97E match a significant decrease of phage restriction to only 1 order of magnitude, as opposed to 4 orders of magnitude by the EcoRII wildtype and the K92A/K97A mutant. Both binding site II mutants were catalytically inactive in vitro and could not be transactivated but were still capable of restricting phage by approximately 1-2 orders of magnitude. Conceivably, the physiological environment offers more appropriate conditions for the mutant enzymes, because K263A/K268A can be stimulated to cleave DNA in vitro in the presence of 5-50 mM Mn 2ϩ . 2 Restriction in vivo may also be related to tight repressor-like binding of the binding site II mutants to their very frequent recognition sites in the genome. The four enzyme variants with mutations in both binding sites exhibited in vitro characteristics similar to those of the binding site II mutants but with significantly reduced DNA binding, and they no longer restricted phage in the host cell.
DNA Binding Site I Overlaps a Potential Catalytic Motif-A search for catalytically relevant amino acid residues based on 2 M. Reuter and P. Mackeldanz, unpublished results.

FIG. 5. Apparent dissociation constant determination for EcoRII-DNA
complexes. a, DNA binding affinity of EcoRII wild-type and mutants were evaluated after incubating varying concentrations of the respective enzyme with Ϸ1 nM 32 P-labeled 190-base pair polymerase chain reaction fragment with a single cognate recognition site. After polyacrylamide gel electrophoresis at 4°C, results were quantitated by phosphorimaging. K D(app) value was determined by gel shift assays as described by Carey (53). *, arithmetic means of 3-6 independent determinations; #, value exceeds highest tested enzyme concentration. b, gel retardation of the EcoRII-DNA complexes after separation on a 5% polyacrylamide gel (for reaction conditions, see under "Experimental Procedures"). the crystal structures of ENase-DNA complexes (cf. Refs. 11, 27, and 28 for review) detected at least three potential catalytic consensus motifs P(D/E)X n (D/E)X(K/R), two of which are in the vicinity of binding site I (Fig. 7a). Both acidic amino acid residues in the consensus motif were found to be important for cleavage in the ENases EcoRI and EcoRV (cf. review in Ref. 11). We have constructed mutants E96A, D130A, and E234A in the three potential catalytic motifs of EcoRII. Substituting glutamic acid at position 96 by alanine eliminated DNA cleavage (Fig. 6b) and binding (data not shown), whereas when alanine was introduced into equivalent positions of the two other potential catalytic motifs, 111 PEX 17 DCK 132 and 214 PDX 18 ELH 236 , it was tolerated, and DNA cleavage activity was almost unaffected (Fig. 6b). These results clearly implicated Glu 96 of the first potential catalytic motif located in DNA binding site I as critical for catalysis.

Sequence Alignments Identify DNA Binding Site II Analogues in Other Restriction Endonucleases-A BLAST search of
EcoRII DNA binding sites I and II produced three ENases with similarities to binding site II, SsoII, NgoPII, and LlaII. All available amino acid sequences of their isoschizomers (1) were then screened to yield ScrFI, MthTI, and DpnII (Fig. 7b). Further close relatives of LlaII (MboI and MJ0600) and of NgoPII (FnuDI) are not shown here because the respective sequence elements are nearly identical. Sequences related to DNA binding site II were also found in CglI and CviAII (Fig. 7b). The core sequences of the alignment (Fig. 7b, between vertical lines) were compared for every pair of enzymes, revealing similarities not only among truly homologous enzymes with identical recognition sites, such as LlaII and DpnII, but also, for example, between NgoPII and DpnII, the recognition sites of which share only the outer G:C pairs (Fig. 7c). However, the greatest similarity between EcoRII DNA binding site II and all other analyzed ENase sequences was observed to SsoII (Fig. 7c). It appears especially significant that the homology to SsoII as well as ScrFI is focal to DNA binding site II, as both enzymes are specific to CCNGG, a degenerate version of the EcoRII DNA recognition sequence.
To see whether the identified DNA binding site II analogues would be juxtaposed by multiple alignment of the whole set of enzymes (Fig. 7b), their total sequences were submitted to DIALIGN 2.0, a program designed to identify local similarities in functionally related proteins, the sequences of which are not necessarily globally related. DIALIGN 2.0 is particularly suitable for the task as it was the only of five alignment programs to identify all DNA binding sites in a set of 11 diverse helix-turn-helix proteins of vastly different size (24). DIALIGN 2.0 aligned the DNA binding site II analogues in EcoRII, SsoII, ScrFI, NgoPII, MthTI, LlaII, and DpnII as the regions of maximal similarity of these enzymes. The same maximal alignment score for the binding site II analogues of these enzymes, as well as of FnuDI and MboI (see above), resulted when a larger set of sequences that additionally included LlaI, NgoFVII, HpaII, and Cfr10I was analyzed, enzymes selected for resemblances to the enzymes shown in Fig. 7b. NgoFVII is a CglI homolog with a known recognition sequence (GCSGC) but with less similarity to DNA binding site II than CglI itself; HpaII is another type IIE enzyme (1) and LlaI (open reading frame 3) encodes a subunit of a complex restriction endonuclease (29) that, like CviAII, contains the KXRXXK motif.
Surprisingly, three different peptides of the type IIE ENase NaeI were identified by independent alignment routines. NaeI 58 -75 (Fig. 7b) emerged from a FASTA homology search (cf. the procedure described under "Experimental Procedures") of the C-terminal half of the EcoRII sequence, NaeI 188 -205 emerged from the DIALIGN 2.0 analysis of the enzyme set shown in Fig. 7b, and NaeI 147-164 emerged from the alignment of the enlarged set of enzymes. All three NaeI peptides include a leucine corresponding to Leu 274 and a glycine corresponding to Gly 267 of EcoRII DNA binding site II, the most conserved residue of the alignment in Fig. 7b. By random mutagenesis, Holtz and Topal (30) located functionally essential residues in all three of these NaeI peptides, including Thr 63 , Glu 70 , Gly 155 , and Gly 196 , and a recent domain analysis (31) placed the latter two peptides in the C-terminal DNA binding domain of NaeI, whereas peptide 58 -75 is in the Nterminal part of the enzyme predicted to be involved in catalysis and dimerization. NaeI 58 -75 was included in Fig. 7b because the Thr and Glu residues (boxed) implicated in catalysis (30) belong to the consensus residues of the alignment.
Although the BLAST search of EcoRII DNA binding site I did not pick up any other ENases, a similar peptide in SfiI was located by a Higgins-Sharp alignment (cf. the procedure described under "Experimental Procedures") (Fig. 7b). SfiI is functionally related to EcoRII in requiring, and simultaneously cleaving, two copies of its recognition site; however, it does this not as a dimer but a tetramer (6). A mutational analysis of SfiI is not available.

DISCUSSION
Deconstruction of the EcoRII-DNA Interaction-We were interested in the structural basis of target recognition by the The ability of the enzyme mutants to be activated by the addition of activator DNA (e.g. EcoRII site-containing oligonucleotide duplexes) was determined by digestion of primarily resistant T3 phage DNA (cf. the procedure described under "Experimental Procedures").
c Relative efficiency of plating (e.o.p.) was calculated as the ratio of the phage titres on the nonpermissive EcoRII-coding host and an isogenic restriction-deficient E. coli strain and is the mean value of three independent experiments. restriction ENase EcoRII. Because crystallographic information was not available, we explored an alternative approach to the study of DNA-protein affinity, synthetic peptide scans. The procedure that allowed the identification of sequence-specific DNA binding to matrix-bound peptides comprised the following elements: unspecific DNA binding was lowered in the presence of Mg 2ϩ ions (Fig. 1, b and c) and suppressed in the presence of an excess of unlabeled unspecific competitor DNA, designed to differ from the target site-containing DNA at the dinucleotide level. Specific binding was only suppressed by an excess of unlabeled specific competitor DNA (Fig. 2). The peptide scan approach allowed us to delineate two potential DNA binding sites of EcoRII.
Type II DNA restriction ENases generally bind palindromic double-stranded DNA recognition sites in a symmetrical fashion (normally as dimers); a number of residues within one or two 5-15-amino acid regions, and shorter peptides or individual amino acids from different parts of the ENase monomer establish direct or water-mediated contacts with bases and phosphate groups of the DNA target site. Can any degree of specificity of such a complex DNA-protein interaction be preserved after breaking up a three-dimensional binding site into short peptide modules? Attempts to model DNA binding specificity of a protein by peptides in solution often fail (cf. Ref. 32). One exception is the dodecapeptide WDGMAAGNAEIER comprising the extended chain region (underlined) of EcoRI. This peptide binds specifically to GAATTC with K a ϭ 3 ϫ 10 4 M Ϫ1 , i.e. about 5000 times weaker than EcoRI itself. Judging by inhibition studies of enzymes with related recognition sequences the target specificity of this peptide was retained (33).
In general, incorporation of DNA binding peptides into macromolecular frameworks appears to be necessary to overcome the loss of conformational and translational entropy accompanying sequence-specific association of a peptide with DNA (cf. Ref. 32). Thus, incorporation of peptides involved in sequencespecific DNA recognition, e.g. of zinc fingers into a phage display coat protein (34), or of a recognition helix from a helixloop-helix protein into an antibody Fab domain (35), reproduced the DNA sequence specificities of the original proteins. Stanojevic and Verdine (32) introduced a new experimental approach to the quest they defined as the deconstruction of sequence-specific DNA-protein interactions by tethering DNA binding domains covalently to DNA. The close proximity of the peptide to its target sequence overcomes the entropic and energetic barriers to peptide DNA recognition.
Johnson et al. (36) extended the concept of induced fit from purely macromolecular to peptide-DNA interactions by demonstrating that the intrinsically flexible DNA binding peptides of basic helix-loop-helix and leucine zipper proteins assume ␣-helical conformation after binding DNA. Compared with peptides in solution, peptide scans offer two features that potentially facilitate sequence-specific induced fit to oligonucleotides: 1) the terminal covalent attachment of the peptide to the membrane reduces conformational entropy, and 2) the high local FIG. 7. DNA binding sites in the restriction ENase EcoRII: conserved residues and similar motifs in other ENases. a, location of DNA binding sites and of potential catalytic motifs in the sequence of EcoRII. Amino acid residues marked with a "c" belong to potential catalytic consensus motifs identified by computer search. b, homology studies of Eco-RII DNA binding sites I and II. Alignments of the two binding sites (between vertical lines) and nine flanking residues are shown. All EcoRII residues and similar residues of other enzymes are in capital letters. Identical residues are in boldface. Homologies spanning three residues or more are on a gray background. Gaps are indicated by Ϫ. Residues known to be essential for catalysis are boxed (cf. details described under "Discussion"). A consensus amino acid is defined at positions where Ն60% of the residues are homologous (* above the predominant resi- where known) at the right. The recognition site of CglI has not been determined. c, pairwise comparison of peptides aligned to DNA binding site II. The percentages of identical (first number) and similar (second number) residues of the peptides aligned in b (excluding the flanking sequences) are shown. For each enzyme, the highest scoring alignment, as judged by the sum of the identical plus 0.5 times the similar residues, is identified by a gray background, and the enzyme number is in superscript.
The target-recognition domains of (cytosine-C5) DNA MTases have a conserved general structure (47,48). Several attempts have been made at finding common DNA recognition principles in cognate ENases and MTases. Sequence similarities between the region of EcoRII encompassing Arg 265 and Lys 268 (part of KXRXXK in binding site II) and MTases specific to CCWGG or CCNGG, in an RXXK motif of their predicted target recognition domains (49), were already discovered by Kossykh et al. (50). In earlier, related studies, Janulaitis et al. (51) identified conserved motifs shared by other cognate ENases and MTases, whereas Swaminathan et al. (52) found scattered, short fragments of homology between several ENases and MTases sharing recognition sites with central CG.
Our homology study (Fig. 7, b and c) identified sequences related to DNA binding site II in several ENases with terminal G:C or C:G base pairs as the only common feature of their diverse recognition sequences (Fig. 7b). These peptides may constitute modules for sequence-specific DNA recognition, specialized to read C:G or G:C in different contexts. Their overall similarity is striking, and it will be interesting to establish how the differences between them relate to their specific DNA targets and whether other families of DNA recognition modules exist. Peptide scans, which have proved their utility for identifying DNA binding regions in proteins specifically interacting with DNA, could provide further clues.