Crystal structures of type II restriction endonuclease EcoO109I and its complex with cognate DNA.

EcoO109I is a type II restriction endonuclease that recognizes the DNA sequence of RGGNCCY. Here we describe the crystal structures of EcoO109I and its complex with DNA. A comparison of the two structures shows that the catalytic domain moves drastically to capture the DNA. One metal ion and two water molecules are observed near the active site of the DNA complex. The metal ion is a Lewis acid that stabilizes the pentavalent phosphorus atom in the transition state. One water molecule, activated by Lys-126, attacks the phosphorus atom in an S(N)2 mechanism, whereas the other water interacts with the 3'-leaving oxygen to donate a proton to the oxygen. EcoO109I is similar to EcoRI family enzymes in terms of its DNA cleavage pattern and folding topology of the common motif in the catalytic domain, but it differs in the manner of DNA recognition. Our findings propose a novel classification of the type II restriction endonucleases and lead to the suggestion that EcoO109I represents a new subclass of the EcoRI family.

EcoO109I is a type II restriction endonuclease that recognizes the DNA sequence of RGGNCCY. Here we describe the crystal structures of EcoO109I and its complex with DNA. A comparison of the two structures shows that the catalytic domain moves drastically to capture the DNA. One metal ion and two water molecules are observed near the active site of the DNA complex. The metal ion is a Lewis acid that stabilizes the pentavalent phosphorus atom in the transition state. One water molecule, activated by Lys-126, attacks the phosphorus atom in an S N 2 mechanism, whereas the other water interacts with the 3-leaving oxygen to donate a proton to the oxygen. EcoO109I is similar to EcoRI family enzymes in terms of its DNA cleavage pattern and folding topology of the common motif in the catalytic domain, but it differs in the manner of DNA recognition. Our findings propose a novel classification of the type II restriction endonucleases and lead to the suggestion that EcoO109I represents a new subclass of the EcoRI family.
Bacteria have evolved a mechanism to protect themselves from viral infection. Restriction endonucleases (REases) 1 provide an anti-viral protection for bacteria by degrading the foreign DNA of invading bacteriophages. The enzymes recognize specific nucleotide sequences and cleave both strands of DNA. To date, more than 3,500 REases have been characterized and classified into four types, I, II, III, and IV (1). Of these types, type II REases are widely used in genetic technology and are the most well studied.
Type II REases generally recognize 4 -8 base pairs of doublestranded DNA and hydrolyze phosphodiester bonds within the recognition sequences. The amino acid sequences are not homologous with the exception of the active site sequence. The active sites of the type II enzymes have a signature sequence PD(X) n DXK, in which four residues (Pro, Asp, Asp, and Lys; n ϭ 1 (PvuII) to ϳ49 (Bse634I) (2)) are weakly conserved and two separated acidic residues are usually followed by a basic residue. The active site structures thus share a common structural motif consisting of a five-stranded ␤-sheet flanked with ␣-helices (3)(4)(5). Specific divalent metal cations such as Mg 2ϩ are required to express the enzymatic activities and are coordinated to the conserved acidic residues during the catalytic reaction. However, details of the mechanism of the catalytic reaction are not fully understood (2).
Type II REases have been classified into two families, EcoRI and EcoRV, based on their DNA cleavage pattern. The EcoRI family produces 5Ј-overhang DNA, whereas the EcoRV family produces blunt-end DNA. From a structural viewpoint, the EcoRI family approaches DNA from the major groove side, whereas the EcoRV family approaches from the minor groove side. Moreover, the folding topology of the five-stranded ␤-sheet in the common structural motif in the active site differs between the two families (7) where the topology of four ␤-strands are absolutely conserved in the common motif but the fifth ␤-strand is oppositely oriented (8).
To date, 15 three-dimensional structures have been determined for type II REases by x-ray crystallography (REBASE; rebase.neb.com/cgi-bin/crylist). They mostly recognize the continuous 6-bp-long palindromic sequence but no three-dimensional structures for enzymes that recognize the discontinuous 7-bp-long palindromic sequence. In the type II REases that recognize degenerate base pairs, BsoBI (C2YCGRG, where 2 indicates cleavage position) (9), Bse634I (R2CCGGY) (10), Cfr10I (R2CCGGY) (11), and HincII (GTY2RAC) (12) have been subjected to the x-ray crystallographic analyses and the three-dimensional structures of BsoBI and HincII have been determined in the form of complexes with their respective cognate DNAs. For the type II enzymes that recognize discontinuous sequences, only the three-dimensional structure of BglI (GCCNNNN2NGGC) (13) has been determined as a complex with its cognate DNA. However, so far, no three-dimensional structure of a REase that recognizes both degenerate and discontinuous sequence has been determined.
EcoO109I is a type II REase isolated from Escherichia coli H709c and recognizes double-stranded DNAs with a sevenbase pair motif of both degenerate and discontinuous sequence, RG2GNCCY (where R ϭ A or G and Y ϭ T or C), and cleaves the phosphodiester bond between the second and the third bases to produce 5Ј-overhang DNA. Here we describe the crystal structures of both DNA-free EcoO109I and DNA-bound EcoO109I and show the structural basis for the catalytic mechanism at the atomic level. Furthermore, we suggest that EcoO109I represents a new subclass of the EcoRI family.

MATERIALS AND METHODS
Site-directed Mutagenesis, Expression, Purification, and Enzymatic Assay-EcoO109I was overexpressed and purified as described previ-ously (14). Site-directed mutagenesis was performed by using a QuikChange kit (Stratagene). Mutant enzymes were overexpressed and purified in the same way as done for the wild type. The activities of the wild-type and mutant enzymes were assayed with T4GT7 DNA (Nippon Gene Co. Ltd.) as follows. 0.31 g of T4GT7 DNA was digested with 6 ng of each enzyme in 20 l of reaction solution containing 10 mM Tris-HCl, pH 7.5, 10 mM MgCl 2 , and 1 mM dithiothreitol at 37°C for 1 h, and the digestion patterns were checked by agarose gel electrophoresis.
Preparation of Heavy Atom Derivatives and Data Collection-Crystallization and native data collection for DNA-free EcoO109I and DNAbound EcoO109I crystals have been reported previously (14). The ethylmercurithiosalicylate (EMTS) derivative of the DNA-free crystal was prepared by soaking the native crystals for 4 days in a reservoir solution containing 10 mM EMTS. The crystal was then washed in the reservoir solution, exchanged into the reservoir solution containing 10% glycerol, and frozen directly in liquid N 2 . The multiple wavelength anomalous dispersion data set of the EMTS derivative was collected at 100 K on SPring-8 BL45PX with a Rigaku MSC Jupiter 210 CCD detector. The diffraction data were processed with the program HKL2000 (15). The EMTS derivative of the DNA-bound EcoO109I crystal was prepared by soaking the native DNA complex crystal for 2 days in a reservoir solution containing 10 mM EMTS. The crystal was then washed in the reservoir solution, exchanged into the reservoir solution containing 20% ethylene glycol, and flash-frozen in a N 2 gas stream at 100 K. Diffraction data were collected at 100 K on an R-AXIS IV ϩϩ mounted on a Rigaku FR-D rotating anode x-ray generator. The data were processed with the program CrystalClear (16). Data collection statistics are given in Table I.
Structure Determination and Refinement-The structures of DNA-free and DNA-bound of EcoO109I were solved by multiple wavelength anomalous dispersion and SIRAS, respectively, with the programs SOLVE and RESOLVE (17). Model building and fitting were performed with the program O (18). Structure refinement was performed with the programs CNS (19) and REFMAC (20). In DNA-bound EcoO109I, the DNA binding occurs in two directions because the DNA used for crystallization was not a palindromic sequence. These structures cannot be distinguished. Therefore, the two directions of the DNA structures were superimposed and they were refined with occupancy of 0.5. The final refinement statistics for the two structures are summarized in Table II. Final coordinates and structure factors of EcoO109I DNA-free and complex were deposited to Protein Data Bank. The Protein Data Bank identification codes of DNAfree and complex structures are 1WTD and 1WTE, respectively.
X-ray Solution Scattering Measurements and Ab initio Low Resolution Structure Analysis of EcoO109I DNA-free-To eliminate interparticle interference effect on the scattering profile, the measurements were repeated at four protein solutions of DNA-free EcoO109I (10, 7.5, 5.0, and 2.5 mg/ml) prepared in the same buffer used for crystallization and these data points were extrapolated to zero protein concentration. Details of the data collection method and the design of the equipment have been reported elsewhere (21). Ab initio low resolution model for DNA-free EcoO109I was carried out using the program DAMMIN (22) where the experimental I(q) data were fitted in a q-range from 0.012 to 0.25 Å Ϫ1 (q ϭ 4sin/, where 2 is the scattering angle and is the x-ray wavelength) with those calculated from the dummy atom model by a simulated annealing minimization procedure. Repetitive ab initio runs yielded superimposable dummy atom models neatly fitting the experimental I(q) data.
Figs. 1 and 4 were prepared with the programs MOLSCRIPT (23) and RASTER3D (24). Figs. 2 and 6a were prepared with the program PYMOL (pymol.sourceforge.net). The helical parameters of the DNA structure were calculated with the program CURVES (25).

RESULTS AND DISCUSSION
Overall Structure-EcoO109I forms a homodimer (molecules A and B) in the crystallographic asymmetric unit of both its DNA-free form and DNA-bound forms (Fig. 1, a and b). Ab initio low resolution structure analysis of x-ray solution scattering data has shown that DNA-free EcoO109I in solution has a shape and a size similar to those of the dimer formed in the crystal (Fig. 2), indicating that the dimer formed in the crystal is the functional unit of the enzyme in solution. The monomer of EcoO109I consists of two domains, the dimerization and the catalytic domains (Fig. 1, c and d). Fig. 3a shows the amino acid sequence of EcoO109I together with an assignment of the secondary structures in the DNA-free and DNA-bound enzymes.
The dimerization domain consists of only ␣-helices (residues 1-86 and 210 -272), which interact with those of the other molecule to form the functional dimer. Buried accessible surface areas of the dimer, calculated with the program MSMS (26), are 5,608 and 7,139 Å 2 for DNA-free and DNA-bound enzymes, respectively, showing that the dimerization domain of the DNA-bound enzyme interacts more tightly than that of the DNA-free enzyme. The catalytic domain (residues 87-219) in which the active site is located folds into an ␣/␤-structure. The catalytic domain of the DNA-free enzyme adopts a flexible conformation with relatively high average temperature factors (74 Å 2 ) as compared with the dimerization domain (46 Å 2 ). A comparison of the structures between the DNA-free and DNAbound enzymes shows that a large conformational change occurs in the catalytic domain. Indeed, ␣6 and a loop region between ␣7 and ␤4 move into the major groove by 10 Å to wrap around the DNA (Fig. 1, c and d). The regions from residues 93 to 100 in molecule A and from residues 178 to 182 in molecule B are disordered in the DNA-free enzyme, whereas they are well ordered in the DNA-bound enzyme. The average temperature factor of the catalytic domain (22 Å 2 ) is now lower than that of the dimerization domain (26 Å 2 ) in the DNA-bound enzyme. Furthermore, binding of DNA induces new helical structures of ␣5 and two 3 10 helices (Fig. 3a). DNA Duplex Is Distorted by a --Interaction with EcoO109I-The DNA sequence used in this study is not palindromic (Fig. 3b); thus, DNA is bound to EcoO109I in two opposite directions in the crystallographic asymmetric unit of the DNA complex. These two indistinguishable DNA structures in which the recognition pattern of EcoO109I is identical are well superimposed on each other. Therefore, we based our discussion on only one of these structures.
The of normal B-DNA (11.6 Å). Because replacement of Trp-130 with alanine decreases endonuclease activity, the structural changes induced by the --interaction are responsible for the sequence specific hydrolysis activity of this enzyme (Fig. 5).
Specific DNA Recognition of DNA-Base pair recognition of Gua5(X):Cyt9(Y) is the same as that of Cyt9(X):Gua5(Y), and  (Fig. 4a) where the mainchain carbonyl oxygen of Trp-130 forms a hydrogen bond with N4 of Cyt9 and the main-chain amido nitrogen of Leu-134 forms a hydrogen bond with N7 of Gua5. Furthermore, N of Lys-173 is hydrogen-bonded to O6 of Gua5 through a water molecule. In the outer GC base pairs, a base-specific interaction occurs between the main-chain carbonyl oxygen of Trp-130 and N4 of Cyt9 by which EcoO109I precisely recognizes Cyt9 from the major groove side.
By contrast, the inner GC base pairs of Gua6(X):Cyt8(Y) and Cyt8(X):Gua6(Y) are recognized from both the major and the minor groove sides (Fig. 4b) where O␥1 of Thr-66 interacts with O6 of Cyt8 through two water molecules from the minor groove side, N⑀2 of Gln-133 interacts with O6 of Gua6 from the major groove side, and O␥1 of Thr-70 interacts with N2 of Gua6 from the minor groove side. In the inner base pairs, base-specific interactions occur between N⑀2 of Gln133 and O6 of Gua6 as well as between O␥1 of Thr-70 and N2 of Gua6 by which EcoO109I precisely recognizes Gua6 from both of the major and the minor groove sides. The C5 position of cytosine in the inner GC pairs has been shown to be methylated by EcoO109I methyltransferase (27). The C5 atom of Cyt8 corresponds to this position, and the distance between the C5 atom and the C␣ atom of Thr-131 corresponds to the normal van der Waals distance (3.6 Å). Therefore, methylation of the C5 position gives rise to steric hindrance and is likely to result in a considerable loss of the protein-DNA interaction to the extent that the DNA is protected from cleavage.
Both of the degenerate base pairs of Gua4(X):Cyt10(Y) and Thy10(X):Ade4(Y) are recognized from only the major groove side (Fig. 4, c and d). N of Lys-173 (in molecule A) interacts with N7 of Ade4(Y) directly and with N6 of Ade4(Y) through a water molecule (Fig. 4c), and N of Lys-173 (in molecule B) interacts with N7 of Gua4(X) directly and with O6 of Gua4(X) through a water molecule (Fig. 4d). Of the type II REases (BsoBI (C2YCGRG), Bse634I (R2CCGGY), Cfr10I (R2CCGGY), and HincII (GTY2RAC)) that recognize degenerate sequences, the structures of BsoBI and HincII have been determined as a complex with their cognate DNAs. Direct and water-mediated indirect hydrogen bonds of N of Lys-173 are also observed in BsoBI where N of Lys-81 forms direct and water-mediated indirect hydrogen bonds with N7, N6, and O6 of a purine base but are not observed in HincII where N⑀2 of Gln-109 interacts directly with O 2 of a cytosine base from the minor groove side. This difference in the recognition pattern of degenerate base pairs may be attributable to differences in the DNA cleavage pattern. Indeed, BsoBI of the EcoRI family produces 5Ј-overhang DNA, whereas HincII of the EcoRV family yields blunt-end DNA. Therefore, the purine base-specific direct and indirect hydrogen bonds of lysine may be a general recognition pattern of degenerate base pairs for type II REases of the EcoRI family.
In addition to the degenerated base pair R:Y, DNA with the sequence RG2GNCCY includes a non-recognized base, N. As expected, no direct and/or water-mediated indirect hydrogen bonding occurs between protein and DNA in the nonspecific base pair of Cyt7(X):Gua7(Y).
Active Site Structure-Among the type II REases, the sequence PD(X) n (D/E)XK is conserved in the active site, although the overall amino acid sequences of the enzymes are not homologous. In regard to EcoO109I, the conserved sequence corresponds to 109 Ile-Asp-(X) 13 -Ser-Leu-Lys 126 (Fig. 3a). The electron density map around the active site of the DNA-EcoO109I complex clearly shows that the phosphodiester bond is not cleaved and that a metal ion is located near the active site (Fig.  6a). A REase requires a divalent metal cation such as a Mg 2ϩ or Mn 2ϩ ion to cleave DNA. Given that the crystallization solution contains Na ϩ and K ϩ ions and that the metal ion is coordinated square-bipyramidally with six oxygen atoms (the side-chain carboxyl oxygen of Asp-110, a phosphate oxygen of DNA, the main-chain carbonyl oxygen of Leu-125, and three water oxygens) with bond distances ranging from 2.3 to 2.5 Å, , which is considered to be Na ϩ ion, is shown by a gray sphere, and water molecules are shown by the red spheres. The scissile bond is indicated by the red arrow. The NW for potential nucleophilic water molecule is indicated. The water molecule that may be a proton donor is labeled PDW. b, schematic representation of the active site structure and possible reaction mechanism. The divalent metal ion (M 2ϩ ) is indicated along with the two water molecules, NW and PDW. the metal ion near the active site can only represent a Na ϩ ion (28). Most recently, the crystal structure of HincII bound to Ca 2ϩ and cognate DNA. The DNA is uncleaved, and one calcium ion is bound per active site (6). The Ca 2ϩ in the structure of HincII is also coordinated square-bipyramidally with six oxygen atoms including the scissile phosphate similar to the structure of EcoO109I. The direct ligation by the metal ion to the scissile phosphate might be ubiquitous interaction in the metal ion-assisted reaction by type II restriction endonuclease, which produces 5Ј-or 3Ј-overhang DNA (6).
A water molecule (designated as NW for nucleophilic water), which is a ligand of the metal ion, is hydrogen-bonded to N of Lys-126. It also forms a hydrogen bond with the phosphorus atom of the DNA backbone at Gua6 with an angle across the O3Ј (Gua5) -P-O (NW) atoms of 172° (Fig. 6a). This almost linear arrangement of the three atoms is suitable for nucleophilic attack on the phosphorus atom by the water molecule, thereby indicating that the water molecule is the nucleophile for hydrolysis of the phosphodiester bond and that the hydrolysis of DNA proceeds by an S N 2 mechanism.
A Proposed Catalytic Mechanism-The hydrolysis of a phosphodiester bond catalyzed by EcoO109I has been shown to require a general base to activate the NW molecule, a Lewis acid such as a divalent metal cation to stabilize the negatively charged pentavalent phosphorus in the transition state, and a general acid to donate a proton to the 3Ј-hydroxyl-leaving group (5). On the basis of these three requirements, the hydrolysis reaction starts with the binding of a water molecule to the metal ion near the active site, resulting in a suitable positioning of the water molecule for nucleophilic attack on the phosphorus atom (Fig. 6). Asp-110 is a key residue for localizing a metal ion near the active site, and Lys-126 changes the water molecule to a hydroxyl ion, which is more nucleophilic than a water molecule, through a general base activation mechanism (Fig. 6b). Indeed, D110A and K126A mutants of EcoO109I were found to have lost endonuclease activity (Fig. 5). The general base activation is also aided by polarization of the water molecule through its coordination to the divalent metal ion. The putative nucleophilic water molecule was also observed in the structure of HincII and was within hydrogen-bonding distance of the conserved active site lysine (Lys-129) as well as the oxygen of the 3Ј-phosphate group of scissile phosphate (6). The importance of 3Ј-phosphate during the activation of water molecule was discussed (6). NW in the EcoO109I also interacts with the oxygen of 3Ј-phosphate group of scissile phosphate. Therefore, the interaction might be involved in the activation of the water molecule.
Nucleophilic attack by the hydroxyl ion on the phosphorus atom releases a 3Ј-hydroxyl group through the pentavalent phosphorus found in the transition state, showing that the reaction proceeds through an S N 2 mechanism (Fig. 6b). More than two metal ions, such as Lewis acids, may contribute to further stabilization of the pentavalent phosphorus. A general acid is finally required to donate a proton to the 3Ј-hydroxylleaving group. A water molecule other than the nucleophile is also observed in the active site. The proton-donating water molecule interacts with O3Ј of Gua5, N of Lys-89, and O␦1 of Asp-77 and is likely to play a significant role in proton donation to the 3Ј-hydroxyl-leaving group (Fig. 6). The water molecule is thought to be protonated by Asp-77 and changed into an oxonium ion (Fig. 6b). This proton is relayed from Asp-77 to the 3Ј-hydroxyl-leaving group via the water molecule. Indeed, the D77A mutant of the enzyme has no detectable enzymatic activity (Fig. 5). In addition, Asp-77 might be a ligand of other metal ions. However, it is still unknown how many divalent metal cations are required for catalysis. The number of the divalent metal cations required may be different in each endo-nuclease. Further experiments are in progress to clarify the metal ion-assisted hydrolysis reaction catalyzed by EcoO109I.
EcoO109I Represents a Novel Subclass of EcoRI Family-Although the topology of ␤-strands of the common structural motif is classified into EcoRI family, there are marked differences between EcoO109I and EcoRI in their interactions with DNA. EcoO109I binds DNA from the minor groove side (Fig. 1), which is similar to EcoRV, whereas EcoRI binds from the major groove side. The mutual arrangement of the dimer interface, DNA, and the common motif region in the DNA-EcoO109I complex also differs from that in DNA-EcoRI complex. Indeed, the interface and the common motif regions of EcoO109I are positioned opposite DNA, whereas those of EcoRI are not. The mutual arrangement in the DNA-EcoO109I complex is analogous to that in the DNA-BsoBI complex. Among EcoRI family enzymes whose three-dimensional structures have been determined as a complex with DNA, only EcoO109I and BsoBI bind DNA from the minor groove side. Both enzymes recognize degenerate base pairs and produce 5Ј-overhang DNA. From these findings, we suggest that EcoO109I and BsoBI organize a novel subclass of the EcoRI family that recognizes a degenerate R:Y pair and produces 5Ј-overhang DNA. However, further structural analyses of type II REases in complexes with DNAs might be required to confirm and generalize this classification.