Generation and Analysis of Mesophilic Variants of the Thermostable Archaeal I-DmoI Homing Endonuclease*

The hyperthermophilic archaeon Desulfurococcus mobilis I-DmoI protein belongs to the family of proteins known as homing endonucleases (HEs). HEs are highly specific DNA-cleaving enzymes that recognize long stretches of DNA and are powerful tools for genome engineering. Because of its monomeric nature, I-DmoI is an ideal scaffold for generating mutant enzymes with novel DNA specificities, similarly reported for homodimeric HEs, but providing single chain endonucleases instead of dimers. However, this would require the use of a mesophilic variant cleaving its substrate at temperatures of 37 °C and below. We have generated mesophilic mutants of I-DmoI, using a single round of directed evolution that relies on a functional assay in yeast. The effect of mutations identified in the novel proteins has been investigated. These mutations are located distant to the DNA-binding site and cause changes in the size and polarity of buried residues, suggesting that they act by destabilizing the protein. Two of the novel proteins have been produced and analyzed in vitro. Their overall structures are similar to that of the parent protein, but they are destabilized against thermal and chemical denaturation. The temperature-dependent activity profiles for the mutants shifted toward lower temperatures with respect to the wild-type activity profile. However, the most destabilized mutant was not the most active at low temperatures, suggesting that other effects, like local structural distortions and/or changes in the protein dynamics, also influence their activity. These mesophilic I-DmoI mutants form the basis for generating new variants with tailored DNA specificities.

Homing endonucleases (HE), 5 also known as meganucleases, produce double strand breaks in DNA, which induce the trans-position of mobile intervening sequences, either introns or inteins, containing the endonuclease open reading frame into cognate alleles that lack this sequences, in a process known as homing (1). HEs are highly sequence-specific enzymes, with recognition sites 12-45 bp long, and therefore have a very low frequency of cleavage even in complex genomes. Thus, HEs are very powerful tools for manipulating genomes of mammalian cells and plants (2)(3)(4) because of this rare cutting property.
Sequence homology has been used to classify HEs into four families, the largest having the conserved LAGLIDADG sequence motif (5). Homing endonucleases with only one such motif, such as I-CreI (6), function as homodimers. By contrast, larger HEs containing two motifs, such as I-SceI (7) or I-DmoI (8), are single chain enzymes. The three-dimensional structures of several LAGLIDADG endonucleases have been solved (9 -19). These proteins adopt a similar active conformation; the single chain LAGLIDADG proteins display two distinct domains with 2-fold pseudo-symmetry, similar to the perfect 2-fold symmetry of the homodimeric proteins. The two LAGLIDADG motifs form structurally conserved ␣-helices tightly packed at the center of the interdomain or intermonomer interface. On either side of the LAGLIDADG ␣-helices, a four-stranded ␤-sheet provides a DNA-binding interface that drives the base-specific interaction of the protein with each one of the half-sites of the target DNA sequence (11,13). The last acidic residue of the LAGLIDADG motif, located at the C-terminal end of the corresponding helix, participates in DNA cleavage by metal-dependent phosphodiester hydrolysis (13).
Several hundred HEs have been identified to date, but the probability of finding an HE cleavage site in a chosen gene is still very low. Thus, generating artificial meganucleases with tailored specificity by rational design or mutagenesis and screening and/or selection is a field of intense research (20 -25). Recently, we used a semi-rational mutagenesis approach coupled with high throughput screening to derive hundreds of novel endonucleases from I-CreI, an HE with locally altered specificity from the LAGLIDADG family (22). We also used a combinatorial and rational approach for engineering the overall specificity of these proteins and for designing an endonuclease that cleaves sequences from the human RAG1 and XPC genes (26,27).
However, these meganucleases are heterodimers, consisting of two engineered monomers that need to be co-expressed in the targeted cell, a characteristic stemming from the homodimeric nature of I-CreI. As a consequence, we actually produce two undesired homodimers in addition to the heterodimer. We know that natural and engineered meganucleases tolerate a certain degree of degeneracy (5,22,27), and these molecules may contribute to a loss of specificity, as has been observed with zinc finger nucleases, another class of engineered endonucleases (28). One possible way of bypassing this issue is to use single chain scaffolds. Previous studies have described the development of single chain I-CreI versions (29), or of hybrid-homing endonucleases, by fusing two LAGLIDADG nucleases, I-DmoI and I-CreI. DmoCre (29) and E-DreI (30) are two similar proteins, differing only in the linker region; both cleave novel hybrid DNA targets consisting of two moieties, one from the I-CreI cleavage site, the other from the I-DmoI cleavage site. However, both proteins have shown intrinsic limits: the single chain I-CreI molecule is imperfectly folded (29) and probably as a consequence is difficult to engineer 6 ; and the I-DmoI/I-CreI chimeras have maintained I-DmoI thermophilic properties (29,30).
Another limit of the I-CreI scaffold is the number of sequences that can be targeted by engineered derivatives. One of the most elusive factors in DNA binding and cleavage by I-CreI is the role of the four central nucleotides of the DNA target molecule. No direct contact has been reported between the protein residues and the DNA four central bases of the DNA; however, several nucleotide substitutions in this region result in a total loss of cleavage by I-CreI and its engineered derivatives (27,31). Previous attempts at modifying the specificity of homing endonucleases were generally based on the prior identification of direct interaction patterns (21,22,25,26,32). However, given the absence of such information, it may be very difficult to change the recognition pattern in the central part of the I-CreI target. Again, using other scaffolds may address this issue, and they would ideally be monomeric (see above).
I-DmoI is a well characterized HE, encoded by an archaeal intron from the hyperthermophile archaeon Desulfurococcus mobilis (8,10,(33)(34)(35). I-DmoI is an attractive scaffold for engineering novel DNA sequence specificities, as its monomeric nature allows for the generation of single chain mutants with altered specificities along the whole DNA recognition sequence. Screening for novel specificities in living cells requires that the activity is detectable at mesophilic temperatures. However, I-DmoI, as expected for an endonuclease from a hyperthermophilic organism, is essentially active at high temperatures, displaying no activity or residual activity at 37°C (8). In this study, we generated I-DmoI variants with enhanced cleavage activity at physiological temperatures in yeast. Two of the proteins were characterized in vitro and were less stable and more active at lower temperatures than the wild type. These novel I-DmoI variants could be used as initial scaffolds for engineering proteins that cleave new DNA targets, using a similar strategy to that described previously for I-CreI (22,26,27). Target Vectors-A  library of I-DmoI variant open reading frames was generated as  follows: the first 300 bp of the wild-type I-DmoI-encoding gene  were amplified by mutagenic PCR (36, 37), whereas the second half of the gene was amplified with a high fidelity enzyme. For mutagenic PCR, we used classic PCR conditions known to enhance the natural mutation rate of a Taq polymerase. MgCl 2 concentration was increased to stabilize DNA mismatches, and MnCl 2 was added to decrease the fidelity of the enzyme, and the concentrations of dCTP, dTTP, and polymerase were increased to favor misincorporation, as described in previous reports (36,37). Under these conditions, the mutation rate was measured to be 6.6 ϫ 10 Ϫ3 per nucleotide resulting in 96% of mutated molecules, each one with one to six mutations (36). Both PCR amplifications overlapped. This allowed a second conservative amplification, reconstituting a full-length I-DmoI gene; the first part contained the mutated domain (from the ATG start codon to the beginning of the second LAGLIDADG domain), and the second part contained the wild-type encoding sequence (from the second LAGLIDADG domain to the stop codon). The resulting PCR products were introduced into the 2-m-based replicative vector pCLS542 marked with the LEU2 gene and transformed into the Saccharomyces cerevisiae selection strain (see below) as described previously (22). Yeast reporter vectors were constructed as described previously (22).
Selection in Yeast-A yeast strain, specific for selecting endonucleases that cleave the I-DmoI related targets, was constructed with strain FYC2-6A (MAT␣, trp1⌬63, leu2⌬1, his3⌬200), as described previously (39). The LYS2 endogenous gene was disrupted using the "pop-in" transformation method (40), resulting in two truncated but overlapping lys2 genes, separated by a cassette containing a kanamycin resistance gene and the I-DmoI target site. The ADE2 gene was similarly disrupted, resulting in two truncated ade2 copies separated by an I-DmoI site and a TRP1 gene (39). Cleavage of the I-DmoI sites resulted in functional LYS2 and ADE2 genes by tandem repeat recombination (Fig. 1a). The final selection yeast strain was G418resistant, Trp ϩ , Ade Ϫ , and Lys Ϫ . For selection, yeast cells from the selection strain were transformed with the I-DmoI mutant library, marked with the LEU2 gene. Cells were plated onto a selective medium lacking leucine with glucose (2%) as a carbon source. After 3 days of growth at 37°C, colonies were suspended in water, and aliquots of the suspension were plated onto selective media lacking leucine, adenine, and lysine, and containing galactose (2%). Leu ϩ , Ade ϩ , and Lys ϩ isolates were picked and rearrayed for screening.
Screening in Yeast-After selection, Leu ϩ , Ade ϩ , and Lys ϩ isolates were screened for their ability to induce tandem repeat recombination in a LacZ reporter plasmid carrying an I-DmoI cleavage site. For this, isolates were mated with the FYBL2-7B (MATa, ura3⌬851, trp1⌬63, leu2⌬1, lys2⌬202) yeast strain carrying the reporter system described in Fig. 1b, and diploids were grown at 37°C on selective media containing galactose (2%). Meganuclease-induced recombination of the LacZ reporter system restores a functional ␤-galactosidase gene, which can be detected by X-gal staining. Mating, cleavage induction, and X-gal staining were performed as described previously (22).
Sequencing of Positive Clones-DNA was extracted from positive clones using Lyticase (from Sigma L2524); 1.5 ml of saturated cultures (microtubes or deep well plates) were centrifuged, and the pellet was resuspended with 50 l of 33.5 mM KH 2 PO 4 , pH 7.5. Lyticase (20 l of 2.5 units/l) was added, and the samples were incubated at 37°C for 1 h. Spheroplasts were lysed by hypo-osmotic shock by adding 10 l of SDS (20% stock). The debris was removed by centrifugation. An aliquot of the supernatant was used to transform Escherichia coli and amplify plasmid DNA. Two E. coli isolates were selected for each yeast-positive isolate. Plasmid DNA was then extracted and sequenced.
Protein Expression and Purification-For expression in E. coli, the I-DmoI gene was obtained as described (29), and the I-DmoI mutants, D1 and D2, were generated using the QuikChange XL site-directed mutagenesis kit (Stratagene). This construct is 200 residues long and differs from the native I-DmoI protein by an extra Ala residue at the N terminus and an AAALEHHHHHH sequence at the C-terminal end for affinity purification. This recombinant enzyme is active (29) and corresponds to the sequence coded by the linear form of the intron after excision; this linear form is 6 residues shorter at the C-terminal end than that coded by the circularized intron (8). Both forms have very similar biochemical properties (38). In the crystal structure of the short enzyme, the last 9 residues are not observed (41), indicating that the C-terminal end is flexible. The genes were inserted into pET-24d (ϩ) vectors (Novagen) and overexpressed in E. coli Rosetta(DE3)pLysS at 25°C. The protein purification protocol was similar to that described for I-CreI homing endonucleases (27,42). Pure proteins in phosphate-buffered saline (PBS, 137 mM NaCl, 10 mM sodium phosphate, 2.7 mM KCl, 2 mM potassium phosphate, pH 7.4) were flash-frozen with liquid nitrogen and stored at Ϫ80°C until use. . Yeast assays. a, yeast selection assay principle. The selection strain contains two chromosomal reporter systems, based on overlapping truncated copies of the ADE2 or LYS2 genes. A I-DmoI cleavage site is placed between the two truncated repeats. The selection strain is transformed with a plasmid harboring a LEU2 marker and expressing the library. Upon cleavage of the I-DmoI site by a meganuclease (gray oval), tandem repeat recombination restores functional ADE2 and LYS2 functional genes, by a process referred to as single strand annealing (SSA). The resulting Ade ϩ Lys ϩ cells can be selected for an appropriate synthetic medium lacking adenine and lysine. Selection strain construction and details of the selection process have been described in a previous report (39). b, yeast screening assay principle. A strain harboring the expression vector encoding a single mutant is mated with a strain harboring a reporter plasmid. In the reporter plasmid, a lacZ reporter gene is interrupted with an insert containing one of the target sites of interest, flanked by two direct repeats. Upon mating, the meganuclease (gray oval) generates a double strand break at the site of interest, allowing restoration of a functional lacZ gene by SSA between the two flanking direct repeats. This selection step can be carried on after a selection step; in this case, the selected Ade ϩ Lys ϩ cells are mated with the strain harboring the reporter plasmid. c, characterization of mutants obtained by random mutagenesis. I-DmoI, I-DmoI mutants, and empty vector are tested against the I-DmoI target at 37°C as described in the text. I-SceI is tested against an I-SceI target site 1, I-SceI; 2, pCLS0542 (empty expression vector); 3, I-DmoI wild type; 4, I-DmoI K49R, I52F, L95Q; 5, I-DmoI I52F, A92T, F101C (D2); 6, I52F, L95Q (D1); 7, I52F, F101C; 8, A92T, L95Q.
Overloaded Coomassie-stained SDS-polyacrylamide gels showed that the proteins were almost 100% pure. The identity of the proteins was confirmed by mass spectrometry, which indicated that the initial methionine was not present in the purified proteins. Protein concentrations were measured by ultraviolet absorbance using an extinction coefficient of 25,900 M Ϫ1 cm Ϫ1 , calculated from the amino acid sequence using the ProtParam web server.
Circular Dichroism (CD) Analysis-Far-UV CD spectra (250 -190 nm) were recorded at 20°C on a Jasco-810 dichrograph equipped with a Peltier thermoelectric temperature controller, previously calibrated with d-10-camphorsulfonic acid. The spectra were acquired in the continuous mode with 2 nm bandwidth, 4-s response, and a scan speed of 100 nm/min. We used 0.1-cm path length quartz cuvettes (Hellma) and protein samples with a concentration of 150 M in PBS for CD analysis. Ten scans were accumulated to obtain the final spectra. Thermal denaturation curves were obtained at a protein concentration of 10 M in a 2-mm cuvette. The ellipticity at 222 nm from 5 to 95°C was recorded at 1°C/min intervals.
NMR Spectroscopy-1 H NMR spectra were recorded at 25°C in a Bruker AVANCE 600 spectrometer equipped with a cryoprobe. The concentrations of protein samples were 500 M in PBS plus 5% 2 H 2 O. 2,2-Dimethyl-2-silapentane-5sulfonate sodium salt was used as internal proton chemical shift reference.
Analytical Ultracentrifugation-Sedimentation velocity and equilibrium experiments were performed on 200 M protein samples in PBS at 20°C, as described previously for I-CreI proteins (42).
Chemical Denaturation-I-DmoI, D1, and D2 proteins were unfolded using guanidine hydrochloride (GdnHCl, Sigma). A stock solution of 8 M GdnHCl was gravimetrically prepared at 20°C in a calibrated volumetric flask. For each point of the titration curve, the protein in PBS was mixed with the desired amount of denaturant and concentrated buffer to obtain a 4 M final protein concentration, and it was left to equilibrate overnight at room temperature. The ellipticity at 222 nm was recorded at 25°C in a Jasco-810 dichrograph using a 10-mm cuvette (Hellma). The reversibility of the chemical denaturation was checked by similarly measuring the refolding curves starting with the proteins denatured in 8 M GdnHCl. The overlap of the two curves, with differences within the experimental uncertainty, indicated that the chemical denaturation was a reversible process. The unfolding data could thus be analyzed according to the following equation: Ϫ ⌬G H2O )/RT)). In the equation, E is the measured ellipticity; E F is the ellipticity of the folded state; E U is the ellipticity of the unfolded state; m F and m U are the changes in the ellipticity of the folded and unfolded states, respectively, as a function of the GdnHCl concentration and account for sloped base lines; also, m is the slope in the transition region representing the dependence of ⌬G H2O on denaturant concentration; ⌬G H2O is the free energy for unfolding; R is the gas constant (8.314472 J/K/mol), and T is the absolute temperature, 298 K (43).
In Vitro Cleavage Assay Conditions-Cleavage assays were performed from 20 to 86°C in 10 mM Tris-HCl, pH 8, 50 mM NaCl, 10 mM MgCl 2 , and 1 mM dithiothreitol. The I-DmoI target sequence was contained in the 3-kb plasmid pGEM-T, which, after linearization by cleavage with XmnI endonuclease and cleavage by I-DmoI, generated a 1-and 2-kb fragment. The 25-l reaction mixtures contained 2 nM DNA target (linearized with XmnI) and 1.7 nM I-DmoI protein (or its variants). The mixtures were prepared on ice and incubated simultaneously for 1 h at the desired temperatures (in the corresponding positions of an Eppendorf Mastercycler with gradient capability). The reactions were then stopped by adding 5 l of 60% glycerol, 60 mM EDTA, pH 8, and 0.045% (w/v) bromphenol blue. The samples were loaded onto a 1% agarose gel for analysis by gel electrophoresis. The gels were stained using SYBR Safe DNA gel staining (Invitrogen), and the intensity of the bands under UV light was quantified using ImageJ software (rsb.info.nih. gov/ij/). The percentage of cleavage was calculated with the following equation: % cleavage ϭ 100 ϫ (I 2kb ϩ I 1kb )/(I 3kb ϩ I 2kb ϩ I 1kb ), in which I 1kb , I 2kb , and I 3kb are the intensities of the 1-3-kb bands.
Fluorescence Measurement of Binding Affinities-The dissociation constants for the I-DmoI proteins were obtained from the change in the intrinsic fluorescence of the proteins on binding DNA. Thus, mixtures consisting of 500 nM I-DmoI with various concentrations (0 -2000 nM) of the 25-bp-long natural I-DmoI target DNA duplex (35) were prepared in 10 mM Tris-HCl, pH 8, 50 mM NaCl, and 10 mM CaCl 2 buffer and incubated for 15 min at 37°C. The DNA duplex was prepared previously from two complementary synthetic oligonucleotides (Operon Biotechnologies); equimolecular amounts of the oligonucleotides were mixed in 10 mM Tris-HCl, pH 8.0, 50 mM NaCl, heated for 15 min at 95°C, and left to cool down at room temperature. Initial tests indicated that the fluorescence of I-DmoI decreased on binding to DNA and that the maximum change occurred at 334 nm. Fluorescence emission spectra were then recorded at 37°C in a QuantaMaster QM-2000-7 model spectrofluorometer (Photon Technology International), with excitation fixed at 280 nm and emission in the 325-340 nm range; each spectra was the average of five consecutive scans. Three independent titration curves were carried out for each protein, which were used to extract the average fluorescence values and the standard errors at each titration point. The dissociation constants (K D ) were determined from the change in fluorescence at 334 nm by data fitting (Origin, Microcal) to the following equation: , in which ⌬F is the protein change in protein fluorescence at 334 nm on DNA addition; ⌬F max is the maximum fluorescence change (corresponding to the saturated protein), [P] is the total protein concentration (500 nM), and [DNA] is the total concentration of the DNA duplex. The adjustable parameters during the fitting were ⌬F max and K D .

RESULTS AND DISCUSSION
Generation and Selection of I-DmoI Variants-The structure of the I-DmoI protein shows that its fold is similar to that of other LAGLIDADG endonuclease structures and that it is orga-nized into two structural ␣/␤ domains with similar topology (10). The N-terminal domain spans the first 100 residues and is slightly longer than the C-terminal domain (the natural untagged protein is 188 residues long). We generated a library of genes with a random mutation carried by a replicative yeast vector marked with a LEU2 gene to generate I-DmoI mutants with a putative enhanced activity at 37°C. We limited the amount of sequence for analysis by mutating only the first half of the gene, which roughly corresponds to the N-terminal domain (Fig. 2a).
After transformation into S. cerevisiae, 1.8 ϫ 10 6 independent Leu ϩ transformants were recovered, pooled, and submitted for a single round of selection and screening following a general process described previously (39). The selection process depends on the induction of two tandem repeat recombination events by the meganuclease, which result in the restoration of functional ADE2 and LYS2 genes (Fig. 1a). As these recombination events can also occur spontaneously, selected transformants were then tested for their ability to induce recombination in a plasmid-borne LacZ-based reporter system (Fig. 1b). Basically, the transformants were individually crossed with a strain of the opposite mating type, carrying the reporter plasmid. This plasmid contains two overlapping truncated ␤-galactosidase genes flanking an intervening sequence that includes the I-DmoI site. Upon cleavage of the I-DmoI site, tandem repeat recombination restores a functional ␤-galactosidase The orientation shown depicts the protein with the proposed active site, the region around the C-terminal ends of the paired helices, one from each domain, at the center of the figure. N and C denote the N-and C-terminal ends of the chain, respectively. The side chains of the mutated residues in D1 and D2 mutants are shown in green; the other residues found mutated in the mesophilic variants identified in yeast are in magenta. b, far-ultraviolet circular dichroism of the three proteins. Under the experimental conditions used, the major source of uncertainty comes from the protein concentration measurements estimated to be not higher than 5%. c, one-dimensional 1 H NMR spectra of the three I-DmoI proteins. The spectrum of the wild-type protein was acquired with presaturation, whereas the spectra of the two mutants were acquired with a double watergate sequence, causing the region in the center of the spectrum to differ markedly between the two sets of spectra. The precision of the resonance frequency in the spectra is better than 0.007 ppm. gene that can be monitored by X-gal staining. This assay results in bright blue staining on effective cleavage (see I-SceI tested with its own DNA target in Fig. 1, c1), and no coloration in the absence of cleavage (Fig. 1, c2). Although the optimal temperature for yeast growth is 30°C, our yeast-based assays aimed at identifying variants cleaving at 37°C. Indeed, for both selection and screening assays, cells were grown for several days at 37°C during the last steps. No clear staining was observed with I-DmoI in this assay, but a very faint blue color was sometimes observed (Fig. 1, c3), probably as a consequence of residual cleavage that occurred at 37°C. Altogether, about 10 8 Leu ϩ cells were plated and 1800 Ade ϩ Lys ϩ isolates were obtained following the selection step, of which 150 were picked and screened as described above. X-gal staining revealed 11 positives, which, after sequencing for the meganuclease open reading frames, were found to correspond to 9 I-DmoI variants (Table 1).
Altogether, 11 distinct substitutions were found in the variants. Moreover, eight of these variants displayed more than one mutation. Thus, new I-DmoI variants were created by directed mutagenesis to test the impact of individual substitutions. Mutations such as K49R, A92T, L95Q, and F101C, identified in the selected variants, were introduced into the I-DmoI open reading frame by site-directed mutagenesis, alone or in combination, and the resulting I-DmoI variants were tested by screening for I-DmoI activity in yeast. Thus, we showed that the K49R and F101C mutations had no detectable effect by themselves. By contrast, the I52F, A92T, and L95Q mutations conferred significant cleavage activity to I-DmoI in yeast, alone and in conjunction with other mutations such as N4K, D7V, I19F, M49L, I60V, and L55Q. These additional mutations did not appear to increase significantly the effect of I52F, A92T, and L95Q mutations in our screening assay; however, we could not exclude that they may have positively affected the selection step. Thus, the major impact on cleavage activity was likely because of three substitutions, I52F, A92T, and L95Q. Also, the other mutations had little or no effect on cleavage at 37°C in yeast.
Although most positive mutants resulting from random and directed mutagenesis resulted in a light blue coloration ( Table  1, and for examples see Fig. 1, c5, c7, and c8), two mutants caused an intense blue coloration in our yeast assay (Table 1, and see Fig. 1, c4 and c6), similar to the significant staining obtained with I-SceI and its target DNA (Fig. 1, c1). Both mutants displayed the I52F and L95Q mutation, alone or in conjunction with K49R, suggesting that I52F and L95Q have an additive or synergistic effect.
It has been shown that after random mutagenesis, a very high mutation rate decreases the number of active isolates, whereas nonefficient mutagenesis results in almost no variant isolates (44). Our mutagenesis was balanced enough to allow for the identification of independent positives bearing distinct sets of mutations. These conditions resulted in 1-2% of mutations at the nucleotide level. It was difficult to infer the proportion of true positives precisely because of the selection protocol. Nevertheless, this proportion was evaluated as 1.8 ϫ 10 Ϫ4 , this number being the ratio of Ade ϩ Leu ϩ colonies divided by the number of Leu ϩ colonies plated on the selective medium (see above). Similar conditions were also used to improve the activity of engineered meganucleases derived from I-CreI (27), and have since been applied routinely for other similar processes. 6 Structural Characterization of I-DmoI and Two of Its Variants-We selected two active mesophilic variants for further in vitro analysis and for comparison with the wild type to better understand their behavior. Variants I-DmoI I52F, L95Q (one of the two mutants resulting in intense blue staining in our yeast assay, Fig. 1, c5, and named hereafter as D1), and I-DmoI I52F, A92T, F101C (one of the other category of mutants, Fig. 1, c6, named hereafter as D2) and the wild-type protein (Fig. 1, c3) were overexpressed and purified from E. coli as described under "Experimental Procedures." We checked the structural integrity of the D1 (I52F, L95Q) and D2 (I52F, A92T, F101C) mutants; thus, we analyzed their structure and stability by circular dichroism, NMR, and thermal and chemical denaturation. The similarity in the far-UV circular dichroism spectra of the three proteins (Fig. 2b) indicated that the two mutants have essentially the same secondary structure content as wild-type I-DmoI. The existence of a well defined tertiary structure in a protein can be quickly inspected by nuclear magnetic resonance, in which a dispersed and narrow set of signals in its 1 H spectrum is diagnostic of a well folded protein. The I-DmoI spectrum showed dispersed signals both in the upfield and downfield regions, and the spectra of the two mutants showed a pattern of signal dispersion similar to that of the wild-type protein (Fig. 2c). There are small differences close to the signal of the remaining water protons (at around 4.7 ppm) because of the effect of solvent suppression. There are also shifts in the signals at around 0.0 ppm, probably because of the changes in the methyl groups and in the shielding caused by aromatic groups as a result of mutations in the D1 and D2

I-DmoI protein a Origin Activity in yeast b
Wild type dm Ϫ a K49R mutation refers to amino acid 49; K is the residue in wild-type I-DmoI, and R is the residue in the mutant. b Cleavage activity of the I-DmoI mutants in yeast at 37°C classified as absent (Ϫ, faint, or no blue color), low (ϩ, light blue color), or high (ϩϩ, intense blue color). c sg indicates synthetic gene obtained by PCR of overlapping oligonucleotides and inserted into pCLS542. d rm indicates mutants resulting from random mutagenesis and selection/screening. e dm indicates mutants resulting from directed mutagenesis.
variants. Although local conformational changes cannot be excluded and are possibly responsible for the changes in the frequencies of some signals, the spectra indicated that the mutants are folded into a tertiary structure that is essentially the same as that of the wild type. The NMR signals of the three proteins also had very similar linewidths, indicating that the mutations did not change the oligomerization state of the proteins nor induce aggregation of the monomers. We analyzed the proteins by analytical ultracentrifugation to confirm the monomeric state of our I-DmoI construct and the mutants. Sedimentation equilibrium and velocity experiments demonstrated that the three proteins are monomeric (Fig. 2d). Together, these results show that the mutants are folded and conserve the I-DmoI scaffold. However, the mutations that increase the cleavage activity in vivo may also have a destabilizing effect on the proteins.
We carried out thermal denaturation experiments followed by circular dichroism analysis to measure the relative stability of the three proteins. The three proteins showed cooperative, sigmoidal transitions (Fig. 3a). This result was consistent with the presence of a folded tertiary structure. However, there was clear destabilization of the two mutants in relation to the wildtype protein, with differences in the apparent mid-point temperatures of about 10°C for D1 and 16°C for D2. 7 We measured the free energy of the unfolding reaction for I-DmoI, D1, and D2 proteins in chemical denaturation experiments to quantify the destabilization of the mutants. Thus, we used guanidinium hydrochloride, because the wild-type protein was so stable that it still showed spectroscopic features of the folded protein in 6 M urea (data not shown). The addition of GdnHCl resulted in the loss of secondary structure. This was followed by circular dichroism at 222 nm (Fig. 3b). The data could be fit to a two state unfolding transition using the equation given under "Experimental Procedures," resulting in the free energies of unfolding for the three proteins ( Table 2). I-DmoI D2 protein was less stable with a free energy of unfolding ϳ47 kJ/mol less than the wild-type protein; however, D1 was more stable than D2 but was also destabilized by 37 kJ/mol in relation to the wild-type protein. The slope of the curve in the transition region (m) measures the dependence of the change in free energy on denaturant concentration, reflects the cooperativity of the unfolding reaction, and is correlated with the change in accessible surface area on unfolding (45). The unfolding reaction of the two mutants is less cooperative than the unfolding of the wild type, as shown by their smaller m values. These differences are much larger than the experimental uncertainty observed in the value of m. In this respect, we assumed that experimental uncertainty was the same as that determined for the RNase barnase, in which a 6% variation in repeated measurements of chemical denaturation curves was obtained (46). One possible explanation for the reduced m values is that the wild-type protein unfolds to a greater extent than the two mutants, of which the unfolded states may be more compact and expose less surface area. Another explanation is that the unfolding of the mutants departs from a pure two-state process, with the possible appearance or stabilization of an intermediate species that lowers the value of m (47).
In Vitro Cleavage Activity of I-DmoI Variants-We used a cleavage assay on the I-DmoI target DNA sequence over a wide range of temperatures to evaluate the cleavage activity of the enzymes as a function of temperature. The I-DmoI enzyme had a barely detectable activity at 37°C, whereas both mutants already showed cleavage activity starting at 32°C (Fig. 4a); the I-DmoI D1 mutant was more efficient than D2 at 37°C, which was consistent with the results of the in vivo selection experiment ( Fig. 1c and Table 1). The profiles for the percentage of cleavage versus temperature were shifted toward lower temperatures for the mutants than those of the wild-type protein (Fig.  4b); 50% cleavage was obtained at 58, 51, and 53°C for wildtype, D1, and D2 I-DmoI proteins, respectively, and 22, 71, and 59% cleavage was observed at 54°C for the same three proteins. Interestingly, and despite the different stabilities and activities at mesophilic temperatures, the three proteins reached their maximal activities at around 65°C and were maintained up to 75°C. Other mesophilic meganucleases also show an increased activity at high temperatures; I-CreI has a maximal activity 7 These temperatures are apparent and not real mid-point denaturation temperatures because the thermal denaturation resulted in an irreversible precipitation of the protein inside the quartz cuvette. The decrease in the circular dichroism signal is because of both the heat-induced denaturation and concomitant loss of secondary structure, as well as because of the decrease in the concentration of soluble protein, and hence that of its chromophores.  Table 2. between 50 and 70°C (6), and I-CeuI is more active at 50°C than at lower temperatures (48). The drop in activity of D2 at 86°C was possibly because of its denaturation at this temperature (Fig. 3a), with the measured activity occurring during the time necessary for complete denaturation, a time that is likely to be longer for the more stable D1 and hence its higher activity at 86°C. However, the activity at high temperatures should be interpreted with caution as both protein and DNA denaturation may occur. We determined the mutation effect on the binding affinity for the I-DmoI target DNA sequence; thus, we measured the dissociation constants of the I-DmoI proteins bound to a 25-bp-long DNA duplex with the natural I-DmoI target sequence. The dissociation constants measured at 37°C were in the 50 -120 nM range but were lower for the mutants than for the wild type, indicating that the mutants have a higher affinity for the DNA (Fig. 5). The I-DmoI D1 variant is the one that binds the DNA with the highest affinity, which is consistent with the highest cleavage activity also shown by D1 at 37°C.
Structural Insights into the Thermophilic-Mesophilic Characteristics of I-DmoI Variants-The structure of I-DmoI bound to its target DNA is unknown, but the structure of the free protein, its comparison with other LAGLIDADG meganucleases, and biochemical and genetic data available for the I-DmoI/DNA interaction suggest that the DNA active site is in the region of the C-terminal ends of the two ␣-helices that lie at the interface of the two structural domains; however, the putative DNA-binding surface is formed by the two antiparallel four-stranded ␤-sheets (10). These conclusions are strongly supported by the structure of the E-DreI-DNA complex, which contains the N-terminal domain of I-DmoI bound to its DNA target half-site (30). Our strategy for random mutation concentrated the residue changes at the N-terminal domain (Fig. 2a), which contains the first LAGLIDADG motif and part of the active site; however, the in vivo selection process excludes mutants with residue changes at the active site or its surroundings that severely impair DNA binding and/or cleavage activity, and it also excludes mutations that produce misfolded, inactive proteins.
As can be seen in Fig. 2a, most mutations in the mesophilic variants occurred in the ␣-helix and the loop at the end of the N-terminal domain (A92T, M94L, L95Q, F101C) in combination with mutations in the second ␣-helix (K49R, I52F). The nature of the residue changes in the active mutants suggests that only conservative or slightly different residue changes are compatible with the activity in vitro. The only two mutations in active variants that introduced (N4K) or eliminated (D7V) a charged residue were at the N-terminal end of the protein, which was not observed in the x-ray structure and hence probably significantly flexible. These two residue changes appeared together with other mutations, which are the ones responsible for the increased activity at 37°C, suggesting that the flexible N-terminal end does not play a major role in DNA cleavage. The other major change occurred in F101C mutant, and here we observed a large reduction in the size of the side chain, as well as an increased polarity. This mutation also appeared together with other residue changes that enhanced the activity at 37°C. The rest of the mutations (I19F, K49R, I52F, L55Q, I60V, A92T, L95Q, and M94L) produced small changes in the size of the side chain, slight increases in its polarity, or both. Mutation K49R is unlikely to have a large effect on the structural properties of the protein, as it is a solvent-exposed residue, does not participate in specific interactions, and maintains its positive charge. The other residues are buried to a varying extent, especially in the case of I52F, which is completely buried (0% surface accessible area) in the hydrophobic core. In summary, the selection of active and stable variants with enhanced activity at mesophilic temperatures is because of the partial destabilization introduced by residue changes that alter the size and polarity of residues at the hydrophobic core, both of which are well known determinants of protein stability (49). However, the most destabilized protein (D2) was not the one with the highest activity at lower temperatures (D1), suggesting that other effects, for example local structural distortions and/or changes in the protein internal dynamics, may also influence the cleavage activity at various temperatures. The incorporation of biological catalysts in industrial processes and products have generated the need for higher yields and fostered enzyme engineering (50). Manipulating protein stability was and still is one of the major goals. The earliest attempts were essentially aimed at enzyme thermostabilization (51), but a growing number of studies aimed at lowering functional temperatures have been reported. As observed for I-DmoI, improvement was in several cases observed over a wide range of temperatures. For example, the hyperthermophilic xylose isomerase from Thermotoga neapolitana, with an optimal temperature of 95°C, was engineered for increased activity at 60°C (52), but an increased activity was reported for higher temperatures, with optimal activity at 90 -95°C. In a related experiment the thermostable glycosylhydrolase CelB from Pyrococcus furiosus was optimized for activity at low temperatures (53). Again, for one mutant, activity was increased 2-fold at 20°C but was also enhanced at 90°C. In addition to thermophilic proteins, mesophilic enzymes have also been optimized for lower temperatures (54,55) or even converted into psychrophilic proteins (56).
Proteins are stabilized primarily by several noncovalent interactions, which must compensate for the enormous decrease in chain entropy upon folding (57). They provide only marginal stability, with free energies for unfolding on the order of 10 -60 kJ/mol. Identifying the molecular origins of the extra stability shown by proteins in thermophilic organisms has been the subject of several studies in trying to define rules for protein stabilization (58). Actually, most single amino acid exchanges result in protein destabilization, particularly if they occur in the protein interior (59). Thus, destabilizing a protein should be easier than stabilizing it, and lowering the optimal temperature of an enzyme would in theory be easier than increasing it. In the engineered versions of the P. furiosus CelB protein (see above), mutations were found in the active site region, at subunit interfaces, at the surface, and buried in the protein core (53). In a related experiment in which a mesophilic subtilisinlike protease was converted into a psychrophilic one, the coldactivating mutations were located in both surface-exposed and buried positions, as well as located close to or distant from the active site, as inferred from a homology-based model structure (56). In this latter case, the thermodynamic stability of the mutant proteins was not measured; however, half-time inactivation at high temperatures indicated that although a trend existed, no strict correlation between destabilization and enhanced low temperature activity was observed. This observation is consistent with our analysis of I-DmoI proteins and further suggests that the mutations may also cause local conformational changes and/or affect the local dynamics of the chain modulating the cleavage activity at different temperatures.
Strategies for Engineering Enzyme Stability-In this study, we used a random mutagenesis procedure, followed by a selection/ screening procedure. This strategy, assisted or not by gene recombination, remains the most effective, as has been shown in the engineering enzymes for use in unusual environments (60,61). More rational approaches have also been explored, including the introduction of possible stabilizing residues based on three-dimensional structures (62,63). Furthermore, the growing number of available enzyme sequences has allowed for novel strategies based on sequence alignments. Sequence comparison of thermophilic, mesophilic, and psychrophilic enzymes with similar activities often suggest a common ancestor whose working temperature could theoretically be re-engi- neered to adjust its optimal temperature (64,65). By comparing two related genes, Serrano et al. (66) identified the contribution of individual substitutions to protein stability. In a more elaborate approach, a synthetic phytase gene was designed, based on the comparison of 13 mesophilic and low thermophilic fungal enzymes, allowing for a 30°C increase in its mid-denaturation temperature (up to 90°C), in comparison with its most thermostable parent (67,68). Although several mesophilic and thermophilic LAGLIDADG endonucleases have now been identified (5,69), the low level of similarity between these proteins appears to preclude the use of similar strategies for this family of proteins. However, the identification of engineered derivatives with altered properties, which are correlated with a limited number of known mutations (such as I-DmoI D1 and D2), may allow for the identification of stabilizing or destabilizing patterns. As LAGLIDADG proteins are relatively conserved at the structural level, mutations analogous to the ones identified in this study may be transposed to other proteins, in order to assay their impact on temperature requirement. More recently, the development of powerful computational methods has recently initiated new strategies for engineering protein stability (70). Similar methods were also used for predicting the specificity of engineered LAGLIDADG meganucleases (20,22). There are no theoretical reasons why such methods could not be used to engineer the stability of LAGLIDADG proteins as well, or at least to massively decrease the complexity of the libraries to be screened experimentally by sorting mutations in a primary in silico screening.

CONCLUSIONS
The two I-DmoI variants obtained by one round of directed evolution behave as mesophilic homing endonucleases as a result of mutations that destabilize the proteins by changing the packing and the polarity balance of the hydrophobic core. This destabilizing effect does not compromise the structural integrity of the proteins. Indeed, the proteins may be further modified by mutation and selection to engineer novel specificities along the whole DNA recognition sequence in a single chain meganuclease, overcoming the limitations inherent in homodimeric meganucleases.