Evolutional Design of a Hyperactive Cysteine- and Methionine-free Mutant of Escherichia coli Dihydrofolate Reductase*

We developed a strategy for finding out the adapted variants of enzymes, and we applied it to an enzyme, dihydrofolate reductase (DHFR), in terms of its catalytic activity so that we successfully obtained several hyperactive cysteine- and methionine-free variants of DHFR in which all five methionyl and two cysteinyl residues were replaced by other amino acid residues. Among them, a variant (M1A/M16N/M20L/M42Y/C85A/M92F/C152S), named as ANLYF, has an approximately seven times higher kcat value than wild type DHFR. Enzyme kinetics and crystal structures of the variant were investigated for elucidating the mechanism of the hyperactivity. Steady-state and transient binding kinetics of the variant indicated that the kinetic scheme of the catalytic cycle of ANLYF was essentially the same as that of wild type, showing that the hyperactivity was brought about by an increase of the dissociation rate constants of tetrahydrofolate from the enzyme-NADPH-tetrahydrofolate ternary complex. The crystal structure of the variant, solved and refined to an R factor of 0.205 at 1.9-Å resolution, indicated that an increased structural flexibility of the variant and an increased size of the N-(p-aminobenzoyl)-l-glutamate binding cleft induced the increase of the dissociation constant. This was consistent with a large compressibility (volume fluctuation) of the variant. A comparison of folding kinetics between wild type and the variant showed that the folding of these two enzymes was similar to each other, suggesting that the activity enhancement of the enzyme can be attained without drastic changes of the folding mechanism.

tein with the desired properties is found, because the structure and function of a protein is determined by its amino acid sequence (1). When we set the goal of protein design "to create a protein of a desired property consisting of a given number of amino acid residues n," the solution can be obtained by the complete search of all the possible amino acid sequences with "n" amino acid residues, the total number reaching 20 n . Therefore, the protein design problem can be reduced to a searching problem in sequence space of polypeptides with n amino acid residues. The solution to this searching problem should include a reliable process that can be performed within a realistic time span, otherwise it is of no use in practical protein design. In this regard, the size of sequence space to be searched is a critical factor (2). The size of the sequence space of a polypeptide with amino acids even as small as 100 is 20 100 (ϳ10 130 ), a number that is unreachable even within an astronomical time span if one were to examine all the amino acid sequences. To avoid this problem, the evolutionary algorithm is a powerful and useful strategy that has been used in biotechnology to search amino acid sequence space for proteins with functional properties that are superior to the natural products (2,3). Adaptive walking in the sequence space by DNA shuffling or by error-prone PCR has resulted in the significant improvement of naturally occurring enzymes, as demonstrated by Stemmer and co-workers (3,4), Chen and Arnold (5), and others (6,7). The keys for the success in these evolutionary algorithms are both how one can create a large mutant pool and how one can develop a powerful selection system of the target protein from the large mutant pool (2). The former limits the reproducibility and the latter limits the generalization.
Here we propose a novel picking up process in the sense of protein design, a reproducible process that can be performed within a realistic time span by greatly reducing the number of mutant proteins to be tested, which is a useful strategy for improvement of naturally occurring proteins. The outline of the proposed process, "quasi-additive adaptive walking with mutant data base" (QAW), 3 Fig. 1 (see also under "Results"). To test the effectiveness of our proposed method, we applied it to an enzyme, dihydrofolate reductase (DHFR, EC 1.5.1.3) from Escherichia coli, to create a cysteine-and methionine-free mutant DHFR while retaining an enzymatic activity as high as that of the wild type. In this study, we successfully obtained several hyperactive variants of DHFR in which all five methionyl and two cysteinyl residues were replaced by other amino acid residues. DHFR is a monomeric protein with two domains, which catalyzes the reduction of dihydrofolate to tetrahydrofolate by using the reducing cofactor NADPH (8). DHFR is a clinically important enzyme not only as the target of a number of antifolate drugs, such as trimethoprim and methotrexate (8), but also as the enzyme to produce l-leucoverin, an anti-cancer drug, in a stereospecific manner (9). Because overproduction of DHFR in cells makes E. coli trimethoprim-resistant (tmp R ), DHFR has been used not only as a selectable genetic marker, tmp R (10), but also as a handle protein (affinity handle) for the production of peptide (11). The catalytic reaction of the wild type DHFR has been reported to occur along a preferred reaction pathway with several intermediate states (12), corresponding structures that were already related to ligand-bound forms by extensive crystallographic studies (13)(14)(15)(16). In the course of such studies of DHFR, a number of variants of DHFR with a point mutation(s) at a site(s) within or in the vicinity of the active site were constructed (17)(18)(19)(20)(21)(22)(23). Such mutational studies uncovered the roles of the residues around the active site. However, the role of residues outside the active site such as those located in the hydrophobic core of the molecule in the catalytic activity remains unclear. One of the hyperactive Met-and Cys-free variants obtained in this study (M1A/M16N/ M20L/M42Y/C85A/M92F/C152S), named as ANLYF, provided a suitable example to consider the role of amino acid residues outside the active site, because the replaced residues (except for  have no or little direct contact(s) with the substrate or coenzymes in DHFR. It may also lead to understanding of the essence of the quasi-additivity of the recombination of the single mutation. Additive effects of single mutations on catalytic activity mean the independent effect of each single mutation on the catalytic activity, where analysis of the effects of each mutation would explain the total enhancement of the catalytic activity (24). On the contrary, quasi-additivity of the mutational effects means the mutual relationship between the mutated sites (25,26). Thus, it is necessary to elucidate the enzymatic mechanisms not only in detail but also as a whole to reveal the essence of the quasi-additive effects on the catalytic activity. For that purpose, we also performed enzymatic kinetics and structural studies of ANLYF, which were compared with those of the wild type. Our results indicate that the increase of flexibility of the overall molecule as well as restriction of the available conformation induced by the mutations is important for the enhancement of the catalytic activity. Furthermore, we describe the structural formation of ANLYF from its urea-induced unfolded state, suggesting that the folding kinetics would be conserved despite the significant change of the catalytic activity.

EXPERIMENTAL PROCEDURES
Materials-E. coli JM109 was used as the cloning and expression host. Methotrexate (MTX) affinity resin was purchased from Sigma. DEAE-Toyopearl 650 M was from Tosoh (Tokyo, Japan). Restriction enzymes, T4 DNA ligase, Taq polymerase, and Achromobacter protease I were purchased from Takara (Kyoto, Japan). DNA primers for mutagenesis were synthesized by JbioS Ltd. (Saitama, Japan). All other chemicals were of reagent grade.
Plasmid Construction and Protein Purification-The genes for the wild type DHFR and a cysteine-free double mutant of DHFR (AS-DHFR) were from pTZwt1-3 and pDHFR20, respectively (27,28). The following two DNA primers were commonly used for amplifying an entire gene with a ribosome-binding site, overexpression promoter, and BamHI sites at the both ends: 5Ј-GCGGGGATCCTCTTGACAATT-AGTTAACTATTTGTTATAATGTAT-3Ј (P35-Bam) and 5Ј-GGGG-GATCCTTAACGACGCTCGAGGATTTCGAA-3Ј (DHFR-C). Sitedirected random mutagenesis was carried out by a procedure similar to the one described previously (29) using appropriate mixed primers for the mutagenic PCR. The resulting genes were inserted into the BamHI site of a high copy vector, pUC18, which was used to transform E. coli JM109 cells to trimethoprim resistance. Recombinant plasmids were isolated from colonies on agar plates containing 200 g ml Ϫ1 ampicillin and 10 g ml Ϫ1 trimethoprim. DNA sequences of the BamHI insert of the isolated plasmids were determined, and the plasmids with mutant sequences were selected. The original colony with the selected plasmid was used to determine DHFR activity and was used for protein purification.
The activity of the mutant proteins was estimated as follows: each E. coli transformant was grown in 2 ml of 2ϫ YT medium containing 200 g ml Ϫ1 ampicillin, at 37°C, until the absorbance at 660 nm was ϳ1.5. The cells from 1 ml of the culture were collected and suspended in 0.2 ml of 10 mM potassium phosphate, pH 7.8, 0.2 mM EDTA containing 1 mg ml Ϫ1 lysozyme. After gently mixing for 30 min, cells were disrupted by sonication for 10 s. Cell debris was removed by centrifugation, and the resultant supernatant (1-10 l) was used for the DHFR assay. To normalize activity to the numbers of cells used, the measured activity (⌬A 340 /min) was divided by the absorbance at 660 nm ((⌬A 340 /min)/ A 660 ). The enzyme activity relative to the wild type DHFR is defined as 100 ϫ ((⌬A 340 /min)/A 660 for the mutant protein)/((⌬A 340 /min)/A 660 for the wild type DHFR).
Purification of mutant DHFRs was carried out primarily by MTX affinity chromatography, taking advantage of adequate pre-purification steps from cell-free extracts (11). Purified protein was stored in a precipitated form in 10 mM phosphate buffer, pH 7.0, containing 1 mM EDTA, 14 mM 2-mercaptoethanol, and saturated ammonium sulfate. Protein concentrations of the wild type and AS-DHFR were determined from the absorbance at 280 nm using the molar extinction coefficient (⑀ 280 ϭ 31,100 M Ϫ1 cm Ϫ1 ) (30) for the wild type DHFR. Protein concentration of the DHFRs was determined by the Bradford method (31) using wild type DHFR as the standard protein.
Enzyme Assay and Steady-state Kinetic Parameters-The activity of DHFR was determined spectrophotometrically at 15°C by following the disappearance of NADPH and DHF at 340 nm (⑀ 340 ϭ 11,800 M Ϫ1 cm Ϫ1 ) (32). The standard assay mixture contained 50 M DHF, 100 M NADPH, 14 mM 2-mercaptoethanol, 1ϫ MTEN buffer (50 mM MES, 25 mM Tris, 25 mM ethanolamine, and 100 mM NaCl, pH 7.0) (33), and the enzyme in a final volume of 2.0 ml. The reaction was started by the addition of DHF. Michaelis parameters (k cat and K m ) were determined by measurements with various concentrations of DHF and nonlinear least squares fitting of the data.
Crystallization and Data Collection-The crystallization of ANLYF was carried out as a folate complex. Folate at three times molar ratio was added to the sample and was concentrated to 20 mg ml Ϫ1 using centricut (Kurabo, Osaka, Japan) with a centrifuge. The crystal was grown in drops of the sample, 0.1 M Tris, pH 8.5, 0.2 M MgCl 2 , and 35% PEG 6000. The crystal was grown at 4°C, and the size of the resulting crystal was 0.5 ϫ 0.6 ϫ 0.7 mm.
Preliminary crystal data were determined using DIP2030 imaging plate diffractometer installed on an MX18HF rotating anode generator (MAC Science). Afterward, synchrotron radiation was used to confirm that the crystals belonged to the space group C2 with unit cell parameters a ϭ 79.58 Å, b ϭ 56.69 Å, c ϭ 85.14 Å, and ␤ ϭ 106.81°. The resolution of the diffraction spot of synchrotron data is quite extended as compared with the rotating anode x-ray. Thus, the intensity data were collected using a Weissemberg camera for macromolecules, which was installed at the beam line BL18B in Photon Factory (Tsukuba, Japan). The camera radius was set at 430 mm. Diffraction patterns, recorded on an imaging plate (Fuji Film, Japan), were digitalized using the reading system for this beam line. The wavelength was fixed to 1.00 Å, and all data collections were carried out at 17°C. To avoid the scattering of the diffracted beam by air, the helium gas was constantly flowed into a sealed area consisting of the collimator, the crystal, and the detector. The data processing was carried out using the program DENZO and SCALEPACK. The reflections with F Ͼ 1 (F) were used for structural determinations, and the R merge based on the intensity was 0.030. The resolution for refinement was determined to be 1.9 Å. In total, 53,835 reflections observed were merged to 25,648 unique reflections with a degree of completeness of 90.4%.
Structure Analysis-Initial phases for the structure factors were obtained using the molecular replacement method. The wild type and folate complex structure (Protein Data Bank entry 1DYI (15)) was used as a starting model for the molecular search. The model structure has two molecules per asymmetric unit, only one of which, molecule A, was employed as the starting model. The molecular replacement was performed using the program AMoRe, which is implemented in the CCP4 suite system. The resolutions of rotation and translation peaks were found, and the solution of the second molecule was also significant. The (2 ͉F o ͉ Ϫ ͉F c ͉) difference Fourier maps showed continuous electron densities, and the boundary between the enzyme and the solvent regions was clear. Then the crystallographic refinement was performed using the program XPLOR. During the refinement, difference electron density maps and omit maps were checked using interactive display program XtalView. Finally, the crystallographic R-factor was converged to 0.205. Data collection and crystallographic refinement statistics for binary complexes of ANLYF are shown in the supplemental Table SI. In ANLYF, the three-dimensional structures of molecule A and B were almost identical to each other in contrast to those of the wild type that has a different structure of M20 loops (15). The r.m.s.d. between molecule A and B of ANLYF was 0.477 Å, whereas the r.m.s.d. between the wild type and ANLYF was 0.999 Å. Atomic coordinates have been deposited in the Protein Data Bank (accession code 2D0K).
Compressibility Measurements-The partial specific volume, o , and the adiabatic compressibility, ␤ s o , of ANLYF at infinite dilution were determined by the sound velocity and density measurements in 10 mM phosphate buffer, pH 7.0, containing 0.1 mM EDTA and 0.1 mM dithiothreitol at 15°C. The apparatus and experimental procedures were essentially the same as those used previously (34).
Equilibrium Dissociation Constants-Equilibrium dissociation constants (K d ) of ligands were determined from fluorescence emission spectra of enzyme-ligand complexes with an Aviv ATF 105 spectrofluorometer. The solution contained various concentrations of ligand (folate, DHF, THF, NADPH, and NADP ϩ ), 0.2 M enzyme in MTEN buffer. The emission spectra were scanned from 500 to 300 nm at 25°C, with an excitation wavelength of 290 nm. The K d values were determined by measurements with various concentrations of ligands and nonlinear least square fitting of the data.
Transient Kinetics-Transient binding and pre-steady-state kinetics experiments were carried out on an Aviv 202 SF stopped-flow spectrometer in the fluorescence mode. The dead time of mixing was ϳ10 ms in the absorbance mode, and this value was used for all the experiments in this study. Ligand association and dissociation rate constants were measured by monitoring either the quenching of the intrinsic enzyme fluorescence above 305 nm with a cut-off filter provided by Aviv instruments, or the enhancement of coenzyme fluorescence above 420 nm by energy transfer with a cut-off filter provided by Hoya (Japan) with the excitation wavelength of 290 nm in MTEN buffer at 25°C. The final enzyme concentrations were 0.5-0.8 M. Pre-steady-state rates were measured by monitoring coenzyme fluorescence above 420 nm with the cut-off filter with the excitation wavelength of 290 nm. The enzyme was pre-incubated with saturating concentrations of NADPH (70 M) (35), and the reaction was initiated by mixing with various concentrations of DHF in MTEN buffer at 25°C. The final enzyme concentration was 1 M.
Sample Preparation for Folding Measurement-Equilibrium and kinetic folding measurements were carried out at 15°C and pH 7.8 in the presence of 10 mM potassium phosphate, 0.2 mM EDTA, and 1 mM 2-mercaptoethanol. The enzyme solution also contained appropriate concentrations of urea as denaturant, or MTX, depending on the design of the experiments.
Equilibrium Unfolding Transition-Far-and near-UV CD spectra were obtained with an Aviv 62 DS spectropolarimeter with a 30-s averaging time by scanning from 250 to 190 nm for a far-UV region and from 310 to 250 nm for a near-UV region, respectively. The path lengths of sample cells were 2.0 or 10.0 mm for the far-and near-UV regions, respectively. Enzyme concentrations were 1 and 55 M for far-and near-UV CD measurements, respectively. The data obtained from the measurements were converted to mean residue ellipticity using the following equation: mean residue ellipticity ϭ (⍜ ϫ 100)/(C ϫ D ϫ NA), where ⍜ is the ellipticity value in degrees; C is the molar protein concentration; D is the path length of the sample cell in cm; and NA is the number of residues in the enzyme. Equilibrium urea-induced unfolding transition was assessed by monitoring ellipticity at 220 nm with the above spectrophotometer or by scanning the fluorescence spectra from 500 to 300 nm with an Aviv spectrofluorometer ATF 105. The sample solutions were incubated at least 2 h before the data collection. Final enzyme concentrations were 1 M for CD measurements and 5 M for fluorescence measurements, respectively. The obtained data were analyzed based on the two-state approximation as described previously (27,36).
Kinetic Folding Measurements-The slow steps in folding were monitored by the time-dependent change in ⑀ 293 using the manual mixing method on an Aviv model 14 NT absorbance spectrophotometer. The final enzyme concentration using the manual mixing method was 18 M. For faster steps, kinetic folding measurements were performed with an Aviv 202 SF stopped-flow spectrometer in the CD, fluorescence, or absorbance mode. The path length of the cell was 1 mm for the CD and absorbance mode. On the CD measurements, the time-dependent ellipticity change was monitored at wavelength from 245 to 215 nm. On the fluorescence measurements, the time-dependent fluorescence intensity above 320 nm, obtained by a cut-off filter from Aviv instruments, was monitored with the excitation wavelength of 292 nm. Refolding experiments in the presence of MTX were carried out by monitoring the change in absorbance at 380 nm using the above stopped-flow spectrometer in the absorbance mode (28,30). Upon binding to DHFR, the change in ⑀ 380 for MTX is Ϫ4,942 M Ϫ1 cm Ϫ1 (30). This wavelength is well outside the absorbance region of the enzyme and allowed for the direct detection of the binding of inhibitor during folding. The final enzyme concentration was typically 10, 0.5, and 25 M in the CD, fluorescence, and absorbance measurements, respectively.
Sample Preparation for the Incubation in 0.1 M H 2 O 2 -The sample solution for the oxidization-resistance test contained appropriate concentrations of wild type or ANLYF, 0.1 M H 2 O 2 in 10 mM potassium phosphate, and 0.2 mM EDTA, pH 7.8, for measuring the thermodynamic stability and MTEN buffer, pH 7.0, for the enzymatic assay. The resultant solutions were stored at 4°C for 24 h. The activity assay and urea-induced unfolding measurements were carried out as described above.
LC/MS Measurements-LC/MS measurements were performed as described previously (37). Separation of protein was carried out with an acetonitrile gradient from 35 to 55% containing 0.1% trifluoroacetic acid on an L-column ODS (2.1 ϫ 150 mm; Kagaku-hin Kensa Kyoukai, Tokyo) on a Shimadzu LC-10A high pressure liquid chromatography system.
Other Methods-DNA sequencing was performed on an ABI PRIZM 310 genetic analyzer. N-terminal amino acid sequences were determined by Edman degradation on a Beckman System Gold LF3000 protein sequencer equipped with an on-line phenylthiohydantoin amino acid analyzer.

Proposal of Novel Picking Up Process in the Sense of Protein Design-
The outline of the QAW process is depicted in Fig. 1. For many proteins, local fitness landscapes are well described by a model of the Mt. Fuji-type landscape, which is proposed by Aita and Husimi (38). They constructed the model landscape based on additivity of the free energy contributed by each residue on a biopolymer, introducing a substitution at each site to functional tolerant residues. The fitness spectrum among a random mutant population around a wild type sequence was theoretically obtained as the probability density distribution function of fitness. Statistical properties of real landscape for current proteins, including general shapes and the mean slope of a landscape (4, 39 -41), are compatible with those calculated using the Mt. Fuji-type landscape model (38,42). Therefore, the mutation effect seems "fuzzy" for any combination of amino acid replacements; the "Mt. Fuji-type" landscape is the first approximation of a real landscape regarded as an "ideal landscape," and the "rough Mt. Fuji-type landscape," which has small random components on its surface, is more accurate and a "real landscape" (38,43). The next major problem in a process of adaptive walking is how we can systematically avoid nonadditive mutations. Our proposed strategy consists of two parts as follows: construction of mutant data bases, and combinatorial searches by taking into account the fuzzy relationship in additivity of multiple mutations (quasi-additivity). Depending on the project, single mutant proteins at each site of the wild type sequence are examined for improvement of the functions such as activity, specificity of the catalytic reaction, conformational stability, thermal stability, and so on. Then a mutant data base at each position can be constructed as shown in Fig. 1. Double mutant proteins are constructed by combining the three (or fewer) most preferred single mutants at positions A i and A j from each single mutant data base (the first quasiadditive walking). The resulting nine (or fewer) double mutant proteins are tested for improvement of the functions. The three (or fewer) most preferred double mutants are then used for further mutation at the A k position using the top three (or fewer) from the corresponding mutant data base (the second quasi-additive walking). The resulting nine (or fewer) triple mutants are then examined for improvement of the functions. Similarly, repetitions of combinatorial mutations and functional tests are further searched over all the data bases. The mutant protein finally obtained would be the best one or close to the best in terms of the improvement of the functions for the searched sequence space. The QAW method provides reproducible results; the same mutant protein should be obtained when the same walking strategy and the same mutant data bases are used. It is only necessary to construct 19 ϫ n mutants (for a protein with n amino acid residues) for the mutant data base and 3 2 ϫ (n Ϫ 1) for quasi-additive walking. For example, the total number is 19 ϫ 100 ϩ 3 ϫ 3 ϫ 99 ϭ 2,791 for a polypeptide with 100 amino acid residues. These numbers are realistic in order to experimentally characterize all the mutant proteins and are much lower than the total number of proteins in the sequence space (20 n ). Therefore, the QAW method can be applied to many proteins even if there is no applicable selection system other than the characterization of isolated proteins. To simplify the explanation, only the top three (or fewer) mutations in each data base are considered in the quasi-additive walking; the variable numbers depend on the characteristics of each data base. For example, when the top four mutants have similar values to one another, we would use the top four for the adaptive walking. When only one mutant is available as an active mutant, we would use the top one for the walking.
Creation of Methionine-and Cysteine-free DHFR with High Enzymatic Activity-To demonstrate the effectiveness of the QAW strategy, we used DHFR as a model protein to create a mutant protein free of both methionine and cysteine residues, which has higher activity than the wild type. Fig. 2 shows the mutant data base created for all cysteine and methionine residues of E. coli wild type DHFR. The wild type DHFR contains 2 cysteine and 5 methionine residues: Cys-85, Cys-152, Met-1, Met-16, Met-20, Met-42, and Met-92. Because we have already constructed a fully active Cys-free DHFR (C85A/C152S double mutant DHFR, AS-DHFR) (27), we have explored alternatives for methionine residues in this study even though the mutant data base was created for wild type DHFR. Because the signal for initiation of translation of protein synthesis in vivo is methionine, the amino acid replacement of Met-1 to another amino acid is not as simple as replacing an amino acid by sitespecific mutagenesis. In an E. coli expression system, the Met residue resulting from the initiation codon is removed by the enzyme methionyl-aminopeptidase (44). Amino acid replacements of Met-13 Met-FIGURE 1. Schematic representation of QAW strategy. A 1 , A 2 , A i , A j , A k , and A n represent the 1st, 2nd, i-th, j-th, k-th, and n-th amino acids of the protein sequence with n amino acids, respectively. The mutant data base at the i-th position (i ϭ 1, 2, i, j, k, and n) represents the effect of amino acid replacement at the corresponding i-th position on a specific function to be improved, such as activity, conformational stability, thermal stability, etc. By looking at each data base, three amino acids (at most) that represent the top three scores in terms of the desired function are considered for further mutations. The term quasiadditive is based on the postulation of a rough (not strict) additivity of the adaptive walking by combinatorial search for better mutant proteins.
Ala (M1MA), Met-1 3 Met-Pro (M1MP), Met-1 3 Met-Ser (M1MS), and Met-1 3 Met-Thr (M1MT) have been examined in this study because these sequences are expected to form a Met-free N terminus with the gene expression process in E. coli (44). The relative activities of the four mutants in cell-free extracts are shown in Fig. 2, and the characteristics of the purified proteins are shown in Table 1. Among the four mutants, almost all fractions (more than 98%) are the Met-free N terminus in M1MA, M1MP, and M1MS mutants as detected by N-terminal sequencing and liquid chromatography/mass spectrometry (LC/MS) measurements, whereas only ϳ60% fraction in M1MT mutant is the Met-free N terminus. Based on the activity and conformational stability, M1MA was chosen for further mutations. For the other Met sites, sitespecific random mutagenesis was carried out using mixed oligonucleotides containing a substitution of the Met codon (ATG) to NNK in the mutagenic primers. In this way, it was possible to convert each of the Met residues to 19 different amino acids as shown in Fig. 2. Some of the single mutants were not obtained in this study, probably because we used tmp R as a selectable marker to reduce the background resulting from the vector plasmid pUC18 with no inserts. Nevertheless, even using the incomplete data base, we could still obtain a hyperactive methionine-and cysteine-free DHFR.
As described above, the first walking step was carried out at Met-1 starting from AS-DHFR, and the Ala-1 created by amino acid replacement of Met-13 Met-Ala (M1MA mutant) was found suitable. The next walking step was carried out at Met-42 and Met-92 by starting with the M1MA mutant by considering that the two residues may weekly interact with each other. From the mutant data base, Phe, Val, and Tyr for the Met-42 position and Leu and Phe for the Met-92 position were chosen for the adaptive walking. Fig. 3A shows relative activities of the six mutants. As seen, strict additivity in terms of activity was not observed, although the combination of the top ones in the single mutants (M42Y and M92F) produced the preferred one in the combined sequence space. In this walking step, we chose the preferred mutant (AYF mutant) because it was three times more active than the wild type DHFR. The next, and final, walking step for methionine-and cysteine-free DHFR was carried out at Met-16 and Met-20 by starting with the AYF mutant. Fig. 3B shows the relative activities of the nine mutants. Again, strict additivity was not observed. In this case, the combination of the top ones in single mutants (M16F and M20L) did not result in the most active mutant in the combined sequence space. The order of activity for mutations based on the wild type sequence was M16F Ͼ M16N Ͼ M16S (Fig. 2), although the corresponding order in the Fig. 3B was M16N (ANLYF) Ͼ M16F (AFLYF) Ͼ M16S (ASLYF) based on the M20L mutation. Mutations other than the top three amino acid residues at the Met-16 position based on the M20L mutant were also created and resulted in a hyperactive methionine-and cysteine-free DHFR. However, they were less active than the ANLYF; for example, the relative activities to the wild type of M16A (AALYF) and M16R (ARFYF) were 379 and 286%, respectively, although that of the ANLYF was 989%. Therefore, with respect to the activity in cell-free extracts, the ANLYF seemed to be the best mutant in the limited sequence space that was searched.
The hyperactive sulfur-free DHFRs (ANLYF, AFLYF, and ASLYF) obtained by the QAW strategy were purified to homogeneity, and their activity and conformational stability were studied. The molecular masses of the three mutants agreed with the calculated mass expected if the N-terminal Met derived from the initiation codon had been removed (Table 1). N-terminal sequencing analysis also confirmed the absence of Met at the N terminus. The three mutants showed hyperactivity in terms of k cat (Table 1). In particular, the k cat value for the ANLYF mutant was 7.6 times larger than that of the wild type DHFR at 15°C and pH 7.0. This is in good agreement with the activity of cell-free extracts that were used for characterization in each adaptive walk.  Although K m values of the mutants were larger than that of the wild type, k cat /K m values, which reflect the catalytic efficiency of an enzyme, were still greater than that of the wild type. In particular, the k cat /K m value (8.6 M Ϫ1 s Ϫ1 ) of AFLYF mutant was two times higher than that of the wild type DHFR (4.4 M Ϫ1 s Ϫ1 ). Because our adaptive walk was based solely on the enzymatic activity, the conformational stabilities of three mutant proteins were decreased by 1-2 kcal mol Ϫ1 (Table 1). Tertiary Structure of ANLYF Determined by X-ray Crystallography-The crystallographic structure of an ANLYF binary complex with folate was solved. The backbone structure was approximately identical to the corresponding structure of wild type (Protein Data Bank code 1DYI) (Figs. 4 and 5). Supplemental Fig. S1 shows the displacements of C-␣ along the primary sequence. Although relatively large displacements (Ͼ1 Å) were found around three major flexible loops, i.e. the M20 loop (residues 9 -23), the F-G loop (residues 117-131), and the G-H loop (residues 142-149), and a loop between strands C (residues 59 -62) and D (residues 73-75), and helix C (residues 44 -50), the rest of the molecules has small displacements of up to 0.5 Å. This small constant displacement of about 0.5 Å resulted from a difference in the mutual positions of two subdomains of DHFR, i.e. the loop subdomain (residues 1-37 and 107-159) and the adenosine binding subdomain (residues 38 -106) between wild type and ANLYF (Fig. 4, A-C).
During the catalytic cycle of DHFR, the two domains exhibit hinge motions, by which the active site cleft changes the size of N-(p-aminobenzoyl)-L-glutamate (pABG) binding pocket (13,16). At the vicinity of this axis of the motion located between strands A (residues 2-8) and E (residues 91-95) (16), where Met-42 and Ile-5 sandwich Ile-94 and Met-92 is located between Ser-3 and Val-40, there is only little room around this region in the wild type structure (Fig. 4D). However, in ANLYF, substitution of the residues Met-42 and Met-92 to more bulky residues Tyr and Phe may prevent the hinge between the two domains from being closed, unlike the case of the wild type (Fig. 4E). It probably follows the difference in mutual position of the two domains caused by the difference in the angle between them, resulting in the formation of the hyperactive mutant.
The pABG moiety of folate and its derivatives, such as DHF and THF, are bound between helices B (residues 25-35) and C located near the boundary of the two subdomains. The relative position of pABG moiety to these helices strongly affects the affinity of the folates to the enzyme because the contacts of pABG to the enzyme are held with two residues in helix B (Leu-28 and Phe-31) and two residues at and near helix C (Ile-50 and Leu-54) (17, 18, 21-23). As shown above, these two helices of ANLYF are more distant from each other than those of the wild type. The fact is that almost all the distances related to the binding of pABG to the enzyme were increased in ANLYF (Table 2). Therefore, as shown in Fig. 5, A and B, the temperature factor of folate bound with ANLYF, especially pABG moiety, is higher than that bound with the wild type, which is consistent with the increase of the distances between pABG and ANLYF. Furthermore, the number of cavities in the molecule was increased (Fig. 5, C and D). Judging from the surface potential (Fig. 5, E  and F), the pABG binding cleft of the wild type is almost filled up by binding with folate, whereas that of ANLYF with folate still has a small space between helices B and C. These results could be related to an increase of the flexibility when the wild type was mutated to yield ANLYF (see under "Discussion").
To the contrary, contacts between the pteridine ring and the enzyme ( Table 2) and the size of the NADP binding pocket (supplemental Table  SII) were almost conserved between the wild type and ANLYF. These are reasonable if one considers that the binding sites of all the moieties of NADP, except the nicotinamide mononucleotide moiety, are located in the adenosine binding subdomain whose structure was almost identical between them (Fig. 5, E and F).
Adiabatic Compressibility-Among the various experimental techniques, a novel measure of the protein flexibility is compressibility because it is directly linked to the volume fluctuation (45). Although compressibility is a thermodynamic quantity, it reflects sensitive structural characteristics of proteins through the contributions of atomic packing or internal cavities and surface hydration (46). To diagnose the flexibility of ANLYF, we measured the partial specific adiabatic compressibility, ␤  (34). Because surface hydration is only slightly affected by mutations, these differences between the variant and the wild type would be mainly attributed to an increase in the number and the volume of cavities. Thus, these volumetric data are compatible with the fact that the structure of ANLYF is more flexible than that of the wild type.
Equilibrium Dissociation Constants-The tryptophan fluorescence of DHFR is markedly quenched upon binding various ligands, such as folate, DHF, THF, NADPH, and NADP ϩ . The equilibrium dissociation constants, K d , of ANLYF for these ligands were determined by monitoring the fluorescence as a function of the ligand concentration at 25°C ( Table 3). The K d values of ANLYF were larger than those of the wild type, although the extent of the increase was different between the coenzymes and folate and its derivatives; for example, the K d values of ANLYF for NADP ϩ and DHF were four and seven times larger than those of the wild type, respectively. For obtaining further insight into the mechanisms of the hypercatalytic activity of ANLYF, pre-steady-state kinetics of the enzyme reaction were analyzed.
Binding and Pre-steady-state Kinetics-The rates of association and dissociation of various ligands to the enzyme and its binary complexes were measured by following the time-dependent intrinsic fluorescence changes as a function of the ligand concentration. Similar to the case in the wild type, binary complex formation of ANLYF with each ligand was well described by two exponential reactions as follows: a rapid and ligand concentration-dependent phase followed by a slower phase independent of ligand concentration. The rate of the faster phase (k obs ) increases linearly with the ligand concentration, which is, under pseudo first-order conditions, described by k obs ϭ k on L ϩ k off , where k on and k off are the association and dissociation rate constants, respectively, and L is the concentration of the ligand. The slower phase has been attributed to the isomerization between two conformers (E 1 and E 2 ) as shown in the left half of Scheme 1 (47). The relative amplitude of the faster phase is ϳ2-fold larger than that of the slower phase. On the other hand, the binding reactions forming ternary complexes consisted of a single exponential with a rapid and ligand concentration-dependent rate constant, described by the same equation shown above. The mechanism of the reaction is well explained by the right half of Scheme 1 (47).
The k on and k off values obtained are shown in Table 4. The k on values of ANLYF were 2-4 times smaller, and its k off values were larger than those of the wild type. In particular, the k off values of ANLYF for THF were much larger than those of the wild type, although those of ANLYF for NADPH were slightly larger than those of the wild type. This is consistent not only with the K d values obtained by the equilibrium measurements but also with the fact that the size of the folate binding SCHEME 1 Other color codes are sequentially following the spectrum from blue to red. The replaced residues (1,16,20,42,85,92, and 152) are also labeled. C and D, distributions of the internal cavities, which are calculated using 1.2 Å probe radius by the definition of Richards (67). The cavities are colored blue. E and F, electric charge distributions on the molecular surface. The positive regions (blue) and the negative regions (red) are highlighted on the surface. A and B were drawn using the program MOLSCRIPT (68), and C-F were drawn using GRASP (60). The Protein Data Bank code used for the representation of the wild type is 1DYI (15). pocket of ANLYF was larger than that of wild type, although the size of the NADP binding pocket was similar between ANLYF and wild type. The binding reactions of NADPH to ANLYF⅐THF binary complex and of NADP ϩ to ANLYF reached the equilibrium too fast to monitor the kinetics by means of the stopped-flow.
To confirm that the preferred kinetic pathway of the catalytic reaction of ANLYF is parallel to that of the wild type, pre-steady-state kinetics of ANLYF were performed by mixing the enzyme with DHF under the NADPH-saturating condition and by measuring the rate of the product formation by the fluorescence energy transfer (12). At low concentrations of DHF where the DHF binding to ANLYF⅐NADPH is ratedetermining, the pre-steady-state rate, k obs , is linearly dependent on the DHF concentration (k obs ϭ 20 M Ϫ1 s Ϫ1 ), although it became independent of the DHF concentration at high DHF concentrations (k obs ϭ 140 s Ϫ1 ) at 25°C and pH 7.0. The kinetic scheme of ANLYF under this condition is shown in Fig. 6 by assuming that of the wild type (12). The scheme strongly suggests that the enhancement of the activity of ANLYF is because of the increase of the dissociation constant of THF from the enzyme-THF-NADPH ternary complex.
CD Spectra of Wild Type and ANLYF-Far-UV CD spectra of the wild type and ANLYF in the absence of urea (the native condition) and in the presence of 8 M urea (the unfolded condition) at 15°C and pH 7.8 are shown in Fig. 7. Although the spectrum in the native condition of ANLYF is similar to that of the wild type, detailed spectral features in the native condition are different from those of the wild type. The ellipticity values of ANLYF in the native condition are more intense, from 228 to 237 nm, and less intense, from 219 to 227 nm, than those of the wild type. Similar spectral difference was reported previously in the relation of far-UV CD spectrum of wild type DHFR with that of W74L mutant of DHFR, and was attributed to the lack of exciton coupling in the W74L mutant, which consists of Trp-47 and Trp-74 in the wild type (48). Also in ANLYF, the spectral difference was at least partially due to the difference of the relationship between the positions of Trp-47 and Trp-74 (see below).
As reported previously (49,50), the binding of the enzyme with NADPH induces a structural change, enough to be detected by CD measurements, as was the case for ANLYF. Fig. 7 also shows the far-UV CD spectra of the wild type and ANLYF in the presence of 50 M NADPH under the native condition. At this condition, both proteins essentially exist as NADPH-bound forms. Fig. 7, inset A, shows the near-UV CD spectra of the wild type and ANLYF in the native condition. The spectrum of ANLYF is similar to that of the wild type but with more negative values from 250 to 270 nm and more positive values around 280 nm. ANLYF has two more aromatic residues (Tyr-42 and Phe-92) than the wild type (Met-42 and Met-92), which would cause these differences in the near-UV CD spectrum.
Equilibrium Unfolding Transition-Urea-induced unfolding transition curves of the wild type and ANLYF were measured by monitoring the ellipticity change at 220 nm as a function of urea concentration at 15°C and pH 7.8. The thermodynamic parameters (⌬G H2O and m value) of the unfolding transition were obtained using two-state approximation by nonlinear least squares fitting and are shown in Table 1.   Evolutional Design of DHFR MAY 12, 2006 • VOLUME 281 • NUMBER 19 Folding Kinetics of ANLYF-The folding of DHFR has been shown to involve a series of intermediates in four parallel channels (28,30). The first intermediates appear within 5 ms with intense far-UV CD signal and no evidence for specific tertiary contacts (28). The second ones appear during the next 200 ms, which is related to the native-like packing at/near Trp-47 and Trp-74 (48). These intermediates fold through four parallel, rate-limiting steps to a corresponding set of native conformers that ultimately relax to the equilibrium distribution.
The refolding and unfolding kinetics were initiated by urea concentration jumps from 4.5 and 0 M to various concentrations of urea, respectively, and were measured by monitoring the time-dependent fluorescence intensity above 320 nm with an excitation wavelength of 292 nm at 15°C and pH 7.8 using a stopped-flow apparatus. The refolding progress curve was qualitatively identical to that of the wild type; the fluorescence intensity was increased within several hundred mil-liseconds from the initiation of the refolding reaction followed by a decrease of the fluorescence intensity to the native level in several hundred second time range. The refolding and unfolding kinetics of ANLYF obtained with a stopped-flow apparatus consisted of four exponential terms (see below).
The slow steps of refolding and unfolding kinetics were measured by monitoring the ⑀ 293 change under the same conditions by the manual mixing methods (30). The refolding and unfolding kinetics obtained consist of two exponential terms, and the relaxation time of the slowest phase from the stopped-flow technique was similar to the fast phase from the manual mixing method. The refolding kinetics of ANLYF was described by five-exponential function as that of the wild type (28,30). The urea concentration dependence of the relaxation time of the refolding and unfolding is shown in Fig. 8. Each value of the relaxation time was similar to that of the wild type, although the chevron was shifted to lower urea concentration resulting from less stable characteristics of ANLYF than the wild type. In the unfolding kinetics of ANLYF, the 4 phase of the unfolding, which was absent at lower urea concentrations for the wild type (28), could be detected at the urea concentrations ranging from 2 to 7 M continuously and was smoothly connected with the corresponding 4 phase of the refolding. The existence of four phases in both the refolding and unfolding smoothly connected with each other strongly suggests the presence of multiple native conformers under equilibrium conditions as the wild type.
The Refolding Kinetics Monitored by MTX Binding-MTX, a tight binding, competitive inhibitor of DHFR, has been used for choosing between parallel and sequential folding pathways in the wild type and mutant DHFR (28). The refolding kinetics of ANLYF were measured at 1.1 M of urea concentration at 15°C and pH 7.8 using a stopped-flow apparatus by monitoring the absorbance change at 380 nm, where the difference spectrum between bound and free MTX has a maximum, without any contribution from the enzyme itself. The time course of the refolding involved a lag phase for several hundred milliseconds followed by three phases at the saturating concentrations of MTX (Fig. 9). The slowest 1 phase could not be observed because of the instability of the stopped-flow apparatus in the long time range. The relaxation times of  the three phases agreed well with those of the 4 , 3 , and 2 phases obtained from the measurements without MTX (Fig. 8). The relative amplitude of each phase was 53, 15, and 32% for 4 , 3 , and 2 phase, respectively. The lag phase in the first several hundred seconds shows that the intermediate formed in 5 phase cannot bind MTX as was similarly observed for the wild type.
The refolding kinetics in the presence of the substoichiometric amounts of MTX resulted in the selective loss of the slow binding phase (Fig. 9) (28). This result strongly supports the parallel folding channels in the refolding of ANLYF because at substoichiometric levels the available MTX is bound to the native species rapidly formed, and the MTXenzyme complexes deplete the supply available to the slower folding species.
The Refolding Kinetics Monitored by CD-The refolding kinetics of ANLYF and wild type were measured at a urea concentration of 0.4 M at 15°C and pH 7.8 by monitoring the time-dependent ellipticity change at various wavelengths from 215 to 245 nm using a stopped-flow apparatus. The traces obtained were fitted by a five-exponential function, and the relaxation time of each phase was fixed to the value obtained by the fluorescence and absorbance measurements under the same conditions and the same method as reported previously for the wild type (48). Based on the kinetic parameters obtained, ellipticity values extrapolated to time 0, (0), of the wild type and ANLYF at the wavelengths were calculated and are also plotted in Fig. 7. An ellipticity change has been observed within the dead time of the stopped-flow apparatus, which is called a burst phase, in both the wild type and ANLYF. Fig. 7B shows the difference spectrum of amplitude difference of the 5 phase between the wild type and ANLYF. This spectrum is similar to the difference CD spectrum between the wild type and ANLYF at the native condition.
Effects of Oxidization by an Incubation in H 2 O 2 -ANLYF contains neither methionyl nor cysteinyl residues, i.e. it does not contain sulfur in its molecule, which should make ANLYF highly resistant to oxidization. For the purpose of confirming the chemical resistance to the oxidization of ANLYF and evaluating its resistability by comparing with that of the wild type, the effects of oxidization on the stability and the enzymatic activity were tested by comparing the thermodynamic parameters (⌬G H2O and m value) and the Michaelis parameters (k cat and K m ) ( Table  1). The thermodynamic parameters of the wild type could not be obtained because it aggregated after the incubation. The k cat of the wild type decreased by 40% probably because of chemical modifications of methionyl and/or cysteinyl residues. On the other hand, the ⌬G H2O and k cat of ANLYF were not affected by the incubation, which shows that ANLYF is highly resistant to oxidization. The K m values were irrelevant to the oxidization.

DISCUSSION
Protein Design by Adaptive Walking in Searching Sequence Space-To demonstrate the effectiveness of the QAW strategy, we used DHFR as a model protein to create a mutant protein with a diminished amino acid component, such as methionine-and cysteine-free, which has higher activity than the wild type. By using a model of the Mt. Fuji-type landscape (38), we have already analyzed and reported the fitness landscape where the fitness was defined as a natural logarithm of the enzymatic activity (51). The fitness of each mutant was partitioned into an additive component and a nonadditive residual by fitting to the model landscape. Based on the previous analysis (52), the number of mutants required for each quasi-additive walking was 2 with a 1% risk of escape of the optimal multiple mutant. This indicates that the top three mutants used in this study are adequate for the quasi-additive walking.
The merit of the QAW strategy is to greatly reduce the number of mutants to be tested with great success. In our experiments, the number of mutants was only 109, resulting in 9 hyperactive cysteine-and methionine-free DHFRs (all the combined mutations at Met-16 and -20 based on AYF mutant) (successful rate Ͼ8%), where the size of the sequence space to be considered was 20 7 ϭ 1.28 ϫ 10 9 . Based on the properties of all purified single mutant proteins at each Met and Cys site, the number of 7-fold mutants expected to be more active than the wild type in terms of k cat value are estimated to be 1.35 ϫ 10 6 (ϭ 1 (number of active mutant with similar or higher k cat value of the wild type at Met-1) ϫ 13 (at Met-16), ϫ 7 (at Met-20), ϫ 8 (at Met-42), ϫ 13 (at Cys-85), ϫ 13 (at Met-92), ϫ 11 (at Cys-152)). The successful rate is only about 0.1%. Similarly, the number of 7-fold mutants expected to be more active than the wild type in terms of k cat /K m value are estimated to be 2,340 (successful rate Ͻ0.0002% on basis of k cat /K m ), and worse, it is impossible to investigate 1.35 ϫ 10 6 enzymes (on basis of k cat ). Those estimations are consistent with the following results. When we attempted to obtain a methionine-and cysteine-free DHFR with activity comparable with that of the wild type by creating 7-fold mutants libraries by random mutagenesis, before starting the experiments described in this paper, such active mutant proteins could not been obtained even after testing more than 100 proteins obtained from tmp R colonies. As mentioned above, the QAW strategy should be significantly potent and effective in cases where it is inefficient to construct mutant pools and screen those by directed evolution strategies.
When we tried to construct a methionine-and cysteine-free DHFR with higher activity than the wild type based solely with structural information of each amino acid, what would be produced? If one makes rational replacements of the sulfur-containing amino acids by non-sulfur amino acids, those, generally the Cys residues, would be replaced by Ala or Ser with the same approximate shape and molecular weight (53,54), and Met residues would be replaced by Ile, Leu, or Lys residues having a similar shape to Met (55,56). As far as the two Cys residues at sites 85 and 152 of DHFR, we previously replaced those by Ala and Ser, respectively, in order to have similar kinetic and folding properties as those of the wild type DHFR (27). In this study, for the Met residues at sites 16, 20, 42, and 92 of DHFR, we replaced those by Asn, Leu, Tyr, and Phe, respectively. These residues are different from those expected to be chosen by the similarity of the side chain shape. We purified all single mutants at each Cys and Met site and characterized their activities. At site 16 8 s Ϫ1 )). In consideration of the effects of a single mutation and the smoothness of the landscape in the DHFR case, it may be hard to anticipate a hyperactive mutant from the combination of less effective mutations than the ANLYF. Moreover, if one would construct mutant proteins with multiple replacements using the rationally idealized amino acids at least at sites 16, 20, 42, and 92, the total number of such mutants is 81 ((M16I, M16K, and M16L) ϫ (M20I, M20K, and M20L) ϫ (M42I, M42K, and M42L) ϫ (M92I, M92K, and M92L)), comparable with the number in this work (109 mutants). Therefore, our QAW method can be prove to be effective and useful.
Meaning of Free of Methionine and Cysteine Residues-Among the 20 natural amino acids, only methionine and cysteine residues contain a sulfur atom in the side chains. Because of the high reactivity of sulfur of thiomethyl and sulfhydryl groups, proteins are easily oxidized and show molecular heterogeneity because of the formation of inter-and intramolecular disulfide linkages and changes in isoelectric points caused by the formation of methionine sulfoxide. Therefore, sulfur-free proteins should be resistant to oxidative damage and could be used in highly oxidizing environments such as those found in pollution treatment facilities. Additionally, the availability of a sulfur-free protein allows one to selectively introduce a cysteine or methionine residue to specific positions for the purposes of immobilization, modification, and fragmentation (57)(58)(59). Because almost all naturally occurring proteins contain cysteine and methionine residues, it would be quite useful to develop a reliable strategy to create a sulfur-free protein that is as active as the wild type.
Dynamic Behavior for the Hyperactivity of ANLYF-For catalytic activity, both the static structure and the dynamic behavior are important. From the viewpoint of the dynamic behavior of the overall molecule, ANLYF showed more flexibility than the wild type in terms of temperature factors and cavities. ANLYF shows a higher temperature factor and has more cavities spreading over the molecule than the wild type (Fig. 5, A-D). When a probe radius of 1.2 Å was used in a GRASP program (60), the total volume and number of cavities were calculated to be 81.5 Å 3 and 5 for ANLYF, and 29.3 Å 3 and 2 for the wild type, respectively. These cavities would increase the room for dynamic motions of side chains and even the backbone, making the molecule flexible, which would be related to the higher temperature factor of ANLYF.
Raising the catalytic activity of an enzyme is simply attained by enhancing the rate of the rate-limiting step of the catalytic cycle of the enzyme. In the case of DHFR, the rate-limiting step is the release of THF from the enzyme⅐NADPH⅐THF ternary complex (Fig. 6), and this reaction rate, namely the k off rate of ANLYF, is 60 s Ϫ1 , five times larger than that of the wild type (12.5 s Ϫ1 ). The increase of the flexibility would increase the chance of the dissociation of THF, which would lead to more efficient turnover. Previous studies of DHFR showed that the catalytic efficiency is increased by the increase of the flexibility (34,61). To examine this with ANLYF, we plotted the k cat values of ANLYF and the various mutants at sites 67, 121, and 145 relative to that of the wild type against their ␤ s o values in Fig. 10 (34). Evidently, the turnover rate of these mutants increases with increasing ␤ s o values, and ANLYF shows the highest catalytic efficiency as expected from its large ␤ s o value. Thus, it is concluded that ANLYF is more flexible than the wild type, which is caused by the increase of the cavities more widely spreading over the molecule, and that the resulting flexibility of ANLYF would make the catalysis more efficient.
Structural Description for the Hyperactivity of ANLYF-The increase of the k off value of THF from the THF⅐NADPH⅐enzyme ternary complex played a significant role in the enhancement of the catalytic activity. The result seems to be attributed to the wider pABG binding cleft of ANLYF than that of the wild type. The pABG moiety is held by the two "walls," of which one side is composed of Leu-28 and Phe-31 in the loop subdomain, and the other side is composed of Ile-50 and Leu-54 in the adenosine binding subdomain, through van der Waals contacts critically important for binding to the substrate (17,18,(21)(22)(23)62). In addition, the temperature factor of folate was shown to be much higher in ANLYF than the wild type. It suggests that the binding of folate was loose probably because of the increase of the bond length between ANLYF and pABG moiety, although it was located at a stable position (see Table 2). These structural factors suggest that the k off value increase of THF from THF⅐NADPH⅐enzyme for ANLYF is because of its wider pABG binding cleft mainly brought about by the replacements of two methionyl residues, Met-42 and -92, with bulkier tyrosyl and phenylalanyl residues, respectively (Fig. 4, D and E). Most likely these two replaced residues in the axis of the hinge hinder the larger hinge motion of the two domains sterically as if the side chains of these two residues were props to the motion. For the methionyl residues in DHFR, the replacements with the highest improvement in the single mutant data base were identical to the replacements selected in ANLYF for Met-1, -20, -42, and -92, that is M1A, M20L, M42Y, and M92F. However, for Met-16, the M16N mutation showed the second highest improvement, not the highest one, in the single mutant data base. The replacement by the phenylalanyl residue improved the activity in the single mutant data base. It is difficult to explain the inverted order of the activity enhancement at residue 16 in the multiple mutants by elucidating the effects of each mutation on the catalytic activity independently, i.e. in a simple additive manner. Met-16 is located in the flexible Met-20 loop in the vicinity of another methionyl residue, Met-20. Because Met-16 and -20 can contact each other in the catalytic cycle, these two residues would be considered as a pair. In general, simple mutational additivity is attained where each mutational effect is independent from each other, and such independency is not always guaranteed under the conditions where some interactions can take place between the two mutation sites. The Met-16 and Met-20 contact in the catalytic cycle would therefore affect the simple additivity, and the inverted result was observed. It should be emphasized that the QAW method is able to reach more improved proteins than that obtained by using the replacements with the highest improvement in the single mutants, where the simple additive condition is not applicable.
Structural Formation of ANLYF-Previous studies indicated that the information for the ability to properly fold, namely foldability, is held in a complete set of the folding elements and that the recognition of the folding elements occurs in the early stages of the folding, using DHFR as a model protein (50). The replacements of the side chains can affect both the foldability and its structural formation process itself (63)(64)(65). ANLYF is a 7-fold mutant with all the cysteinyl and methionyl residues of the wild type replaced by other amino acid residues, two residues of which are located in the assigned folding elements, namely Met-42 and Met-92 (50,(63)(64)(65). Each M42Y and M92F mutation results in only a small perturbation on the respective folding elements as observed by the small difference in burst phase spectrum between the wild type and ANLYF (Fig. 7), and therefore, ANLYF is fully foldable.
Trp-47 has a contact with Trp-74 forming the exciton coupling shown by the specific far-UV CD pattern (48). The slight spectral difference between ANLYF and wild type in the native condition is most likely to be caused by either or both of these replacements (Fig. 7, inset  B). In fact, the comparison of the structure around Trp-47 and -74 showed that the relative distance and angle were different between ANLYF and the wild type (data not shown). The exciton coupling formation is fully completed within the first several hundred milliseconds ( 5 phase) (28,48). The spectral difference between the wild type and ANLYF in the native condition was similar to the difference spectra of the 5 phase between the wild type and ANLYF. The 5 phase of ANLYF thus includes the process of the exciton coupling formation, and a formation of the native-like structure around the Trp-47 and -74 occurs at this folding stage the same as in the case of the wild type. The later stages of the folding kinetics of ANLYF were qualitatively the same as those of the wild type as shown in Figs. 7 and 9, namely the folding mechanism was essentially conserved in ANLYF. This suggests that an improvement of enzyme activity by amino acid replacements can be attained without drastic change of the folding mechanism.