Improved Catalytic Efficiency and Active Site Modification of 1,4- (cid:1) - D -Glucan Glucohydrolase A from Thermotoga neapolitana by Directed Evolution*

Thermotoga neapolitana 1,4- (cid:1) - D -glucan glucohydrolase A preferentially hydrolyzes cello-oligomers, such as cellotetraose, releasing single glucose moieties from the reducing end of the cello-oligosaccharide chain. Using directed evolution techniques of error-prone PCR and mutant library screening, a variant glucan glucohydrolase has been isolated that hydrolyzes the disaccharide, cellobiose, at a 31% greater rate than its wild type (WT) predecessor. The mutant library, expressed in Escherichia coli , was screened at 85 °C for increased hydrolysis of cellobiose, a native substrate rather than a chromogenic analog, using a continuous, thermostable coupled enzyme assay. The V max for the mutant was 108 (cid:2) 3 units mg (cid:3) 1 , whereas that of the WT was 75 (cid:2) 2 units mg (cid:3) 1 . The K m for both proteins was nearly the same. The k cat for the new enzyme increased by 31% and its catalytic efficiency ( k cat / K m ) for cellobiose also rose by 31% as compared with the parent. The nucleotide sequence of two positive clones and two null clones identified 11 single base shifts. The nucleotide transi-tion in the most active clone caused an isoleucine to threonine amino acid substitution at position 170. Structural models Assays were performed in 100 m M phosphate-citrate buffer (pH using 2.5 (cid:2) of cell filtrate. Kinetic Comparison of Wild Type and Mutant GghA— Kinetic param- eters for wild type and mutant GghA were determined by measuring enzyme activity over a range of cellobiose concentrations (see above), using the coupled enzyme assay. Standard error was calculated and ranged from 2.5 to 9%. The K m and V max were derived using the non-linear regression program of SigmaPlot (SPSS, Chicago, IL) based on the Michaelis-Menten model equation; the resulting curves repre- sent best-fit values for the data. The proteins were assayed side by side under identical conditions. Molecular Modeling and Analysis— Structural models of WT and mutant GghA proteins were based on the sequence homology of the protein to the reported crystal structure of native (cid:1) -glucosidase from Bacillus (now Paenibacillus ) polymyxa from the Protein Data Bank. A protein-protein BLAST (39) comparison showed 46% sequence identity between T. neapolitana GghA and the P. polymyxa enzyme (Protein Data Bank code 1BGA). Structural models were constructed using Modeler version 1.5 version 1 (40) on a Silicon Graphics model O2 computer running IRIX 6.4. The model structures were analyzed using Insight 2000 (Accelrys, San Diego, CA); their stereochemical qualities were assessed using the Biotech Validation Suite for Protein Structures. 3 The application the following and Thr-170 active

The multistep transformation of cellulosic biomass from the biopolymer to its glucose components is mediated in nature by the enzymatic hydrolysis of the polymer and its glucan intermediates. The potential of this microbial bioprocessing for production of oxychemicals and transportation fuels has intensified the biotechnological focus on and basic research into the enzymology of cellulases (1)(2)(3)(4)(5)(6). The large scale research initiatives underway to significantly improve biomass conversion yields (7) are supported by previous work that developed mutant organisms with enhanced activity against cellulosic substrates (8) and harnessed ␤-glucosidases, or ␤-glycosidases, for industrial application (9). Thermotoga, a Eubacterial genus diverging near the root of the Bacteria domain, evolved or received through lateral gene transfer (10,11) a unique set of glycosyl hydrolases to degrade the variety of polymeric glucans found in its environment (12). In Thermotoga neapolitana, a Gram-negative, fermentative hyperthermophile (13), 1,4-␤-Dglucan glucohydrolase (GghA) 1 (EC 3.2.1.74) is one of three enzymes comprising the cello-oligosaccharide utilization pathway (14,15). With optimal activity at 85°C and low affinity for cellobiose (its preferred substrate is cellotetraose), GghA is an excellent target for a directed evolution effort. By "redesigning" GghA as a thermostable cellobiase with increased catalytic efficiency (k cat /K m ) for cellobiose hydrolysis, the modified enzyme could play a significant role in a high temperature model cellulase system.
Directed molecular evolution, a widely used, dynamic tool for protein engineering (16,17), draws its power from iterative cycles of random nucleotide-sequence mutagenesis, colony expression, and screening of mutants (18,19). Its feasibility depends on a dynamic screening assay to identify enhanced mutant clones against a background of (tens of) thousands of normal and null clones. To screen for an evolved ␤-glucosidase (GghA) with increased activity toward a native substrate, cellobiose, we developed a thermostable (85°C) coupled enzyme assay with glucokinase and glucose-6-phosphate dehydrogenase from Thermotoga maritima (20). Although ␤-glucosidase activity is rapidly and routinely assayed with chromogenic substrates such as various p-nitrophenyl-␤-D-glycosides, using the pNP-␤-D-glucoside (pNPG) analog in determining increased hydrolysis of cellobiose by GghA mutants might have resulted in evolving an "improved" enzyme that lacked a useful industrial application. As pNPG has both glycone and aglycone moieties, greater mutant enzyme activity might indicate that an amino acid substitution had occurred in a residue responsible for aglycone recognition or binding. Such substitutions might have a wholly different effect on cellobiose that lacks the steric and electrostatic characteristics of the aglycone moiety.
The first step in the directed evolution of GghA, expressed in Escherichia coli, has produced an enzyme with ␤-glucosidaselike character. The GghA mutant, based on a single amino acid substitution six positions from the catalytic residues conserved in all family 1 ␤-glucosidases, has a V max for cellobiose 31% higher than its wild type parent and shows a 31% increase in catalytic efficiency when assayed at 85°C, the temperature optimum of the wild type enzyme. In the absence of a published crystal structure for WT GghA, a three-dimensional molecular model, based on sequence homology to Paenibacillus polymyxa (21), was derived for both wild type (WT) and mutant GghA proteins. Our analysis of the parent and mutant apoenzymes was facilitated by the breadth of structural data available for family 1 glycosyl hydrolases (22): the conservation of the eightstranded TIM barrel-fold (23); the known residues involved in positioning and substrate recognition (24 -26); the mechanism of hydrolysis by general acid catalysis and the condition of specific residues in the active site microenvironment (27,28). Comparing the enhanced GghA apoenzyme to known family 1 structures enabled us to suggest how its single amino acid substitution produces the observed functional improvement.

EXPERIMENTAL PROCEDURES
pTNGghA1 Expression System-Isolation, purification, and kinetic characterization of WT T. neapolitana GghA from E. coli have been described previously (15,20). Plasmid pTNGghA1, constructed from a pET-24d vector (Novagen, Madison, WI), contains the gghA gene under the control of an inducible T7lac promoter and includes a His 6 sequence tag at the C terminus of the gene. Plasmids were harvested with the QIAfilter plasmid purification kit (Qiagen, Valencia, CA) and separated by agarose gel electrophoresis. DNA was run on the gel with 1 kb and supercoiled DNA ladders (Invitrogen, Carlsbad, CA) to provide concentration estimates. The WT library was grown from a single colony in 96-well microtiter plates with each well containing Luria-Bertani (LB) broth (200 l) with kanamycin selection (30 g/l).
Vector Preparation-A pET-24d vector was constructed with a 900-bp spacer sequence flanked with HindIII restriction sites at the 5Ј and 3Ј ends inserted into the multiple cloning site of the vector. The construct was transformed into E. coli DH5␣ competent cells, plated on LB agar with 30 g/ml kanamycin selection. A single colony from the plate, grown overnight (37°C), was used to inoculate 50 ml of LB broth (as above) (500 ml flask, shaking, 37°C). One ml of the overnight broth culture was removed for glycerol stocks; from the remainder, the pETspacer plasmid was harvested using a QIAfilter plasmid purification kit. As needed, the pET-spacer plasmid was double digested with NheI and XhoI (New England Biolabs, Beverly, MA). The linear vector DNA was gel purified, cleaned up with the GeneClean Spin kit Qbiogene (Bio 101 Systems, Carlsbad, CA) and concentrated by SpeedVac vacuum centrifugation (Savant, Farmingdale, NY).
Random Mutagenesis-Primers, 5Ј-GAAGGAGATATACCATGG-CTAGCGTGAAAAAG and 3Ј-GTGGTGCTCGAGTCCGTTGTTTTTG, were designed for error-prone PCR amplification (18,19,29,30) of the gghA gene (1335 bp) in pTNGghA1 to include regions of the vector upstream and downstream of the original NheI-XhoI gghA insertion. The increase in the gghA amplicon to 1378 bases allows for the random mutagenesis of the gghA open reading frame and for the re-ligation of the subsequent mutant amplicon library back into the same plasmid with the His 6 sequence tag in-frame. A 500-l error-prone PCR mixture of mutagenic buffer (7 mM MgCl 2 , 50 mM KCl, 10 mM Tris-HCl (pH 8.5), 0.01% gelatin (w/v)), an asymmetrical mixture of dNTPs (0.2 mM dATP, 0.2 mM dGTP, 1.0 mM dCTP, and 1.0 mM dTTP), 250 pmol of each primer, 500 ng of pTNGghA1 template DNA, and a range of MnCl 2 (0.075-0.15 mM final concentration) was aliquoted in 10 50-l reactions. After denaturation for 3 min at 96°C, all PCR mixtures were amplified for 20 cycles: 6 s at 94°C, 30 s at 61°C, and 1.45 min at 72°C with 0.7 units/50 l of reaction of recombinant Taq polymerase (31). To determine the appropriate mutation frequency, 3 parallel 500-l reactions were set up with different MnCl 2 concentrations, and amplified, producing 3 libraries of mutants.
Ligation and Transformation-PCR products were cleaned up with the GeneClean spin kit, quantified by agarose gel electrophoresis, and double digested with NheI and XhoI. Amplicon digests were also cleaned up, then concentrated (as above). Variant library sequences were ligated into the linear vector (described above) using rapid DNA ligation (Roche Applied Science) and transformed, using standard protocols (32), into commercially available E. coli non-expression host strains with high transformation efficiencies. Each transformant library was plated out as follows: 75 l in one plate and the remaining volume (ϳ225 l) of the transformation reaction in a second plate. Transformed cells were incubated overnight at 37°C. Transformation efficiencies were determined by comparing colony numbers on a 75-l mutant library plate and on a 75-l negative control plate (no insert). To harvest the library, 1.5 ml of LB was added to each mutant plate. Colonies were gently removed with a cell scraper, collected, and combined. Plasmid preparations were made from each mutant library. Ten nanograms of mutant library plasmid DNA was used to transform chemically competent E. coli strain BL21(DE3) cells (Invitrogen) following standard protocols. A portion (140 l) of each BL21(DE3) transformation was plated out on 22-cm 2 bioassay trays (Genetix, New Milton, UK) with the addition of 250 l of SOC (rich broth medium) to aid in spreading. Clones were incubated overnight at 37°C.
Harvesting Mutant Clones-Microtiter plates were prepared with 200 l of LB including kanamycin selection (30 g/ml) per well. Transformants of the mutant libraries were picked from the bioassay trays using a Genetix Qpix robotic colony picker. The picking head inoculated 96 wells of a microtiter plate with 96 individual clones. Clones in 96-well plates were grown overnight at 28°C, shaken on a Pulsating Vortexer (Glas-Col, Terre Haute, IN). In fresh 96-well plates, individual wells were inoculated with 2.5 l from each unique well of the overnight culture. During transfer from overnight growth to expression plates, two wells on each new plate were left un-inoculated and were designated as reference wells. One of the reference wells was then inoculated with WT culture; the other was left un-inoculated as a blank. Microtiter cultures were allowed to grow (shaking, 37°C) to an A of 0.9 -1.0 at 595 nm. Absorbances, adjusted to a cell free reference well, were measured on a Sunrise Absorbance Reader (Tecan USA, Research Triangle Park, NC). Protein expression was induced by the addition of isopropyl-␤-Dthiogalactopyranoside (1 mM/per well), followed by 4 additional hours of growth. Heat-labile host proteins were denatured by heating plates at 75°C for 20 min. After cooling to room temperature, cells were lysed by the addition of lysozyme (with shaking, 20 min). Disintegrated cells from each unique well were transferred to a 96-well MultiScreen-BV 1.2-m filter plate atop a vacuum manifold (Millipore, Bedford, MA). Filtrates of heat-treated, partially purified protein were collected in fresh plates, by applying a vacuum (18 p.s.i.) sustained by a pump, and were stored at 4°C until assayed. The protein concentrations of the filtrates were estimated by the Bradford dye binding method (Bio-Rad) with bovine serum albumin as the standard. Control experiments using gel loading dye indicated the revolutions/min at which there was no well to well contamination during shaking.
Screening for Increased GghA Activity-Wild type and mutant clones were screened for activity in 96-well plates at 85°C using a coupled enzyme assay derived from T. maritima (20). The coupled assay, linking the formation of NADPH (measured at 340 nm) to the hydrolysis of cellobiose by GghA or GghA mutants (20) was scaled down to a 200-l reaction. Thermostable glucokinase (1 unit ml Ϫ1 ) and glucose-6-phosphate dehydrogenase (0.4 units ml Ϫ1 ) in triethanolamine (TEA) buffer (80 mM TEA, 4 mM MgCl 2 , pH 7.8) were aliquoted into assay plate wells; the cell filtrate of individual clones (15 l) was transferred column-wise to the assay plate bringing the volume in each well to 90 l. Filtrate volume in the assay was arrived at empirically. Plates were then placed on a heatblock designed for microtiter plates and heated (85°C, 3 min). Prior to use, cellobiose was dissolved in phosphate citrate buffer (pH 5.2) (for the 1-ml cuvette assay for WT and mutant GghA, the cellobiose was prepared in 50 mM NaAcO buffer, pH 5.2), and incubated with glucose oxidase. Because of the sensitivity of the coupled assay, all cellobiose had to be pretreated with glucose oxidase to remove glucose impurities (20). The pH of the cellobiose preparation was then adjusted with TEA base to pH 7.3. To start the reaction, 90 l of pre-heated (85°C) substrate/cofactor mixture (cellobiose, 20 mM; ATP, 3.5 mM; and NADP, 1 mM) in TEA buffer was added with a multichannel pipettor to the hot assay plate on the heatblock. To prevent condensation on the lid of the plate during preheating and the assay, a second heat block (85°C) was inverted and placed on top of the assay plate lid. The reaction was stopped after 3 min by the addition of 25 l of ice-cold stop buffer (50% assay buffer, 50% ethanol (95%)). In a control experiment, the stop buffer showed no distortion of absorbance at 340 nm. After the addition of the stop buffer, plates were allowed to cool (1 min). Changes in absorbance (A 340 ) because of differential NADPH concentrations were measured against an enzyme-free reference well. To ensure that increases in activity were based on random mutagenesis effects, and not on media evaporation/condensation or well to well cross-talk, potential positive mutants were re-grown and re-screened in the range of column/ row arrangements; un-inoculated columns separated the WT wells from potential positives. Re-screened positive mutants were streaked on LB kanamycin plates and single colonies were grown overnight (3 ml of LB, as above) for glycerol stocks and plasmid purification.
Sequencing and Analysis-The nucleotide sequences of recombinant gghA and gghA mutants produced by random mutagenesis were deter-mined using fluorescent dye terminator sequencing chemistry on an ABI 3100 capillary sequencer (Applied Biosystems, Foster City, CA). Nucleotide sequence alignment and analysis was performed using Vector NTI 7.1 (Informax Inc., North Bethesda, MD). Primary structure comparisons were made using WU-BLAST (33) and MView (34). Secondary structure estimations were provided by PredictProtein from the EMBL. 2 Purification of Wild Type and Mutant Glucan Glucohydrolase-The growth, expression, and purification of recombinant, wild type GghA from Thermotoga sp. has been described previously (15,20,35). The same protocols were followed for GghA mutants. Briefly, a loopful of glycerol stock was inoculated into 3 ml of LB containing 30 g/ml kanamycin and grown to an A of 0.5 at 595 nm. The 3-ml culture was used to inoculate 500-ml of LB medium containing 30 g/ml kanamycin and allowed to grow to an A of 0.9 -1.0 (Fernbach flask, shaking, 37°C); protein expression was induced with the addition of isopropyl-␤-Dthiogalactopyranoside (1 mM final concentration), and the culture was grown for an additional 4 h. Cells were pelleted by centrifugation (6,000 ϫ g, 4°C, 15 min) and resuspended in 50 mM Tris-HCl (pH 7.5). Whole cells were lysed with lysozyme (4 mg ml Ϫ1 of resuspension buffer) and shaken (room temperature, 15 min), cell lysates were heat treated (75°C, 15 min) to denature heat-labile proteins, then sonicated on ice (4 bursts, 45 s). Supernatants were recovered after centrifugation (15,000 ϫ g, 30 min, 4°C), equilibrated with 300 mM NaCl, 50 mM NaH 2 PO 4 , and 10 mM imidazole, and loaded onto a Hi-Trap chelating column (Amersham Biosciences). Bound protein was eluted by a linear imidazole gradient (10 -500 mM imidazole). Peak fractions were assayed for activity and analyzed by SDS-PAGE (36) stained with Coomassie Blue. Concentrations of pure protein were determined by measuring absorbance at 280 nm (⑀ GghA ϭ 11.9 ϫ 10 4 m Ϫ1 cm Ϫ1 ) (37).
Enzyme Assays-Assays of purified, wild type and mutant GghA were performed at 85°C, the temperature optimum was previously established for the wild type (15), using a Shimadzu UV-265 spectrophotometer. Assay temperature was maintained in the cuvette by a thermostatically controlled circulating water bath connected to the cuvette holder and monitored with a traceable temperature probe (VWR, West Chester, PA) placed in the cuvette. Cellobiose from 1.5 to 200 mM (pretreated as described above), 3.5 mM ATP, and 1 mM NADP in TEA assay buffer (as described above) were added to the cuvette and allowed to reach 85°C (2 min). The purified enzymes, GghA (0.2 unit), glucokinase (1 unit ml Ϫ1 ), and glucose-6-phosphate dehydrogenase (0.4 unit ml Ϫ1 ) were injected together into the cuvette and immediately mixed by pipetting. Rates, measured by continuous coupled enzyme assay monitoring formation of NADPH at 340 nm, were determined from the slope of the linear segment of the time course curve observed at the start of the reaction (5-15 s). For the substrate, cellobiose, 1 unit of GghA activity corresponds to 2 mol of glucose produced per minute at 85°C. All coupled assays were done in triplicate. Qualitative arylglycoside activity was evaluated in 96-well plate assays using pNPG at 5 mM final concentration in a 5-min assay at 85°C (38). Assays were performed in 100 mM phosphate-citrate buffer (pH 6.8) using 2.5 l of cell filtrate.
Kinetic Comparison of Wild Type and Mutant GghA-Kinetic parameters for wild type and mutant GghA were determined by measuring enzyme activity over a range of cellobiose concentrations (see above), using the coupled enzyme assay. Standard error was calculated and ranged from 2.5 to 9%. The K m and V max were derived using the non-linear regression program of SigmaPlot (SPSS, Chicago, IL) based on the Michaelis-Menten model equation; the resulting curves represent best-fit values for the data. The proteins were assayed side by side under identical conditions. Molecular Modeling and Analysis-Structural models of WT and mutant GghA proteins were based on the sequence homology of the protein to the reported crystal structure of native ␤-glucosidase from Bacillus (now Paenibacillus) polymyxa from the Protein Data Bank. A protein-protein BLAST (39) comparison showed 46% sequence identity between T. neapolitana GghA and the P. polymyxa enzyme (Protein Data Bank code 1BGA). Structural models were constructed using Modeler version 1.5 version 1 (40) on a Silicon Graphics model O2 computer running IRIX 6.4. The model structures were analyzed using Insight 2000 (Accelrys, San Diego, CA); their stereochemical qualities were assessed using the Biotech Validation Suite for Protein Structures. 3 The application suite provides the following checks PROCHECK V3.5, PROVE V2.3 (41), and WHAT IF V4.99. Ramachandran plot data derived in PROCHECK (42) provided verification of the acceptability of the models. The physicochemical environment of both WT Ile-170 and mutant Thr-170 within the molecule with regard to the neighboring active site cleft was determined. Initially, both protein models were run with H-STRIP (software application authored by P. C. Kahn); solventaccessible surface areas and volumes were then calculated by the method of Lee and Richards (43) using the programs ACCESS and VOLUME.

Directed Evolution: Mutagenesis and
Screening-Prior to the first round of mutagenesis, a WT gghA library (Fig. 1A) was constructed in 96-well microtiter plates from E. coli BL21(DE3) glycerol stocks containing the pTNGghA1 plasmid. The library was used to establish standard conditions for culture growth, gghA expression, and screening in 96-well plates. In Fig. 1A, the shape of the graph is characteristic of WT profiles in other directed evolution experiments (44). Cell filtrate preparations and assay conditions including: filtrate clarity/turbidity, well loading with electronic pipetting, maintenance of reagent, and enzyme temperatures during preheating and assaying, were standardized to be straightforward and reproducible.
Three first generation gghA libraries were constructed to determine which mutagenic PCR conditions would produce the desired mutation frequency (number of base shifts per gene).
For screening a small library (Յ2000 clones), the desired mutation frequency for a gene the length of gghA (1335 bases) was 0.  , produces a differential mutation frequency. The percentage of null clones per total clones in each library is a useful indicator of the mutation frequency. In A and B, pNPG absorbances were normalized by concentration of filtrate per well. Activity data were sorted in ascending order. Assays were performed at 85°C. be estimated from the percent of null clones that arise in a mutant library (44, 46) with 30 -40% null clones corresponding to a 0.2-0.3% mutation frequency. Fig. 1B shows the activity profile for three mutant clone libraries (192 clones each); the graph shows the data near zero activity to emphasize the differential percentages of each library. The libraries were designated IC, ID, and IE. With ϳ35% null mutants, library IE (MnCl 2 , 0.075 mM) provided the appropriate mutation frequency and was chosen for further screening.
In the first generation of directed evolution, 1500ϩ colonies were screened for increased activity toward cellobiose at 85°C using the coupled enzyme assay. Activity corresponds to absorbance of NADPH at 340 nm. Seven clones showed increased absorbance as compared with wild type. Three potential positive mutants (Fig. 2) were re-grown in 96-well plates, and the cell filtrates were re-screened. A second cycle of growth and screening confirmed that all three were more active than the wild type. Mutant IE1E10 (IE-first generation, Library E; 1-plate #1; E10-well identifier) had 33% greater absorbance; IE1D5 showed a 41% increase, and IE8A9 (I170T) had an increase in absorbance of 45% as compared with the wild type. I170T, showing the highest absorbance, will be used for the next generation of mutagenesis.
Sequence Analysis-Complete DNA gene sequences were obtained for two positive (IE1D5 and I170T (IE8A9)) and two null mutants (IE1A5 and IE2H10). Eleven base shifts were identified over the 5,300ϩ bases sequenced (Table I). The actual mutation frequency of 0.21% was within the desired range of 0.2-0.3%. There were 8 transitions and 3 transversions. The T 3 C base shift in the most active mutant, I170T, caused an I170T change in the amino acid sequence. The wild type Ile-170 is 6 amino acids from Glu-164, the conserved catalytic acidbase residue. Along with the adjacent Asn-163, involved in hydrogen bonding with the substrate at the Ϫ1 position, the pair of residues are strictly conserved in family 1 and 5 glycosyl hydrolases (47). Comparative sequence analysis by WU Blast (33,34) indicated that much of the sequence from residue 156 to 171, VKHWITLNEPWVVAIV, including the conserved catalytic glutamate and asparagine residues, is also highly conserved (bold face) in ␤-glucosidases and thioglucosidases in the Brassicales. Even isoleucine 170 (underlined) is conserved in a ␤-glucosidase from the fungus, Humicola grisea var. thermoidea and in one from the plant, Arabidopsis thaliana. As indicated in Table I, the substituted residues of both the positive and the null clones have been identified with elements of secondary structure in which they might participate. It is not surprising that the A 3 T transversion in IE1A5 produced a null mutant as the N163I substitution replaces the conserved active site asparagine involved in catalysis with a bulky hydrophobic residue.
Comparative Kinetic Characterization-Purified WT GghA and evolved I170T proteins each ran electrophoretically as a single band with an approximate molecular mass of 52-53 kDa, correlating with the DNA sequence data for gghA with the addition of the coding sequence for six histidines. Although WT GghA kinetics have been previously characterized (15,20,35), the current results are from identical and/or side by side experiments with both proteins performed under standard conditions. Fig. 3 shows the relative rates of the hydrolysis of cellobiose by wild type GghA and I170T proteins over a range of cellobiose concentrations. Kinetic parameters for both proteins were determined, and Table II summarizes the changes in kinetic behavior toward cellobiose arising from the I170T substitution. Although the K m values for both proteins were nearly the same, the maximum enzyme activity (V max ) of I170T, as compared with its parent, increased by 31% (108 Ϯ 3 versus 75 Ϯ 2 units mg Ϫ1 ). The k cat and the catalytic efficiency (k cat / K m ) for I170T correspondingly increased 31%.
Protein Modeling-Crystal structures have been determined for a number of family 1 ␤-glucosidases (21,24,48). The enzyme from B. (now Paenibacillus) polymyxa, determined at 2.4-Å resolution (Protein Data Bank file 1BGA), was used to construct models of both WT GghA and I170T apoenzymes. The structure models (Figs. 4, 6, and 7) suggested the relative locations of the substituted residues in the tertiary structure of the proteins (Table I). The models were refined by energy minimization using the Insight 2000 package, and their stereochemical qualities were assessed. Ramachandran plots obtained from PROCHECK showed that of the non-glycine, nonproline residues in wild type GghA, 98.2%, were within the allowed regions with 70.2% in the most favored regions. The values for the I170T mutant were similar, with 97.4% of the non-glycine, non-proline residues falling within the allowed regions of the plot and 73.8% found in the most favored regions. Solvent accessible surface area estimates (43) indicated that both the wild type isoleucine and the mutant threonine are buried at position 170. In the model structures of the apoenzymes, residue 170 is seen as a building block in the side wall of the active site cleft just beneath two partially accessible aromatic residues, Tyr-175 and Trp-66. The derived models of GghA and I170T (Fig. 4) fit well into the TIM barrel (␤/␣) 8 structures characteristic of the glycosidase 4/7 superfamily (23). The location of the conserved catalytic residues Asn-163, Glu-164, and Glu-349 at the C termini of the fourth and seventh ␤-strands, respectively, further classify GghA and I170T with other family 1, 2, 5, 10, and 17 glycosidases in the 4/7 superfamily of glycosyl hydrolases (49,50). DISCUSSION Using protein engineering techniques of directed evolution, we have identified a first generation GghA mutant from T. neapolitana that, bearing a single amino acid substitution, I170T, significantly raises the V max of the enzyme and increases its catalytic efficiency for cellobiose (Table II). The mutant, I170T, was picked from a library of 1,500 clones using a screening protocol based on the use of the recently reported thermostable, 96-well plate coupled enzyme assay. Developed with T. maritima glucokinase and glucose-6-phosphate dehy- drogenase enzymes, the assay made it possible to screen mutant libraries for increased ␤-glucosidase activity at 85°C (20). The protocol was initially designed as a continuous assay using a spectrophotometer where the path of the light is horizontal and the heat, which is required for the assay and maintained by a circulating water bath (85°C), rises vertically. However, when clone libraries were screened using a microtiter plate spectrophotometer, the assay had to be reconceived as an end point assay. In the confines of the detector, the heat produced by the assay interfered with the scanning light beam and distorted the absorbance readings. For 96-well plate screening, ice-cold buffer was added to stop the reaction and cool the plates; absorbances were then read. Fortunately, the thermostable coupled assay proved flexible and reliable, both as an end point and a continuous assay: the enhanced mutant, I170T, identified in the 96-well plate assay (Fig. 2) was confirmed and characterized in the 1-ml continuous assay (Fig. 3).
To track the evolution of the WT GghA enzyme, we chose to screen our mutant library using cellobiose. Although chromogenic substrates such as pNPG have been widely used to monitor ␤-glucosidase activity in cellulase systems (38,51), not so subtle differences in monomer conformation between the reducing sugar glycone of the cellobiose and the aglycone of the pNPG might be recognized by the evolving enzyme as essentially different substrates (52). The recent characterization of the mechanism of substrate specificity in a family 1 ␤-glucosidase from maize (53) and sorghum (26,54) provided insights into enzyme binding of a glycone-aglycone complex and confirmed our doubts about using an aryl analog. Czjzek et al. (53) noted that the maize ␤-glucosidase had "specialized aglycone and glycone binding pockets" that were each well suited to binding the disparate parts of the substrate. Whereas the pocket for the aryl portion of the molecule was formed with four aromatic residues, the glycone pocket contained six residues, Surface No a Activity as compared to wild type GghA. b Secondary structure prediction (15). c Secondary structure comparison to ␤-glycosidase from Sulfolobus solfataricus (24).   highly conserved in family 1 ␤-glucosidases, that form hydrogen bonds with the hydroxyl groups of the non-reducing glucose moiety in the active site (53). In T. neapolitana GghA, those residues are Gln-18, His-119, Glu-164, Glu-349, Glu-403, and Trp-404. A well to well comparison of two microtiter plates further suggested there was little correlation between enzyme assays for cellobiose and those for pNPG (Fig. 5). Based on data collected in establishing the mutation frequency of given GghA libraries, where pNPG activity assays were used to rapidly determine the percentage of active clones on a 96-well plate, there were a number of instances where an identical plate, albeit grown at different times, was assayed for both for cellobiose and pNPG hydrolysis. The qualitative analysis shown in Fig. 5 indicated that there was only a 25% correlation between the cellobiose-coupled assay and the pNPG assay for any given mutant (well assignment) on the 96-well plate.
Family 1 1,4-␤-D-glucan glucohydrolases exhibit a substrate preference for longer cello-oligomers, that is, as the degree of substrate polymerization (DP) increases the catalytic efficiency (k cat /K m ) of the enzyme increases (15,21). In the barley ␤-glucosidase ␤II, multiple surface residues, Trp, Tyr, Phe, and His, have been characterized as the binding subsites of the glucosyl units of the longer chain cello-oligomers (47), enabling the conserved catalytic mechanism of the enzyme to cleave single glucose units off the reducing end of the polymer in an energetically efficient manner. Breaking the scissile bond between the first and second glucoses eases the distorted conformation of the reducing glucosyl residue, as has been suggested for other glycosyl hydrolases (55), and provides the energy to propel the substrate into position for the next cut. It is not surprising then that GghA shows little preference for cellobiose. Indeed, there is a 20-fold difference in the wild type K m values of the enzyme for cellotetraose and cellobiose, 2.15 versus 42.6 mM, respectively (15,20,35); furthermore, T. neapolitana expresses another catabolic enzyme, cellobiose phosphorylase, to utilize the disaccharide (15). The significant increase in the maximum rate of cellobiose hydrolysis and overall efficiency of the mutant enzyme, without a change in its K m value for cellobiose (Table II), suggests that the I170T substitution has affected the substrate-enzyme entry and/or exit interactions.
Structural changes in the mutant protein, I170T, as compared with WT GghA were investigated by molecular modeling. These models (Figs. 4, 6, 7), derived from ␤-glucosidase A from P. polymyxa (Protein Data Bank code 1BGA), enabled us to envision possible adjustments in three-dimensional structure and enzymatic mechanism that led to the 31% increase in the catalytic efficiency of the mutant. As the changes in I170T were relatively substantial, but the observed increases in catalytic activity, although significant, were not large, it is important to re-emphasize that the structural models were based on the apoenzyme conformations of the WT and I170T. The most prominent feature of the I170T apoenzyme model is the oval, crater-like shape of the active site. More characteristic of family 1 ␤-glycosidases (53,56), the I170T active site opening is considerably different from that of the wild type model in which aromatic residues, the sugar binding subsites (see above), protrude into the narrow cleft. At the bottom of both are the highly conserved glutamates, Glu-164 and Glu-349, the catalytic acid and nucleophile, respectively, of family 1 glycosyl hydrolases (22,23,53,57). The threonine for isoleucine substitution at residue 170, replacing a hydrophobic residue with a noncharged polar side chain, also changes the geometry of its immediate vicinity, presumably making new hydrogen bonds and "tightening" the turns of the short coil structure of which it is a part. This short helical motif, seen in the crystal structure representations of several ␤-glucosidases (21,24,56), runs parallel to the inside of the TIM barrel (Fig. 4). Starting just beyond the N terminus of the ␤ 4 strand, it includes not only the substituted I170T, but also conserved catalytic residues Asn-163 and Glu-164. The tightening of this coil (Fig. 6) results in a repositioning of several catalytic residues in the active site. Asn-163 is rotated completely away from its role in stabilizing one of the reducing-sugar hydroxyls. His-119, its side chain rotated 180° (Fig. 7), appears to provide both its nitrogen atoms for hydrogen bonding with that same sugar moiety. The two conserved glutamates, Glu-164 and Glu-349, are also repositioned by the threonine substitution: in the three-dimensional space of the model, the distance between their catalytic oxygens has been compressed. Other single amino acid substitutions in this conserved T(L)NEP region of the ␤ 4 strand have led to nearly complete loss of activity: the null mutant N163I reported here and in Verdoucq et al. (54). By superimposing large stretches of amino acids and looking for shifts in the areas not included in the superimpositions, qualitative comparisons can be made between the native and mutant model proteins. Two other tertiary structural changes appear to play a role in the geography of the active site of I170T. Upstream of Thr-170, in the loop between helix 3 and the ␤ 4 strand (Fig. 7), a vertical shift in the backbone angles repositions the side chain of Lys-157 at the back of the active site pocket. Its relocation suggests the formation of new hydrogen bonds with residues of the ␤ 5 strand. Another change in backbone angles, far downstream of Thr-170, causes the rings of Trp-396 to rotate up 45°, which in turn repositions Trp-322 up against the long wall of the crater. This shift appears to make the highly conserved catalytic core of the enzyme (Glu-20, His-119, Asn-163, Glu-164, Asn-220, Glu-349, Glu-403, and Trp-404) more accessible to the substrate. Indeed, the increase in k cat /K m of I170T could be the result of the more open conformation of the mutant and be indicative of a higher on rate for cellobiose. The new conformation also suggests a faster dispersion of the product (glucose) away from the catalytic microenvironment after hydrolysis, a conformational change in the enzyme, as Rignall et al. (5) also noted, that, in effect, makes glucose a better leaving group.
By combining error-prone PCR mutagenesis with the high temperature, coupled assay, we have developed a robust, reliable method for the directed evolution of a thermostable glycosyl hydrolase. The single amino acid substitution I170T, so close to the catalytic center of the enzyme, provides an excellent opportunity to consider the conserved strength and evolutionary diversity of family 1 glycosyl hydrolases. Despite the "remodeling" of the entry to the active site of the enzyme to better accommodate the smaller substrate, cellobiose, the conserved catalytic tools of the core ␤-glucosidase were retained. Within the framework of the national efforts to improve the efficiency of, and lower the costs associated with the transformation of biomass cellulosics to biofuels, the results of this first step in the evolution of a thermophilic ␤-glucosidase from a glucan glucohydrolase hold great promise for the development of a key component of the model cellulase system of the future.