Expanding the Nucleotide and Sugar 1-Phosphate Promiscuity of Nucleotidyltransferase RmlA via Directed Evolution*

Directed evolution is a valuable technique to improve enzyme activity in the absence of a priori structural knowledge, which can be typically enhanced via structure-guided strategies. In this study, a combination of both whole-gene error-prone polymerase chain reaction and site-saturation mutagenesis enabled the rapid identification of mutations that improved RmlA activity toward non-native substrates. These mutations have been shown to improve activities over 10-fold for several targeted substrates, including non-native pyrimidine- and purine-based NTPs as well as non-native d- and l-sugars (both α- and β-isomers). This study highlights the first broadly applicable high throughput sugar-1-phosphate nucleotidyltransferase screen and the first proof of concept for the directed evolution of this enzyme class toward the identification of uniquely permissive RmlA variants.

The availability of diverse sugar nucleotides is key to this process, and given the challenges associated with their chemical synthesis (13)(14)(15), the enzymatic synthesis of sugar nucleotides is an attractive alternative (16,17). Within this context, certain sugar-1-phosphate nucleotidyltransferases display some flexibility toward non-native sugar 1-phosphates and, in some cases, even non-native NTPs (18 -25), the promiscuity of which can be further expanded via protein engineering (26 -28). One of the most permissive and best studied nucleotidyltransferases to date is the glucose-1-phosphate thymidylyltransferase from Salmonella enterica typhimurium LT2 (RmlA; Fig. 1B) (24,27,28). Specifically, wild type RmlA was demonstrated to turn over a wide variety of sugar 1-phosphates, including deoxy, amino, and azido sugars, and even displayed weak activity with all eight ribo-and deoxyribonucleotides. Although structure-guided mutagenesis has led to some improvements in RmlA substrate range, this method requires access to accurate ligand-bound models and provides only a static representation (i.e. does not account for protein dynamics as a factor contributing to catalysis).
As a complement to structure-guided engineering, the directed evolution of enzymes has enabled studies to increase activity, enhance enantioselectivity, improve stability, broaden promiscuity, increase tolerance to organic solvents, and even engineer metabolic pathways (29 -31). The process has also been specifically applied to two enzymes relevant to glycosylation, including those utilized for glycorandomization (32)(33)(34)(35). To date, the main limitation for sugar-1-phosphate nucleotidyltransferase directed evolution has been a lack of a facile high-throughput screen. The one previous report of directed evolution for this enzyme class was restricted to native substrates and a substrate-inflexible biological readout (35). Herein we describe the application of a recently developed general high-throughput screen for sugar 1-phosphate toward RmlA directed evolution (36). In this study, epPCR and site-saturation techniques were combined to identify a number of improved variants for both the NTP and sugar 1-phosphate substrates. By combining the mutations identified, variants with up to 40-fold improvement in activity toward six nonnative nucleotides, galactose-1-phosphate, and both the ␣and ␤-anomers of L-fucose 1-phosphate were identified. Subsequent x-ray crystallography of key single amino acid mutants with enhanced NTP tolerance also revealed the mode of binding for non-natural NTPs. This study highlights the first application of nucleotidyltransferase directed evolution for expanding the activity toward non-native substrates.

EXPERIMENTAL PROCEDURES
General Materials and Methods-Chemicals were purchased from Sigma-Aldrich or Fisher and used without further purification. Oligonucleotides were obtained from either Integrated DNA Technologies (Coralville, IA) or the University of Wisconsin-Madison Biotechnology Center (Madison, WI) as "desalted" grade and used without any purification.
Liquid handling during high throughput screening was performed on a Beckman Coulter (Fullerton, CA) Biomek FX liquid handling system, and microtiter plate absorbance was obtained with a BMG Labtech (Durham, NC) FLUOstar Optima plate reader. Analytical and preparative HPLC was performed using a Varian ProStar HPLC system (Varian Inc., Palo Alto, CA).
Mutagenesis-Error-prone PCR was performed with the GeneMorph II kit from Stratagene (La Jolla, CA), using the manufacturer's protocols. For a mutation rate of 1-3 nucleotides/kb, 500 -2000 ng of plasmid (75-325 ng amplified DNA) was used. Mutated genes were subsequently cloned into vectors via standard restriction enzyme digestion (NdeI/EcoRI) and ligation (T4 DNA ligase) or with the MEGAWHOP protocol, as described (37,38). Site saturation and site-directed mutagenesis (supplemental Tables SI and SII) were performed using the megaprimer-based protocol of Tseng et al. (39).
Microtiter Plate Assay-The DNA from error-prone or site saturation libraries were desalted and directly transformed into electrocompetent BL21(DE3) cells (Lucigen, Middleton, WI), using manufacturer's protocols. The obtained colonies, along with empty vector and wild type controls, were inoculated into LB ϩ kanamycin medium (50 mg/liter) in a 96-well deep well (2 ml) culture plate (MidSci, Valley Park, MO) for an overnight culture at 37°C with mixing (350 rpm). Ten microliters from each well were transferred to 1 ml of LB ϩ kanamycin medium in a fresh 96-well deep well plate, incubated at 37°C for 3 h, at which time 100 l/well of 4 mM isopropyl-␤-D-thiogalactopyranoside was added to each well. After overnight (20 -22 h) incubation at 28°C and 350 rpm, the cells were pelleted (2000 ϫ g, 10 min), decanted, and frozen at Ϫ20°C.
Cells were subsequently resuspended in 500 l/well 50 mM sodium phosphate (pH 7.5) containing 2 mg/ml lysozyme, and the cell suspension was frozen at Ϫ80°C. Frozen solutions were thawed in a warm water bath, and the insoluble portions were removed via centrifugation (6000 ϫ g, 10 min). Ni 2ϩ -NTAagarose slurry (Qiagen, Valencia, CA), washed and equilibrated to 25% (v/v) in a 50 mM sodium phosphate (pH 7.5) buffer containing 1.2 M NaCl, was added (100 l/well) to a fresh 96-well deep well plate, followed by the cleared supernatant. Resin was incubated with protein for 10 -20 min at room temperature, with occasional gentle mixing. The bound resin was centrifuged at 1000 ϫ g for 5 min, followed by careful aspiration of the supernatant. Resin was washed twice with 1 ml/well each of 10 mM MOPS (pH 7.5). On the final wash, 400 l/well was left behind from the aspiration.
Enzyme Kinetics-Assays with purified enzymes were performed using the previously described HPLC assay (28). Steady-state kinetic parameters were obtained by fixing either glucose 1-phosphate, UTP, or dTTP at a saturating concentration of 10 mM (34,35) and titrating the compound of interest. At least six different concentrations in the range of 0.25 ϫ K m to 4 ϫ K m (1-40 mM for NTP, 1.5-50 mM for galactose 1-phosphate, 3-50 mM for fucose 1-phosphates) were assayed in replicate on at least three separate days for each titration.
X-ray Crystallography of RmlA Variants-Purified RmlA samples were concentrated to 12 mg/ml (Q83S) and 15 mg/ml (Q83D) with final concentration of dATP or dGTP for co-crystallization of 25 mM. Initial screens were performed with a local screen UW192 utilizing a Mosquito dispenser (TTP labTech, Royston, UK) by the sitting drop method. Crystal growth was monitored by Bruker Nonius Crystal Farms at 20 and 4°C.
Diffraction quality crystals of RmlA Q83S with dATP were obtained by mixing 2 l of the sample solution and 2 l of a reservoir solution (20% methyl ether PEG 5000, 120 mM MgCl 2 , 100 mM Tris, pH 8.5, and 1 mM suramine) at 20°C using the hanging drop method. Diffraction quality crystals of RmlA Q83D with dGTP were generated by mixing 2 l of the sample solution and 2 l of a similar reservoir solution (10% methyl ether PEG 5000, 120 mM MgCl 2 , 100 mM Tris, pH 8.5, and 1 mM suramine) at 20°C using the hanging drop method. Crystals were cryoprotected with 20% ethylene glycol and were flashfrozen by liquid nitrogen.
X-ray diffraction data were collected by the Life Science Collaborative Access Team at the Advanced Photon Source at Argonne National Laboratory (Argonne, IL) using an x-ray wavelength of 0.9794 Å. Data sets were indexed and scaled using HKL 2000 (41). Molrep was utilized for molecular replacement (42). For the RmlA Q83D/dGTP structure (Protein Data Bank entry 3PKQ), molecular replacement was used with the previous RmlA mutant structure (Protein Data Bank entry 1MP4) as a starting model. For the RmlA Q83S/dATP structure (Protein Data Bank entry 3PKP), molecular replacement was used with the RmlA Q83D/dGTP structure as a starting model. The structures were completed with alternating rounds of manual model building with COOT (43) and refinement with phenix.refine (44). The final round of Q83S structure refinement included eight translation-libration-screw rigid groups (45,46). Noncrystallographic symmetry restraints were applied to both structures. Structure quality was assessed by Procheck (47) and Molprobity (48).

RESULTS AND DISCUSSION
Library Creation-Purely random mutagenesis techniques, such as epPCR, have several limitations. Random mutagenesis spreads mutations evenly throughout the protein, whereas productive mutations generally more often occur near the active site (49,50). In addition, epPCR causes almost exclusively single nucleotide changes in any one codon, thereby restricting the possible amino acid substitutions for each position (51). As a result, researchers have increasingly turned to "semirational" techniques, such as structure or homology-guided saturation mutagenesis, to enhance the outcome of enzyme evolution studies (51,52).
A similar strategy was adopted for this study. Of the 61 amino acid residues within 8 Å of the bound product, 32 were eliminated from consideration because they either orient away from the active site or are occluded by more proximal amino acids. Three additional residues, which appear to be directly involved in binding the invariant phosphates, were also eliminated. Based upon this assessment, 22 of the remaining residues were submitted to saturation mutagenesis using an NNK degenerate codon.
As a complement, epPCR libraries were also generated with a mutation rate of 1-3 nucleotides/gene. Higher mutation rates were found to be detrimental to protein folding/production, consistent with prior epPCR-based studies (53,54).
Sensitivity Enhancements of the Plate-based Assay-Although the previously published phosphatase/pHBH assay for residual sugar phosphates (36) works well for RmlA crude extracts and good substrates, contaminating phosphatase activity within the crude lysate is detrimental in the context of poor substrates. Specifically, various sugar 1-phosphates simply treated with crude lysates (without the addition of exogenous alkaline phosphatase) revealed nearly quantitative conversion of the sugar 1-phosphate to reducing sugar in 30 -60 min ( Fig.  2A). Thus, for non-native substrates requiring longer incubation times or more enzyme (i.e. more crude extract), the pHBH assay had to be adapted to a purified enzyme format.
Initial attempts toward this goal utilized standard 96-well plate-based nickel affinity chromatography; however, although this method enabled the purification of the enzyme (and corresponding removal of contaminating phosphatase activity), salt in the elution buffer inhibited RmlA activity. Lowering salt concentration or changing the type of salt in the elution buffer failed to identify a satisfactory alternative (data not shown), and standard methods for buffer exchange (such as dialysis and gel filtration), although successful, are not conducive to high throughput applications.
Consistent with prior precedent for immobilized active nucleotidyltransferases (55), an examination of the activity of immobilized RmlA revealed specific activities (ϳ4-fold reduction), similar to purified enzyme in solution (Fig. 2B). However, initial pHBH assays revealed that the presence of the Ni 2ϩ -NTA-agarose greatly reduced assay sensitivity. However, consistent with the well documented detrimental effect of metal ions (such as Ni 2ϩ ) on the pHBH reaction (36,56), the addition of 50 mM EDTA to the sodium hydroxide reagent was sufficient to restore sensitivity (Fig. 2C).
Microtiter Plate Screening Conditions-To maximize sensitivity, reaction conditions that resulted in ϳ25-50% turnover of the WT RmlA were selected for all screening. In this fashion, mutants that demonstrate Ն2-fold improvements (compared with WT RmlA) were readily discernable. Fig. 2D illustrates a typical reaction, highlighting the activities of both WT RmlA and inactive controls.
To identify mutations that specifically improve activity toward nucleotide substrates, ATP, CTP, and GTP with glucose-1-phosphate were tested, with the expectation that the ribo forms would serve as reasonable surrogates for the corresponding deoxyribo forms. To limit the number of preliminary screens, the target NTPs were screened as a combined equimolar mixture. Galactose 1-phosphate was selected as a primary sugar target for two reasons: (i) WT RmlA displays only moderate activity with galactose 1-phosphate (28,34), and (ii) the sugar kinases commonly employed for glycorandomization are most proficient at generating galacto-configured sugar 1-phosphate analogs (34,57). To further challenge the outcome in the context of galactose 1-phosphate, UTP was selected for the countersubstrate.
Although L-configured isomers are often found attached to microbial natural products (4 -7), all reported RmlA sugar 1-phosphate substrates to date have been D-configured isomers. To explore the ability of engineered RmlA variants to accommodate L-configured sugars, two commercially available L-sugar 1-phosphates, ␣-L-fucose 1-phosphate and ␤-L-fucose 1-phosphate, differing only at their anomeric positions (supplemental Fig. S1), were also included as screening targets in the presence of dTTP (to maximize activity). Although ␤-L-fucose 1-phosphate has previously been shown to be a substrate for another glucose-1-phosphate thymidylyltransferase (20), both ␣ and ␤ forms were tested because the position of the anomeric phosphate is critical for RmlA catalysis, and it is unknown if the absolute ((R) versus (S)) or relative (axial versus equatorial) stereochemistry is most important in substrate positioning for RmlA.
Screening Results-To minimize the chances of a false negative and to correct for potential variability in expression levels, a multitiered screening approach was used. Specifically, the four most active "hits" from each plate, identified using the primary pHBH screen, were reinoculated from the reserved culture plates into fresh growth plates and were re-expressed and reassayed (supplemental Fig. S2). During the secondary assay, the active mutants were tested against each NTP individually. Completion of this screening program revealed a fair overlap between hits, especially for the NTP targets. A final cohort of 76 hits, representing all of the unique starting colonies, were tested against each substrate and sequenced. This analysis revealed many of the hits from the site saturation library to be degenerate, leaving a total of 37 distinct variants (supplemental Table SIII), with 49 distinct mutations at 38 different sites within the protein (supplemental Fig. S3).
Preliminary Hit Characterization-To confirm identified hits, a selection of the top hits were produced in E. coli, purified, and tested in an HPLC-based specific activity assay at the substrate concentrations employed during screening (Tables 1 and  2). Although not all of the putative hits held up under this final analysis (ϳ40% had activities less than WT), at least one mutation per substrate was identified with a demonstrable improvement in activity. With regard to NTP improvements, Table 1 reveals the most productive substitutions to occur at a specific series of glutamines (Gln 27 , Gln 83 , and Gln 91 ) wherein small polar (Ser and Thr) or small hydrophobic (Ala and Val) substitutions provide the greatest desired improvement. Although subtle amino acid substitutions at one or more of these positions lead to some

Specific activity of selected NTP mutants
Shown is specific activity in min Ϫ1 (values greater than wild type underlined with the two best mutants for each category in boldface type) using the following assay conditions: 2 mM sugar 1-phosphate, 2.5 mM NTP, 100 mM MOPS, pH 7.5, 7.5 mM MgCl 2 , 1 unit/ml inorganic pyrophosphatase, 30  purine (ATP/GTP) versus pyrimidine (CTP)-and/or ribose (NTP) versus deoxyribose (dNTP)-based preferences, this analysis suggests that the best NTP "generalist" derives from the Gln 91 series with small side chain substitutions (Q91T Ͼ Q91S Ͼ Q91A). It should be noted that a mutation equivalent to Q27S in a homologous enzyme was previously reported to improve activity toward UTP (26). In addition, Gln 83 mutations (including Q83S) had previously been identified as important for enhancing RmlA activity with purine nucleotides (28). Table 2 suggests sugar 1-phosphate improvements to be more case-specific, with the most productive L-sugar 1-phosphate substitutions occurring at Leu 89 , Tyr 146 , and Trp 224 with small (Val) or planar (Phe) hydrophobic substitutions preferred. In contrast, the double mutant M22V/L89I provides the best "general" C-4 epimer selectivity switch from glucose to galactose wherein the single mutations are roughly additive. In addition, galactose-directed mutants may be nucleotide-sensitive, with some improvements being specific to UDP/Gal 1-phosphate (Q27V) and some to dTDP/Gal 1-phosphate (Y146F).
Targeted Mutant Combinations-Key active mutations for targeted substrates were combined via site-directed mutagenesis to test for further gains in activity. Specifically, this study focused upon two series of combinations: (i) improving the ATP/CTP activities of the "NTP generalist" (substituted at Gln 91 ) via combining with advantageous Gln 27 mutations and (ii) improving targeted sugar 1-phosphate activities through combining the most productive mutations at Gln 27 , Leu 89 , Tyr 146 , and Trp 224 . In most cases, the combination of two mutations enhanced activity as expected (Figs. 3 and 4). One noticeable exception was Q91S, which failed to combine profitably with either Q27V (in the case of GTP) or Q83D (for GTP and dGTP). It should be noted that the specific activity observed for ␤-L-fucose-1-phosphate and dTTP with Y146F/ Y224F (1.6 min Ϫ1 ; Fig. 4), compares favorably with other Glc 1-phosphate thymidylyltransferases reported to turn over this L-sugar 1-phosphate (0.45-1.4 min Ϫ1 ) (24), and this level of activity is sufficient for bulk production of dTDP-L-fucose-1phosphates (supplemental material).
To examine if anomeric flexibility is a general feature of RmlA, the specific activities of WT RmlA and Y146/W224F with ␤-D-Glc 1-phosphate and dTTP were compared. Y146F/ W224F displayed a small amount of turnover (0.043 min Ϫ1 ; 16-fold lower than the anomerically similar ␣-L-Fuc 1-phosphate and 46,000-fold lower than wild type RmlA with dTTP and ␣-D-Glc 1-phosphate). No activity was detected for the wild type enzyme and ␤-D-glucose 1-phosphate to the limit of sensitivity (0.007 min Ϫ1 ). Table 3 highlights the single substrate kinetic parameter for key mutants. Notably, no appreciable change in K m was observed for any of the mutants characterized, despite the fact that the screening conditions were in the K m regime. Instead, the improvements in activity were confined primarily to k cat and k cat /K m with overall improvements in k cat /K m (for the screened reactions) ranging from 6-fold (for Q27V/Q91T and CTP) to 40-fold (for Q27V/Q91T and GTP). Determined parameters for non-screened dNTPs revealed more modest improvements in k cat /K m , ranging from 1.4-fold (for dCTP and Q91T) to 9-fold (for dATP and Q83S/Q91T). Notably, the observed k cat values for dATP and dGTP are in the range for WT RmlA and native dTTP substrate (1850 Ϯ 40 min Ϫ1 ) (28). Likewise, k cat improvements for the targeted sugar 1-phosphates ranged from 11-to 28-fold, with the greatest improvement observed for ␤-L-Fuc 1-phosphate/dTTP (Y146F/W224F). Structural Basis for Sugar 1-Phosphate Improvements-From a structural perspective, most of the key mutations have straightforward explanations. Two of the identified mutations for Fuc 1-phosphate (Y146F and W224F) are located near C6 of the hexose in the active site (based upon bound NDP-Glc). Although the specific structural impacts of Y146F or W224F have not been determined, previous structural characterization of the W224H mutant revealed an astonishing active site side chain rearrangement and main chain distortion that creates an expansive gap in the substrate-binding pocket surrounding C6. The precedent for such mutants to accommodate significant C6 steric bulk, in conjunction with a bias for ␤-l-Fuc 1-phosphate over ␣-l-Fuc 1-phosphate is consistent with ␤-l-Fuc 1-phosphate adopting the less favored ␣-D-like 4 C 1 chair conformation (with the 6Љ-methyl in the axial position) within the active site. A similar induced fit mechanism has been put forth to explain the ability of engineered anomeric sugar kinases to accept both D-and L-sugars (34). That identical mutations also provide an advantage for ␣-L-Fuc 1-phosphate activity and may suggest contributions from a median chair conformation potentially accessible from both ␣and ␤-L-Fuc 1-phosphate (e.g. boat, skew, or envelope) or even the possibility that the sugar may completely flip over in the active site.
Although the specific structural contributions of L89V or Q27V have not been determined, Leu 89 and Gln 27 both interact with the nucleoside substructure of the NTP/NDP-sugar. In addition, previous structural characterization of the L89T mutant revealed this mutation to relieve sugar 1-phosphate C2, C3, and C4 steric constraints via a potential adjustment or "slipping" of the sugar base in the enlarged active site pocket (58    Structural Basis of Nucleotide Improvements-Gln 83 is specifically involved in binding the nucleoside base substructure of the NTP/NDP-sugar and thereby contributes to base specificity. In contrast, Gln 91 is found within the secondary shell, where it hydrogen-bonds with Gln 83 but does not directly contact the substrate. To further probe the effect of the Gln 83 mutations on nucleotide specificity, x-ray structures of RmlA Q83S with dATP and Q83D with dGTP were solved to a resolution of 2.6 and 2.4 Å, respectively (supplemental Table SIV). The overall D2 symmetric tetrameric conformations were found to be similar to previous wild type and other mutant RmlA structures (58). Q83S was found to have two tetramers per asymmetric unit compared with one tetramer per asymmetric unit for Q83D. Although the Gln 83 -containing loops (residues 83-87) of both mutant structures were not as clearly defined due to high B-factors and perhaps incomplete occupancy of dGTP in the Q83D (80 -86%), key contacts and the mutational influence upon non-native nucleotides can be confirmed in both structures (Fig. 5). In the Q83S mutant with dATP bound, a serine hydroxyl group forms a weak hydrogen bond to the N1 and N6 of adenine (distance: 3.1 Å with N1, 3.2 Å with N6). In the Q83D mutant with dGTP bound, the side chain oxygens of aspartate show weaker hydrogen bond interactions with N1 of guanine (distance: O␦1, 2.9 Å with N1).
There are several features that distinguish the binding from the previous wild-type structures (Protein Data Bank entry 1IIM). First, in the mutant proteins, the purine rings of dATP and dGTP are tilted about 45°toward the mutated residue due to the new interactions. Second, the loop containing residues 83-89 shows a variety of conformations and possesses higher average B-factor than other regions. In the Q83S structure, Pro 84 , Ser 85 , and Pro 86 shift outward toward the position of the nucleotide, creating more space for a larger substrate. Ser 85 shows a completely different rotamer conformation compared with the wild type. Also, the backbone oxygen of Ser 85 and Pro 86 interact with adenine N6 via hydrogen bonding. Meanwhile, in the Q83D structure, we assigned alternative conformations for Ser 85 , Pro 86 , and Asp 87 due to the apparent disorder. One con-former has the loop shifted outward toward the position of the nucleotide, creating more space for larger substrate, and Pro 86 has a flipped conformation. The other conformer is similar to the wild type and Q83S structures but has a cis-proline (Pro 86 ). In both conformers, the hydroxyl group of the Ser 85 side chain interacts with O6 of guanine. The Q83S and Q83D structures also have different rotamer conformations of Glu 87 . Furthermore, the loop with residue 83 lacks interactions with Gln 91 , consistent with the lack of an additive effect within certain Gln 83 /Gln 91 double mutants that have been assayed (e.g. Q83D/ Q91S). Based on these observations, introduction of the Gln 83 mutation not only creates hydrogen bonds to the new substrates but also makes a wider active site and contributes to a more flexible binding loop.

CONCLUSION
This study highlights the first application of nucleotidyltransferase directed evolution for expanding the activity toward non-native substrates. This work further illuminates the plasticity of this unique catalyst and enables, for the first time, the preparative syntheses of a set of non-native fucose nucleotides. The high throughput tools presented and key nucleotidyltransferase residues identified are likely to contribute to the future engineering of nucleotidyltransferases and related biocatalysts.