Phylogenetic sequence analysis and functional studies reveal compensatory amino acid substitutions in loop 2 of human ribonucleotide reductase

Eukaryotic class I ribonucleotide reductases (RRs) generate deoxyribonucleotides for DNA synthesis. Binding of dNTP effectors is coupled to the formation of active dimers and induces conformational changes in a short loop (loop 2) to regulate RR specificity among its nucleoside diphosphate substrates. Moreover, ATP and dATP bind at an additional allosteric site 40 Å away from loop 2 and thereby drive formation of activated or inactive hexamers, respectively. To better understand how dNTP binding influences specificity, activity, and oligomerization of human RR, we aligned >300 eukaryotic RR sequences to examine natural sequence variation in loop 2. We found that most amino acids in eukaryotic loop 2 were nearly invariant in this sample; however, two positions co-varied as nonconservative substitutions (N291G and P294K; human numbering). We also found that the individual N291G and P294K substitutions in human RR additively affect substrate specificity. The P294K substitution significantly impaired effector-induced oligomerization required for enzyme activity, and oligomerization was rescued in the N291G/P294K enzyme. None of the other mutants exhibited altered ATP-mediated hexamerization; however, certain combinations of loop 2 mutations and dNTP effectors perturbed ATP's role as an allosteric activator. Our results demonstrate that the observed compensatory covariation of amino acids in eukaryotic loop 2 is essential for its role in dNTP-induced dimerization. In contrast, defects in substrate specificity are not rescued in the double mutant, implying that functional sequence variation elsewhere in the protein is necessary. These findings yield insight into loop 2's roles in regulating RR specificity, allostery, and oligomerization.

Ribonucleotide reductase (RR) 3 catalyzes the reduction of ribonucleotides to generate the 2Ј-deoxynucleotides required for DNA synthesis, and at least one type of RR is found in all cellular organisms. RR is responsible for maintaining balanced pools of dNTPs, and its activity is regulated at the levels of transcription, localization, and allostery (1,2). Eukaryotic class I RR is composed of a large subunit that binds nucleotides (R1) and a small subunit that carries a tyrosyl radical essential for catalysis (R2). The R1 subunit contains the active site for NDP reduction (the catalytic, or C-site) and two allosteric sites: the activity site (A-site) and the specificity site (S-site) (Fig. 1A). Binding of an NTP/dNTP effector to the S-site results in formation of active R1 dimers and controls enzymatic specificity, i.e. the relative k cat /K m values for the four NDP substrates (3). ATP/dATP-binding promotes reduction of primarily CDP and UDP, dGTP binding favors ADP reduction, and dTTP binding favors GDP reduction (4,5). The A-site is ϳ40 Å from the active site and binds ATP or dATP. Nucleotide binding induces formation of activated or inactive hexamers, respectively. The ATP-activated form is generally accepted as the physiologically relevant species (6 -10). Together, the actions of ligand binding at the A-site and S-site ensure that total dNTP pools remain at appropriate levels (5). dNTP pools are so finely tuned that they vary significantly even among cellular compartments (11).
RR is a key target for human disease treatment including cancer chemotherapy, and interactions between small molecules and hRR can be measured through observing changes in oligomeric state (12). The nucleotide analog gemcitabine diphosphate binds in the enzyme active site (C-site) and acts as a suicide inhibitor (13). Recently, it was shown that the triphosphate form of clofarabine, a deoxyadenosine analog, induces formation of inactive R1 hexamers. Adenosine analogs cladribine and fludarabine can also act as inhibitors in diphosphate forms by inducing the formation of inactive hexamers (12). Although the functional connection between effector S-site binding, dimerization, and RR activity up-regulation is wellunderstood, there are fewer studies examining the mechanistic link between dATP binding, hexamerization, and inhibition (6, 9, 14 -16). Moreover, current models do not yet account for ATP's activating effect on activity. Achieving a better understanding of nucleotide binding and associated enzyme conformations responsible for allosteric regulation of specificity and activity is, therefore, essential both to our basic understanding of RR function and to development of further therapeutics. Current X-ray crystal structures of bacterial and eukaryotic RR together with in vitro functional studies provide valuable and detailed insight into the potential mechanisms by which nucleotide binding regulates eukaryotic class I RR activity and substrate specificity (Fig. 1A) (9, 14 -16, 18, 19). Binding of dNTP effectors results in formation of active R1 dimers in which the C-site and S-site of opposite monomers are closely apposed at the dimer interface. Structure models of the R1 subunit from Saccharomyces cerevisiae show that the active site is only large enough to accommodate the substrate after dNTP effector-binding (15). Kinetic studies of murine RR also support the idea that effector-dependent dimerization is necessary for activity, preventing substrate reduction from taking place without allosteric information transfer from the S-site (20). Structures of RRs from bacteria and eukaryotes further reveal that the C-site and S-site are joined by a short loop called loop 2 (Fig.  1B). Structural models of RRs bound to different substrate/effector pairs show that binding of a dNTP effector in the S-site directs C-site specificity by changing the conformation of loop 2. The general features of this model are supported by the available biochemical data and contribute to the overall paradigm for enzyme regulation by allostery.
Although some features of loop 2 appear to be conserved, others vary between species, but how such differences influence function is not known. Furthermore, crystal structures of RRs from various species show differences in loop 2 conformations in the presence of common ligands (9, 14 -16), which can leave the precise roles of individual amino acids unclear. Mutation of Arg-293 or Gln-288 in S. cerevisiae RR to alanine results in impaired substrate binding, reduced catalytic activity, and altered loop 2 conformations (21). Mutation of Arg-293 or Gln-288 (human numbering) to alanine in Escherichia coli RR results in inactivation and decreased activation by dATP, respectively. Mutation of the less conserved Asp-287 to alanine in hRR leads to an inability to tune substrate specificity to effector identity (22). Mutating any of residues Tyr-285, Asp-287, Gln-288, or Arg-293 in loop 2 results in a mutator phenotype in S. cerevisiae (23,24). Thus, conserved residues in loop 2 are clearly important for function, but additional information is needed to further our understanding of how conserved and variable residues in loop 2 work together to transmit allosteric information between the S-site and C-site.
To further this goal, we systematically investigated patterns of phylogenetic sequence variation in eukaryotic RR enzymes and used this information to guide structure-function studies of loop 2 function in vitro. A sequence alignment of 310 R1 subunits from Eukarya reveals two amino acids in loop 2 that consistently covary (N291G and P294K, human numbering). Individual mutations at these positions in hRR affect NDP discrimination, consistent with the canonical role of loop 2 in directing substrate specificity. However, the results also reveal that these positions are essential for dNTP-induced RR dimerization and suggest a surprising new role for loop 2 in mediating the long-range effects of ATP on activity.

Experimental procedures
Phylogenetic comparative sequence analysis cDNA sequences of eukaryotic RR large subunits were retrieved from GenBank TM . GenBank TM entries of putative and predicted proteins as well as proteins under 500 amino acid residues were excluded (25). In the two instances when duplicate entries existed for a single organism, the protein sequence that was most conserved within the organism's kingdom was chosen for further analysis. Amino acid sequence alignment was carried out using the Multiple Sequence Alignment feature of Clustal Omega (26,27). This multiple sequence alignment was used to produce conservation scores for hRR in ConSurf (28) and visualized using the PyMOL Molecular Graphics System, Version 1.8, Schrödinger, LLC. A phylogenetic tree of all 310 organisms queried was generated using phyloT and Interactive Tree of Life based on their taxonomy data from the National Center for Biotechnology Information (NCBI) (29).

Mutagenesis and purification of human ribonucleotide reductase
Protein purification was conducted as described previously (22). Site-directed mutagenesis of hR1 was conducted essentially as described previously (22). The N291G mutant was generated using primers 5Ј-GATCAAGGTGGGGGCAAGC-GTCCTGG-3Ј and 5Ј-CACATATCTAGCTGTGTTGTT-

Circular dichroism (CD) spectroscopy
CD spectroscopy was carried out essentially as described in Davis et al. (30). Briefly, stocks of hR1 in 50 mM Tris, pH 8.0, 5 mM MgCl 2 , 5% glycerol, and 10 mM DTT were diluted into a Ͼ10-fold excess of 25 mM potassium phosphate dibasic, pH 7.5, for a final volume of 800 l and a final protein concentration of 0.1 mg/ml. Initial CD spectra were recorded at 20°C on an Applied Photophysics PiStar 180 spectrophotometer in a quartz cuvette with a path length of 0.5 mm. The sample was heated in 2°C increments and held at each temperature for 30 s before recording each spectrum. Samples were heated to at least 86°C to ensure that thermal melting was complete. Fitting was performed using Origin 8. Melting curves were fit to Equation 1, where Y is the measured ellipticity, ⌬H m is the enthalpy at the unfolding transition, T m is the melting temperature in degrees Kelvin, and R is the universal gas constant (31). The baseline and slope before the transition are represented by m n and y n , respectively. Likewise, m d and y d represent the baseline and slope after the transition.

Analyses of hRR catalytic activity and NDP substrate specificity
Kinetic assays were conducted as described previously (22). Assays were performed at 37°C in 50 mM glycylglycine, pH 7.7, 15 mM MgCl 2 , and 20 mM DTT. Protein concentrations were 0.5 M R1 and 5 M R2. Substrates ADP, CDP, GDP, and UDP were typically present at 0.6 mM. Effectors were present at concentrations indicated in the individual figures. Two aliquots were removed before initiating the reaction. These samples were used to empirically measure the substrate concentrations by high performance liquid chromatography (HPLC) and to verify the efficacy of subsequent boronate chromatography to remove unreacted substrate. Reactions were incubated for 30 min. All aliquots were quenched by rapid freezing to Ϫ80°C. Substrates were removed by boronate chromatography essen-tially as described in Hendricks and Mathews (32). In this procedure, the reaction is stopped by rapid freezing. The aliquot is then thawed by dilution into acidic chromatography buffer on ice and promptly injected, precluding additional product formation. Substrate and product concentrations were quantified by anion exchange HPLC as in Hendricks and Mathews (32). Slow hydrolysis of dNTP effectors occasionally precludes accurate quantification of rates of product formation for poor substrates. For example, dGTP degrades to dGDP, which is chemically identical to dGDP produced through reduction of GDP. Such cases are noted where applicable in individual figure legends.
Reactions were kept in steady state conditions (fraction of reaction Ͻ10%). Substrate concentration and v obs data were combined via Equation 2 (33)(34)(35)(36) to yield the r k, or the ratio of each substrate's k cat /K m relative to that of the reference substrate.
The specificity of hRR is thus quantified by comparison of r k values that represent the ratio of each substrate's k cat /K m relative to that of a reference substrate. In all cases, specific activity is reported as the enzymatic activity of the R1 subunit across all four NDP substrates.

Analysis of hRR oligomerization by size exclusion chromatography (SEC)
SEC was carried out essentially as in Fairman et al. (9). Briefly, 100 l of 4 M hR1 in 50 mM Tris, pH 7.6, 5 mM MgCl 2 , 100 mM KCl was mixed with 0 -4 l of stock NTP/dNTP solution. All nucleotide stocks were 100 mM from New England BioLabs. When dGTP or dTTP was used, the final nucleotide concentration was 10 M. When ATP was used, the final nucleotide concentration was 1 mM. The mixture was centrifuged for 10 min at 20,000 ϫ g and 25 Ϯ 3°C. The mixture was injected onto a Superdex 200 10/300 size exclusion column on a Shimadzu LC-20AB chromatograph equilibrated with 50 mM Tris, pH 7.6, 5 mM MgCl 2 , 100 mM KCl, and a concentration of ligand equal to that in the sample. SEC was carried out at 25 Ϯ 3°C. The flow rate was 0.25 ml/min, and absorbance was recorded at 290 nm. Peak retention time was correlated with apparent molecular weight using the retention times of a set of standards (Sigma Molecular Weight Marker kit). Peak fitting to multiple Lorentzian peaks was performed using Origin 8.

Phylogenetic comparative sequence analysis revealed loop 2 positions that covary in eukaryotic RR
To better understand the extent of sequence variation in eukaryotic RR and the loop 2 region in particular, we conducted an alignment of 310 eukaryotic RR sequences retrieved from GenBank TM (Fig. 2 Fig. 3; supplemental data). Putative or predicted proteins as well as those under 500 amino acids were excluded. With respect to the overall conservation of the eukaryotic RR sequence, the hydrophobic core, the active site (C-site), and the ligand-binding sites involved in allosteric reg-Phylogenetic mutagenesis of human ribonucleotide reductase ulation (S-site and A-site) showed strong conservation as expected, whereas most surface-exposed residues were less conserved (Fig. 2, A and B). These patterns are consistent with the expectation of strong selection pressure to maintain the overall three-dimensional structure and functional ligandbinding properties of the protein.

and
Previous alignments of RRs have shown that loop 2 is wellconserved, consistent with its known role in allosteric regulation (16,19,(37)(38)(39). However, the current analysis revealed important differences among enzymes from different groups of eukaryotes. The alignment showed that three major types of eukaryotic loop 2 sequences exist. We designated them as type I, type II, and type III (Fig. 2, C and D). Type I and type II loop 2 sequences contained both the highly conserved Gln-288 and Arg-293 residues and Asp-287, which is conserved in eukaryotes but variable in RR enzymes from bacteria (22). In contrast, type III sequences did not show significant homology to type I or type II RR and in particular did not include the hallmark Gln-288 or Arg-293 residues. The lack of significant homology between type III and I/II loop 2 sequences makes it difficult to draw functional correlations, particularly because no structural information exists for these enzymes. Therefore, more detailed inquiry was directed at type I and type II sequences.
Importantly, type I and II sequences were distinguished by two sites of surprising non-conservative amino acids substitutions in loop 2. hRR is representative of type I sequences, which have an asparagine at position 291 and a proline at position 294 A, crystal structure model of the hRR large subunit homodimer (PDB ID 3HND, 3.21 Å resolution; ATP was modeled into the figure by aligning this structure to PDB ID 3HNE). The protein backbone is shown as ribbons colored by conservation score (see the key). Yellow is used to indicate that insufficient data exist for that position. The S-site ligands (dTTP) are shown as blue spheres. The C-site ligands (GDP) are shown as green spheres. The A-site ligand (ATP) is shown as red spheres. B, as in panel A but with the protein represented as a van der Waals surface. C, representative examples of loop 2 diversity in eukaryotes. The group to which each organism belongs is listed to the right of its species name (A ϭ Animalia; F ϭ Fungi; Pl ϭ Plantae; Pr ϭ Protista). The loop 2 sequence of each organism is shown to the right of its group name. Positions 291 and 294 are indicated and in bold (human numbering). Deviations from the human loop 2 sequence are shown in red. The species name and loop 2 sequence are surrounded by a blue box if the loop 2 sequence is of type I (identical to hRR), a green box if the loop 2 sequence is of type II (N291G and P294K), and an orange box if the loop 2 sequence has the N291G substitution only (human numbering). D, pie charts showing the percentage of organisms from each group with each major type of loop 2 sequence. Type I and type II loop 2 sequences are as described in C. GP denotes loop 2 sequences with the N291G substitution only. Other denotes sequences with the conserved glutamine and arginine residues that have some other sequence at the remaining residues. Type III denotes putative loop 2 sequences that lack the conserved glutamine and arginine residues.

Phylogenetic mutagenesis of human ribonucleotide reductase
(human numbering). Type II sequences had a glycine at position 291 and a lysine at position 294 (human numbering). These changes were highly non-conservative, and the fact that the observed variation took place in glycine and proline residues is particularly surprising. The biophysical properties of these two residue types are markedly different from those of the other 18 canonical amino acid residues and can be considered to be "punctuation marks" in protein structure (40).
To facilitate comparison between loop 2 diversity and the eukaryotic tree of life, we divided the sampled organisms into animals, plants, fungi, and protists. We recognize that Protista is an obsolete phylogenetic group, and when mapping organisms onto a eukaryotic tree of life we made use of the most current information (Fig. 3). However, in Fig. 2 we made use of the term Protista for convenience because of the relative undersampling of microbial eukaryotes. Most animals and fungi have type I loop 2 sequences, whereas most plants and protists had type II loop 2 sequences (Fig. 2, C and D) (25). A notable exception to the covariation at positions 291 and 294 is Candida albicans, whose RR has a glycine at position 291 but a proline at position 294 (the so-called "GP" loop 2). Furthermore, almost all flies in this data set have "GP" loop 2 sequences, suggesting that this variant arose independently in their common ancestor (Fig. 3). Conversely and importantly, loop 2 sequences that harbor an asparagine at position 291 and a lysine at position 294 (human numbering) were not observed (supplemental Fig. S1). These data suggest that the common ancestor of eukaryotes may have had a type II loop 2 sequence. The data also raise the possibility that mutation of Lys-294 to proline occurred relatively early in the common ancestor of Unikonts (the group that includes animals and fungi), which then allowed for Gly-291 to mutate to asparagine (human numbering).
The metabolic properties and genomic GC contents of the organisms included in the present analysis do not appear to correlate with loop 2 sequence in a way that suggests an adaptation to an environmental pressure on evolutionary fitness. Plasmodium falciparum, the malarial parasite, is a member of Apicomplexa and is known for having an unusually low GC content in its DNA (41). Trypanosoma brucei, the cause of African sleeping sickness, is another parasitic protist whose RR reduces a comparatively large amount of UDP in the presence of dATP (42). It is possible that a type II loop 2 allows for adaptations in RR specificity regulation that are advantageous for the lifestyles of particular organisms. However, many plants have GC contents that are closer to average and would thus likely require RR enzymes with specificity similar to those present in animals and fungi (43).
Available structural data offer clues into the potential roles of amino acid residues Asn-291 and Pro-294. For example, Asn-291 interacts with neither the substrate nor the effector in the structural model but appeared to participate in an H-bonding network with the other amino acid residues in the loop. However, Asn-291 also has the potential to participate in a crystal contact in the structural model of hRR, potentially confounding interpretation of its role in enzyme function (supplemental Fig.  S2). In S. cerevisiae RR, Pro-294 forms part of the active site and appears to hold it in a conformational ensemble, which is permissible to substrate ingress (15). Yet it is not obvious from inspection of current structures how the observed variation at these two positions would affect RR activity nor how covaria-

Phylogenetic mutagenesis of human ribonucleotide reductase
tion would result in a new and compensatory structure-function relationship (19).
Because positions 291 and 294 are phylogenetically linked, determining the properties of single and double mutants at these positions in the background of the biomedically important and comparatively well-studied human enzyme is likely to provide insight into these amino acid contributions to RR activity, substrate specificity, and allosteric regulation by oligomerization. Accordingly, we generated N291G, P294K, and N291G/P294K mutants of hRR. Because of the highly non-conservative nature of the substitutions involved in the present study, we considered the possibility that one or more of the mutant human proteins is misfolded and/or thermally unstable. However, at 20°C, wild-type hRR had a CD spectrum consistent with significant ␣-helical content and similar to the CD spectrum of murine RR (Fig. 4) (30). The three hRR loop 2 mutants had essentially identical CD spectra and T m values within ϳ5°C of the wild-type enzyme (55-60°C). Thus, the functional effects of these mutations are likely to reveal only the results of local changes in loop 2 geometry, as opposed to gross effects on overall protein structure.

N291G and P294K point mutations in human RR loop 2 disrupted specificity but had complementary effects on activity
We determined the effects of loop 2 mutation on catalytic activity and NDP substrate specificity using alternative substrate kinetics as described previously (22). Briefly, the relative k cat /K m value ( r k) for each NDP substrate was measured in reactions containing all four substrates by using boronate chromatography to remove unreacted NDP substrates and anion exchange HPLC to separate and quantify the resulting dNDP products. Using this approach, we found that N291G hRR is ϳ5-fold more active than wild-type hRR in the presence of dGTP or dTTP alone (Fig. 5, A and C). In contrast, the P294K mutation reduced catalytic activity by ϳ10-fold when either dGTP or dTTP was used as the effector. In the N291G/P294K double mutant these effects were offset, and its activity was comparable with the wild-type enzyme. Strikingly, both the N291G and P294K enzymes were inactive in the presence of 1 mM ATP, a concentration of S-site ligand that were sufficient to drive robust activity in the native enzyme. Under the same conditions the N291G/P294K double mutation rescued catalytic activity to ϳ10% of wild-type levels, with essentially the same NDP substrate specificity as the wild-type enzyme.
Comparison of the relative k cat /K m values for the four NDP substrates shows that N291G hRR processes significantly more CDP than wild type when dGTP is the effector. The P294K mutation also results in a higher proportion of CDP reduction than wild-type hRR under these conditions. However, compared with the effect of the P294K mutation on RR activity, the effect on specificity is relatively modest. The effects of each single mutation on dGTP-directed activity and specificity are additive in the double mutant. N291G also has decreased specificity for GDP when dTTP is used as the effector. P294K hRR processes primarily GDP like the wild-type enzyme, although quantitative analysis of specificity is limited due to effects of this mutation on activity. Similar to the results observed in the presence of dGTP, the effects of each mutation on dTTP-directed activity and specificity are additive in the N291G/P294K double mutant. Thus, N291G, and to a lesser extent P294K, result in effects on specificity that are additive in the N291G/P294K double mutant. In addition, the P294K mutant causes a defect in overall activity that is rescued in the N291G/P294K enzyme.

The P294K mutation caused defects in oligomerization that are rescued by the N291G mutation
To further investigate the basis for the defect in catalytic activity induced by P294K mutation in hRR and the robust compensatory rescue in the double mutant protein, we tested the effects of loop 2 mutations on effector-induced dimerization. It is known that the minimal active form of RR is a dimer, so observed dimerization defects could potentially explain variations in activity (3). Alternatively, the P294K mutant could be capable of dNTP binding and dimerization but have perturbed local structure resulting in loss of NDP binding or catalysis. In this case the N291G mutation could allow for compensatory . CD of hRR variants. A, overlaid CD spectra for all enzyme variants used in this study. These spectra were taken at 20°C. B, thermal melting of hRR mutants. All variants of hRR were subjected to thermal melting in 2°C increments, and , the mean residue ellipticity at 210 nm, was monitored. T m values are displayed with each transition.  (22). A, activity of the hRR variants in the presence of 50 M dGTP. The specific activity of wild-type hRR is 0.0055 Ϯ 0.0017 mol/(s ϫ mol of hRRM1) (9). Data are reported as the ratio of the specific activity of a given enzyme variant to that of wild type. B, specificity of the hRR variants in the presence of 50 M dGTP. Specificity is defined as in Knappenberger et al. (22) and is the ratio of the k cat /K m for a given substrate to that of a reference substrate. ‡ indicates a product that was not present in sufficient quantity to accurately measure its formation. † indicates a product that coelutes with a hydrolysis product from an effector (here dGDP from slow dGTP hydrolysis). C, activity of the hRR variants in the presence of 50 M dTTP. The specific activity of wild-type hRR is 0.0066 Ϯ 0.00052 mol/(s ϫ mol of hRRM1) (9). Data are reported as the ratio of the specific activity of a given enzyme variant to that of wild type. D, specificity of the hRR variants in the presence of 50 M dTTP. Specificity is defined as in Knappenberger et al. (22) and is the ratio of the k cat /K m for a given substrate to that of a reference substrate. ‡ indicates a product that was not present in sufficient quantity to accurately measure its formation. E, activity of the hRR variants in the presence of 1 mM ATP. The specific activity of wild-type hRR is 0.070 Ϯ 0.008 mol/(s ϫ mol of hRRM1) (9). Data are reported as the ratio of the specific activity of a given enzyme variant to that of wild type. Wild-type hRR prefers CDP over UDP by Ͼ10-fold (22). N291GϩP2924K hRR only appreciably reduced CDP. The limit of detection for this assay is ϳ1 ϫ 10 Ϫ4 mol/(s ϫ mol of hRRM1). Results are also available in numerical form (supplemental Table S1).

Phylogenetic mutagenesis of human ribonucleotide reductase
interactions that correct local geometry at the active site. To distinguish between these possibilities, we used SEC to determine the oligomeric state of hRR and the three loop 2 mutants in the presence of 10 M dGTP or dTTP and in the presence of 1 mM ATP (Fig. 6). dGTP and dTTP were included at a concentration of 10 M because this concentration is approximately 1 order of magnitude greater than the dissociation constants for these effectors at the S-site (44). ATP was included at a concentration of 1 mM for the sake of consistency with enzyme activity assay conditions and because resolution suffered at higher concentrations. The wild-type elution profile was essentially unchanged at 3 mM ATP (supplemental Fig. S3).
In the absence of effectors, all enzyme variants exhibit retention times consistent with the monomeric form of the enzyme, as expected from previous studies (5,9). In the presence of 10 M dGTP or dTTP, all four enzymes dimerize. However, the extent of P294K dimerization was reduced ϳ3-fold relative to wild type. Importantly, the results demonstrate that the N291G Figure 6. SEC of hRR variants in the presence of dGTP, dTTP, or ATP. A, a simple model depicting S-site ligand binding and oligomerization in hRR. The protein is represented by large circles; the ligand is represented by small black circles. hRR converts from a monomeric form (red circles) to a dimeric form (green circles) in the presence of dGTP, dTTP, or ATP. ATP also binds at the A-site to trigger formation of hexamers (purple) and intermediate species that migrate as apparent tetramers (blue) (3). The species are shown in the order in which they elute from a size exclusion column. B, representative SEC chromatograms of hRR variants. Each square represents a representative SEC run. The rows represent the protein variants, whereas the columns represent the running conditions. The raw data are shown as black dots. The fitted peaks are shown in colors that correspond with oligomeric states in A. The extent of partitioning into each non-monomer state, if appreciable, is indicated in each square. The P294K mutant is indicated by a dotted gray outline.

Phylogenetic mutagenesis of human ribonucleotide reductase
mutation fully rescues dimerization in the N291G/P294K double mutant, restoring the extent of its dimerization to that of wild type. This observation is consistent with the kinetic results in the presence of dGTP or dTTP in which all enzyme forms showed activity near wild-type levels except for P294K (Fig. 5, A  and C). The P294K mutant also has deficient oligomerization in the presence of 1 mM ATP; the extent of hexamerization and the formation of intermediate molecular weight species are reduced ϳ2-fold. The N291G mutation restores P294K oligomerization to wild-type levels in the presence of ATP, but this result did not correlate with activity levels in the presence of 1 mM ATP (Fig. 5E). The correlation between SEC and activity resulting in the presence of dGTP or dTTP suggests deficient dimerization as a plausible contributor to the P294K variant's deficient activity under those conditions. In contrast, the lack of correlation between the two techniques in the presence of 1 mM ATP excluded oligomerization as a potential explanation for the two single mutants' deficient activity. Instead, the data point to potential local perturbations in the cross-talk between the S-site and the active site. Thus, these results show that multiple systems within hRR are perturbed when these two positions in loop 2 are mutated.

N291G hRR recognized the same effector nucleobase functional groups as wild-type hRR
Only a limited subset of effector nucleobase functional groups are recognized by the S-site in order to drive specificity at the C-site (22). To test whether the altered specificity of the N291G mutant arises due to new adventitious contacts with the dNTP effector, we assayed the specificity of this protein in the presence of three effector analogs: 2-aminopurine deoxyribonucleotide triphosphate (2-aminopurine-drTP), deoxyinosine triphosphate (dITP), and deoxyzebularine triphosphate (dZeb) (Fig. 7). The N1 of ATP drives specificity, but the N6 is dispensable in wild type. In the presence of 2-aminopurine-drTP, N291G hRR has specificity similar to wild-type hRR with ATP or 2-aminopurine-drTP. The N2 of dGTP is also dispensable for allosteric regulation, and wild-type hRR recognizes dITP as if it were dGTP. In the presence of dITP, N291G hRR directs reduction of CDP followed closely by ADP, similar to its specificity in the presence of GDP. In the presence of dZeb, wild-type hRR reduced GDP and CDP with approximately equal efficiency. The N291G variant gave a similar result but with a greater preference for CDP than wild type. N291G hRR also reduced more CDP than wild type in the presence of dTTP. Thus, the effects of introducing the N291G mutation and introducing the chemical mutation of dTTP to dZeb are roughly additive. This additivity suggests that specificity perturbations do not derive from effector recognition via new adventitious contacts but through impaired information transfer once canonical contacts are established.

dGTP binding to N291G/P294K hRR and dTTP binding to P294K hRR converted the allosteric activator ATP into a dATP-like negative allosteric regulator
To investigate the effects of loop 2 mutations on allosteric activation by ATP, we measured the activity and substrate specificity of the mutants in the presence of 1 mM ATP and either 0.75 mM dGTP or 1.6 mM dTTP (Fig. 8) (22). Although hRR can bind ATP at both the S-site and the A-site, under these condi-  (22). The chemical structures for each effector analog are listed. R denotes the deoxyribose triphosphate moiety. Specificity is defined as in Fig. 5. ‡ indicates a product that was not present in sufficient quantity to accurately measure its formation. † indicates a product that coelutes with a hydrolysis product from an effector (here 2Ј-deoxyzebularine 5Ј-diphosphate from slow dZeb hydrolysis). * indicates a substrate that was not included. ⌬ r k denotes the difference between the specificity of wild-type and N291G hRR under identical conditions. Negative values indicate that the N291G mutant prefers the substrate less than wild type; positive values indicate that the opposite is true. For example, if a substrate is not appreciably processed by wild-type hRR but is the favored substrate for N291G hRR, the difference has a value of 1. Activity data are reported as the ratio of the specific activity of N291G hRR to that of wild type. The specific activity of N291G hR1 in the presence of 100 M 2-aminopurine-drTP is 0.061 Ϯ 0.011 mol/(s ϫ mol of hRRM1); 100 M dITP, 0.034 Ϯ 0.0073 mol/(s ϫ mol of hRRM1); dZeb, 0.0074 Ϯ 0.0023 mol/(s ϫ mol of hRRM1).

Phylogenetic mutagenesis of human ribonucleotide reductase
tions the higher affinity of the S-site for deoxynucleotides (K D values of ϳ1 M versus ϳ150 M) results in the dNTPs dominating occupancy of the S-site (5). This conclusion is also supported by the present observation that the presence of 1 mM ATP did not result in increased specificity for CDP when compared with reactions containing either dGTP or dTTP alone (compare Figs. 5 and 8).
As expected, the overall activity of wild-type hRR is stimulated 5-10-fold by ATP binding to the A-site. In contrast, the N291G mutant was inhibited ϳ5-fold. The P294K mutant was stimulated by the presence of ATP when dGTP were the S-site effector, which was similar to wild-type hRR. Unexpectedly, although the N291G/P294K double mutant had robust catalytic activity in the presence of dGTP or ATP alone, the reaction was quenched when both were present. This effect was clearly not due to competition for ligand binding at the S-site, as the N291G/P294K double mutant had activity within ϳ10-fold of wild-type hRR in the presence of 1 mM ATP or 50 M dGTP. Thus, the presence of both N291G and P294K mutations in hRR converted ATP, which stimulated enzyme activity in wildtype hRR into an inhibitor. P294K mutant hRR was inactive in the presence of ATP and dTTP, thus revealing a second set of conditions in which ATP binding was interpreted as a negative allosteric signal. In contrast to the results in the presence of 1 mM ATP alone (Fig. 5E), results from the present set of experiments were best explained by a mechanism in which long-range communication between the A-site and the active site was disrupted by mutations in loop 2.

Discussion
Despite the wide range of lifestyles and genomic DNA GC content among eukaryotes, RR enzymes are well-conserved across this domain of life (Fig. 2, A and B) (43). Analysis of the alignment of eukaryotic RRs reported here (Figs. 2 and 3, supplemental data) reveals that eukaryotic loop 2 sequences primarily occur as one of two major types. Type I sequences are exemplified by hRR loop 2 and have an asparagine at position 291 and a proline at position 294 (human numbering). Type II loop 2 sequences are identical at almost other positions yet harbor the highly non-conservative substitutions N291G and P294K. Most plants and protists have type II loop 2s, whereas most animals and fungi have type I loop 2s. The present bioinformatics and biochemical data together suggest that type II was the ancestral form of RR loop 2 in eukaryotes (Fig. 3). The glycine-to-asparagine and lysine-to-proline mutations likely occurred in the common ancestor of Unikonta before the group differentiated into its earliest members. A small number of loop 2 sequences of an intermediate genotype, exemplified by most members of the Candida genus, have a glycine at position 291 but retain a proline at position 294 (human numbering) (Figs. 2C and 3). The fact that C. albicans has an intermediate type of loop 2 sequence is of particular interest because of the clinical significance of the organism (45,46). Almost all species of flies that were sampled also have the single N291G mutation in the absence of a corresponding P294K substitution (Fig. 3). A small group of algae including Nannochloropsis gaditana also evolved  (22). A, activity of the hRR variants in the presence of 1 mM ATP and 0.75 mM dGTP. The specific activity of wild-type hRR is 0.030 Ϯ 0.0041 mol/(s ϫ mol of hRRM1) (9). Data are reported as the ratio of the specific activity of a given enzyme variant to that of wild type. The limit of detection for this assay is ϳ1 ϫ 10 Ϫ4 mol/(s ϫ mol of hRRM1). B, activity of the hRR variants in the presence of 1 mM ATP and 1.6 mM dTTP. The specific activity of wild-type hRR is 0.021 Ϯ 0.0040 mol/(s ϫ mol of hRRM1) (9). Data are reported as the ratio of the specific activity of a given enzyme variant to that of wild type. C, the extent of activation or inhibition of dGTP-bound hRR by ATP. Data represent the ratio of the activity in the presence of ATP (panel A) to the activity in the presence of dGTP but not ATP (Fig. 5A). D, the extent of activation or inhibition of dTTP-bound hRR by ATP. Data represent the ratio of the activity in the presence of ATP (panel B) to the activity in the presence of dTTP but not ATP (Fig. 5C). In panels C and D, numbers greater than unity represent activation, whereas numbers less than unity represent inhibition. E, specificity of the hRR variants in the presence of 1 mM ATP and 0.75 mM dGTP. Specificity is defined as in Fig. 5 and is the ratio of the k cat /K m for a given substrate to that of a reference substrate. ‡ indicates a product that was not present in sufficient quantity to accurately measure its formation. † indicates a product that coelutes with a hydrolysis product from an effector (here dGDP from slow dGTP hydrolysis). F, specificity of the hRR variants in the presence of 1 mM ATP and 1.6 mM dTTP. Specificity is defined as in Fig. 5 and is the ratio of the k cat /K m for a given substrate to that of a reference substrate. ‡ indicates a product that was not present in sufficient quantity to accurately measure its formation. Results are also available in numerical form (Table S- is less favorable to hRR enzyme function, it is thus likely that the type I loop 2 arose from mutation of the lysine residue followed by mutation of glycine residue rather than the converse (Fig. 9A).
The bioinformatic, biochemical, and biophysical results reported here provide insight into the functional roles of two key residues in hRR loop 2. Asn-291 is not proposed to contact either the substrate or the effector in current structure models of eukaryotic RR, yet the N291G mutant of hRR is impaired in its ability to recognize purine substrates (Fig. 5). Similarly, Pro-294 apparently plays a limited role in substrate recognition; its only contact with a substrate is a second-sphere water-mediated contact with ADP in S. cerevisiae (15). Nonetheless, the conformation of Pro-294 is altered by binding of different effector-substrate pairs. The substrate specificity observed in the presence of nucleotide effector analogs rules out the possibility that N291G recognizes different effector functional groups to drive specificity. In particular, results in the presence of dZeb exclude that possibility because the N291G substitution and the dZeb substitution independently perturb specificity. If the protein and effector substitutions perturbed the same set of contacts, it would be expected that introducing both substitutions would produce a similar specificity to either when present individually. The results reported here are, therefore, consistent with previous proposals that Asn-291 is involved in a network of contacts among loop 2 residues that helps transmit information from the S-site to the C-site (9, 14 -16).
Loop 2 is clearly the primary structural motif involved in differential substrate recognition (9, 14 -16). However, there is a body of literature to strongly suggest that RRs with divergent loop 2 sequences are able to discriminate among potential substrates in ways that are functionally equivalent and give rise to similar specificities (22,38,42,47,48). The results presented here further show that loop 2's function is intimately tied to the surrounding protein context. If loop 2 were perfectly modular with respect to allosteric communication between the S-site and C-site, the Type 2 sequence (i.e. the N291G/P294K mutant) should necessarily serve as a perfect functional equivalent of the type I sequence (i.e. wild type). Because the defects in the N291G/P294K double mutant are additive, loop 2 function must also depend on other regions of the protein that work in concert to specify the correct substrate in the context of S-site ligand binding. Examination of phylogenetic data makes clear that functional equivalency in loop 2 is achieved with a variety of sequences and that the type I loop 2 asparagine and proline are not obligate requirements. Complementary biochemical data reveal the apparent necessity for additional mutations elsewhere in the protein that allow for variation within loop 2. Importantly, whenever hRR structure is changed through sitedirected mutagenesis or effector structure is changed through chemical mutagenesis, the predominant effect on specificity is increased reduction of CDP (22). It follows that the most thermodynamically stable conformation of loop 2 favors reduction of CDP and that a key role of dGTP or dTTP binding is to contribute binding energy that RR harnesses to perturb the loop 2 conformation away from this "default" state.  Type I loop 2 dominates Unikonta (animals, fungi, and some protists), whereas type II loop 2 dominates the rest of Eukarya (Fig. 3). GP loop 2 evolved independently in representatives of animals, fungi, and protists. NK loop 2 was not observed in this study of 310 eukaryotic RRs. B, compensatory effects of N291G and P294K mutations in hRR loop 2. The two mutations affect specificity in an additive way, but features of the enzyme including oligomerization, specific activity, and ATP stimulation are fully or partially rescued by addition of both substitutions in the same enzyme. N291G has a less severe effect on enzyme function when present individually, which corresponds with the patterns of loop 2 evolution observed in A.

Phylogenetic mutagenesis of human ribonucleotide reductase
Both loop 2 mutants retain some catalytic activity and are capable of oligomerization to some extent in the presence of S-site ligands. Although the effects of N291G and P294K mutation on NDP specificity are additive, several key aspects of enzyme function disrupted by these point mutations are rescued in the double mutant (Fig. 9B). First, the catalytic activities of the single mutants in the presence of dGTP or dTTP offset one another in the double mutant, yielding activity similar to that of wild type (Fig. 5, A and C). Second, P294K's decreased ability to oligomerize in the presence of allosteric ligands was rescued in the double mutant (Fig. 6). Third, both single mutants are catalytically inactive in the presence of 1 mM ATP, whereas the double mutant showed activity rescued to ϳ10% of wild-type levels (Fig. 5E). The fact that rescue was observed for some, but not all, effects of RR functionality suggests that loop 2 plays a variety of roles in RR. Those aspects that show rescued functionality are likely governed within loop 2; for aspects that fail to show rescue, it is likely that other regions of the protein are involved.
An important negative result is the observation that for the most part, the loop 2 mutations described here do not appear to have large effects on ATP-induced oligomerization. Several groups have independently measured the oligomeric states of RRs from several species under a variety of conditions using a diverse array of techniques including SEC as employed in the current study. Ando et al. (6) employed SAXS and electron microscopy to study the human enzyme. They found that both dATP and ATP cause formation of hexamers in the presence of substrates and that, notably, R2 can change the structure of dATP hexamers but not ATP hexamers. ATP can even induce formation of R1 filaments at high concentrations. Aye and Stubbe (49) and Wang et al. (17) showed using SEC that hRR can form hexamers, dimers, or a mix of oligomeric states depending on the precise reaction and chromatographic conditions used. Fairman et al. (9) employed SEC/MALS to observe that hRR forms dimers and hexamers in the presence of dATP. Kashlan et al. (5) examined murine RR via dynamic light scattering and sedimentation velocity experiments. They observe that murine RR may form tetramers or hexamers. Rofougaran et al. (10) examined the oligomeric state of human RR using SEC in a method that is similar to our own. They found that hRR forms a mixture of hexamers, dimers, and monomers in the presence of 3 mM ATP, which is only partially consistent with the results in the present study. There are several potential explanations that can account for this apparent discrepancy. The two sets of experiments make use of different methods of protein purification and different divalent ion concentrations. They may also differ in other variables such as flow rate and temperature. Part of the hexamer population in this study may be dissociating during the course of the SEC run, leading to the formation of a peak with the apparent molecular weight of a tetramer. However, the two studies agree on the point that ATP binding triggers formation of an ensemble of oligomers in hRR. Importantly, all comparisons made in this study use wild-type protein under identical conditions as a reference, and interpretation is restricted to examination of the differences in the behavior of wild-type and mutant proteins. Although potential dissociation during the chromatographic run precludes quantification, the fact that no significant observable differences are induced in the N291G or N291G/P294K mutants is consistent with minimal perturbation of ATP-induced oligomerization under steady state reaction conditions. In contrast, the P294K mutation slightly reduces the accumulation of higher-order oligomeric states induced by ATP. This attenuation was rescued when the ATP concentration was increased to 3 mM (supplemental Fig. S3), suggesting that the dissociation constant for ATP at the S-site or A-site is slightly increased.
In sum, this diverse set of bioinformatic and in vitro experiments leads to the following conclusions. Loop 2 is well-conserved, yet variation is tolerated at positions 291 and 294. This variation tends to take the form of two major types of loop 2: type I and type II. Introducing mutations into hRR loop 2 to convert it from type I to type II did not disrupt hRR's overall secondary structure. In the presence of dGTP or dTTP alone, all enzyme variants are active, yet the P294K mutant has deficient activity. This is likely due to a partial failure to dimerize under similar conditions. In the presence of ATP alone neither single mutant is active, and this may be due to aberrant conformational ensembles in loop 2 that are partially restored in the double mutant. The presence of dGTP or dTTP in the S-site and ATP in the A-site has complex effects on the mutant enzymes, which may be due to disruption in long-range communication between the two allosteric sites. In all activity assays, the mutants showed small detrimental effects on specificity, suggesting that loop 2 works together with other regions of the enzyme to dictate substrate specificity. Throughout the biochemical experiments, P294K was more detrimental than N291G or N291G/P294K mutations, which correlates well with the observed bioinformatics data. Loop 2 is clearly involved in a wide range of processes in hRR, and mutations in loop 2 can have effects on both short-range and long-range allostery within the protein.
The results of this study implicate two amino acids in loop 2 as having important roles in diverse areas of enzyme function including overall activity, substrate specificity, allosteric regulation, and oligomerization. Some of these aspects were anticipated from prior studies, whereas others could not have been predicted from examination of previous biochemical or structural data. Because the entirety of loop 2 is well-conserved, it is likely that each individual residue plays a unique and key role in some aspect of enzyme function that is vital to biological fitness. The roles of Gln-288 and Arg-293 have been well-examined, and studies from our group have shed some light on residues Asp-287, Asn-291, and Pro-294. Future studies should examine the remaining positions of loop 2; e.g. Lys-292 and the almost universally conserved Gly-289. Further bioinformatics analysis of RR may also search for positions that covary similarly to positions 291 and 294. Continued thorough phylogenetic and structure-function analysis will likely reveal many surprising roles for individual amino acid residues in regulation of eukaryotic RR enzymes.
Author contributions-A. J. K. and M. E. H. conceived of the study, interpreted the data, and wrote the manuscript. A. J. K. performed the experiments with assistance from R. S. S. G. and A. J. K. conducted the bioinformatics analysis. S. G., R. S., R. V., and M. F. A. participated in experiment planning and contributed to intellectual development of the manuscript.