Aminoacylating Urzymes Challenge the RNA World Hypothesis*♦

Background: RNA World scenarios require high initial fidelity, greatly slowing lift-off. Results: Class I TrpRS and Class II HisRS Urzymes (120–130 residues) both acylate tRNAs ∼106 times faster than the uncatalyzed peptide synthesis rate. Conclusion: Urzymes appear highly evolved, implying that they had even simpler ancestors. Significance: High Urzyme catalytic proficiencies imply that translation began in a Peptide·RNA World. We describe experimental evidence that ancestral peptide catalysts substantially accelerated development of genetic coding. Structurally invariant 120–130-residue Urzymes (Ur = primitive plus enzyme) derived from Class I and Class II aminoacyl-tRNA synthetases (aaRSs) acylate tRNA far faster than the uncatalyzed rate of nonribosomal peptide bond formation from activated amino acids. These new data allow us to demonstrate statistically indistinguishable catalytic profiles for Class I and II aaRSs in both amino acid activation and tRNA acylation, over a time period extending to well before the assembly of full-length enzymes and even further before the Last Universal Common Ancestor. Both Urzymes also exhibit ∼60% of the contemporary catalytic proficiencies. Moreover, they are linked by ancestral sense/antisense genetic coding, and their evident modularities suggest descent from even simpler ancestral pairs also coded by opposite strands of the same gene. Thus, aaRS Urzymes substantially pre-date modern aaRS but are, nevertheless, highly evolved. Their unexpectedly advanced catalytic repertoires, sense/antisense coding, and ancestral modularities imply considerable prior protein-tRNA co-evolution. Further, unlike ribozymes that motivated the RNA World hypothesis, Class I and II Urzyme·tRNA pairs represent consensus ancestral forms sufficient for codon-directed synthesis of nonrandom peptides. By tracing aaRS catalytic activities back to simpler ancestral peptides, we demonstrate key steps for a simpler and hence more probable peptide·RNA development of rapid coding systems matching amino acids with anticodon trinucleotides.

The RNA World Hypothesis (1,2) holds that RNA molecules, simultaneously playing both informational and catalytic roles, evolved extensively prior to the appearance of proteins. However, no biological ribozymes that might support the hypothesis participate in either replication or tRNA aminoacylation. Moreover, Koonin estimates that assembling the repli-cative fidelity necessary for an RNA-only origin would require multiple universes (3). tRNA aminoacylation, the defining reaction in codon-dependent translation, has not previously been demonstrated for aaRS 2 fragments smaller than intact catalytic domains (4,5). Experimental acylation of mini-and microhelices derived from tRNAs (6,7) has been shown, however, for several intact aaRSs, suggesting that aaRS catalytic domains might have acylated cognate tRNAs using an earlier "operational RNA code" (8) focused on bases in the 3Ј acceptor stems. We previously showed that ancestral peptides afford a realistic alternative to ribozymal catalysis by characterizing 120 -130residue Urzymes from both classes ( Fig. 1) that accelerate cognate amino acid activation by ATP ϳ10 8 -fold (9 -11).
However, to have helped implement genetic coding (8), Urzyme-like peptide catalysts must also have acylated cognate tRNAs. Our previous work did not address that question. We demonstrate here that Class I TrpRS and Class II HisRS Urzymes indeed catalyze tRNA acylation at a far faster rate than is required for spontaneous, ribosome-independent peptide synthesis (12). These new data challenge the notion that genetic coding was established entirely by RNA molecules.

EXPERIMENTAL PROCEDURES
TrpRS (9) and HisRS (11) Urzymes were expressed and purified as described. Bacillus stearothermophilus tRNA Trp and Escherichia coli tRNA His in vitro transcripts were labeled with 32 P at the 3Ј-terminal internucleotide linkage using the exchange reaction catalyzed by E. coli tRNA nucleotidyltransferase, as described (13). Labeled tRNA was purified on a Bio-Rad P30 spin column and then mixed with unlabeled tRNA to the desired final concentration. tRNA was refolded by heating to 80°C for 2-3 min followed by the addition of MgCl 2 to a final concentration of 10 mM and slow cooling to ambient temperature over a period of 15-30 min. Aminoacylation reaction mixtures contained 50 mM HEPES, pH 7.5, 20 mM KCl, 10 mM MgCl 2 , 5 mM DTT, 10 mM ATP, 250 M amino acid, enzyme (5 nM wild type; 1-5 M Urzyme), and a range of tRNA concen-* This work was supported by NIGMS/National Institutes of Health Grants trations (0, 0.5, 1.0, 2.5, 5.0, 10.0, and 15.0 M). Reactions were run at 37°C, for variable lengths of time, determined to be in the linear range, and 1-l aliquots were removed at varying time points, quenched in 400 mM sodium acetate, pH 5.2. Aminoacylated tRNA was digested with 0.1 mM P1 nuclease (Sigma), spotted on prewashed polyethyleneimine-cellulose TLC plates (Sigma), and developed in 100 mM ammonium acetate, 5% acetic acid. Dried TLC plates were quantified by phosphor imaging analysis. TLC profiles afford quantitative estimates for the amounts of acylated and unacylated A76 bases. The acylated product produced at a given time is thus the product of the initial tRNA concentration times the fraction of acylated A76. Steady-state kinetic parameters were determined using JMP (14) and corrected for the fraction of active enzyme (0.35-0.7) and aminoacylatable tRNA (0.2-0.55).
Experimental differences between Urzyme and full-length catalytic rate accelerations, between amino acid activation and acylation rates, and between Class I-and Class II-derived Urzymes were compared quantitatively by fitting linear regression models for transition-state stabilization free energies. Activation free energies for the uncatalyzed rates of amino acid activation and acyl-transfer from Table 1 were subtracted from corresponding experimental values for the catalyzed rates, ⌬G ‡ k cat /K m , giving estimates for transition state (TS) stabilization, ⌬⌬G ‡ . TS stabilization energies are then assigned binary codes (1 or 0), according to which enzyme and which reaction they belong to, whether the catalyst includes the Urzyme and whether it includes enhancements found in the full-length enzymes. TS stabilization energies are collected with these codes in Table 2.  The codes in Table 2 are potential predictors of catalytic rate enhancement. The presence of the Urzyme in a catalyst may contribute consistently to the observed rate enhancement; the activation reaction may be accelerated consistently by a greater amount than tRNA aminoacylation; enhancements found in contemporary enzymes but not in Urzymes may accelerate rates by a consistent amount, and Class I catalysts may be consistently faster or slower than Class II catalysts. The extent to which statement is actually validated by the data can be assessed from the coefficients, ␤ i, ␤ i ␤ j , of the linear model for transition state stabilization energy, where X i and X j represent entries in the third through sixth columns of Table 2. This treatment differs from principal components analysis in that we explicitly postulate the identity of the principal components, rather than identifying them a priori. It represents multidimensional thermodynamic cycles (see Fig. 5A) These calculations were also performed with JMP (14). Validity can be evaluated from the degree to which it accounts for the variance of the TS stabilization (R 2 ϭ 0.99) and from Student's t test probabilities of the coefficients, Table 3. The four predictors in Table 3 account for virtually all of the variance in the observed values (see Fig. 5B); errors are defined with five degrees of freedom, and Student's t test probabilities for all but the activation*full-length enhancements interaction term are highly significant, as discussed previously (11,15).

RESULTS AND DISCUSSION
TrpRS and HisRS Urzymes both retain binding sites for cognate tRNA acceptor stems. Models of aaRS⅐tRNA complexes suggest that aaRS Urzymes of both classes retain interactions with cognate tRNA acceptor stems (Fig. 2). Surface area calculations show that complex formation with tRNA buries ϳ300 -500 Å 2 , positioning the acceptor stems for acylation and aligning previously identified active site and acceptor stem recognition elements. Urzymes HisRS-1, HisRS-2, HisRS-3, and HisRS-4 differ in the presence (HisRS-3, HisRS-4) or absence of signature Motif 3, and the presence (HisRS-2, HisRS-4) of a 6-residue N-terminal extension preceding Motif 1 (11). All four HisRS Urzymes contain the identity elements shown in Fig. 2B. Thus, such recognition potentially confers on both Urzymes the rudimentary acceptor-stem discrimination necessary to interpret an operational RNA code (8).
TrpRS and HisRS Urzymes Accelerate tRNA Aminoacylation ϳ10 6 -Fold-Assays using 32 P-3Ј adenosine-labeled cognate tRNA (13) (Fig. 3) readily allowed derivation of steady-state kinetic parameters, documenting that both Urzymes catalyze cognate tRNA aminoacylation. Notably, whereas previously published second-order rate constants (k cat /K m ) for amino acid activation were decreased 10 5 -fold, those for aminoacylation are decreased by only three logs relative to the full-length enzymes (Table 1). Moreover, despite lacking anticodon binding domains, Urzyme tRNA K m values are within an order of magnitude of those for contemporary WT enzymes.
Model bio-organic reactions (12,16,17) summarized in Table 1 provide plausible estimates of the uncatalyzed rates of amino acid activation, acyl transfer, and peptidyl-transfer (denoted by "Uncat" in the Table). These rates likely constrained the evolution of living systems prior to the advent of biological catalysts. Contemporary enzymes accelerate essential reactions differentially so that they can all proceed at approximately the same rate.  All three reactions necessary for protein synthesis therefore must have proceeded at comparable rates for protein synthesis to evolve. The significant differences among uncatalyzed rates of these three reactions mean that efficient protein synthesis demanded greater acceleration of amino acid activation than of tRNA aminoacylation. Moreover, to preserve uniformity of translation across a variety of messages, different aaRS classes must have accumulated parallel evolutionary rate enhancements.
Class I and II Aminoacyl-tRNA Synthetase Catalytic Profiles Evolved in Parallel-Experimental catalyzed rates (Table 1 and Figs. 4 and 5) quantitatively confirm both expectations outlined in the previous section. The new aminoacylation data allow quantitative assessment of catalytic proficiencies of Class I and II Urzymes and contemporary enzymes in both amino acid activation and tRNA acylation. The uncatalyzed rate of amino acid activation is ϳ3 orders of magnitude slower than the other two steps (Fig. 4A); hence, it is the most significant of the three chemical barriers to protein synthesis.
Class I and II Urzymes accelerate amino acid activation 10 8 -10 9 -fold (Fig. 4B). Previously, we demonstrated the authenticity of this catalytic activity (9 -11), showing that the observed activity arises from major percentages of molecules, that amino acid K m values are increased relative to those of full-length enzymes and that activities are modified by mutation. Evolutionary enhancement of both families, to create catalytic domains and then full-length enzymes, led to equivalent catalytic rate increases for both amino acid activation (Fig. 4B) and tRNA aminoacylation.
Regression modeling (Fig. 5) affords a convenient assessment of the statistical significance of patterns evident in averages of different groupings within the data. It reinterprets the TS sta-bilization values in the space spanned by the four predictor columns in Table 2 and does not introduce new information. The analysis (Table 3 and Fig. 5C) confirms that both Urzymes reduce transition state free energies of both reactions by an average Ϫ9.7 Ϯ 0.7 k cal /mol (p Ͻ 0.0001).
Catalytic enhancements in full-length HisRS and TrpRS enzymes further stabilize all transition states by Ϫ6.1 Ϯ 0.6 kcal/mol (p Ͻ 0.0001). The Urzymes therefore have only 15-25% of the mass but exhibit 61% of the catalytic proficiency of contemporary aaRSs. Quantitative comparison suggests that the Urzymes, which comprise the oldest modules of both synthetase superfamilies, are to the full-length contemporary enzymes approximately what the chassis and drive train are to an automobile.
The Urzymes accelerate 32 P i exchange, a surrogate for cognate amino acid activation, and tRNA aminoacylation in approximately the same proportion as do the full-length enzymes. Transition states for amino acid activation are preferentially stabilized in all four cases, by Ϫ2.6 Ϯ 0.6 kcal/mol more than those for acylation (p ϭ 0.0045), consistent with the fact that uncatalyzed amino acid activation is slower. Thus, all four catalysts tend to equalize the two reaction rates by preferentially accelerating amino acid activation. Both Urzymes have essentially modern catalytic profiles.
Class I and II Urzyme-like aaRSs Date from Substantially Earlier than Modern aaRS, Last Universal Common Ancestor (LUCA) (Fig. 6)-Consensus (18,19) holds that protein synthesis in the LUCA was supported by a nearly complete set of essentially modern aaRSs, with fully developed accessory domains for anticodon recognition and editing of incorrectly activated amino acids. In contrast, the TrpRS and HisRS  Table 1.
Urzymes consist exclusively of the invariant structural cores of their respective superfamilies.
To the extent that we have characterized them, their amino acid specificities are quite modest (9). Whereas Urzymes greatly reduce the structural complexity necessary to implement translation of a rudimentary genetic code, that limitation would have prevented them from translating modern proteins. Thus, the Urzymes exhibit catalytic properties from an era well before the full expression of the genetic code and hence from a much earlier evolutionary stage than modern aaRSs or LUCA.
In light of their limited specificity, the exquisitely matched catalytic repertoires of both Urzymes may initially appear puzzling. Conservation of domains and catalytic residues involved in amino acid activation by aaRSs and variability of those required for tRNA acylation have led others (20,21) to argue that evolution of amino acid activation activity preceded acquisition of the capability to catalyze acylation. Building on their catalytic cores, Urzymes could subsequently have accumulated anticodon binding and insertion domains in response to evolutionary selective pressure to enhance amino acid and tRNA specificity as the genetic code expanded during later stages of evolution. Integrating these new modules likely required mechanistic changes to exploit and tune conformational cycles in multidomain aaRSs (15), selecting for more recent catalytically important residues stabilizing transition states for tRNA acyla-

FIGURE 4. Rates of 32 P i exchange with cognate amino acids by Class I (TrpRS, LeuRS) and Class II (HisRS) aaRS Urzymes, catalytic domains, and native enzymes, compared with uncatalyzed and fully catalyzed rates.
A, rate accelerations estimated from experimental data for single substrate (red) and bi-substrate (black, bold) reactions adapted to include uncatalyzed and catalyzed rates of bi-substrate reactions of the ribosome (12), amino acid activation (16), and kinases (40). Second-order rate constants (black bars) were converted into comparable units by multiplying by 0.002 M, which is the ATP concentration used to assay the catalysts shown in B. B, experimental rate accelerations estimated from steady-state kinetics as k cat /K m for a series of catalysts derived from Class I and Class II aaRSs (5,6,11). Vertical scales in A and B are the same. Histograms in B were normalized by subtracting the logarithm of the uncatalyzed rate of amino acid activation in (AAact; A). Red bars denote Class I tryptophanyl-and leucyl-tRNA synthetase constructs, blue bars denote Class II histidyl-tRNA synthetase constructs, and green denotes the ribozymal catalyst (18)

Class II
A. Edges represent contributions to the overall rate acceleration that can be attributed to each predictor in third through sixth columns of Table 2. B, transition state stabilization free energies from steady-state k cat /K m values for activation (open) and acylation (shaded) by Urzymes and full-length contemporary enzymes. C, regression model relating transition state stabilization free energies to contributions from Urzymes themselves, evolutionary enhancements in the full-length enzymes, whether the reaction is amino acid activation or tRNA acylation, and the two-way interaction between enhancements and activation ( Table 2). The aaRS Class distinction does not contribute significantly.
tion (21). These questions notwithstanding, uniform integration of both activities into the same Urzymes is consistent with their participation in implementing an operational RNA code (8).

Class I and II Urzymes Are Highly Evolved and Had
Simpler Sense/Antisense-encoded Ancestors-By the same token, the unexpectedly sophisticated Urzyme catalytic proficiencies nevertheless reduce the largest kinetic barrier to protein synthesis ϳ10 5 -fold more than necessary to assemble peptides from activated monomers in the absence of ribosomes (Fig. 4A). We conclude that the Urzymes represent a more advanced catalytic proficiency and evolutionary development than the earliest peptides catalyzing aminoacylation. Extended prior co-evolution with ancestral tRNAs implies, in turn, that both Class I and II Urzyme-like enzymes descended from still simpler, less proficient polypeptide catalysts that may have faced a stronger selective pressure to accelerate amino acid activation, as proposed earlier for different reasons (20).
An evident modularity of both Urzymes strengthens the implication that simpler catalysts preceded their emergence. The strongest published evidence for modularity is the widespread conservation of a nonpolar packing motif in the first ␤-␣-␤ crossover of Ͼ120 Rossmannoid superfamily members (22), together with the critical role played by this motif in tryptophan activation by TrpRS (15). We explicitly used this modularity to identify strong vestigial traces of sense/ antisense coding of Class I and II Urzymes (23). It therefore is probably more than coincidental that the Class I ATP binding site containing this motif at the C terminus and a modified P-loop (PXXXXHIGH) at the N terminus of the helix aligns precisely antisense to the Class II aaRS Motif 2 ATP binding site (10).
These observations afford strong circumstantial evidence that genetically coded, ATP-mobilizing catalysts only ϳ46-res-idues long preceded both Urzymes. Substantial evidence that these earlier catalysts were also encoded by opposite strands of the same gene (23) constitutes an important leitmotif connecting these antecedents through the Urzymes to the contemporary enzymes (Fig. 6). Specifically, the evidence for sense/antisense complementarity implies that ancestral Class I and II aaRSs emerged simultaneously, not sequentially. That evidence reinforces the case that Urzyme-like molecules were actually on the evolutionary path to modern protein synthesis.

Class I and II Urzyme Capabilities May Have Been Sufficient to Initiate Natural Selection of Globular Proteins-Although
Class I and II aaRSs differ markedly in primary, secondary, and tertiary structure (24), both Urzymes accelerate activation and acylation in the proportions observed for full-length enzymes (Fig. 5B). Remarkably, none of the variance in TS stabilization free energies can be attributed to whether the catalyst is derived from TrpRS or from HisRS. By this test, Class I and Class II Urzymes have statistically indistinguishable catalytic profiles for cognate amino acids. This parity seems especially significant in view of the underlying preference (10) of Class I aaRSs for large aliphatic amino acids (Leu, Val, Ile, and Met) and that of Class II aaRS for smaller, hydrophilic amino acids (Ala, Gly, Ser, Thr, His, and Pro). The chief difference between Class I and II appears to be that they process complementary (i.e. nonpolar versus polar) amino acid types.
The contrasting amino acid specificities of Class I and II aaRSs for nonpolar and polar amino acid substrates (10) lend additional significance to their parallel emergence and subsequent evolution. High Urzyme ATP and cognate tRNA affinities combined with weaker amino acid affinities appear suited to insert amino acids with appropriate water solubility at positions specified by a succession of increasingly specific codes (25), generating diverse, nonrandom, and yet functional molten globular translation products.
These specificities suggest that the first operational RNA code (8) may have been as simple as two amino acid types, activated by a single pair of Class I and II aaRSs. Such a code may therefore have sufficed to initiate synthesis of nonrandom peptides with the potential to form molten globules (26 -28) with a variety of secondary structures, depending on the binary pattern of polar versus nonpolar side chains specified by the code. Finally, such an evolving translation system could thus have continuously participated in correlated natural selection of improved proteins and more specific codes, without disrupting the integrity of existing proteins.
Urzyme Catalytic Proficiencies, Modularities, and Sense/Antisense Genetic Coding Imply That Translation Began in a Peptide⅐RNA World-According to the RNA World Hypothesis (1) RNA catalysts alone launched coded peptide synthesis, eventually producing superior protein catalysts that replaced RNA in virtually every sphere of biochemistry. Two considerations weaken the argument that decoding and protein synthesis arose uniquely from RNA.
First, whereas the contemporary ribosomal catalyst for peptidyl transfer is RNA, there is no evidence in extant biology for ribozyme-catalyzed nucleic acid synthesis, amino acid activation, or tRNA acylation. Thus, the hypothesis lacks clear ancestral lineages for most of the necessary components, relying instead on hypothetical ribozymal catalysts for which there is there is scant phylogenetic support. Yarus (2) and others adduced considerable apparent support for the hypothesis using SELEX exponential enrichment (29,30) to produce contemporary surrogates for these hypothetical ribozymes. Specifically, RNA aptamers have been isolated that accelerate amino acid activation (31) and RNA aminoacylation (32), promote specific amino acid recognition (33), and peptide synthesis (34). However, selecting these aptamers required synthetic technology (e.g. the polymerase chain reaction) that was absent when life began.
A second, more serious weakness is that the ribozymal path to complexity depends critically on the early emergence of accurate replication, tying complexity tightly to fidelity. Such a path offers no mechanism to transcend what Koonin (3) calls "Eigen's Cliff," which arises from the need for nearly all progeny to closely resemble their parents to sustain function. For the RNA World, this is essentially impossible to envision. Whatever mechanisms polymerized nucleotides, elaborating the requisite ϳ1800 nucleotides of RNA thought necessary for accurate replication and protein synthesis has been estimated to be prohibitively slow, even requiring multiple universes (3). Mechanisms involving complementary, less specific peptide catalysts would doubtless have accelerated the process and represent more probable paths to biology.
The scenario that peptide catalysts and RNAs co-evolved from simpler precursors addresses both weaknesses. (i) We emulated ancestral gene reconstruction (35) to produce the Urzymes. Multiple three-dimensional structure alignments of superfamily members revealed earlier phylogenetic relationships than are accessible from multiple sequence alignments (Fig. 1); protein design stabilized and solubilized invariant core peptides as putative ancestral forms; and experimental catalytic activities are quantitative metrics for the likelihood of ancestry.
Ancestral sequence reconstructions demonstrating sense/antisense ancestry of Class I and II Urzymes (23) further reinforce the three-dimensional structural homologies ( Fig. 1 and Ref. 18), affording strong phylogenetic evidence that the high Urzyme catalytic proficiencies and functional sufficiency imply that close relatives of these Urzymes participated in the early evolution of translation. (ii) We postulate that catalysis and coding co-evolved from an initial state with minimal information via the simultaneous growth of stereochemically complementary peptides and RNAs. Antiparallel polypeptides and polynucleotides adopt complementary helical conformations (36,37), suggesting rudimentary catalytic templating mechanisms for mutual assembly of both polymers from resources thought to be present on primitive earth. The striking reciprocity of proteins and RNA in biology is consistent with our proposal: proteins exclusively catalyze nucleic acid synthesis; RNA catalyzes protein synthesis; and genetic messages are interpreted by the small ribosomal subunit (38), a ribonucleoprotein.
Our proposal avoids the Eigen Cliff because its most important restriction is not sequence fidelity but appropriate stereochemistry and relative chirality. Models (36) suggest that hydrogen bonding between L peptide carbonyl oxygen atoms and D ribose 2ЈOH groups (or the opposite relative configuration) can orient both the 3Ј-OH group for polymerization, accounting for 5Ј-3Ј linkages in polynucleotides, and C-terminal peptide carboxyl oxygens for nucleophilic attack on activated amino acids. We suggest that structural and catalytic repertoires of short polypeptide hairpins were likely critical to eventually establishing the code, and perhaps even to the emergence of RNA itself (36).
Reciprocal templating of peptide and RNA double helices by stereochemical complementarity entails a natural form of sense/antisense genetic coding. As each RNA strand would have influenced the amino acid sequence primarily of one of the two peptide strands, both strands would have been templates for functional peptides, and vice versa. Complementarity enforces nonrandom but distinct sequences for peptides assembled by opposite RNA strands.
Under this alternative hypothesis, the earliest "polymerases" were naturally peptides, and the earliest ribosomes were short RNA oligonucleotides. Sense/antisense coding therefore connects this simpler framework to the realm of phylogenetically meaningful and experimentally testable genetic constructs described here. It is consistent with a peptide-accelerated and hence more probable scenario than the exclusive RNA world for the origins of codon-directed translation (Fig. 5). Thus, rather than an RNA world, life more likely began in a Peptide⅐RNA World (36,38,39), in which evolutionary development of catalysis and coding were intimately coupled via stereochemical thermodynamics.