Spontaneous emergence of self-replication in chemical reaction systems

Explaining the origin of life requires us to explain how self-replication arises. To be specific, how can a self-replicating entity develop spontaneously from a chemical reaction system in which no reaction is self-replicating? Previously proposed mathematical models either supply an explicit framework for a minimal living system or only consider catalyzed reactions, and thus fail to provide a comprehensive theory. We set up a general model for chemical reaction systems that properly accounts for energetics, kinetics and the conservation law. We find that (1) some systems are collectively-catalytic where reactants are transformed into end products with the assistance of intermediates (as in the citric acid cycle), while some others are self-replicating where different parts replicate each other and the system self-replicates as a whole (as in the formose reaction); (2) many alternative chemical universes often contain one or more such systems; (3) it is possible to construct a self-replicating system where the entropy of some parts spontaneously decreases, in a manner similar to that discussed by Schr\"odinger; (4) complex self-replicating molecules can emerge spontaneously and relatively easily from simple chemical reaction systems through a sequence of transitions. Together these results start to explain the origins of prebiotic evolution.

S elf-replication is one of the central properties of life (1), and to explain life's origins we need to explain how selfreplication arose. It is widely accepted that before DNA, life replicated through RNA molecules (2). However, it remains unclear how the building blocks of RNA, such as nucleotides, became available on the primitive Earth, and even if these building blocks were abundant, it is unclear how they were assembled into the first RNA (3,4). It is plausible that selfreplication did not originate from a single complex independent self-replicating molecule. In the early stage of evolution, the "precursor life" could be very different from what we see today (3). For example, in Wächtershäuser's iron-sulfur world hypothesis, the precursor life does not have nucleic acids but consists of a self-replicating (or autocatalytic, in his words) metabolic network (5). Another proposal, by Szathmáry, is that life evolved from "holistic limited hereditary replicators" such as the formose reaction-in which sugar is replicated from formaldehyde-to "modular unlimited hereditary replicators" such as RNA and today's DNA (6)(7)(8).
There are lots of biological examples of self-replicating systems that do not rely on a single complex template molecule (as is the case for DNA and RNA molecules). These include: the malic acid cycle (a metabolic path of some bacteria and plants for synthesis of malates); the Calvin cycle in photosynthesis (9); the reductive citric acid cycle for a certain group of chemoautotrophs (10,11); the metabolic pathways of ATP (as well as some other coenzymes such as NAD + and CoA) found in many different living organisms (12); and the whole metabolic reaction network of E. coli (13). Self-replication has also been identified in non-living systems such as the formose reaction (9,14) and experiments in labs (15)(16)(17), including self-replication of nucleotide-based oligomers (18).
Several theoreticians have tried to explain the origin of self-replication in terms of a system of coupled chemical reactions (19). For example, the chemoton model describes a system composed of three coupled quasi-self-replicating subsystems (metabolism, membrane, and template), which as a whole is able to self-replicate (9). The chemoton can be considered as a model of a minimal living system, but cannot explain how this system spontaneously develops from a soup of simple molecules. The RAF (reflexively autocatalytic and food-generated) theory (20)(21)(22), extended from Kauffman's autocatalytic sets theory (23), is also a step in the direction of understanding origins of self-replication. A set of chemical reactions is RAF if (1) every reaction in this set is catalyzed by at least one molecule involved in this set, and (2) every molecule involved in this set can be produced from a small food molecule set. RAF sets are shown to be able to readily emerge from a set of chemical reactions, and always consist of smaller RAF sets, demonstrating the capability to evolve (24,25). Other similar models also contributed to this theory (26)(27)(28)(29), and many of the biological observations mentioned in the previous paragraph can be put into the framework of RAF theory (13,(15)(16)(17)(18).
Although RAF theory is interesting, it has limitations as an explanation of self-replication. Firstly, the theory stipulates that every reaction in the set is catalyzed. Even though catalyzed reactions are very common in living systems, not every reaction involves a catalyst (e.g., condensation reactions (7,30)). In the early stages of biological evolution, probably no reaction required sophisticated biotic catalysts (e.g., enzymes) (10,13), so there is a strong motivation not to include these enzymes (18,31,32) or even catalysts as given in a model of the origins of self-replication. Uncatalyzed reactions have significant effects on the dynamics of the whole system, e.g., "innovating" new species of molecules and then triggering other RAF sets (33,34). The second concern with RAF theory is that it is a purely graph-theoretic approach. As a result, there is no constraint on how chemical reactions are constructed and coupled (although it gives the theory much freedom, the construction of reaction systems is too arbitrary to investigate the model systematically). Extra assumptions need to be made about chemical kinetics in order to investigate the dynamics of how populations of molecules change over time (35). Here we see another reason to relax the as- sumption of studying only catalyzed reaction: catalyzed reactions are never elementary reactions, so the kinetics cannot be simply calculated from the reaction equations (36).
Any theoretical approach to the origin of self-replication should explicitly include energetics (13,37), an aspect which is missing from all these models and theories above. There are several reasons for this. Firstly, energetics (e.g., Gibbs energy) determines whether a chemical reaction is spontaneous, so to investigate the spontaneity of the emergence of life, it has to be considered. Secondly, in order to concretely discuss the issue-famously put forward by Schrödinger-that life maintains its order by feeding on "negative entropy" (38), energetics has to be explicitly taken into account: entropy is negatively related to the thermodynamic free energy (which is Gibbs energy in the scenario of constant pressure and temperature). It should be noted that the relationship between life and entropy is investigated in different ways and contexts (39)(40)(41). For example, Branscomb explained the specific mechanisms to increase thermodynamic free energy in two realworld biochemical scenarios (40): the hypothesized "alkaline hydrothermal vent" (42) on the prebiotic Earth, and the system where the Ferredoxin I protein translocates protons. In the context of statistical physics, a lower limit was derived for the amount of heat generated in a non-equilibrium system where a process of self-replication occurs (41).
In this paper, we set up a general model for chemical reaction systems that properly accounts for energetics, kinetics and the conservation law. Catalysts are not explicitly included in the model, but we later find that catalysis can emerge, along with self-replication and potentially complex molecules, in our system.

Model
Chemical reaction system. We model a well-mixed soup of molecules, each of which is defined by its integer mass, i. A molecule's type is thus denoted i. Only two types of reaction are possible: synthesis of two molecules to create a molecule of greater mass (e.g., 2 + 4 → 6) and decomposition into two molecules to create two molecules of lower mass (e.g., 6 → 1 + 5). All reactions that conserve mass-the total mass on the left-hand side of the equation adds up to those on the right-hand side-can occur. For convenience, we define a reaction pair to be a reaction and its corresponding reverse reaction.
Each type of molecule i has its own standard Gibbs energy of formation G • i . Then, as illustrated in Fig. 1, a reaction is either spontaneous-meaning that the total standard Gibbs energy of formation of the reactants is greater than that of the products-i.e., i+j . If one reaction in a reaction pair is spontaneous, the other is non-spontaneous, and vice versa.
According to transition state theory (36), the reactants have to overcome the Gibbs energy of activation (namely ∆G ‡ +ij in Fig. 1) in order for the reaction to occur. In the model, any reaction pair is either low-barrier-the energy barrier (corresponding to ψij in Fig. 1) is low and the reaction rate is thus high-or high-barrier. In our system all reactions are possible, but to define a specific chemical reaction system we write a list of low-barrier (and spontaneous) reactions only.
For example, is a model of the citric acid cycle in cellular respiration, simplified so that only carbon-changing reactions are included (9). Specifically, molecule 1 stands for carbon dioxide, 2 for acetyl-CoA, 4 for oxaloacetic acid, 5 for α-Ketoglutaric acid, and 6 for citric acid, respectively. It is also possible to model the formose reaction (9) which involves the formation of sugars from formaldehyde, Again, only carbon-changing reactions are considered. Specifically, molecule 1 stands for formaldehyde, 2 for glycolaldehyde, 3 for glyceraldehyde, and 4 for tetrose, respectively. These two reaction systems will serve as special cases in our study, but we will cover a wide range of different systems.
Note that some chemical reaction systems are not physically possible, e.g., where by adding up these low-barrier reactions, all molecules are cancelled out. As a consequence, one cannot find proper Gibbs energy for each molecule such that each reaction above is spontaneous. We are only interested in physically possible systems.

Kinetics.
We now specify a general model for the kinetics of our chemical system, under the following assumptions: (1) All molecules are ideally gaseous; (2) the whole system is kept at constant pressure and temperature; (3) every possible reaction is elementary. The derivation follows the rate law and transition state theory (36). Here we cover the key points, and a full derivation is given in Supplementary Information (SI) section S1.
For any synthesis reaction i + j → i + j, the reaction rate in unit s −1 is where the subscript +ij stands for the synthesis reaction, β and κ are constants, Ni is the number of molecules i in the system, Nj is the number of j, S is the number of solvent molecules, determining the global rate at which molecules interact, N is the number of all the molecules except for solvent molecules, and ∆G ‡ +ij , as shown in Fig. 1, is defined as When implementing the model, we set values of ψij (positive and finite) for each reaction pair and G • i for each molecule. Together, these give a unique value for ∆G ‡ +ij . Likewise, for any decomposition reaction i + j → i + j, the reaction rate in unit s −1 is where the subscript −ij stands for this decomposition reaction, and Due to the fact that the transition state of a reaction pair i + j → i + j and i + j → i + j is identical, by setting ψij both ∆G ‡ +ij and ∆G ‡ −ij are uniquely determined. In addition, by setting ψij large or small, we can easily make the reaction pair low-barrier or high-barrier.

Simulation of dynamics.
We take the citric acid cycle Eq. (1) as an example to illustrate how to set up the simulation experiment.
(1) Set up all the constants. Assume that the constant pressure the system is kept at is 100 kP a and the constant temperature is 298.15 K, so we obtain that β ≈ 6.21×10 12 s −1 and κ ≈ 0.403 mol/kJ.
(2) Set G • i for each type of molecule (up to 6 in this case), to make sure that the three reactions are spontaneous, that is, These values are set in the range of normal chemical substances' Gibbs energy of formation (36), roughly in the range (−1500, 300). Note that the choice is not unique and a wide range of choices can be made to allow the system to work.
(4) Assume that there is an unlimited reservoir of resource molecule 2. That is, whenever a molecule 2 is consumed or produced, it is replenished or removed so that the number of 2 always keeps as a constant Q = 1000. This setting makes biological sense if we consider a system separated from the unlimited reservoir by a "wall". As long as some resource molecules are consumed, more will enter the system driven by the chemical gradient.
We then use the standard Gillespie Algorithm to simulate the dynamics (see details in SI section S2). In addition, we also construct ordinary differential equations (ODEs) to describe the mean-field dynamics (see details in SI section S3).

Results
Collectively-catalytic system. We first simulate the citric acid cycle Eq. (1) and observe that N1(t) increases linearly. Molecules 4, 5 and 6 are involved in a cycle of reactions, but the total number is constant. The SI section S4 shows this dynamics and ODEs solutions in more detail. We also observe that each reaction in Eq. (1) occurs approximately the same number of times. We can add up these reactions, cancel out the molecules appeared on both sides of the reaction (in this case, molecule 4, 5 and 6), and then obtain the overall reaction of the system 2 → 1 + 1. In itself, the reaction 2 → 1 + 1 is high-barrier, so its reaction rate is extremely low, but through the whole system the actual rate of the overall reaction is several billion times larger (SI section S4). We call system Eq. (1) collectively-catalytic system, since the overall reaction 2 → 1 + 1 is catalyzed by molecule 4, 5 and 6. Note that this outcome is consistent with the biological observation that the citric acid cycle consumes acetyl-CoA and produces carbon dioxide as a waste product.
The linear growth of molecules 1 is because: (1) 1 is an end product which cannot be used by these low-barrier reactions; (2) the number of resource molecules 2 is constant; and (3) no additional molecule 4, 5 and 6 can be produced through these low-barrier reactions (by noting that the number of 4, 5 and 6 on the right-hand side of the reaction system Eq. (1) is the same as the number on the left-hand side). We can use these observations to give a rigorous set of criteria for collectively-catalytic systems. We start by defining an intermediate molecule to be any molecule that appears on both the reactant side and the product side. In general, the following stoichiometric criteria are sufficient (but not necessary) to show that a physically possible chemical reaction system, given supplies of resource molecules, is collectively-catalytic: (1) For every low-barrier reaction, at least one type of its reactants comes from the products of other low-barrier reactions (called the criterion for self-driven); (2) by adding up all the low-barrier reactions, for every type of intermediate molecule, the number of times it appears on the reactant side and on the product side is the same (called the criterion for balancedcancelling).
The citric acid cycle Eq. (1) thus satisfies all these criteria, but we also find that other systems fulfill these criteria too (in fact, any single catalytic reaction can be written as a collectively-catalytic system: see SI section S5 for details). The criteria above give us a way of discerning whether or not a system is collectively-catalytic based on stoichiometry alone, without the need to investigate its dynamics. Self-replicating system. We now look at the dynamics of the formose reaction Eq. (2) given resource molecule 1. The result of a simulation is shown in Fig. 2. All three intermediate molecules (N2(t), N3(t) and N4(t)) increase exponentially. The solutions of the corresponding ODEs are consistent with the simulations (SI section S6). The fact that 2 grows exponentially can be seen directly by adding up the three low-barrier reactions to obtain 1+1+2 → 2 + 2. It indicates that if one 2 is present beforehand, one extra 2 can be produced, by transforming two of 1. Then, the additional 2 is further used by the system, and more 2s are produced. Although the molecules 3 and 4 are canceled out when we add up the reactions, they also grow exponentially (with 4 increasing much slower than the other two). The reason is that the actual reaction rate of each low-barrier reaction is not the same (see SI section S6). This observation is important since it illustrates that it is not just one type of molecule that grows exponentially in such systems, but all the intermediate molecules.
We define a self-replicating system, of which Eq. (2) is an example, to be a system in which at least one type of molecule is replicated. By investigating various self-replicating systems, we find that not only exponential but also super-exponential growth is observed (see examples in SI section S7). The dynamics indicates that the reactions in self-replicating systems become faster and faster. This is a very special property compared to the collectively-catalytic system, where the overall reaction rate keeps constant, e.g., the citric acid cycle Eq. (1).
In general, the following stoichiometric criteria are sufficient (but not necessary) to show that a physically possible chemical reaction system, given supplies of resource molecules, is self-replicating: (1) The criterion for self-driven (mentioned in last section) is satisfied; (2) there are some types of intermediate molecules, and the number of times it appears on the reactant side is less than that on the product side (called the criterion for overproduction); (3) there is no type of intermediate molecules that the number of times it appears on the reactant side is larger than that on the product side (called the criterion for no-overintake). The formose reaction Eq. (2) satisfies all these criteria.
Collectively-catalytic and self-replicating systems are common. A natural question to ask is how common these reaction systems are, such as the citric acid cycle and the for-mose reaction. Specifically, if we construct alternative chemical universes where arbitrarily choose which reaction is lowbarrier, how many of the resulting chemical universes contain collectively-catalytic or self-replicating systems? To answer this question we start by using the criterion shared by both of them, that they are self-driven. Table 1 shows the numbers for different L, which is set to be the mass of the largest molecule in the chemical universe in question, in order to have a measurement of the number of all chemical reaction systems. For example, when L = 6, there are in total 1, 2, · · · , 6 six types of molecules, and thus 9 reaction pairs. By choosing which reaction is low-barrier, we can construct 9 l=0 ( 9 l ) · 2 l = 19683 alternative chemical universes, 16825 of which turn out to be physically possible. Using the criterion for self-driven given above, we find that 6886 (41%) of all the physically possible universes contain self-driven systems. This percentage increases with L, which indicates that self-driven systems are common, and more common in systems involving more types of molecules.
However, we cannot be sure that these self-driven systems are collectively-catalytic or self-replicating, and there is a third type of self-driven system that is non-sustaining system (see SI section S8 for details of non-sustaining system and section S9 for more details of classification for self-driven system). Nevertheless, we can use the stoichiometric criteria mentioned above (note that these criteria are sufficient but not necessary) to give a lower bound on the number of chemical universes that contain collectively-catalytic or selfreplicating systems. That is, for a self-driven system, a system is collectively-catalytic if it satisfies the criterion for balancedcanceling, or self-replicating if it satisfies both the criteria for overproduction and no-overintake.
Using the stoichiometric criteria we find that the number of chemical universes containing collectively-catalytic or selfreplicating systems increases with L, although the percentage decreases. This is a lower bound, so does not mean that the actual number of chemical universes containing collectivelycatalytic or self-replicating systems decreases. Establishing a firm relationship between the number of chemicals (L) and self-replication will involve simulating dynamics of all systems.
How can life maintain low entropy?. According to the second law of thermodynamics, the total entropy of an isolated system never spontaneously decreases over time. Life, thought of as an open system as opposed to an isolated one, is able to maintain order, i.e., maintain a relatively low entropy level. Schrödinger suggested that this is achieved by life "feeding on negative entropy" (38). His question is how this can happen spontaneously. But before we answer this we first need a way to discuss this question concretely and more quantitively.
Under the framework of our model, if we simply consider life as some self-replicating entity, we should then ask: Is it possible that a self-replicating system spontaneously increases its Gibbs energy or at least keeps it unchanged (we first note that in the scenario of constant pressure and temperature, the decrease of entropy corresponds to the increase of Gibbs energy as G = H −T S where G is Gibbs energy, H is enthalpy, T is temperature and S is entropy)? Let us consider the selfreplicating system Eq. (3), given the resource molecule 2: The simulation shows that N1(t), N3(t) and N4(t) increase exponentially, as well as N5(t) and N6(t) (although they are very small, see SI section S10). Molecule 1 is the end product, while 3, 4, 5 and 6 are replicated through the whole system. We then consider that the "living" system consists of the selfreplicating part (namely all of molecules 3, 4, 5 and 6) and the resource molecules in the system. Now we investigate how Gibbs energy of the system changes. We set Gibbs energy of the initial system to zero (as the reference point), since only relative quantity matters. Therefore, Gibbs energy of the self-replicating part is G replicating (t) = 6 i=3 Ni(t) · G • i . Gibbs energy of the resource molecules in the system is where F2(t) is the number of resource molecule 2 ever consumed till time t. Gibbs energy of the waste is Gwaste(t) = N1(t) · G • 1 . Then, Gibbs energy of the living system is G living (t) = G replicating (t) + Gresource(t). As shown in Fig.  3, G living (t) increases while G total (t) = G living (t) + Gwaste(t) decreases. We have thus given an explicit example of a self-replicating "living" system that spontaneously consumes the resources to increase its own Gibbs energy. Note that our system, as defined, is a well-mixed gas system. So the waste molecules 1 are not automatically separated from other molecules, and Gibbs energy of the gas-mixing process is neglected in the calculation above. However, the contribution of the fact of wellmixed gas is relatively small (see details in SI section S10). So it is still possible for a self-replicating system to spontaneously increase Gibbs energy or at least keep it unchanged.
Spontaneous evolution from simple towards complex. Nature provides many examples of the spontaneous evolution from simple towards complex. Is it possible to construct a system showing the similar process? Imagine there is a chemical reaction system composed of the following low-barrier reactions, given an infinite reservoir of only resource molecules 1. The first three reactions constitute the formose reaction A self-replicating system (i.e., formose reaction Eq. (2)): A collectively-catalytic system: A self-replicating system: · · · 1 + 23 → 24 24 → 12 + 12 [5] Eq. (2), given the resource molecule 1. The three reactions in Eq. (4) constitute a collectively-catalytic system, given the resource 3. The thirteen reactions in Eq. (5) constitute a self-replicating system, given the resource 1. Is it possible that lots of complex molecules, such as 12, 13 and 14, are produced in the end?
The answer in this case is yes. If the first 12 is produced, the self-replicating system Eq. (5) will be triggered, and consequently N12 will grow exponentially, as well as N13, N14, · · · , N23. But how is the first 12 produced? There are three stages: (1) Initially when there are only lots of 1 but nothing else, the system stays almost unchanged for a very long time since no low-barrier reaction could occur. Occasionally, by the high-barrier reaction 1 + 1 → 2, one molecule 2 is produced. The self-replicating system Eq. (2) is triggered, and then N2 and N3 grow exponentially. Very quickly there are lots of 2 and 3. (2) After a relatively long "boring" period, the first 5 is produced by the high-barrier reaction 2 + 3 → 5. Then the collectively-catalytic system Eq. (4) is triggered, and N6 grows. Very soon there are lots of 6. (3) After that, occasionally one 12 is produced by the high-barrier reaction 6+6 → 12.
One might naively believe that the first 12 can be produced by other reactions without the need for self-replicating and collectively-catalytic systems. This is not the case. Despite an abundance of molecules 1 which could be used to "assemble" an initial 12, the production of the first 12 requires high-barrier reactions, e.g., 6 + 6 → 12. It is only when both the other self-replicating and collective-catalytic systems are producing 6 in sufficient numbers that one such reaction is sufficiently likely to occur. At this point a 6 + 6 → 12 can occur, despite it being a high-barrier reaction, because of the abundance of 6 molecules.
The previous stage-by-stage procedure could be a general model of how chemical reaction systems evolve towards complex: a relatively simple innovation triggers some selfreplicating or collectively-catalytic systems and then a large number of new types of molecules are produced, paving the way for other innovations.
Meanwhile, the more types of molecules, the more probable to have reactions which are high-barrier before becoming practically low-barrier. For example, in the formose reaction Eq. (2), if the reaction 1 + 3 → 4 is high-barrier, the system is not self-replicating. But imagine that the following three reactions are low-barrier, They constitute a collectively-catalytic system. Then the reaction 1 + 3 → 4 which is the overall reaction of the three can still be considered as low-barrier, and the formose reaction system is still self-replicating. The only question is that we have to wait for the complex molecule 30 to appear.

Discussion
We have set up a general model for chemical reaction systems that properly accounts for energetics, kinetics and the conservation law. Although our model did not explicitly include catalysts, as other models did (20)(21)(22), catalysis and autocatalysis emerge in a number of systems.
We found three distinct types of self-driven system, i.e., systems which "feed" themselves. Both collectively-catalytic and self-replicating systems are vital in biology, while the third (non-sustaining) system appears less important. In terms of generating complexity, self-replicating system plays a more important role, since it is able to replicate innovations. In the self-replicating formose reaction Eq. (2), after the first molecule 2 is produced by a high-barrier reaction, more 2s are easily replicated. In a biological setting, if this molecule spread to other places, it can trigger more self-replicating system. In contrast, in the collectively-catalytic citric acid cycle Eq. (1) for example, after the innovation (the first molecule 5), the second 5 will not appear until the responsible high-barrier reaction occurs once again.
By arbitrarily constructing alternative chemical universes, we found that lots of them contain self-driven systems, and the lower bounds on the number of chemical universes containing collectively-catalytic or self-replicating systems increase with more types of molecules. This result suggests that in a random chemical universe, it would not be too surprising to observe the emergence of self-replication, one of the central properties of life (1). Although it is not the first theory to propose that self-replication is relatively easy to emerge, as the RAF theory did (23)(24)(25), it is the first one requiring no catalyst.
We provided a general model explicitly showing that high thermodynamic free energy molecules can be produced exponentially from low free energy molecules, while specific mechanisms in specific real-world scenarios have been investigated before (39,40). The example system we showed is a metaphor of why high free energy ATP molecules are constantly produced in organisms (12). In addition, as our model takes energetics (corresponding to entropy) into account, it provides a more concrete way to discuss the issue-famously put forward by Schrödinger (38)-why life is able to spontaneously maintain a relatively low entropy level, although it cannot give the full answer: Answering the questions of how the molecules in living systems can be placed in an ordered structure (namely, a low entropy state) would require extending our model to include spatial effects.
Our model explicitly shows that complexity evolves from extreme simplicity stage by stage. It gives insights into three issues related to the origin of life. Firstly, the first RNA molecule is much more likely to be produced de novo by this stage-by-stage procedure, rather than a magic event (3). It provides a theoretical support to "metabolism-first" theories (3), such as Wächtershäuser's iron-sulfur world hypothesis (5) and Szathmáry's theory (6)(7)(8). Secondly, before life, Earth should have gone through many stages in which different selfreplicating systems existed and consequently Earth's compositions were different in each stage. The raw materials for life we should look for are those for the first self-replicating system, rather than those for the extant life (5). That is why, in the current theoretical framework, the raw materials for life (e.g., nucleotides) seem not to be available on the primordial Earth (3,4). Thirdly, collectively-catalytic and self-replicating systems generate more types of new molecules, and in return, more types of molecules make more reactions feasible (in the form of catalysis and autocatalysis). This could explain why metabolic reactions in extant life always require sophisticated enzymes (13), while no reaction is expected to involve catalysts in the very early stage of life (18,31,32).
The model on its own provides a convenient platform to construct alternative chemical universes and investigate general properties of chemical reaction systems. It may provide a theoretical guideline for systematically searching for other chemical paths towards life (or at least self-replicating entities), as pursued in astrobiology (43,44) and xenobiology (45) for example. However, there are limitations. To be as simple as possible, our model assumes molecules with the same mass identical. As a result, in principle, every type of molecule can be produced from other molecules. But in reality it is not always the case, e.g., organometallic compounds can never be produced by a chemical reaction system involved with only carbohydrates. Nonetheless, the current model can be considered as a simple version of a more general model where conservation of different "atoms" is considered.