Neural Model of the Genetic Network*

,

Recently developed analytical techniques such as gene chips and proteomics generate data reflecting the status of an entire cell or organism at a given time and therefore provide a snapshot of all the pathways that compose the genetic network of a cell or organism. A set of these snapshots taken at short time intervals forms time series of mRNA or protein amounts. Such time series reflect regulatory interactions among the members of the system. The availability of such kind of data makes it possible to utilize existing, or to design new mathematical models of the regulatory processes, which make it possible to analyze the dynamics of the processes and identify interactions among genes and their products. Knowledge of the principles of the system control can lead to the design of simulation models that can predict situations that can not be achieved experimentally and thus provide a feedback to experimental biologists.
A number of different mathematical models of cell regulatory processes have appeared in recent years. One of the approaches formalizes gene regulation into a network with nodes representing genes and connections among genes, which define the regulatory action of one gene product on another gene. Mathematically such processes can be expressed by a neural network. A number of models were suggested based on two-stage Boolean or linear (feed forward) representation (1)(2)(3)(4)(5)(6)(7)(8). Recurrent neural networks used for the description of dynamics of genetic networks, in particular of eukaryotic systems (9,10), or suggested as a general model of transcription and translation (11) form a special case of this type of model. The advantages and disadvantages of this approach were discussed in Ref. 11. Comparative analysis of performance of the neural network-based models can be found in Ref. 12. Recurrent neural networks can also deal with feedback, which can occur in natural gene control processes, and they are flexible enough to fit experimental data. If the components of the system and their regulatory interactions are known, recurrent neural networks can be used to model and simulate the dynamics of gene expression in such systems.
The decision circuit, which operates within a single Escherichia coli cell and controls one phase of single phage life cycle, is probably the most completely characterized complex genetic network. Because the qualitative behavior of the phage switch is known, it can serve as a test for the validity of a mathematical model. The goal is to describe by a model the dynamics of such a system under known physiological conditions. The model must first satisfy observed behavior, and if so it can be used to simulate the system. Here a model of the genetic network used by bacteriophage to choose between lytic and lysogenic pathways, based on the recurrent neural network principle, is presented.

The Model
The model is based on the assumption that the regulatory effect on the expression of a particular gene can be expressed as a neural network (Fig. 1). Each node of the network represents a particular gene, and the wiring between the nodes defines regulatory interactions (11). It can be assumed that the state of gene expression at time t ϩ dt depends on the state of expression at time t and the connection weights (Fig. 1). Generalization of this principle leads to the definition of the rate of expression as a change of accumulation of gene i product over time (dz i /dt). The rate of expression of gene i can be derived from the expression levels (y j ) at the time t and connection weights (w ij ) of all genes connected to the given gene. Thus g i , a regulatory effect on gene i, is This regulatory effect is transformed by a sigmoidal transfer function f to the interval with a bias vector b, which can be interpreted as a reaction delay parameter. In our case its value represent timing of the events in the network. The actual rate of expression is then modulated by a multiplicative constant k 1i that represents the maximal expression rate of a given gene.
Let the rate of expression of a target gene i (dz i /dt) be given by the regulatory effects of other genes i minus degradation x I , where i runs over the members of the set shown in Figs. 1 and 2. The degradation effect x i is modeled by the kinetic equation of a first order chemical reaction x i ϭ k 2i z i , and i represents the regulatory effect reflected in a variable g i ( i ϭ k 1i f(g i )). The constant k 2 represents the rate constant of degradation of the gene product i, and k 1 is its maximal rate of expression. The model for the control of the ith gene has the form, where f represents the sigmoid transfer function and the y j are concentrations of the components of the system (for j ϭ i; z i ϭ y i ). Equation 3 describes the dynamics of accumulation of the gene i product and represents a "node" function. Each node can be connected with other nodes to form a neural network (Fig. 1). The wiring between the nodes of the network is defined by the weight matrix w. A nonzero value of w ij for the nodes i and j means that the connection between the nodes exists. The magnitude of the weight w ij represents a strength of interaction, or a regulatory effect, between the two nodes. The neural network is fully described by the differential equations corresponding to the particular nodes. The number of equations is given by the number of nodes, and the amounts of gene products at arbitrary time t can be computed by solving the set of differential equations. Equation 3 represents a special case of a class of recurrent neural networks described by a general formula, which is a special case of analog Hopfield neural networks (see e.g. Ref. 19), where f i is a nonlinear transfer function, w represents connection weights, and is an external input to the node i. These types of networks have been used as associative memories and as models of brain activity, and their dynamics have been studied thoroughly. Depending on the weight matrix, the state transition of the network can lead to a point attractor (stationary state), can become oscillatory (limit cycle), or can even become chaotic depending on the weights and the complexity of the network. Several methods of "training," i.e. computation of the weight matrix and the other parameters from experimental time series, have been suggested (20,21).

Phage Regulatory Networks
Bacteriophage is a virus consisting of protein particle and double-stranded DNA of ϳ50 kilobases with cohesive single strands at the ends of the molecule. Soon after infection, the DNA circularizes, and lytic or lysogenic growth is chosen. In the lytic pathway, viral DNA is replicated many times and then packed into viral coats, causing lysis of the host cell and the release of up to 100 viruses that infect new bacteria. In some cases, the lytic cycle is aborted, and the phage switches to the lysogenic pathway, in which the phage genome is integrated into the bacterial host DNA and is replicated and transmitted as any other chromosomal DNA.
The Decision Network-A simplified schematic representation of the phage genes having a regulatory role in determining lysis or lysogeny in the chromosome of infected E. coli is shown in Fig. 2. Other genes involved in the process of lysis and lysogeny are not involved in the process of regulation and are therefore not considered.
Transcribing RNA polymerase molecules stop just at the end of N and cro, respectively. The N and cro mRNAs are translated into their respective proteins (22)(23)(24). The N protein is a positive regulator that enables the transcription of genes to the left of N including cIII and the recombination genes and genes to the right of cro including cII and the DNA replication genes O, P, and Q (25). N works as a positive regulator by enabling RNA polymerase to transcribe through the regions of DNA that would otherwise cause the mRNA to terminate. CIII protects CII from proteolytic degradation (26). Because CII works as an activator, it enables RNA polymerase to bind and begin transcription at two promoters that would otherwise remain silent: P RE and P I . cI can be transcribed from two promoters, P RM and P RE . Transcription from P RM is maintained by CI in a self-inducing feedback loop, and transcription from P RE depends on the concentration of CII. Starting at P RE , polymerase transcribes leftward to the end of the cI gene. When this mRNA is translated it produces CI but not Cro, which is encoded on the other strand. CI binds to O L and O R . CI bound to these two sites turns on transcription of its own gene and blocks transcription of lytic genes and early ones necessary for phage construction.
In the meantime Cro binds to O R 3 to block synthesis of CI. It then binds to the second operator, O L , to turn off transcription initiated at P L and to the remaining sites of O R , as described above. Both CI and Cro bind to O R to stop their own transcription at higher concentrations. This occurs far after the decision has been made and does not influence the process of decision and therefore is not considered in the model. This analysis defines both the timing of the process and regulatory interactions among the members of the system. Formalization of the process to a neural network is shown in Fig. 1. The network comprises all crucial elements acting in the lytic/lysogeny pathway choice. The arrows indicate interactions between genes and their products. ϩ and Ϫ, positive and negative controls, respectively. The CI node represents free repressor concentration and is calculated as the sum of unbound CI transcribed from P RE and P RM , respectively. b, the connection weight matrix. ϩ, Ϫ, and 0, positive, negative, and no control, respectively. The matrix is ordered such that columns "control" rows, e.g. the ϩ in the first column and second row is interpreted as "cII is positively controlled by N." For modeling purposes the signs were replaced by numbers expressing the "strengths" of the interactions.
The Decision-The scenario of the process is not fully understood but it can be deduced that the decision, which of the two alternative pathways is to be chosen, depends on the activity of CII (26,27). It has been found that if CII is highly active, the infecting phage lysogenizes; otherwise it grows lytically. In cells in which CII is rapidly degraded, no CI is synthesized. If Cro binds to O R 3 before CI is bound to O R 1 and O R 2, the phage will not establish lysogeny. This never occurs if the CII protein is highly active. It has been shown that the rate of expression of cI depends on the activity of the P RE promotor, which is controlled by CII. The CIII protein also helps to establish lysogeny; its role is to protect CII from degradation (the halflife of CII is about 5 min in the presence of CIII but Ͻ1 min in the absence of CIII (26)). If CIII is absent, CII is virtually always inactivated, and the phage can grow only lytically. In the cells where protease levels are high, no CI is synthesized, the Cro protein is synthesized, and lytic growth ensues. Therefore the rate of synthesis of CI depends on the activity of CII, which depends on the environmental conditions. CII is the crucial element of the decision network and its activity determines which of the two pathways will be chosen.

Simulation of the Lytic/Lysogenic Decision
The elements of the decision circuit have been identified above. The "influence" matrix, which is just the formalization of the analysis of the decision network made above, can then be derived (Fig. 1b). The kinetic profiles of individual gene products were computed numerically from the set of differential equations defined in Equation 3 with the weight matrix from Fig. 1b. There, ϩ and Ϫ signs were replaced by values satisfying constraints derived in the previous paragraph. Constants were calculated as shown in Fig. 3.
The Role of CII-It has been found that the variable influencing the decision to which side the "switch" will flip is the stability of CII. The simulation of the two principal situations (low and high stability of CII) is shown in Fig. 3. Results of simulations shown in Fig. 3, a and b, revealed that when the CII half-life is short, practically immediately after the network initiation Cro dominates the field and represses expression of all other members of the network. Fig. 3b shows the time course of the changes in number of molecules of N, CII, and CI transcribed from P RE and P RM . First of all, N is synthesized followed by CII, which initiates transcription of cI from P RE , but CI never reaches the amount necessary to activate the self-inducing loop of expression of cI from P RM . 1 Fig. 3, c and d, shows the simulation of the situation when CII is well protected from cleavage. CI from P RE reaches a sufficiently high value to start up the self-inducing expression from P RM , which immediately starts to block the synthesis of N, CII, and Cro and consequently of its own transcription from P RE . The amount of cI product transcribed from P RM is high enough to maintain the self-inducing loop. CI reaches a concentration that maintains its dominance, represses all other members of the network, and establishes lysogeny.
Dependence of the Lytic/Lysogenic Decision on Multiplicity of Infection-It has been observed that probability of lysogeny increases markedly with increasing MOI 2 up to an MOI of ϳ7 (28). As the simulation reveals (Fig. 4), at low MOI both CII and CIII are synthesized at low levels, and the CII concentration is too low to activate P RE and initiate transcription of cI from P RM . Concentration of Cro is therefore high enough to establish lysis. With increasing MOI, the concentration of CII increases until it reaches an amount necessary to initiate the self-inducing production of CI. Any further increase of MOI does not substantially change the concentration of CII. The peak concentration of CII is maintained by the balance between its production and the concentration of the CI repressor, which, after establishing the lysogenic path, blocks the synthesis of CII.
The infection of a population of cells with an average MOI produces a distribution of MOI across the population. Because the concentration of CII markedly depends on MOI, the result is a distribution of lytic/lysogeny outcomes, which has been observed.
Effect of Ultra Violet Irradiation-Ultraviolet light stimulates proteases that cleave CI, breaking the P RM feedback loop, which maintains the transcription of cI (29). In the prophage stage, promoters P R and P L are no longer repressed by CI and cro, other lytic genes can be expressed, and phage replication and lysis are initiated. This situation is principally analogous to the simulation shown in Fig. 3a.
Sensitivity of Lytic/Lysogenic Switch-The question remains: how sensitive is the switch? In other words, does there exist a transition region of CII half-life where the decision is uncertain? To investigate this question, a set of simulations was run, in which the value of the half-life of CII was gradually increased, and the terminal state of the network was checked until it switched from lysis to lysogeny. It was found that there exists a point at which an almost infinitesimal change of the half-life of CII caused inversion of the terminal state of the 1 The curve of the cI product transcribed from P RM remains close to zero. Nevertheless, detailed inspection of the curve shows that very little CI from P RM is being synthesized. 2 The abbreviation used is: MOI, multiplicity of infection.

FIG. 2. Simplified version of the phage decision network that determines whether an infected E. coli cell follows the lytic or lysogenic pathway.
Dashed arrows indicate the direction of transcription, and bold arrows indicate regulatory interactions between a gene product and particular DNA region.
network. This is illustrated in Fig. 5, in which the simulated time courses of the members of the decision network are plotted for different increments of the CII half-life. It can be seen that there exists a point of inversion at which a change in less than a few molecules of CII causes a change in the terminal state of the network (lytic or lysogenic).
The parameters and weight matrix of the model were chosen to give values in the range of the experimentally observed one. To test whether the switch works the same way using other combinations of parameters, the behavior of ϳ100 networks with randomly generated parameter values was simulated. The range of parameter values was constrained only by the known and observed limits, which were mentioned above. The "switch point" moved to different positions, and the amplitude of time courses of components of the network was different; but in all cases the principle of the bifunctional switch remained the same. The transition region was always almost infinitesimally narrow. Such behavior has never been reported before. This is of course not a mathematical proof that the system always relaxes to this terminal state, but it at least supports the notion that this is the natural behavior of the phage switch.
A similar phenomenon is exhibited by the dependence of the lysis/lysogeny decision on MOI. At low MOI the concentration of CII is never high enough to establish lysogeny. With increasing MOI the concentration of CII increases as well, up to a point when it is high enough to initiate transcription of cI from P RM . This flips the switch and establishes lysogeny. A further increase of MOI does not substantially increase the concentration of CII, which is already sufficient to start up the lysogenic path.
Values of the weight matrix in all simulations were set to follow constraints found during qualitative analysis of the behavior of the system to give outputs in a range roughly corresponding to the experimentally observed ones. Biochemically correct values can only be obtained from experimental time series, which are not presently available.

DISCUSSION
In a single cell all biochemical reactions occur on a nanomolar scale, i.e. with countable amounts of molecules. The direction the reaction pathway takes depends on the amounts of reagents at a particular time in a particular place. The fluctuations of these amounts are stochastic, and the output of a regulatory process can be stochastic as well. This topic has been analyzed thoroughly in the work of McAdams and Arkin (14,16). In principle, such a stochastic nature of cell molecular processes can lead to instability. Any instability is potentially dangerous because of the resonance effect the instability can propagate in time and finally can cause breakup of the system. It implies that in the evolutionary process, the regulatory systems in the cell evolved such that they were capable of eliminating such stochastic behavior and thus inherited instability. Very often the stability of natural processes is maintained by redundancy. The phage decision network comprises redundancy as well. The self-regulatory circuit of cII and cro, which by binding to the O R 1-3 region controls its own expression, is an example. From a simplistic point of view it would be sufficient to control the amount of these compounds just by the balance between synthesis and degradation that under normal circumstances would maintain a stable terminal amount of both CI and Cro. The self-regulation is redundant and ensures that the stable terminal state will be reached in any case.
It would be possible to design a circuit, much simpler than the one, that would be capable of flipping a switch. The reason for the complexity of the decision network could be that it evolved to ensure maximal stability of the output. To ensure that the system will reach a terminal state, the range of values of the variable that determines the fate of the system (in this case CII half-life) must be very broad such that the probability that the value will fall into the region of uncertainty is as low as possible. This result is in perfect agreement with the predictions made from the model described here. The width of the sensitive region corresponds to an amount smaller than a single molecule of CII. Thus in all cases the terminal state of the network is predictable and leads to the two observed states, lysis or lysogeny. From this point of view, the model not only corresponds perfectly to the observed behavior but also predicts a situation that cannot be reached experimentally and offers an explanation coherent with the observed principles. The Boolean-like behavior of the phage genetic network was not expected, and there was no direct or indirect evidence available that a genetic network of this type can display such behavior. Modeling and simulation of the genetic network revealed the dynamics of the transition from one stage to another and showed how sensitive it is on the molecular level. Such results can not be achieved experimentally. In this sense mathematical modeling is a very powerful tool for the analysis of the behavior of regulatory systems in the cell.
In this paper the ability of a neural network model to simulate and predict the behavior of already quite complex system is presented. The model generates time series of amounts of components of the system. With progress in the development of techniques such as quantitative and functional proteomics (11,30), mainly mass spectroscopy-based (31,32) or DNA chip technology, the problem can be turned around, and the problem of identifying the network from experimental time series arises. This means the determination of the values of the weight matrix and other parameters from experimentally measured time series 3 (training of the network in the neural network terminology). The weight matrix can then be translated in terms of mutual control of the elements of the system. Values close to zero mean no control, and high positive or negative 3 It is possible to take advantage of the theory of recurrent neural networks, which suggests appropriate methods (e.g. Ref. 33 or 19). Such methods were designed for recurrent neural networks in general. The use of these approaches for the modeling of control of gene expression is discussed in Ref. 11 or 12. In reality, increasing MOI also increases the expression of cIII, which protects CII from degradation. Therefore the lysogenic path is reached for even lower MOI than in the case shown here. ones mean positive or negative control, respectively. The weight matrix directly determines the connections in the neural network. Together with other independent observations the neural network can be translated to a pathway. This reverse engineering can become a base for a very efficient method for reconstructing regulatory pathways from experimentally measured data.
The modeling approach has an advantage of any mathematical model; it can predict situations that cannot be reached experimentally. If quantitative data obtained from a well designed experiment allow a model to be constructed, the functionality of which is tested by comparison with experiments, the model can serve to investigate an infinite number of different situations such as those caused by mutagenesis, changes in activity of the elements of the system, etc. The model allows the dynamics of the process and the terminal states of the system to be investigated. It can bypass experimental limitations and thus explore biological situations currently restricted by experimental accessibility. It also brings deeper understanding of the nature of the whole process. The simulation capabilities are unlimited and can provide a check on the intuitive understanding of a process.
Neural networks are known as models of brain activity or tools capable of recognizing an input pattern. A neural network, out of a possible infinite number of states, tends to relax to a limited number of terminal configurations depending on the initial state of the variables forming the network. This resembles the principles of complex coordination of the functions within a cell. The cell is an extremely complex system and without tight control would reach an infinite number of states. In practice it reaches a very limited number of configurations that maintain the healthy state of the cell and its overall stability. Many control systems exist in a cell that eliminates any perturbations that would lead to cell instability and its subsequent breakdown. I am convinced, and in this paper I am trying to bring the evidence, that the neural networks are capable of simulating natural processes and can be used to model cell regulatory systems.