Your Account. Author Profile. Show caption. The mRNA sequence is thus used as a template to assemble—in order—the chain of amino acids that form a protein. Figure 2: The amino acids specified by each mRNA codon.
Multiple codons can code for the same amino acid. The codons are written 5' to 3', as they appear in the mRNA. Figure Detail But where does translation take place within a cell? What individual substeps are a part of this process? And does translation differ between prokaryotes and eukaryotes?
The answers to questions such as these reveal a great deal about the essential similarities between all species. Within all cells, the translation machinery resides within a specialized organelle called the ribosome. In eukaryotes, mature mRNA molecules must leave the nucleus and travel to the cytoplasm , where the ribosomes are located. On the other hand, in prokaryotic organisms, ribosomes can attach to mRNA while it is still being transcribed.
In all types of cells, the ribosome is composed of two subunits: the large 50S subunit and the small 30S subunit S, for svedberg unit, is a measure of sedimentation velocity and, therefore, mass. Each subunit exists separately in the cytoplasm, but the two join together on the mRNA molecule. The tRNA molecules are adaptor molecules—they have one end that can read the triplet code in the mRNA through complementary base-pairing, and another end that attaches to a specific amino acid Chapeville et al.
The idea that tRNA was an adaptor molecule was first proposed by Francis Crick, co-discoverer of DNA structure, who did much of the key work in deciphering the genetic code Crick, The rRNA catalyzes the attachment of each new amino acid to the growing chain. Interestingly, not all regions of an mRNA molecule correspond to particular amino acids. In particular, there is an area near the 5' end of the molecule that is known as the untranslated region UTR or leader sequence. This portion of mRNA is located between the first nucleotide that is transcribed and the start codon AUG of the coding region, and it does not affect the sequence of amino acids in a protein Figure 3.
So, what is the purpose of the UTR? It turns out that the leader sequence is important because it contains a ribosome-binding site. A similar site in vertebrates was characterized by Marilyn Kozak and is thus known as the Kozak box.
If the leader is long, it may contain regulatory sequences, including binding sites for proteins, that can affect the stability of the mRNA or the efficiency of its translation.
Figure 4: The translation initiation complex. When translation begins, the small subunit of the ribosome and an initiator tRNA molecule assemble on the mRNA transcript. The small subunit of the ribosome has three binding sites: an amino acid site A , a polypeptide site P , and an exit site E.
Here, the initiator tRNA molecule is shown binding after the small ribosomal subunit has assembled on the mRNA; the order in which this occurs is unique to prokaryotic cells. In eukaryotes, the free initiator tRNA first binds the small ribosomal subunit to form a complex. Figure Detail Although methionine Met is the first amino acid incorporated into any new protein, it is not always the first amino acid in mature proteins—in many proteins, methionine is removed after translation.
In fact, if a large number of proteins are sequenced and compared with their known gene sequences, methionine or formylmethionine occurs at the N-terminus of all of them. However, not all amino acids are equally likely to occur second in the chain, and the second amino acid influences whether the initial methionine is enzymatically removed.
For example, many proteins begin with methionine followed by alanine. In both prokaryotes and eukaryotes, these proteins have the methionine removed, so that alanine becomes the N-terminal amino acid Table 1. However, if the second amino acid is lysine, which is also frequently the case, methionine is not removed at least in the sample proteins that have been studied thus far.
These proteins therefore begin with methionine followed by lysine Flinta et al. Table 1 shows the N-terminal sequences of proteins in prokaryotes and eukaryotes, based on a sample of prokaryotic and eukaryotic proteins Flinta et al. In the table, M represents methionine, A represents alanine, K represents lysine, S represents serine, and T represents threonine. Once the initiation complex is formed on the mRNA, the large ribosomal subunit binds to this complex, which causes the release of IFs initiation factors.
The large subunit of the ribosome has three sites at which tRNA molecules can bind. The A amino acid site is the location at which the aminoacyl-tRNA anticodon base pairs up with the mRNA codon, ensuring that correct amino acid is added to the growing polypeptide chain.
The P polypeptide site is the location at which the amino acid is transferred from its tRNA to the growing polypeptide chain. ODE models provide an easier framework for analysis, but do not naturally incorporate certain features such as strict exclusion. They can be analyzed much more easily than other models, especially when additional regulatory complexity is present and this becomes more pronounced as the model size increases.
In addition, different tools from control engineering can be brought to bear here, something which is relevant in synthetic biology. In addition, depending on the question under investigation, either fine grained perhaps locally or coarse grained models may be employed [ 50 , 51 ], and therefore it is important to be able to systematically fine-grain and coarse-grain models.
Relatively coarse grained models have been shown to be useful, successfully making predictions in multiple contexts. The multiple model methodology may be useful here as well. For instance, certain coarse grained ribosome flow models, can be cast as and analyzed as probabilistic boolean models and their stationary distributions exactly numerically determined.
This can be combined with models which incorporate more detailed resolution, which may be analyzed by simulation. The above points highlight the tradeoff between the complexity of the model and the effectiveness in analyzing it.
A basic aspect of interest in systems biology is what the role of intrinsic factors and parameters are and how they combine with extrinsic factors in regulating protein synthesis. One way to approach this is to employ suitable representations of the protein synthesis process and analyze this in silico. In synthetic biology it is desired to build robustly functioning circuits to meet particular objectives [ 10 , 65 , 66 ]. Detailed TASEP type models possibly with significant extensions, incorporating finite pools of ribosomes, along with other factors are analyzed primarily by simulations.
A PBN type model possibly coarse grained can offer a simplified middle-ground model: it incorporates some of the essential features of translation, is stochastic, and can be used to perform multiparametric sensitivity analysis.
This can be determined directly mathematically, once the stationary state is computed, and only needs matrix vector product computations. The result of such analysis can be used in conjunction with that of ODE models and detailed stochastic simulations. In general the use of multiple methodologies in conjunction in specific problems, allows us to more clearly understand how different assumptions in the model, including those implicit in the modelling methodology, affect the conclusions and predictions.
This in turn, would allow for a tighter set of conclusions which could be drawn and the effects of stochasticity, crowding and their interplay with regulatory complexity systematically elucidated with an effective use of available tools.
This approach allows for predictions and extrapolations to be made with greater confidence. It may be anticipated that in some situations a hybrid modelling approach can be useful: to model the mRNA translation process for a specific problem, those parts that are not directly related to the considered problem can be modelled with relatively simple descriptions and the parts which are the focus of interest are modelled in more detail.
For example, to understand the autoregulation mechanism considered, the elongation and termination stages can be modelled with the TASEP assumptions stochastic event rates or simplifications thereof, while the initiation stage can be much more detailed biochemical reactions. This is equally relevant to understanding the natural coupling of translation with other processes.
Finally it is important to be able to systematically and appropriately coarse grain models of translation. The use of multiple models in conjunction would be very helpful in making the transition from the individual process to the systems description.
Translation is a basic genetic process which is widespread, and controlled in a multitude of ways in cells. Further the advent of synthetic biology suggests that there will be additional ways of this basic process being artificially regulated and manipulated [ 12 , 67 ].
The characteristic of translation is that it has a basic process ribosome movement on mRNA upon which is overlaid various additional regulatory and other complexities. Examples of this include regulatory mechanisms at initiation [ 5 , 20 , 36 , 68 , 69 ] and termination [ 70 — 73 ], nonsense mediated decay [ 39 , 74 ], the regulation of elongation steps by tRNA and the detailed mechanochemical steps involved in the ribosomal movement [ 22 , 40 , 75 , 76 ] and feedback [ 77 ].
Many of these aspects are being actively investigated experimentally. It is clear that modelling and computational frameworks need to be deployed in a systematic way to investigate many of these issues and to elucidate other issues such as the role of stochasticity in translation and its contribution to phenotypic noise. Existing models of translation, already span a broad spectrum from the single ODE model to the detailed computational model of translation incorporating the effects of many factors [ 28 ].
The models we have examined and analyzed, exhibit an intermediate level of complexity, but are codon based. These models are built based on different assumptions of the mRNA translation process, thus making it important to clearly recognize the underlying assumptions and to select the right model s for specific problems. The different insights brought by different models also make the multiple-model methodology and hybrid modelling approaches desired choices for modelling and understanding the mRNA translation process.
The multiple model methodology allows us to obtain a handle on process complexity on one hand and combine it with effective tools of analysis on the other. This is of relevance to both systems and synthetic biology. The understanding of translation and protein synthesis, and regulatory mechanisms therein, is an important theme in systems biology.
Multiple data-driven models have been proposed for mRNA translation, where one is usually satisfied as long as the model matches the available experimental data and possibly makes a few additional predictions successfully.
However it is often the case that arbitrarily many models can be defined for the same data set and perform the same task, and further the applicability and limitations of the models are not systematically assessed. This means that the extent to which the models developed can be further employed is not clear. Nor is it clear, how different such models describing different facets of the system, actually fit together in effectively describing the full system.
This makes it necessary for a careful investigation of the modeling methodologies and highlights the need for a systematic modelling approach involving multiple models and levels of description. In addition, the mRNA translation process is regulated at multiple levels which is related to other parts of the cellular system.
Therefore, a detailed understanding of this process will then require such system level models as discussed in this work. A key aspect of synthetic biology is the precise control of gene expression and protein synthesis, and translation is an emerging area of focus.
Synthetic biology is now engineering riboswitches, ribozymes, small RNAs [ 78 , 79 ], and other possible regulatory molecules to regulate protein synthesis, suggesting that sophisticated dynamic regulation of protein synthesis may be possible in the future.
Thus far, the design has been done in a somewhat ad hoc and case-by-case manner, focusing on individual bio-blocks while lacking the system level understanding of the whole process. However the mRNA translation process is closely regulated at multiple levels and is also subject to noise, and further, synthetic circuits may be subject to extraneous interactions in the host cell s.
Therefore the system level understanding of the translation process itself and the different levels of regulation, is vital [ 14 , 15 ]. In addition, the models used for understanding, design and control purposes, should also be at an appropriate level of complexity, maintaining a balance between model complexity and the ability to analyse it note that ODE models benefit from additional tools of control engineering , while making it possible to systematically account for and predict the effects of inherent regulatory effects and stochasticity.
The modelling methodology comparison and analysis tools presented in this work provide powerful tools for this purpose, providing a useful foundation for synthetic biology. Different models and different formalisms have been used in specific contexts to elucidate different aspects of translation in systems biology and design circuits in synthetic biology, and different levels of coarse and fine graining have been performed, all on a more-or-less ad hoc basis.
Since in many cases the models describe different facets of the same system it is important to have a more unified and systematic framework which allows for a genuine systems understanding of the translation process as well as reliable simplifications thereof. The approaches outlined above, possibly combined with tools such as Bayesian inference will allow for reliable and systematic frameworks, which both effectively distill the intrinsic complexity of translation, interaction with and control by extrinsic factors and can also be used with greater confidence for predictive purposes, as tools to complement experimental investigations, as well as for systems level descriptions.
All these aspects provide substantial new challenges for modelling and computation of this basic genetic process which itself combines different scales and levels of complexity. In this section, we discuss different aspects of the models and simulation algorithms we employ.
We discuss in turn i Some ODE models ii Simulation algorithms and variants for stochastic simulation of translation iii Formulation of Boolean rules to describe different events in translation. The variation of the concentration of those mRNA whose i th codon is occupied by the head of a ribosome is determined by the following equations:. The fluxes c i can be determined as follows. The ribosome can exit whenever its head is at the last codon of the mRNA. These fluxes can be generally written as.
Without making further assumptions, W i can not be determined. Then the steady state solution is determined by the following equations. The relation between the elongation termination and initiation rates and the translation rate is given by. In this subsection, we briefly discuss simulation algorithms and their variants for simulating the basic translation process. All the following simulation algorithms are based on the rate law in 1 and the random-sequential update rule.
With this update rule, no particular update order is predetermined: at each time step, the update event is chosen randomly with equal probabilities. For the sake of exposition, in what follows we assume that all the event rates are no more than one and can thus be interpreted as probabilities.
We now discuss the algorithms and their variants. We begin with what may be regarded as a conventional algorithm [ 44 ]. Conventional algorithm. The definition of the random-sequential update rule naturally leads to Algorithm 4, which clearly does not assume any particular update order.
This algorithm, picks out one possible update event chosen from a set of events with equal probability, checks if this event is possible, and if it is, probabilistically updates it in a manner commensurate with the event rate. This is described in more detail below. An alternative algorithm equivalent to Algorithm 4.
Although Algorithm 4 is probably the most popular used algorithm in the literature, its structure does not readily allow modifications.
We consider and discuss a variant Algorithm 5 for further discussions. The difference of these two algorithms lies in their algorithmic structures: Algorithm 4 first determines the update event with equal probability and then updates the event according to its rate, while Algorithm 5 combines these two steps by making the determination of the next update event directly dependent on their rates.
As in Algorithm 4, Algorithm 5 does not predefine any particular update order and therefore it is still random sequentially updated. Therefore, it yields that. With the above discussion it is readily seen that Algorithms 4 and 5 are equivalent to each other. The efficient fixed-time-step algorithm. This is an obvious source of inefficiency.
The more efficient algorithm is given in Algorithm 6. The efficient varying-time-step algorithm. In Algorithms 5 and 6 the selected update event may not actually occur as the state may not allow it to. Notice that the sum of the probabilities of defining the next event index is always no more than one, i.
Any improvement of the algorithm efficiency has to be made by modifying the algorithm structure. This is done by switching the fixed time steps in Algorithms 5 and 6 to time-varying ones in Algorithm 1, as follows. I e x is entirely determined by the state x and is thus time-varying. Therefore, the new algorithm given in Algorithm 1, still agrees with the rate law and the random sequential update rule. Note that the time steps in Algorithm 1 are time-varying with the current mRNA state.
The statistical equivalence of the algorithms. Although the update time interval and update mechanisms are different, all the algorithms ensure that within their individual update time interval, the probability of event occurrence is exactly given as in 1 , and the update order is not particularly determined random-sequential.
Therefore, in the long run all these algorithms are equivalent in the statistical sense, leading to the fact that all the statistical characteristics as the translation rate and codon density are the same for all the algorithms.
Only for Algorithm 1 every time step will definitely result in an actual update event. The derivation of the PBN model is based on Algorithm 6 with the random-sequential update rule. As shown above, all the simulation algorithms with the same random-sequential update rule are statistically equivalent and therefore the choice of the underlying algorithm does not limit the PBN model in any sense. The PBN model with the parallel update rule is ongoing work and will not be discussed here.
We first present some background and preliminary details on the PBN model and then discuss how the various events are represented in this setting. In order to derive the PBN model, one must be able to first express the mRNA state as the state in a Boolean network and then the update events as Boolean functions governing the dynamics of the Boolean network. Conceptually this can be readily done as the mRNA codon state is indeed Boolean a codon being covered or uncovered by a ribosome constitute its two Boolean states.
However, historically no efficient tools for Boolean networks have been available, which may be a reason why this seemingly straightforward PBN model for mRNA translation has not been discussed before. A Boolean network is the dynamic interactions of multiple Boolean nodes where each node can be one of the only two possible states, thus making 2 n different states for the whole network.
A Boolean network with n nodes can be generally represented as follows,. Boolean networks are typically analysed using truth tables, i. This approach of analysis is evidently not a powerful one but has however been the only one for a long time before the introduction of the matrix representation of Boolean networks based on the so-called semi-tensor product in recent years [ 82 , 83 ].
The semi-tensor product is an extended matrix product. The most interesting aspect of the semi-tensor product in this context is, with it Boolean networks can be represented in a matrix form. That is, any Boolean function can be uniquely identified by and equivalently treated with its structure matrix [ 83 ]. This provides a succinct and systematic way of representing the network, which can be systematically augmented.
Additionally this representation provides new tools for analysis of attractors of deterministic analogues of such networks. This further leads to. That is,. That is, a Boolean network based on the logical rules is equivalent to a linear system in 18 and is completely described by the structure matrix L. Finally, a Boolean network becomes a probabilistic one, i. In this subsection, we discuss how the various events in the translation process may be described in Boolean terms.
The three types of the update events in Algorithm 6 can be described as Boolean functions associated with the Boolean network consisting of the mRNA state x. As mentioned earlier, as long as we are able to formally describe the update events as Boolean functions i.
Therefore in what follows we focus only on the Boolean expression of the update events but not their further calculations within the semi-tensor product framework. Then the following discussions are for the mRNA states in only.
Entry : A new ribosome may attach the leftmost of the mRNA if and only if the first r codons are free. This may be succinctly encoded in Boolean terms. Therefore, this Boolean function agrees with the dynamics of the entry event e 0 in Algorithm 6. Exit : A ribosome dissociates from the rightmost of the mRNA if and only if the last r codons are occupied. Therefore, this Boolean function agrees with the dynamics of the exit event e n in Algorithm 6.
Therefore, this Boolean function agrees with the dynamics of the hopping event e j in Algorithm 6. This page has been archived and is no longer updated. Translation is the process by which a protein is synthesized from the information contained in a molecule of messenger RNA mRNA.
During translation, an mRNA sequence is read using the genetic code, which is a set of rules that defines how an mRNA sequence is to be translated into the letter code of amino acids, which are the building blocks of proteins. The genetic code is a set of three-letter combinations of nucleotides called codons, each of which corresponds with a specific amino acid or stop signal.
Translation occurs in a structure called the ribosome, which is a factory for the synthesis of proteins.
0コメント