Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 520
Filter
1.
Biosystems ; 243: 105273, 2024 Sep.
Article in English | MEDLINE | ID: mdl-39033972

ABSTRACT

TSPO protein is known to be involved in various cellular functions and dysregulations of TSPO expression has been found to be associated with pathologies of different human diseases, including cardiovascular disease, cancer, neuroinflammatory, neurodegenerative, neoplastic disorders. However, there are limited studies in the literature on the effects of sequence variations in the TSPO gene on the function of the protein and their relationship with human diseases. Evaluating the pathogenicity of genetic variants is crucial in terms of prioritizing the functional importance and clinical use. Therefore, various in-silico prediction tools have been developed that combine different algorithms to predict the effects of sequence variations on protein functions or gene regulation. In this study, the p-adic distance approach in modeling the genetic code, proposed and developed by Dragovich and Dragovich, was discussed in order to obtain an alternative to the existing in-silico prediction tools. Dragovichs' approach is expressed as follows: A 5-adic space of codons is constructed and 5-adic and 2-adic distances between codons are taken into account. As a result, two codons with the smallest value of 5-adic and 2-adic distances are obtained, encoded for the same amino acid and stop signal. This model describes well the degeneration of the genetic code. This study combined the data obtained from in-silico prediction tools and used a bioinformatics approach to determine the functional relevance of coding SNPs in the TSPO. Overall, we evaluate the potential utility of Dragovichs' approach by comparing it with other existing prediction tools for variant classification and prioritization.


Subject(s)
Receptors, GABA , Receptors, GABA/genetics , Receptors, GABA/metabolism , Humans , Algorithms , Codon/genetics , Computational Biology/methods , Computer Simulation , Genetic Code/genetics , Models, Genetic
2.
Biosystems ; 243: 105263, 2024 Sep.
Article in English | MEDLINE | ID: mdl-38971553

ABSTRACT

In this work we present an analysis of the dinucleotide occurrences in the three codon sites 1-2, 2-3 and 1-3, based on a computation of the codon usage of three large sets of bacterial, archaeal and eukaryotic genes using the same method that identified a maximal C3 self-complementary trinucleotide circular code X in genes of bacteria and eukaryotes in 1996 (Arquès and Michel, 1996). Surprisingly, two dinucleotide circular codes are identified in the codon sites 1-2 and 2-3. Furthermore, these two codes are shifted versions of each other. Moreover, the dinucleotide code in the codon site 1-3 is circular, self-complementary and contained in the projection of X onto the 1st and 3rd bases, i.e. by cutting the middle base in each codon of X. We prove several results showing that the circularity and the self-complementarity of trinucleotide codes is induced by the circularity and the self-complementarity of its dinucleotide cut codes. Finally, we present several evolutionary approaches for an emergence of trinucleotide codes from dinucleotide codes.


Subject(s)
Genetic Code , Genetic Code/genetics , Codon/genetics , Evolution, Molecular , Codon Usage/genetics , Archaea/genetics , Nucleotides/genetics , Bacteria/genetics , Bacteria/classification , Models, Genetic , Eukaryota/genetics
3.
Bioessays ; 46(7): e2400058, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38724251

ABSTRACT

The genetic code is a set of instructions that determine how the information in our genetic material is translated into amino acids. In general, it is universal for all organisms, from viruses and bacteria to humans. However, in the last few decades, exceptions to this rule have been identified both in pro- and eukaryotes. In this review, we discuss the 16 described alternative eukaryotic nuclear genetic codes and observe theories of their appearance in evolution. We consider possible molecular mechanisms that allow codon reassignment. Most reassignments in nuclear genetic codes are observed for stop codons. Moreover, in several organisms, stop codons can simultaneously encode amino acids and serve as termination signals. In this case, the meaning of the codon is determined by the additional factors besides the triplets. A comprehensive review of various non-standard coding events in the nuclear genomes provides a new insight into the translation mechanism in eukaryotes.


Subject(s)
Genetic Code , Protein Biosynthesis , RNA, Messenger , Genetic Code/genetics , Humans , RNA, Messenger/genetics , RNA, Messenger/metabolism , Protein Biosynthesis/genetics , Animals , Codon, Terminator/genetics , Cell Nucleus/genetics , Evolution, Molecular , Codon/genetics , Eukaryota/genetics
4.
Biosystems ; 240: 105230, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38740125

ABSTRACT

This is a brief review on modeling genetic codes with the aid of 2-adic dynamical systems. In this model amino acids are encoded by the attractors of such dynamical systems. Each genetic code is coupled to the special class of 2-adic dynamics. We consider the discrete dynamical systems, These are the iterations of a function F:Z2→Z2, where Z2 is the ring of 2-adic numbers (2-adic tree). A genetic code is characterized by the set of attractors of a function belonging to the code generating functional class. The main mathematical problem is to reduce degeneration of dynamic representation and select the optimal generating function. Here optimality can be treated in many ways. One possibility is to consider the Lipschitz functions playing the crucial role in general theory of iterations. Then we minimize the Lip-constant. The main issue is to find the proper biological interpretation of code-functions. One can speculate that the evolution of the genetic codes can be described in information space of the nucleotide-strings endowed with ultrametric (treelike) geometry. A code-function is a fitness function; the solutions of the genetic code optimization problem are attractors of the code-function. We illustrate this approach by generation of the standard nuclear and (vertebrate) mitochondrial genetics codes.


Subject(s)
Codon , Evolution, Molecular , Genetic Code , Models, Genetic , Genetic Code/genetics , Codon/genetics , Humans , Animals , Amino Acids/genetics , Amino Acids/metabolism , Algorithms
5.
Biosystems ; 239: 105217, 2024 May.
Article in English | MEDLINE | ID: mdl-38663520

ABSTRACT

I analyzed all the theories and models of the origin of the genetic code, and over the years, I have considered the main suggestions that could explain this origin. The conclusion of this analysis is that the coevolution theory of the origin of the genetic code is the theory that best captures the majority of observations concerning the organization of the genetic code. In other words, the biosynthetic relationships between amino acids would have heavily influenced the origin of the organization of the genetic code, as supported by the coevolution theory. Instead, the presence in the genetic code of physicochemical properties of amino acids, which have also been linked to the physicochemical properties of anticodons or codons or bases by stereochemical and physicochemical theories, would simply be the result of natural selection. More explicitly, I maintain that these correlations between codons, anticodons or bases and amino acids are in fact the result not of a real correlation between amino acids and codons, for example, but are only the effect of the intervention of natural selection. Specifically, in the genetic code table we expect, for example, that the most similar codons - that is, those that differ by only one base - will have more similar physicochemical properties. Therefore, the 64 codons of the genetic code table ordered in a certain way would also represent an ordering of some of their physicochemical properties. Now, a study aimed at clarifying which physicochemical property of amino acids has influenced the allocation of amino acids in the genetic code has established that the partition energy of amino acids has played a role decisive in this. Indeed, under some conditions, the genetic code was found to be approximately 98% optimized on its columns. In this same work, it was shown that this was most likely the result of the action of natural selection. If natural selection had truly allocated the amino acids in the genetic code in such a way that similar amino acids also have similar codons - this, not through a mechanism of physicochemical interaction between, for example, codons and amino acids - then it might turn out that even different physicochemical properties of codons (or anticodons or bases) show some correlation with the physicochemical properties of amino acids, simply because the partition energy of amino acids is correlated with other physicochemical properties of amino acids. It is very likely that this would inevitably lead to a correlation between codons (or anticodons or bases) and amino acids. In other words, since the codons (anticodons or bases) are ordered in the genetic code, that is to say, some of their physicochemical properties should also be ordered by a similar order, and given that the amino acids would also appear to have been ordered in the genetic code by selection natural, then it should inevitably turn out that there is a correlation between, for example, the hydrophobicity of anticodons and that of amino acids. Instead, the intervention of natural selection in organizing the genetic code would appear to be highly compatible with the main mechanism of structuring the genetic code as supported by the coevolution theory. This would make the coevolution theory the only plausible explanation for the origin of the genetic code.


Subject(s)
Amino Acids , Codon , Evolution, Molecular , Genetic Code , Selection, Genetic , Genetic Code/genetics , Amino Acids/genetics , Amino Acids/chemistry , Codon/genetics , Models, Genetic , Anticodon/genetics , Humans , Animals
6.
Biosystems ; 239: 105215, 2024 May.
Article in English | MEDLINE | ID: mdl-38641199

ABSTRACT

A massive statistical analysis based on the autocorrelation function of the circular code X observed in genes is performed on the (eukaryotic) introns. Surprisingly, a circular code periodicity 0 modulo 3 is identified in 5 groups of introns: birds, ascomycetes, basidiomycetes, green algae and land plants. This circular code periodicity, which is a property of retrieving the reading frame in (protein coding) genes, may suggest that these introns have a coding property. In a well-known way, a periodicity 1 modulo 2 is observed in 6 groups of introns: amphibians, fishes, mammals, other animals, reptiles and apicomplexans. A mixed periodicity modulo 2 and 3 is found in the introns of insects. Astonishing, a subperiodicity 3 modulo 6 is a common statistical property in these 3 classes of introns. When the particular trinucleotides N1N2N1 of the circular code X are not considered, the circular code periodicity 0 modulo 3, hidden by the periodicity 1 modulo 2, is now retrieved in 5 groups of introns: amphibians, fishes, other animals, reptiles and insects. Thus, 10 groups of introns, taxonomically different, out of 12 have a coding property related to the reading frame retrieval. The trinucleotides N1N2N1 are analysed in the 216 maximal C3 self-complementary trinucleotide circular codes. A hexanucleotide code (words of 6 letters) is proposed to explain the periodicity 3 modulo 6. It could be a trace of more general circular codes at the origin of the circular code X.


Subject(s)
Genetic Code , Introns , Introns/genetics , Animals , Genetic Code/genetics , Evolution, Molecular
7.
IEEE Trans Nanobioscience ; 23(3): 447-457, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38512749

ABSTRACT

In this paper, we propose a new coding scheme for DNA storage using low-density parity-check (LDPC) codes and interleaving techniques. While conventional coding schemes generally employ error correcting codes in both inter and intra-oligo directions, we show that inter-oligo LDPC codes, optimized by differential evolution, are sufficient in ensuring the reliability of DNA storage due to the powerful soft decoding of LDPC codes. In addition, we apply interleaving techniques for handling non-uniform error characteristics of DNA storage to enhance the decoding performance. Consequently, the proposed coding scheme reduces the required number of oligo reads for perfect recovery by 26.25% ~ 38.5% compared to existing state-of-the-art coding schemes. Moreover, we develop an analytical DNA channel model in terms of non-uniform binary symmetric channels. This mathematical model allows us to demonstrate the superiority of the proposed coding scheme while isolating the experimental variation, as well as confirm the independent effects of LDPC codes and interleaving techniques.


Subject(s)
DNA , DNA/genetics , DNA/chemistry , Computers, Molecular , Sequence Analysis, DNA/methods , Algorithms , Genetic Code/genetics
8.
Biosystems ; 237: 105135, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38320621

ABSTRACT

The existent algebraic models of the genetic code contribute to the understanding of the physio-chemical characteristics of the amino acids. However, the process of translating a gene into a phenotype is highly complex. Moreover, the intricacy of gene expression gets further multiplied due to the biases in the codon usage. This paper explores an algebraic structure called module on the set of codons as well as on that of RNA sequences. We study the potential implications of these structures on gene expression and the GC content of an RNA sequence. The base order {C,U,G,A} appears to possess greater biological significance than many of the orders previously studied. We have developed a novel algorithm to generate RNA sequences with high GC content, aiming to enhance the thermostability of biomolecules. The insights gained from this investigation may have applications in biomolecular modeling and docking, protein engineering, drug development, and related fields.


Subject(s)
Genetic Code , Base Sequence , Base Composition , Genetic Code/genetics , Codon/genetics , Gene Expression
9.
Biosystems ; 237: 105133, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38336225

ABSTRACT

Life codes increase in both number and variety with biological complexity. Although our knowledge of codes is constantly expanding, the evolutionary progression of organic, neural, and cultural codes in response to selection pressure remains poorly understood. Greater clarification of the selective mechanisms is achieved by investigating how major evolutionary transitions reduce spatiotemporal and energetic constraints on transmitting heritable code to offspring. Evolution toward less constrained flows is integral to enduring flow architecture everywhere, in both engineered and natural flow systems. Beginning approximately 4 billion years ago, the most basic level for transmitting genetic material to offspring was initiated by protocell division. Evidence from ribosomes suggests that protocells transmitted comma-free or circular codes, preceding the evolution of standard genetic code. This rudimentary information flow within protocells is likely to have first emerged within the geo-energetic and geospatial constraints of hydrothermal vents. A broad-gauged hypothesis is that major evolutionary transitions overcame such constraints with tri-flow adaptations. The interconnected triple flows incorporated energy-converting, spatiotemporal, and code-based informational dynamics. Such tri-flow adaptations stacked sequence splicing code on top of protein-DNA recognition code in eukaryotes, prefiguring the transition to sexual reproduction. Sex overcame the spatiotemporal-energetic constraints of binary fission with further code stacking. Examples are tubulin code and transcription initiation code in vertebrates. In a later evolutionary transition, language reduced metabolic-spatiotemporal constraints on inheritance by stacking phonetic, phonological, and orthographic codes. In organisms that reproduce sexually, each major evolutionary transition is shown to be a tri-flow adaptation that adds new levels of code-based informational exchange. Evolving biological complexity is also shown to increase the nongenetic transmissibility of code.


Subject(s)
Eukaryota , Genetic Code , Animals , Genetic Code/genetics , Eukaryota/genetics , Vertebrates/genetics , Reproduction , Ribosomes , Evolution, Molecular
10.
Biosystems ; 237: 105159, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38373543

ABSTRACT

I support the hypothesis that the origin of the genetic code occurred simultaneously with the evolution of cellularity. That is to say, I favour the hypothesis that the origin of the genetic code is a very, very late event in the history of life on Earth. I corroborate this hypothesis with observations favouring the progenote's stage for the Last Universal Common Ancestor (LUCA), for the ancestor of bacteria and that of archaea. Indeed, these progenotic stages would imply that - at that time - the origin of the genetic code was still ongoing simply because this origin would fall within the very definition of progenote. Therefore, if the evolution of cellularity had truly been coeval with the origin of the genetic code - at least in its terminal part - then this would favour theories such as the coevolution theory of the origin of the genetic code because this theory would postulate that this origin must have occurred in extremely complex protocellular conditions and not concerning stereochemical or physicochemical interactions having to do with other stages of the origin of life. In this sense, the coevolution theory would be corroborated while the stereochemical and physicochemical theories would be damaged. Therefore, the origin of the genetic code would be linked to the origin of the cell and not to the origin of life as sometimes asserted. Therefore, I will discuss the late hypothesis of the origin of the genetic code in the context of the theories proposed to explain this origin and more generally of its implications for the early evolution of life.


Subject(s)
Evolution, Molecular , Genetic Code , Genetic Code/genetics , Bacteria/genetics , Archaea/genetics
11.
Nature ; 625(7995): 603-610, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38200312

ABSTRACT

The genetic code of living cells has been reprogrammed to enable the site-specific incorporation of hundreds of non-canonical amino acids into proteins, and the encoded synthesis of non-canonical polymers and macrocyclic peptides and depsipeptides1-3. Current methods for engineering orthogonal aminoacyl-tRNA synthetases to acylate new monomers, as required for the expansion and reprogramming of the genetic code, rely on translational readouts and therefore require the monomers to be ribosomal substrates4-6. Orthogonal synthetases cannot be evolved to acylate orthogonal tRNAs with non-canonical monomers (ncMs) that are poor ribosomal substrates, and ribosomes cannot be evolved to polymerize ncMs that cannot be acylated onto orthogonal tRNAs-this co-dependence creates an evolutionary deadlock that has essentially restricted the scope of translation in living cells to α-L-amino acids and closely related hydroxy acids. Here we break this deadlock by developing tRNA display, which enables direct, rapid and scalable selection for orthogonal synthetases that selectively acylate their cognate orthogonal tRNAs with ncMs in Escherichia coli, independent of whether the ncMs are ribosomal substrates. Using tRNA display, we directly select orthogonal synthetases that specifically acylate their cognate orthogonal tRNA with eight non-canonical amino acids and eight ncMs, including several ß-amino acids, α,α-disubstituted-amino acids and ß-hydroxy acids. We build on these advances to demonstrate the genetically encoded, site-specific cellular incorporation of ß-amino acids and α,α-disubstituted amino acids into a protein, and thereby expand the chemical scope of the genetic code to new classes of monomers.


Subject(s)
Amino Acids , Amino Acyl-tRNA Synthetases , Escherichia coli , Genetic Code , RNA, Transfer , Acylation , Amino Acids/chemistry , Amino Acids/metabolism , Amino Acyl-tRNA Synthetases/chemistry , Amino Acyl-tRNA Synthetases/genetics , Amino Acyl-tRNA Synthetases/metabolism , Genetic Code/genetics , Hydroxy Acids/chemistry , Hydroxy Acids/metabolism , RNA, Transfer/chemistry , RNA, Transfer/genetics , RNA, Transfer/metabolism , Substrate Specificity , Ribosomes/metabolism , Escherichia coli/enzymology , Escherichia coli/genetics , Escherichia coli/metabolism
12.
Cell Rep Methods ; 3(11): 100626, 2023 Nov 20.
Article in English | MEDLINE | ID: mdl-37935196

ABSTRACT

Stop codon suppression using dedicated tRNA/aminoacyl-tRNA synthetase (aaRS) pairs allows for genetically encoded, site-specific incorporation of non-canonical amino acids (ncAAs) as chemical handles for protein labeling and modification. Here, we demonstrate that piggyBac-mediated genomic integration of archaeal pyrrolysine tRNA (tRNAPyl)/pyrrolysyl-tRNA synthetase (PylRS) or bacterial tRNA/aaRS pairs, using a modular plasmid design with multi-copy tRNA arrays, allows for homogeneous and efficient genetically encoded ncAA incorporation in diverse mammalian cell lines. We assess opportunities and limitations of using ncAAs for fluorescent labeling applications in stable cell lines. We explore suppression of ochre and opal stop codons and finally incorporate two distinct ncAAs with mutually orthogonal click chemistries for site-specific, dual-fluorophore labeling of a cell surface receptor on live mammalian cells.


Subject(s)
Amino Acyl-tRNA Synthetases , Genetic Code , Codon, Terminator/genetics , Genetic Code/genetics , RNA, Transfer/genetics , Amino Acids/genetics , Amino Acyl-tRNA Synthetases/genetics
14.
Biosystems ; 234: 105043, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37852409

ABSTRACT

The accumulated material in evolutionary biology, greatly enhanced by the achievements of modern synthetic biology, allows us to envision certain key hypothetical stages of prebiotic (chemical) evolution. This is often understood as the further evolution in the RNA World towards the RNA-protein World. It is a path towards the emergence of translation and the genetic code (I), signaling pathways with signaling molecules (II), and the appearance of RNA-based components of future gene regulatory networks (III). We believe that these evolutionary paths can be constructively viewed from the perspective of the concept of biological codes (Barbieri, 2003). Crucial evolutionary events in these directions would involve the emergence of RNA-based adaptors. Such adaptors connect two families of functionally and chemically distinct molecules into one functional entity. The emergence of primitive translation processes is undoubtedly the major milestone in the evolutionary path towards modern life. The key aspect here is the appearance of adaptors between amino acids and their cognate triplet codons. The initial steps are believed to involve the emergence of proto-transfer RNAs capable of self-aminoacylation. The second significant evolutionary breakthrough is the development of biochemical regulatory networks based on signaling molecules of the RNA World (ribonucleotides and their derivatives), as well as receptors and effectors (riboswitches) for these messengers. Some authors refer to this as the "lost language of the RNA World." The third evolutionary step is the emergence of signal sequences for ribozymes on the molecules of their RNA targets. This level of regulation in the RNA World is comparable to the gene regulatory networks of modern organisms. We believe that the signal sequences on target molecules have been rediscovered and developed by evolution into the gene regulatory networks of modern cells. In conclusion, the immense diversity of modern biological codes, in some of its key characteristics, can be traced back to the achievements of prebiotic evolution.


Subject(s)
RNA, Transfer , RNA , RNA/chemistry , RNA, Transfer/genetics , Genetic Code/genetics , Codon , Protein Sorting Signals/genetics , Evolution, Molecular
15.
Biosystems ; 233: 105016, 2023 Nov.
Article in English | MEDLINE | ID: mdl-37659678

ABSTRACT

Organismal evolution displays complex dynamics in phase and scale which seem to trend towards increasing biocomplexity and diversity. For over a century, such amazing dynamics have been cleverly explained by the apparently straightforward mechanism of natural selection: all diversification, including speciation, results from the gradual accumulation of small beneficial or near-neutral alterations over long timescales. However, although this has been widely accepted, natural selection makes a crucial assumption that has not yet been validated. Specifically, the informational relationship between small microevolutionary alterations and large macroevolutionary changes in natural selection is unclear. To address the macroevolution-microevolution relationship, it is crucial to incorporate the concept of organic codes and particularly the "karyotype code" which defines macroevolutionary changes. This concept piece examines the karyotype from the perspective of two-phased evolution and four key components of information management. It offers insight into how the karyotype creates and preserves information that defines the scale and phase of macroevolution and, by extension, microevolution. We briefly describe the relationship between the karyotype code, the genetic code, and other organic codes in the context of generating evolutionary novelties in macroevolution and imposing constraints on them as biological routines in microevolution. Our analyses suggest that karyotype coding preserves many organic codes by providing system-level inheritance, and similar analyses are needed to classify and prioritize a large number of different organic codes based on the phases and scales of evolution. Finally, the importance of natural information self-creation is briefly discussed, leading to a call to integrate information and time into the relationship between matter and energy.


Subject(s)
Genetic Code , Inheritance Patterns , Genetic Code/genetics , Karyotype , Biological Evolution , Evolution, Molecular
16.
Biosystems ; 232: 105013, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37657747

ABSTRACT

Autonomy, meaning freedom from exogenous control, requires independence of both constitution and cybernetic regulation. Here, the necessity of biological codes to achieve both is explained, assuming that Aristotelian efficient cause is 'formal cause empowered by physical force'. Constitutive independence requires closure to efficient causation (in the Rosen sense); cybernetic independence requires transformation of cause-effect into signal-response relations at the organism boundary; the combination of both kinds of independence enables adaptation and evolution. Codes and cyphers translate information from one form of physical embodiment (domain) to another. Because information can only contribute as formal cause to efficient cause within the domain of its embodiment, translation can extend or restrict the range over which information is effective. Closure to efficient causation requires internalised information to be isolated from the cycle of efficient causes that it informs: e.g. Von Neumann self-replicator requires a (template) source of information that is causally isolated from the physical replication system. Life operationalises this isolation with the genetic code translating from the (isolated) domain of codons to that of protein interactions. Separately, cybernetic freedom is achieved at the cell boundary because transducers, which embody molecular coding, translate exogenous information into a domain where it no longer has the power of efficient cause. Information, not efficient cause, passes through the boundary to serve as stimulus for an internally generated response. Coding further extends freedom by enabling historically accumulated information to be selectively transformed into efficient cause under internal control, leaving it otherwise stored inactive. Code-based translation thus enables selective causal isolation, controlling the flow from cause to effect. Genetic code, cell-signalling codes and, in eukaryotes, the histone code, signal sequence based protein sorting and other code-dependent processes all regulate and separate causal chains. The existence of life can be seen as an expression of the power of molecular codes to selectively isolate and thereby organise causal relations among molecular interactions to form an organism.


Subject(s)
Cybernetics , Eukaryota , Causality , Eukaryota/genetics , Genetic Code/genetics , Histone Code
18.
Biosystems ; 229: 104906, 2023 Jul.
Article in English | MEDLINE | ID: mdl-37196893

ABSTRACT

In this article, we introduce the new mathematical concept of circular mixed sets of words over an arbitrary finite alphabet. These circular mixed sets may not be codes in the classical sense and hence allow a higher amount of information to be encoded. After describing their basic properties, we generalize a recent graph theoretical approach for circularity and apply it to distinguish codes from sets (i.e. non-codes). Moreover, several methods are given to construct circular mixed sets. Finally, this approach allows us to propose a new evolution model of the present genetic code that could have evolved from a dinucleotide world to a trinucleotide world via circular mixed sets of dinucleotides and trinucleotides.


Subject(s)
Genetic Code , Models, Genetic , Genetic Code/genetics
19.
Nature ; 617(7960): 395-402, 2023 May.
Article in English | MEDLINE | ID: mdl-37046090

ABSTRACT

Translation is pervasive outside of canonical coding regions, occurring in long noncoding RNAs, canonical untranslated regions and introns1-4, especially in ageing4-6, neurodegeneration5,7 and cancer8-10. Notably, the majority of tumour-specific antigens are results of noncoding translation11-13. Although the resulting polypeptides are often nonfunctional, translation of noncoding regions is nonetheless necessary for the birth of new coding sequences14,15. The mechanisms underlying the surveillance of translation in diverse noncoding regions and how escaped polypeptides evolve new functions remain unclear10,16-19. Functional polypeptides derived from annotated noncoding sequences often localize to membranes20,21. Here we integrate massively parallel analyses of more than 10,000 human genomic sequences and millions of random sequences with genome-wide CRISPR screens, accompanied by in-depth genetic and biochemical characterizations. Our results show that the intrinsic nucleotide bias in the noncoding genome and in the genetic code frequently results in polypeptides with a hydrophobic C-terminal tail, which is captured by the ribosome-associated BAG6 membrane protein triage complex for either proteasomal degradation or membrane targeting. By contrast, canonical proteins have evolved to deplete C-terminal hydrophobic residues. Our results reveal a fail-safe mechanism for the surveillance of unwanted translation from diverse noncoding regions and suggest a possible biochemical route for the preferential membrane localization of newly evolved proteins.


Subject(s)
Genetic Code , Protein Biosynthesis , Proteins , RNA, Long Noncoding , Ribosomes , Humans , Molecular Chaperones/metabolism , Peptides/chemistry , Peptides/genetics , Peptides/metabolism , Proteins/chemistry , Proteins/genetics , Proteins/metabolism , Ribosomes/metabolism , RNA, Long Noncoding/genetics , Protein Biosynthesis/genetics , Genome, Human , Genetic Code/genetics , Hydrophobic and Hydrophilic Interactions , Introns/genetics
20.
PLoS Comput Biol ; 19(4): e1011034, 2023 04.
Article in English | MEDLINE | ID: mdl-37068098

ABSTRACT

The genetic code refers to a rule that maps 64 codons to 20 amino acids. Nearly all organisms, with few exceptions, share the same genetic code, the standard genetic code (SGC). While it remains unclear why this universal code has arisen and been maintained during evolution, it may have been preserved under selection pressure. Theoretical studies comparing the SGC and numerically created hypothetical random genetic codes have suggested that the SGC has been subject to strong selection pressure for being robust against translation errors. However, these prior studies have searched for random genetic codes in only a small subspace of the possible code space due to limitations in computation time. Thus, how the genetic code has evolved, and the characteristics of the genetic code fitness landscape, remain unclear. By applying multicanonical Monte Carlo, an efficient rare-event sampling method, we efficiently sampled random codes from a much broader random ensemble of genetic codes than in previous studies, estimating that only one out of every 1020 random codes is more robust than the SGC. This estimate is significantly smaller than the previous estimate, one in a million. We also characterized the fitness landscape of the genetic code that has four major fitness peaks, one of which includes the SGC. Furthermore, genetic algorithm analysis revealed that evolution under such a multi-peaked fitness landscape could be strongly biased toward a narrow peak, in an evolutionary path-dependent manner.


Subject(s)
Evolution, Molecular , Genetic Code , Genetic Code/genetics , Codon/genetics , Amino Acids/chemistry , Algorithms , Models, Genetic
SELECTION OF CITATIONS
SEARCH DETAIL