Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
1.
Bull Math Biol ; 82(4): 48, 2020 04 04.
Artigo em Inglês | MEDLINE | ID: mdl-32248310

RESUMO

The origin of the modern genetic code and the mechanisms that have contributed to its present form raise many questions. The main goal of this work is to test two hypotheses concerning the development of the genetic code for their compatibility and complementarity and see if they could benefit from each other. On the one hand, Gonzalez, Giannerini and Rosa developed a theory, based on four-based codons, which they called tesserae. This theory can explain the degeneracy of the modern vertebrate mitochondrial code. On the other hand, in the 1990s, so-called circular codes were discovered in nature, which seem to ensure the maintenance of a correct reading-frame during the translation process. It turns out that the two concepts not only do not contradict each other, but on the contrary complement and enrichen each other.


Assuntos
Evolução Molecular , Código Genético , Modelos Genéticos , Animais , Códon , Genes Mitocondriais , Humanos , Conceitos Matemáticos , Biossíntese de Proteínas , Fases de Leitura , Vertebrados/genética
2.
Bull Math Biol ; 82(8): 105, 2020 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-32754878

RESUMO

A code X is k-circular if any concatenation of at most k words from X, when read on a circle, admits exactly one partition into words from X. It is circular if it is k-circular for every integer k. While it is not a priori clear from the definition, there exists, for every pair [Formula: see text], an integer k such that every k-circular [Formula: see text]-letter code over an alphabet of cardinality n is circular, and we determine the least such integer k for all values of n and [Formula: see text]. The k-circular codes may represent an important evolutionary step between the circular codes, such as the comma-free codes, and the genetic code.


Assuntos
Modelos Genéticos , Evolução Biológica , Código Genético , Conceitos Matemáticos , Nucleotídeos
3.
Bull Math Biol ; 79(8): 1796-1819, 2017 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-28643131

RESUMO

Comma-free codes constitute a class of circular codes, which has been widely studied, in particular by Golomb et al. (Biologiske Meddelelser, Kongelige Danske Videnskabernes Selskab 23:1-34, 1958a, Can J Math 10:202-209, 1958b), Michel et al. (Comput Math Appl 55:989-996, 2008a, Theor Comput Sci 401:17-26, 2008b, Inf Comput 212:55-63, 2012), Michel and Pirillo (Int J Comb 2011:659567, 2011), and Fimmel and Strüngmann (J Theor Biol 389:206-213, 2016). Based on a recent approach using graph theory to study circular codes Fimmel et al. (Philos Trans R Soc 374:20150058, 2016), a new class of circular codes, called strong comma-free codes, is identified. These codes detect a frameshift during the translation process immediately after a reading window of at most two nucleotides. We describe several combinatorial properties of strong comma-free codes: enumeration, maximality, self-complementarity and [Formula: see text]-property (comma-free property in all the three possible frames). These combinatorial results also highlight some new properties of the genetic code and its evolution. Each amino acid in the standard genetic code is coded by at least one strong comma-free code of size 1. There are 9 amino acids [Formula: see text] among 20 such that for each amino acid from S, its synonymous trinucleotide set (excluding the necessary periodic trinucleotides [Formula: see text]) is a strong comma-free code. The primeval comma-free RNY code of Eigen and Schuster (Naturwissenschaften 65:341-369, 1978) is a self-complementary [Formula: see text]-code of size 16. Furthermore, it is the union of two strong comma-free codes of size 8 which are complementary to each other.


Assuntos
Aminoácidos , Código Genético , Modelos Genéticos , Códon , Nucleotídeos
4.
J Theor Biol ; 389: 206-13, 2016 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-26562635

RESUMO

The problem of retrieval and maintenance of the correct reading frame plays a significant role in RNA transcription. Circular codes, and especially comma-free codes, can help to understand the underlying mechanisms of error-detection in this process. In recent years much attention has been paid to the investigation of trinucleotide circular codes (see, for instance, Fimmel et al., 2014; Fimmel and Strüngmann, 2015a; Michel and Pirillo, 2012; Michel et al., 2012, 2008), while dinucleotide codes had been touched on only marginally, even though dinucleotides are associated to important biological functions. Recently, all maximal dinucleotide circular codes were classified (Fimmel et al., 2015; Michel and Pirillo, 2013). The present paper studies maximal dinucleotide comma-free codes and their close connection to maximal dinucleotide circular codes. We give a construction principle for such codes and provide a graphical representation that allows them to be visualized geometrically. Moreover, we compare the results for dinucleotide codes with the corresponding situation for trinucleotide maximal self-complementary C(3)-codes. Finally, the results obtained are discussed with respect to Crick׳s hypothesis about frame-shift-detecting codes without commas.


Assuntos
Código Genético , Nucleotídeos/química , Algoritmos , Aminoácidos/química , Códon , Gráficos por Computador , Simulação por Computador , Evolução Molecular , Genoma , Modelos Genéticos , Nucleotídeos/genética , RNA/genética , Reprodutibilidade dos Testes , Transcrição Gênica
5.
J Theor Biol ; 364: 113-20, 2015 Jan 07.
Artigo em Inglês | MEDLINE | ID: mdl-25225028

RESUMO

Circular codes are putative remnants of primeval comma-free codes and are potentially involved in detecting and maintaining the normal reading frame in protein coding sequences. In Michel and Pirillo (2013a) it was shown by computer algorithm that no maximal trinucleotide circular code can encode more than 18 different amino acids under the standard version of the genetic code. For comma-free codes the maximum is even less, namely 13 (Michel, 2014). The main purpose of this paper is to investigate these facts from a mathematical point of view and to show why the codes with the best-known error detecting properties are limited in the number of amino acids they can encode. We introduce five hierarchically ordered classes of trinucleotide codes including the well-known comma-free and circular codes and prove combinatorically that it is impossible to encode all amino acids using codes from four out of the five classes that have the strongest error detecting properties. However, it is possible to encode all 20 amino acids using codes from the largest class with the weakest properties. Additionally, we develop a handy criterion for circularity. As an application, it is shown that all codes from a special class of trinucleotide codes which includes the RNY-primeval code (Shepherd, 1986) are automatically circular. We also list which amino acids these codes encode.


Assuntos
Aminoácidos/genética , Código Genético , Nucleotídeos/genética , Sequência de Bases , Códon/genética , Modelos Genéticos , Nucleotídeos/química
6.
J Theor Biol ; 386: 159-65, 2015 Dec 07.
Artigo em Inglês | MEDLINE | ID: mdl-26423358

RESUMO

The presence of circular codes in mRNA coding sequences is postulated to be involved in informational mechanisms aimed at detecting and maintaining the normal reading frame during protein synthesis. Most of the recent research is focused on trinucleotide circular codes. However, also dinucleotide circular codes are important since dinucleotides are ubiquitous in genomes and associated to important biological functions. In this work we adopt the group theoretic approach used for trinucleotide codes in Fimmel et al. (2015) to study dinucleotide circular codes and highlight their symmetry properties. Moreover, we characterize such codes in terms of n-circularity and provide a graph representation that allows to visualize them geometrically. The results establish a theoretical framework for the study of the biological implications of dinucleotide circular codes in genomic sequences.


Assuntos
DNA Circular/genética , Código Genético , Modelos Genéticos , Nucleotídeos/genética , Humanos , Fases de Leitura/genética , Transformação Genética
7.
J Math Biol ; 70(7): 1623-44, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25008961

RESUMO

Circular codes, putative remnants of primeval comma-free codes, have gained considerable attention in the last years. In fact they represent a second kind of genetic code potentially involved in detecting and maintaining the normal reading frame in protein coding sequences. The discovering of an universal code across species suggested many theoretical and experimental questions. However, there is a key aspect that relates circular codes to symmetries and transformations that remains to a large extent unexplored. In this article we aim at addressing the issue by studying the symmetries and transformations that connect different circular codes. The main result is that the class of 216 C3 maximal self-complementary codes can be partitioned into 27 equivalence classes defined by a particular set of transformations. We show that such transformations can be put in a group theoretic framework with an intuitive geometric interpretation. More general mathematical results about symmetry transformations which are valid for any kind of circular codes are also presented. Our results pave the way to the study of the biological consequences of the mathematical structure behind circular codes and contribute to shed light on the evolutionary steps that led to the observed symmetries of present codes.


Assuntos
Código Genético , Modelos Genéticos , Códon , Evolução Molecular , Conceitos Matemáticos , Biossíntese de Proteínas , Fases de Leitura
8.
Biosystems ; : 105263, 2024 Jul 04.
Artigo em Inglês | MEDLINE | ID: mdl-38971553

RESUMO

In this work we present an analysis of the dinucleotide occurrences in the three codon sites 1-2, 2-3 and 1-3, based on a computation of the codon usage of three large sets of bacterial, archaeal and eukaryotic genes using the same method that identified a maximal C3 self-complementary trinucleotide circular code X in genes of bacteria and eukaryotes in 1996 Arquès and Michel (1996). Surprisingly, two dinucleotide circular codes are identified in the codon sites 1-2 and 2-3. Furthermore, these two codes are shifted versions of each other. Moreover, the dinucleotide code in the codon site 1-3 is circular, self-complementary and contained in the projection of X onto the 1st and 3rd bases, i.e. by cutting the middle base in each codon of X. We prove several results showing that the circularity and the self-complementarity of trinucleotide codes is induced by the circularity and the self-complementarity of its dinucleotide cut codes. Finally, we present several evolutionary approaches for an emergence of trinucleotide codes from dinucleotide codes.

9.
J Theor Biol ; 336: 221-30, 2013 Nov 07.
Artigo em Inglês | MEDLINE | ID: mdl-23988795

RESUMO

Dichotomic classes arising from a recent mathematical model of the genetic code allow to uncover many symmetry properties of the code, and although theoretically derived, they permitted to build statistical classifiers able to retrieve the correct translational frame of coding sequences. Herein we formalize the mathematical properties of these classes, first focusing on all the possible decompositions of the 64 codons of the genetic code into two equally sized dichotomic subsets. Then the global framework of bijective transformations of the nucleotide bases is discussed and we clarify when dichotomic partitions can be generated. In addition, we show that the parity dichotomic classes of the mathematical model and complementarity dichotomic classes obtained in the present article can be formalized in the same algorithmic way the dichotomic Rumer's degeneracy classes. Interestingly, we find that the algorithm underlying dichotomic class definition mirrors biochemical features occurring at discrete base positions in the decoding center of the ribosome.


Assuntos
Algoritmos , Código Genético , Modelos Genéticos
10.
Biosystems ; 233: 105009, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37640191

RESUMO

Nature possesses inherent mechanisms for error detection and correction during the translation of genetic information, as demonstrated by the discovery of a self-complementary circular C3-code called X0 in various organisms such as bacteria, eukaryotes, plasmids, and viruses (Arquès and Michel, 1996; Michel, 2015, 2017). Since then, extensive research has focused on circular codes, which are believed to be remnants of ancient comma-free codes. These codes can be regarded as an additional genetic code specifically optimized for detecting and preserving the proper reading frame in protein-coding sequences. A study by Fimmel et al. in 2014 identified that a total of 216 maximal self-complementary C3-codes can be grouped into 27 equivalence classes with eight codes in each class. In this work, we study how the 27 equivalence classes are related to each other. While the codes in each equivalence class obtained by Fimmel et al. in 2014 are permutations of each other, i.e. one code can be obtained from the other by applying a permutation of the bases, it has not been clear how the equvalence classes are connected. We show that there is an ordering of the equivalence classes such that one gets from one class to the next one by substituting only one pair of codon/anticodon in the corresponding codes, i.e. the corresponding codes have a maximal intersection of 18 codons. To perform this analysis, we define two graphs, G216 and G27, whose vertices are, respectively, all 216 maximal self-complementary C3-codes and 27 equivalence classes. Several properties of the graphs are obtained. Most surprisingly, it turns out that G27 contains Hamiltonian paths of length 27. This fact ultimately leads to a representation of the set of all 216 maximal self-complementary C3-codes as a kind of spider web. Finally, we define dinucleotide cuts of such codes by projecting each codon to its first two bases and show that the paths of lengths 27 in G216 can even be chosen so that all the codes contain a special subset of dinucleotides defined by Rumer's roots. These observations raise a lot of new questions about the biological function of such structures.

11.
Biosystems ; 229: 104906, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37196893

RESUMO

In this article, we introduce the new mathematical concept of circular mixed sets of words over an arbitrary finite alphabet. These circular mixed sets may not be codes in the classical sense and hence allow a higher amount of information to be encoded. After describing their basic properties, we generalize a recent graph theoretical approach for circularity and apply it to distinguish codes from sets (i.e. non-codes). Moreover, several methods are given to construct circular mixed sets. Finally, this approach allows us to propose a new evolution model of the present genetic code that could have evolved from a dinucleotide world to a trinucleotide world via circular mixed sets of dinucleotides and trinucleotides.


Assuntos
Código Genético , Modelos Genéticos , Código Genético/genética
12.
Biosystems ; 219: 104716, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-35710042

RESUMO

A message such as mRNA, which consists of continuous characters without separators (such as commas or spaces), can easily be decoded incorrectly if it is read in the wrong reading frame. One construct to theoretically avoid these reading frame errors is the class of block codes. However, the first hypothesis of Watson and Crick (1953) that block codes are used as a tool to avoid reading frame errors in coding sequences already failed because the four periodical codons AAA, CCC, GGG and UUU seem to play an important role in protein coding sequences. Even the class of circular codes later discovered by Arquès and Michel (1996) in coding sequences cannot contain a periodic codon. However, by incorporating the interpretation of the message into the robustness of the reading frame, the extension of circular codes to include periodic codons is theoretically possible. In this work, we introduce the new class of I-circular codes. Unlike circular codes, these codes allow frame shifts, but only if the decoded interpretation of the message is identical to the intended interpretation. In the following, the formal definition of I-circular codes is introduced and the maximum and the maximal size of I-circular codes are given based on the standard genetic code table. These numbers are calculated using a new graph-theoretic approach derived from the classical one for the class of circular codes. Furthermore, we show that all 216 maximum self-complementary C3-codes (see Fimmel et al., 2015) can be extended to larger I-circular codes. We present the increased code coverage of the 216 newly constructed I-circular codes based on the human coding sequences in chromosome 1. In the last section of this paper, we use the polarity of amino acids as an interpretation table to construct I-circular codes. In an optimization process, two maximum I-circular codes of length 30 are found.


Assuntos
Código Genético , Modelos Genéticos , Aminoácidos , Códon/genética , Código Genético/genética , Humanos , Fases de Leitura
13.
Biosystems ; 204: 104392, 2021 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-33731280

RESUMO

Is it possible to apply infinite combinatorics and (infinite) set theory in theoretical biology? We do not know the answer yet but in this article we try to present some techniques from infinite combinatorics and set theory that have been used over the last decades in order to prove existence results and independence theorems in algebra and that might have the flexibility and generality to be also used in theoretical biology. In particular, we will introduce the theory of forcing and an algebraic construction technique based on trees and forests using infinite binary sequences. We will also present an overview of the theory of circular codes. Such codes had been found in the genetic information and are assumed to play an important role in error detecting and error correcting mechanisms during the process of translation. Finally, examples and constructions of infinite mixed circular codes using binary sequences hopefully show some similarity between these theories - a starting point for future applications.


Assuntos
Biologia , Matemática , Modelos Teóricos
14.
Life (Basel) ; 11(12)2021 Dec 03.
Artigo em Inglês | MEDLINE | ID: mdl-34947869

RESUMO

It is believed that the codon-amino acid assignments of the standard genetic code (SGC) help to minimize the negative effects caused by point mutations. All possible point mutations of the genetic code can be represented as a weighted graph with weights that correspond to the probabilities of these mutations. The robustness of a code against point mutations can be described then by means of the so-called conductance measure. This paper quantifies the wobble effect, which was investigated previously by applying the weighted graph approach, and seeks optimal weights using an evolutionary optimization algorithm to maximize the code's robustness. One result of our study is that the robustness of the genetic code is least influenced by mutations in the third position-like with the wobble effect. Moreover, the results clearly demonstrate that point mutations in the first, and even more importantly, in the second base of a codon have a very large influence on the robustness of the genetic code. These results were compared to single nucleotide variants (SNV) in coding sequences which support our findings. Additionally, it was analyzed which structure of a genetic code evolves from random code tables when the robustness is maximized. Our calculations show that the resulting code tables are very close to the standard genetic code. In conclusion, the results illustrate that the robustness against point mutations seems to be an important factor in the evolution of the standard genetic code.

15.
Theory Biosci ; 140(1): 107-121, 2021 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-33523355

RESUMO

In the 1950s, Crick proposed the concept of so-called comma-free codes as an answer to the frame-shift problem that biologists have encountered when studying the process of translating a sequence of nucleotide bases into a protein. A little later it turned out that this proposal unfortunately does not correspond to biological reality. However, in the mid-90s, a weaker version of comma-free codes, so-called circular codes, was discovered in nature in J Theor Biol 182:45-58, 1996. Circular codes allow to retrieve the reading frame during the translational process in the ribosome and surprisingly the circular code discovered in nature is even circular in all three possible reading-frames ([Formula: see text]-property). Moreover, it is maximal in the sense that it contains 20 codons and is self-complementary which means that it consists of pairs of codons and corresponding anticodons. In further investigations, it was found that there are exactly 216 codes that have the same strong properties as the originally found code from J Theor Biol 182:45-58. Using an algebraic approach, it was shown in J Math Biol, 2004 that the class of 216 maximal self-complementary [Formula: see text]-codes can be partitioned into 27 equally sized equivalence classes by the action of a transformation group [Formula: see text] which is isomorphic to the dihedral group. Here, we extend the above findings to circular codes over a finite alphabet of even cardinality [Formula: see text] for [Formula: see text]. We describe the corresponding group [Formula: see text] using matrices and we investigate what classes of circular codes are split into equally sized equivalence classes under the natural equivalence relation induced by [Formula: see text]. Surprisingly, this is not always the case. All results and constructions are illustrated by examples.


Assuntos
Código Genético , Modelos Genéticos , Códon , Nucleotídeos , Ribossomos
16.
Biosystems ; 184: 103990, 2019 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-31326431

RESUMO

The origin of the genetic code can certainly be regarded as one of the most challenging problems in the theory of molecular evolution. Thus the known variants of the genetic code and a possible common ancestry of them haven been studied extensively in the literature. Gonzalez et al. (2012) developed the theory of a primeval mitochondrial genetic code composed of four base codons. These were called tesserae and it was shown that the tesserae code has some remarkable error detection capabilities. In our paper we will show that using classical coding theory we can construct the tessera code as a linear coding of the standard genetic code and at the same time it can be deduced from the code of all dinucleotides by Plotkin's construction. It shows that the tessera model of the mitochondrial code does not just have a biological explanation but also has a clear mathematical structure. This underlines the role that the tessera model might have played in evolution.


Assuntos
Códon/genética , Código Genético/genética , Genoma Mitocondrial/genética , Mitocôndrias/genética , Algoritmos , Aminoácidos/genética , Sequência de Bases , Evolução Molecular , Modelos Genéticos , Nucleotídeos/genética
17.
Math Biosci ; 317: 108231, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31325443

RESUMO

By an extensive statistical analysis in genes of bacteria, archaea, eukaryotes, plasmids and viruses, a maximal C3-self-complementary trinucleotide circular code has been found to have the highest average occurrence in the reading frame of the ribosome during translation. Circular codes may play an important role in maintaining the correct reading frame. On the other hand, as several evolutionary theories propose primeval codes based on dinucleotides, trinucleotides and tetranucleotides, mixed circular codes were investigated. By using a graph-theoretical approach of circular codes recently developed, we study mixed circular codes, which are the union of a dinucleotide circular code, a trinucleotide circular code and a tetranucleotide circular code. Maximal mixed circular codes of (di,tri)-nucleotides, (tri,tetra)-nucleotides and (di,tri,tetra)-nucleotides are constructed, respectively. In particular, we show that any maximal dinucleotide circular code of size 6 can be embedded into a maximal mixed (di,tri)-nucleotide circular code such that its trinucleotide component is a maximal C3-comma-free code. The growth function of self-complementary mixed circular codes of dinucleotides and trinucleotides is given. Self-complementary mixed circular codes could have been involved in primitive genetic processes.


Assuntos
Código Genético/genética , Genoma/genética , Modelos Biológicos , Nucleotídeos/genética , Archaea , Bactérias , Eucariotos , Plasmídeos , Vírus
18.
Biosystems ; 164: 186-198, 2018 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-28918301

RESUMO

Symmetry is one of the essential and most visible patterns that can be seen in nature. Starting from the left-right symmetry of the human body, all types of symmetry can be found in crystals, plants, animals and nature as a whole. Similarly, principals of symmetry are also some of the fundamental and most useful tools in modern mathematical natural science that play a major role in theory and applications. As a consequence, it is not surprising that the desire to understand the origin of life, based on the genetic code, forces us to involve symmetry as a mathematical concept. The genetic code can be seen as a key to biological self-organisation. All living organisms have the same molecular bases - an alphabet consisting of four letters (nitrogenous bases): adenine, cytosine, guanine, and thymine. Linearly ordered sequences of these bases contain the genetic information for synthesis of proteins in all forms of life. Thus, one of the most fascinating riddles of nature is to explain why the genetic code is as it is. Genetic coding possesses noise immunity which is the fundamental feature that allows to pass on the genetic information from parents to their descendants. Hence, since the time of the discovery of the genetic code, scientists have tried to explain the noise immunity of the genetic information. In this chapter we will discuss recent results in mathematical modelling of the genetic code with respect to noise immunity, in particular error-detection and error-correction. We will focus on two central properties: Degeneracy and frameshift correction. DEGENERACY: Different amino acids are encoded by different quantities of codons and a connection between this degeneracy and the noise immunity of genetic information is a long standing hypothesis. Biological implications of the degeneracy have been intensively studied and whether the natural code is a frozen accident or a highly optimised product of evolution is still controversially discussed. Symmetries in the structure of degeneracy of the genetic code are essential and give evidence of substantial advantages of the natural code over other possible ones. In the present chapter we will present a recent approach to explain the degeneracy of the genetic code by algorithmic methods from bioinformatics, and discuss its biological consequences. FRAMESHIFT CORRECTION: The biologists recognised this problem immediately after the detection of the non-overlapping structure of the genetic code, i.e., coding sequences are to be read in a unique way determined by their reading frame. But how does the reading head of the ribosome recognises an error in the grouping of codons, caused by e.g. insertion or deletion of a base, that can be fatal during the translation process and may result in nonfunctional proteins? In this chapter we will discuss possible solutions to the frameshift problem with a focus on the theory of so-called circular codes that were discovered in large gene populations of prokaryotes and eukaryotes in the early 90s. Circular codes allow to detect a frameshift of one or two positions and recently a beautiful theory of such codes has been developed using statistics, group theory and graph theory.


Assuntos
Evolução Molecular , Código Genético/fisiologia , Modelos Teóricos , Aminoácidos/genética , Animais , Humanos , Ácidos Nucleicos/genética
19.
Theory Biosci ; 137(1): 51-65, 2018 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-29532441

RESUMO

Self-complementary circular codes are involved in pairing genetic processes. A maximal [Formula: see text] self-complementary circular code X of trinucleotides was identified in genes of bacteria, archaea, eukaryotes, plasmids and viruses (Michel in Life 7(20):1-16 2017, J Theor Biol 380:156-177, 2015; Arquès and Michel in J Theor Biol 182:45-58 1996). In this paper, self-complementary circular codes are investigated using the graph theory approach recently formulated in Fimmel et al. (Philos Trans R Soc A 374:20150058, 2016). A directed graph [Formula: see text] associated with any code X mirrors the properties of the code. In the present paper, we demonstrate a necessary condition for the self-complementarity of an arbitrary code X in terms of the graph theory. The same condition has been proven to be sufficient for codes which are circular and of large size [Formula: see text] trinucleotides, in particular for maximal circular codes ([Formula: see text] trinucleotides). For codes of small-size [Formula: see text] trinucleotides, some very rare counterexamples have been constructed. Furthermore, the length and the structure of the longest paths in the graphs associated with the self-complementary circular codes are investigated. It has been proven that the longest paths in such graphs determine the reading frame for the self-complementary circular codes. By applying this result, the reading frame in any arbitrary sequence of trinucleotides is retrieved after at most 15 nucleotides, i.e., 5 consecutive trinucleotides, from the circular code X identified in genes. Thus, an X motif of a length of at least 15 nucleotides in an arbitrary sequence of trinucleotides (not necessarily all of them belonging to X) uniquely defines the reading (correct) frame, an important criterion for analyzing the X motifs in genes in the future.


Assuntos
Modelos Genéticos , Nucleotídeos/genética , DNA/análise , Células Eucarióticas , Genes Arqueais , Genes Bacterianos , Genes Virais , Código Genético , Modelos Teóricos , Oligonucleotídeos , Fases de Leitura Aberta , Plasmídeos , RNA/análise , Ribossomos , Saccharomyces cerevisiae/genética
20.
Math Biosci ; 294: 120-129, 2017 12.
Artigo em Inglês | MEDLINE | ID: mdl-29024747

RESUMO

The graph approach of circular codes recently developed (Fimmel et al., 2016) allows here a detailed study of diletter circular codes over finite alphabets. A new class of circular codes is identified, strong comma-free codes. New theorems are proved with the diletter circular codes of maximal length in relation to (i) a characterisation of their graphs as acyclic tournaments; (ii) their explicit description; and (iii) the non-existence of other maximal diletter circular codes. The maximal lengths of paths in the graphs of the comma-free and strong comma-free codes are determined. Furthermore, for the first time, diletter circular codes are enumerated over finite alphabets. Biological consequences of dinucleotide circular codes are analysed with respect to their embedding in the trinucleotide circular code X identified in genes and to the periodicity modulo 2 observed in introns. An evolutionary hypothesis of circular codes is also proposed according to their combinatorial properties.


Assuntos
Código Genético/genética , Modelos Genéticos , Nucleotídeos/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA