RESUMO
Life as we know it requires three basic types of polymers: polypeptide, polynucleotide, and polysaccharide. Here we evaluate both universal and idiosyncratic characteristics of these biopolymers. We incorporate this information into a model that explains much about their origins, selection, and early evolution. We observe that all three biopolymer types are pre-organized, conditionally self-complementary, chemically unstable in aqueous media yet persistent because of kinetic trapping, with chiral monomers and directional chains. All three biopolymers are synthesized by dehydration reactions that are catalyzed by molecular motors driven by hydrolysis of phosphorylated nucleosides. All three biopolymers can access specific states that protect against hydrolysis. These protected states are folded, using self-complementary interactions among recurrent folding elements within a given biopolymer, or assembled, in associations between the same or different biopolymer types. Self-association in a hydrolytic environment achieves self-preservation. Heterogeneous association achieves partner-preservation. These universal properties support a model in which life's polymers emerged simultaneously and co-evolved in a common hydrolytic milieu where molecular persistence depended on folding and assembly. We believe that an understanding of the structure, function, and origins of any given type of biopolymer requires the context of other biopolymers.
Assuntos
Biopolímeros/biossíntese , Biopolímeros/metabolismo , Biopolímeros/fisiologia , Animais , Catálise , Humanos , Peptídeos/metabolismo , Peptídeos/fisiologia , Polímeros , Polinucleotídeos/biossíntese , Polinucleotídeos/metabolismo , Polissacarídeos/biossíntese , Polissacarídeos/metabolismo , Polissacarídeos/fisiologia , Dobramento de Proteína , Dobramento de RNA/fisiologiaRESUMO
Comma-free codes constitute a class of circular codes, which has been widely studied, in particular by Golomb et al. (Biologiske Meddelelser, Kongelige Danske Videnskabernes Selskab 23:1-34, 1958a, Can J Math 10:202-209, 1958b), Michel et al. (Comput Math Appl 55:989-996, 2008a, Theor Comput Sci 401:17-26, 2008b, Inf Comput 212:55-63, 2012), Michel and Pirillo (Int J Comb 2011:659567, 2011), and Fimmel and Strüngmann (J Theor Biol 389:206-213, 2016). Based on a recent approach using graph theory to study circular codes Fimmel et al. (Philos Trans R Soc 374:20150058, 2016), a new class of circular codes, called strong comma-free codes, is identified. These codes detect a frameshift during the translation process immediately after a reading window of at most two nucleotides. We describe several combinatorial properties of strong comma-free codes: enumeration, maximality, self-complementarity and [Formula: see text]-property (comma-free property in all the three possible frames). These combinatorial results also highlight some new properties of the genetic code and its evolution. Each amino acid in the standard genetic code is coded by at least one strong comma-free code of size 1. There are 9 amino acids [Formula: see text] among 20 such that for each amino acid from S, its synonymous trinucleotide set (excluding the necessary periodic trinucleotides [Formula: see text]) is a strong comma-free code. The primeval comma-free RNY code of Eigen and Schuster (Naturwissenschaften 65:341-369, 1978) is a self-complementary [Formula: see text]-code of size 16. Furthermore, it is the union of two strong comma-free codes of size 8 which are complementary to each other.
Assuntos
Aminoácidos , Código Genético , Modelos Genéticos , Códon , NucleotídeosRESUMO
A code X is (⩾k)-circular if every concatenation of words from X that admits, when read on a circle, more than one partition into words from X, must contain at least k+1 words. In other words, the reading frame retrieval is guaranteed for any concatenation of up to k words from X. A code that is (⩾k)-circular for all integers k is said to be circular. Any code is (⩾0)-circular and it turns out that a code of trinucleotides is circular as soon as it is (⩾4)-circular. A code is k-circular if it is (⩾k)-circular and not (⩾k+1)-circular. The theoretical aspects of trinucleotide k-circular codes have been developed in a companion article (Michel et al., 2022). Trinucleotide circular codes always retrieve the reading frame, leaving no ambiguous sequences. On the contrary, trinucleotide k-circular codes, for k∈{0,1,2,3} all have ambiguous sequences, for which the reading frame cannot always be retrieved. However, such a trinucleotide k-circular code is still able to retrieve the reading frame for a number of sequences, thereby exhibiting a partial circularity property. We describe this combinatorial property for each class of trinucleotide k-circular codes with k∈{0,1,2,3}. The circularity, i.e. the reading frame retrieval, is an ordinary property in genes. In order to consider the different cases of ambiguous sequences, we derive a new and general formula to measure the reading frame loss, whatever the trinucleotide k-circular code. This formula allows us to study the evolution of any trinucleotide k-circular code of (maximal) cardinality 20 to the genetic code, based on the reading frame retrieval property. We apply this approach to analyse the evolution of the trinucleotide circular code X observed in genes to the genetic code. The (⩾1)-circular codes of maximal size 20 necessarily have the same number of each nucleotide, specifically 15=3â 20/4. This balanceness property can also be achieved by trinucleotide codes of cardinality 4,8,12 and 16. We call such trinucleotide codes balanced. We develop a general mathematical method to compute the number of balanced trinucleotide codes of each size, which also applies to self-complementary trinucleotide codes. We establish and quantify a relation between this balanceness property and the self-complementarity property. The combinatorial hierarchy of trinucleotide k-circular codes is updated with the growth function results. The numbers of amino acids coded by the trinucleotide k-circular codes are given for the cases maximal, minimal, self-complementary k-, (k,k,k)- and self-complementary (k,k,k)-circular.
Assuntos
Código Genético , Modelos Genéticos , Biologia , Código Genético/genética , Nucleotídeos/genética , Fases de LeituraRESUMO
Anion-exchange chromatography carried out under non-denaturing conditions is a versatile tool to differentiate DNA conformations. In this work, the utility of this form of HPLC was demonstrated in four examples. The hairpin and duplex forms of d(CG)9 were readily resolved, which allowed for the studies of the influence of salt on the equilibrium of these two forms of secondary structures. Similarly, the minimum size of Tn in the loop region required for the sequence 5'-d(CCCAA-(T)n-TTGGG)-3' to form hairpin was established to be two nucleotides using anion-exchange HPLC and fluorescence resonance energy transfer. Furthermore, the efficiency of hybridization of partially self-complementary sequences d[(CG)6Nx] was readily monitored by non-denaturing anion-exchange HPLC. Finally, different structures adopted by quadruplex-forming sequences were resolved in the same manner.
Assuntos
Cromatografia Líquida de Alta Pressão/métodos , Cromatografia por Troca Iônica/métodos , DNA , DNA/análise , DNA/química , Quadruplex G , Conformação de Ácido Nucleico , Desnaturação de Ácido Nucleico , Polimorfismo GenéticoRESUMO
Numerous methods of vector design and delivery have been employed in an attempt to increase transgene expression following AAV-based gene therapy. Here, a gene transfer study was conducted in mice to compare the effects of vector self-complementarity (double- or single-stranded DNA), codon optimization of the transgene, and vector dose on transgene expression levels in the liver. Two different reporter genes were used: human ornithine transcarbamylase (hOTC) detected by immunofluorescence, and enhanced green fluorescent protein (EGFP) detected by direct fluorescence. The AAV8 capsid was chosen for all experiments due to its strong liver tropism. While EGFP is already a codon-optimized version of the original gene, both wild-type (WT) and codon-optimized (co) versions of the hOTC transgene were compared in this study. In addition, the study evaluated which of the two hOTC modifications-codon optimization or self-complementarity-would confer the highest increase in expression levels at a given dose. Interestingly, based on morphometric image analysis, it was observed that the difference in detectable expression levels between self-complementary (sc) and single-stranded (ss) hOTCco vectors was dose dependent, with a sevenfold increase in OTC-positive area using sc vectors at a dose of 3 × 109 genome copies (GC) per mouse, but no significant difference at a dose of 1 × 1010 GC/mouse. In contrast, with EGFP as a transgene, the increases in expression levels when using the sc vector were observed at both the 3 × 109 GC/mouse and 1 × 1010 GC/mouse doses. Furthermore, codon optimization of the hOTC transgene generated a more significant improvement in expression than the use of self-complementarity did. Overall, the results demonstrate that increases in expression levels gained by using sc vectors instead of ss vectors can vary between different transgenes, and that codon optimization of the transgene can have an even more powerful effect on the resulting expression levels.