Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 52
Filter
Add more filters










Publication year range
2.
Mol Biochem Parasitol ; 255: 111581, 2023 09.
Article in English | MEDLINE | ID: mdl-37478919

ABSTRACT

Schistosoma mansoni is a parasitic flatworm that causes a human disease called schistosomiasis, or bilharzia. At the genomic level, S. mansoni is AT-rich, but has some compositional heterogeneity. Indeed, some regions of its genome are GC-rich, mainly in the regions located near the extreme ends of the chromosomes. Recently, we showed that, despite the strong bias towards A/T ending codons, highly expressed genes tend to use GC-rich codons. Here, we address the following question: are highly expressed sequences biased in their amino acid frequencies? Our analyses show that these sequences in S. mansoni, as in species ranging from bacteria to human, are strongly biased in nucleotide composition. Highly expressed genes tend to use GC-rich codons (in the first and second codon positions), which code the energetically cheapest amino acids. Therefore, we conclude that amino acid usage, at least in highly expressed genes, is strongly shaped by natural selection to avoid energetically expensive residues. Whether this is an adaptation to the parasitic way of life of S. mansoni, is unclear since the same pattern occurs in free-living species.


Subject(s)
Platyhelminths , Animals , Humans , Platyhelminths/genetics , Schistosoma mansoni/genetics , Amino Acids/genetics , Codon , Bacteria
3.
J Mol Evol ; 91(4): 382-390, 2023 08.
Article in English | MEDLINE | ID: mdl-37264211

ABSTRACT

The standard genetic code determines that in most species, including viruses, there are 20 amino acids that are coded by 61 codons, while the other three codons are stop triplets. Considering the whole proteome each species features its own amino acid frequencies, given the slow rate of change, closely related species display similar GC content and amino acids usage. In contrast, distantly related species display different amino acid frequencies. Furthermore, within certain multicellular species, as mammals, intragenomic differences in the usage of amino acids are evident. In this communication, we shall summarize some of the most prominent and well-established factors that determine the differences found in the amino acid usage, both across evolution and intragenomically.


Subject(s)
Amino Acids , Genetic Code , Animals , Amino Acids/genetics , Codon/genetics , Base Composition , Proteome/genetics , Evolution, Molecular , Mammals/genetics
4.
J Mol Evol ; 91(1): 6-9, 2023 02.
Article in English | MEDLINE | ID: mdl-36370165
5.
J Mol Evol ; 90(5): 325-327, 2022 10.
Article in English | MEDLINE | ID: mdl-35838772
6.
Arch Virol ; 167(6): 1443-1448, 2022 Jun.
Article in English | MEDLINE | ID: mdl-35467158

ABSTRACT

Viruses are, by far, the most abundant biological entities on earth. They are found in all known ecological niches and are the causative agents of many important diseases in plants and animals. From an evolutionary point of view, since viruses do not share any orthologous genes, there is a general consensus that they are polyphyletic; that is, they do not have a common ancestor. This means that they appeared several times during the course of evolution. For their life cycle, they are always obligate parasites of a free cellular life form, which can be bacteria, archaea, or eukaryotes. More complexity is added to these entities by the fact that their genetic material can be DNA or RNA (double- or single-stranded) or retrotranscribed. Given these features, we wondered if some general rules can be inferred when studying two basic genomic signatures-dinucleotides and codon usage-analyzing all available complete and non-redundant viral sequences. In spite of the obviously biased sample of sequences available, some general features appear to emerge.


Subject(s)
Codon Usage , Viruses , Animals , Archaea/genetics , Bacteria/genetics , Eukaryota/genetics , Evolution, Molecular , Viruses/genetics
7.
Mol Biochem Parasitol ; 247: 111445, 2022 01.
Article in English | MEDLINE | ID: mdl-34942292

ABSTRACT

Schistosoma mansoni is a trematode flatworm that parasitizes humans and produces a disease called bilharzia. At the genomic level, it is characterized by a low genomic GC content and an "isochore-like" structure, where GC-richest regions, mainly placed at the extremes of the chromosomes, are interspersed with low GC-regions. Furthermore, the GC-richest regions are at the same time the gene-richest, and where the most heavily expressed genes are placed. Taking these features into account, we decided to reanalyze the codon usage of this flatworm. Our results show that a) when all genes are considered together, the strong mutational bias towards A + T leads to a predominance of A/T-ending codons, b) a multivariate analysis discriminates between highly and lowly expressed genes, c) the sequences expressed at highest levels display a significant increase in G/C-ending codons, d) when comparing the molecular distances with a closely related species the synonymous distance in highly expressed genes is significantly lower than in lowly expressed sequences. Therefore, we conclude that despite previous results, which were performed with a small sample of genes, codon usage in S. mansoni is the result of two forces that operate in opposite directions: while mutational bias leads to a predominance of A/T codons, translational selection, working at the level of speed, increment G/C ending triplets.


Subject(s)
Codon Usage , Platyhelminths , Animals , Base Composition , Codon , Platyhelminths/genetics , Schistosoma mansoni/genetics
8.
J Mol Evol ; 89(9-10): 589-593, 2021 12.
Article in English | MEDLINE | ID: mdl-34383106

ABSTRACT

Since the genetic code is degenerate, several codons are translated to the same amino acid. Although these triplets were historically considered to be "synonymous" and therefore expected to be used at rather equal frequencies in all genomes, we now know that this is not the case. Indeed, since several coding sequences were obtained in the late '70s and early '80s in the last century, coming from either the same or different species, it was evident that (a) each genome, taken globally, displayed different codon usage patterns, which means that different genomes display a particular global codon usage table when all genes are considered together, and (b) there is a strong intragenomic diversity: in other words, within a given species the codon usage pattern can (and usually do) differ greatly among genes in the same genome. These different patterns were attributed to two main factors: first, the mutational bias characteristic of each genome, which determines that GC- poor species display a general bias towards A/T codons while the reverse is true for GC- rich species. Second, the differences in codon usage among genes from the same species are due to natural selection acting at the level of translation, in such a way that highly expressed genes tend to use codons that match with the most abundant isoacceptor tRNAs. Thus, these genes are translated at a highest rate, which in turn leads to avoid the limiting factor in translation which is the number of available ribosomes per cell. Although these explanations are still valid, new factors are almost constantly postulated to affect codon usage. In this mini review, we shall try to summarize them.


Subject(s)
Codon Usage , Genetic Code , Codon/genetics , RNA, Transfer/genetics , Selection, Genetic
9.
Front Microbiol ; 12: 646300, 2021.
Article in English | MEDLINE | ID: mdl-34262534

ABSTRACT

The genetic material of the three domains of life (Bacteria, Archaea, and Eukaryota) is always double-stranded DNA, and their GC content (molar content of guanine plus cytosine) varies between ≈ 13% and ≈ 75%. Nucleotide composition is the simplest way of characterizing genomes. Despite this simplicity, it has several implications. Indeed, it is the main factor that determines, among other features, dinucleotide frequencies, repeated short DNA sequences, and codon and amino acid usage. Which forces drive this strong variation is still a matter of controversy. For rather obvious reasons, most of the studies concerning this huge variation and its consequences, have been done in free-living organisms. However, no recent comprehensive study of all known viruses has been done (that is, concerning all available sequences). Viruses, by far the most abundant biological entities on Earth, are the causative agents of many diseases. An overview of these entities is important also because their genetic material is not always double-stranded DNA: indeed, certain viruses have as genetic material single-stranded DNA, double-stranded RNA, single-stranded RNA, and/or retro-transcribing. Therefore, one may wonder if what we have learned about the evolution of GC content and its implications in prokaryotes and eukaryotes also applies to viruses. In this contribution, we attempt to describe compositional properties of ∼ 10,000 viral species: base composition (globally and according to Baltimore classification), correlations among non-coding regions and the three codon positions, and the relationship of the nucleotide frequencies and codon usage of viruses with the same feature of their hosts. This allowed us to determine how the base composition of phages strongly correlate with the value of their respective hosts, while eukaryotic viruses do not (with fungi and protists as exceptions). Finally, we discuss some of these results concerning codon usage: reinforcing previous results, we found that phages and hosts exhibit moderate to high correlations, while for eukaryotes and their viruses the correlations are weak or do not exist.

10.
BMC Bioinformatics ; 21(1): 293, 2020 Jul 08.
Article in English | MEDLINE | ID: mdl-32640978

ABSTRACT

BACKGROUND: Spliced Leader trans-splicing is an important mechanism for the maturation of mRNAs in several lineages of eukaryotes, including several groups of parasites of great medical and economic importance. Nevertheless, its study across the tree of life is severely hindered by the problem of identifying the SL sequences that are being trans-spliced. RESULTS: In this paper we present SLFinder, a four-step pipeline meant to identify de novo candidate SL sequences making very few assumptions regarding the SL sequence properties. The pipeline takes transcriptomic de novo assemblies and a reference genome as input and allows the user intervention on several points to account for unexpected features of the dataset. The strategy and its implementation were tested on real RNAseq data from species with and without SL Trans-Splicing. CONCLUSIONS: SLFinder is capable to identify SL candidates with good precision in a reasonable amount of time. It is especially suitable for species with unknown SL sequences, generating candidate sequences for further refining and experimental validation.


Subject(s)
RNA, Spliced Leader/chemistry , Software , Trans-Splicing , Animals , Genomics , Mice , RNA-Seq
11.
R Soc Open Sci ; 6(11): 190773, 2019 Nov.
Article in English | MEDLINE | ID: mdl-31827830

ABSTRACT

In both prokaryotic and eukaryotic genomes, synonymous codons are unevenly used. Such differential usage of optimal or non-optimal codons has been suggested to play a role in the control of translation initiation and elongation, as well as at the level of transcription and mRNA stability. In the case of membrane proteins, codon usage has been proposed to assist in the establishment of a pause necessary for the correct targeting of the nascent chains to the translocon. By using as a model UreA, the Aspergillus nidulans urea transporter, we revealed that a pair of non-optimal codons encoding amino acids situated at the boundary between the N-terminus and the first transmembrane segment are necessary for proper biogenesis of the protein at 37°C. These codons presumably regulate the translation rate in a previously undescribed fashion, possibly contributing to the correct interaction of ureA-translating ribosome-nascent chain complexes with the signal recognition particle and/or other factors, while the polypeptide has not yet emerged from the ribosomal tunnel. Our results suggest that the presence of the pair of non-optimal codons would not be functionally important in all cellular conditions. Whether this mechanism would affect other proteins remains to be determined.

12.
Sci Rep ; 8(1): 17820, 2018 12 13.
Article in English | MEDLINE | ID: mdl-30546029

ABSTRACT

Recent investigations have shown that isochores are characterized by a 3-D structure which is primarily responsible for the topology of chromatin domains. More precisely, an analysis of human chromosome 21 demonstrated that low-heterogeneity, GC-poor isochores are characterized by the presence of oligo-Adenines that are intrinsically stiff, curved and unfavorable for nucleosome binding. This leads to a structure of the corresponding chromatin domains, the Lamina Associated Domains, or LADs, which is well suited for interaction with the lamina. In contrast, the high-heterogeneity GC-rich isochores are in the form of compositional peaks and valleys characterized by increasing gradients of oligo-Guanines in the peaks and oligo-Adenines in the valleys that lead to increasing nucleosome depletions in the corresponding chromatin domains, the Topological Associating Domains, or TADs. These results encouraged us to investigate in detail the di- and tri-nucleotide profiles of 100 Kb segments of chromosome 21, as well as those of the di- to octa-Adenines and di- to octa-Guanines in some representative regions of the chromosome. The results obtained show that the 3-D structures of isochores and chromatin domains depend not only upon oligo-Adenines and oligo-Guanines but also, to a lower but definite extent, upon the majority of di- and tri-nucleotides. This conclusion has strong implications for the biological role of non-coding sequences.


Subject(s)
Chromosomes, Human, Pair 21/chemistry , Genome, Human , Isochores/chemistry , Isochores/chemical synthesis , Nucleosomes/chemistry , Humans
13.
An. Facultad Med. (Univ. Repúb. Urug., En línea) ; 5(2): 12-28, dic. 2018. tab, graf
Article in Spanish | LILACS, BNUY, UY-BNMED | ID: biblio-1088677

ABSTRACT

El genoma humano, como el de todos los mamíferos y aves, es un mosaico de isocoros, los que son regiones muy largas de ADN (>>100 kb) que son homogéneas en cuanto a su composición de bases. Los isocoros pueden ser divididos en un pequeño número de familias que cubren un amplio rango de niveles de GC (GC es la relación molar de guanina+citosina en el ADN). En el genoma humano encontramos cinco familias, que (yendo de valores bajos a altos de GC) son L1, L2, H1, H2 y H3. Este tipo de organización tiene importantes consecuencias funcionales, tales como la diferente concentración de genes, su regulación, niveles de transcripción, tasas de recombinación, tiempo de replicación, etc. Además, la existencia de los isocoros lleva a las llamadas "correlaciones composicionales", lo que significa que en la medida en que diferentes secuencias están localizadas en diferentes isocoros, todas sus regiones (exones y sus tres posiciones de los codones, intrones, etc.) cambian su contenido en GC, y como consecuencia, cambian tanto el uso de aminoácidos como de codones sinónimos en cada familia de isocoros. Finalmente, discutimos el origen de estas estructuras en un marco evolutivo.


The human genome, as the genome of all mammals and birds, are mosaic of isochores, which are very long streches (>> 100 kb) of DNA that are homogeneous in base composition. Isochores can be divided in a small number of families that cover a broad range of GC levels (GC is the molar ratio of guanine+cytosine in DNA). In the human genome, we find five families, which are (going from GC- poor to GC- rich) L1, L2, H1, H2 and H3. This organization has important consequences, as is the case of the concentration of genes, their regulation, transcription levels, rate of recombination, time of replication, etc. Furthermore, the existence of isochores has as a consequence the so called "compositional correlations", which means that as long as sequences are placed in different families of isochores, all of their regions (exons and their three codon positions, introns, etc.) change their GC content, and as a consequence, both codon and amino acids usage change in each isochore family. Finally, we discuss the origin of isochores within an evolutioary framework.


O genoma humano, como todos os mamíferos e aves, é um mosaico de isocóricas, que são muito longas regiões de ADN (>> 100 kb) que são homogéneos na sua composição de base. Isóquos podem ser divididos em um pequeno número de famílias que cobrem uma ampla gama de níveis de GC (GC é a razão molar de guanina + citosina no DNA). No genoma humano, encontramos cinco famílias, que (variando de valores baixos a altos de GC) são L1, L2, H1, H2 e H3. Este tipo de organização tem importantes conseqüências funcionais, como a diferente concentração de genes, sua regulação, níveis de transcrição, taxas de recombinação, tempo de replicação, etc. Além disso, a existência de isocóricas portada chamado "correlações de composição", o que significa que, na medida em que diferentes sequências estão localizados em diferentes isocóricas, todas as regiões (exs e três posições de codões, intrs, etc.) mudam seu conteúdo em GC e, como consequência, alteram tanto o uso de aminoácidos quanto de códons sinônimos em cada família de isócoros. Finalmente, discutimos a origem dessas estruturas em uma estrutura evolucionária.


Subject(s)
Humans , Genome, Human/genetics , Isochores/genetics , Base Composition , Introns/genetics
15.
Virol J ; 14(1): 115, 2017 06 17.
Article in English | MEDLINE | ID: mdl-28623921

ABSTRACT

BACKGROUND: Bovine coronavirus (BCoV) belong to the genus Betacoronavirus of the family Coronaviridae. BCoV are widespread around the world and cause enteric or respiratory infections among cattle, leading to important economic losses to the beef and dairy industry worldwide. To study the relation of codon usage among viruses and their hosts is essential to understand host-pathogen interaction, evasion from host's immune system and evolution. METHODS: We performed a comprehensive analysis of codon usage and composition of BCoV. RESULTS: The global codon usage among BCoV strains is similar. Significant differences of codon preferences in BCoV genes in relation to codon usage of Bos taurus host genes were found. Most of the highly frequent codons are U-ending. G + C compositional constraint and dinucleotide composition also plays a role in the overall pattern of BCoV codon usage. CONCLUSIONS: The results of these studies revealed that mutational bias is a leading force shaping codon usage in this virus. Additionally, relative dinucleotide frequencies, geographical distribution, and evolutionary processes also influenced the codon usage pattern.


Subject(s)
Codon , Coronavirus, Bovine/genetics , Genome, Viral , Protein Biosynthesis , Adaptation, Biological , Animals , Cattle , Evolution, Molecular
16.
Biochem Biophys Res Commun ; 492(4): 572-578, 2017 10 28.
Article in English | MEDLINE | ID: mdl-28630001

ABSTRACT

Flaviviruses present substantial differences in their host range and transmissibility. We studied the evolution of base composition, dinucleotide biases, codon usage and amino acid frequencies in the genus Flavivirus within a phylogenetic framework by principal components analysis. There is a mutual interplay between the evolutionary history of flaviviruses and their respective vectors and/or hosts. Hosts associated to distinct phylogenetic groups may be driving flaviviruses at different pace and through various sequence landscapes, as can be seen for viruses associated with Aedes or Culex spp., although phylogenetic inertia cannot be ruled out. In some cases, viruses face even opposite forces. For instance, in tick-borne flaviviruses, while vertebrate hosts exert pressure to deplete their CpG, tick vectors drive them to exhibit GC-rich codons. Within a vertebrate environment, natural selection appears to be acting on the viral genome to overcome the immune system. On the other side, within an arthropod environment, mutational biases seem to be the dominant forces.


Subject(s)
Biological Evolution , Flaviviridae/genetics , Genome, Viral/genetics , Insect Vectors/genetics , Insect Vectors/virology , Viral Proteins/genetics , Animals , Codon/genetics , CpG Islands/genetics , Data Interpretation, Statistical , Evolution, Molecular , Genetic Association Studies , Models, Genetic , Models, Statistical , Multivariate Analysis
17.
J Mol Evol ; 84(2-3): 93-103, 2017 03.
Article in English | MEDLINE | ID: mdl-28243687

ABSTRACT

The recent availability of a number of fully sequenced genomes (including marine organisms) allowed to map very precisely the isochores, based on DNA sequences, confirming the results obtained before genome sequencing by the ultracentrifugation in CsCl. In fact, the analytical profile of human DNA showed that the vertebrate genome is a mosaic of isochores, typically megabase-size DNA segments that belong to a small number of families characterized by different GC levels. In this review, we will concentrate on some general genome features regarding the compositional organization from different organisms and their evolution, ranging from vertebrates to invertebrates until unicellular organisms. Since isochores are tightly linked to biological properties such as gene density, replication timing, and recombination, the new level of detail provided by the isochore map helped the understanding of genome structure, function, and evolution. All the findings reported here confirm the idea that the isochores can be considered as a "fundamental level of genome structure and organization." We stress that we do not discuss in this review the origin of isochores, which is still a matter of controversy, but we focus on well established structural and physiological aspects.


Subject(s)
Genome/genetics , Isochores/genetics , Sequence Analysis, DNA/methods , Animals , Base Composition , Biological Evolution , Chromosome Mapping , Computational Biology/methods , DNA , Evolution, Molecular , Genome/physiology , Humans , Invertebrates/genetics , Isochores/physiology , Vertebrates/genetics
18.
Genome Biol Evol ; 8(8): 2312-8, 2016 08 16.
Article in English | MEDLINE | ID: mdl-27435793

ABSTRACT

Eukaryotic genomes are compositionally heterogeneous, that is, composed by regions that differ in guanine-cytosine (GC) content (isochores). The most well documented case is that of vertebrates (mainly mammals) although it has been also noted among unicellular eukaryotes and invertebrates. In the human genome, regarded as a typical mammal, this heterogeneity is associated with several features. Specifically, genes located in GC-richest regions are the GC3-richest, display CpG islands and have shorter introns. Furthermore, these genes are more heavily expressed and tend to be located at the extremes of the chromosomes. Although the compositional heterogeneity seems to be widespread among eukaryotes, the associated properties noted in the human genome and other mammals have not been investigated in depth in other taxa Here we provide evidence that the genome of the parasitic flatworm Schistosoma mansoni is compositionally heterogeneous and exhibits an isochore-like structure, displaying some features associated, until now, only with the human and other vertebrate genomes, with the exception of gene concentration.


Subject(s)
Evolution, Molecular , Genome, Helminth , Isochores/genetics , Schistosoma mansoni/genetics , Animals , GC Rich Sequence
19.
Virus Res ; 223: 147-52, 2016 09 02.
Article in English | MEDLINE | ID: mdl-27449601

ABSTRACT

Zika virus (ZIKV) is a member of the family Flaviviridae and its genome consists of a single-stranded positive sense RNA molecule with 10,794 nucleotides. Clinical manifestations of disease caused by ZIKV infection range from asymptomatic cases to an influenza-like syndrome. There is an increasing concern about the possible relation among microcephaly and ZIKV infection. To get insight into the relation of codon usage among viruses and their hosts is extremely important to understand virus survival, fitness, evasion from host's immune system and evolution. In this study, we performed a comprehensive analysis of codon usage and composition of ZIKV. The overall codon usage among ZIKV strains is similar and slightly biased. Different codon preferences in ZIKV genes in relation to codon usage of human, Aedes aegypti and Aedes albopictus genes were found. Most of the highly frequent codons are A-ending, which strongly suggests that mutational bias is the main force shaping codon usage in this virus. G+C compositional constraint as well as dinucleotide composition also influence the codon usage of ZIKV. The results of these studies suggest that the emergence of ZIKV outside Africa, in the Pacific and the Americas may also be reflected in ZIKV codon usage. No significant differences were found in codon usage among strains isolated from microcephaly cases and the rest of strains from the Asian cluster enrolled in these studies.


Subject(s)
Codon , Genome, Viral , Zika Virus/genetics , Adaptation, Biological , Base Composition , Evolution, Molecular , Genetic Variation , Humans , Mutation , Open Reading Frames , Phylogeny , Whole Genome Sequencing
20.
Infect Genet Evol ; 43: 267-73, 2016 09.
Article in English | MEDLINE | ID: mdl-27264728

ABSTRACT

Hepatitis E virus (HEV) is an emergent hepatotropic virus endemic mainly in Asia and other developing areas. However, in the last decade it has been increasingly reported in high-income countries. Human infecting HEV strains are currently classified into four genotypes (1-4). Genotype 3 (HEV-3) is the prevalent virus genotype and the mostly associated with autochthonous and sporadic cases of HEV in developed areas. The evolutionary history of HEV worldwide remains largely unknown. In this study we reconstructed the spatiotemporal and population dynamics of HEV-3 at global scale, but with particular emphasis in South America, where case reports have increased dramatically in the last years. To achieve this, we applied a Bayesian coalescent-based approach to a comprehensive data set comprising 97 GenBank HEV-3 sequences for which the location and sampling date was documented. Our phylogenetic analyses suggest that the worldwide genetic diversity of HEV-3 can be grouped into two main Clades (I and II) with a Ƭmrca dated in approximately 320years ago (95% HPD: 420-236years) and that a unique independent introduction of HEV-3 seems to have occurred in Uruguay, where most of the human HEV cases in South America have been described. The phylodynamic inference indicates that the population size of this virus suffered substantial temporal variations after the second half of the 20th century. In this sense and conversely to what is postulated to date, we suggest that the worldwide effective population size of HEV-3 is not decreasing and that frequently sources of error in its estimates stem from assumptions that the analyzed sequences are derived from a single panmictic population. Novel insights on the global population dynamics of HEV are given. Additionally, this work constitutes an attempt to further describe in a Bayesian coalescent framework, the phylodynamics and evolutionary history of HEV-3 in the South American region.


Subject(s)
Genetic Variation , Genotype , Hepatitis E virus/genetics , Hepatitis E/epidemiology , Phylogeny , Bayes Theorem , Biological Evolution , Hepatitis E/virology , Hepatitis E virus/classification , Humans , Monte Carlo Method , Phylogeography , Population Dynamics , South America/epidemiology , Time Factors
SELECTION OF CITATIONS
SEARCH DETAIL
...