Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Mol Biol Evol ; 37(8): 2332-2340, 2020 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-32316034

RESUMO

Comparative genomics and molecular phylogenetics are foundational for understanding biological evolution. Although many studies have been made with the aim of understanding the genomic contents of early life, uncertainty remains. A study by Weiss et al. (Weiss MC, Sousa FL, Mrnjavac N, Neukirchen S, Roettger M, Nelson-Sathi S, Martin WF. 2016. The physiology and habitat of the last universal common ancestor. Nat Microbiol. 1(9):16116.) identified a number of protein families in the last universal common ancestor of archaea and bacteria (LUCA) which were not found in previous works. Here, we report new research that suggests the clustering approaches used in this previous study undersampled protein families, resulting in incomplete phylogenetic trees which do not reflect protein family evolution. Phylogenetic analysis of protein families which include more sequence homologs rejects a simple LUCA hypothesis based on phylogenetic separation of the bacterial and archaeal domains for a majority of the previously identified LUCA proteins (∼82%). To supplement limitations of phylogenetic inference derived from incompletely populated orthologous groups and to test the hypothesis of a period of rapid evolution preceding the separation of the domains, we compared phylogenetic distances both within and between domains, for thousands of orthologous groups. We find a substantial diversity of interdomain versus intradomain branch lengths, even among protein families which exhibit a single domain separating branch and are thought to be associated with the LUCA. Additionally, phylogenetic trees with long interdomain branches relative to intradomain branches are enriched in information categories of protein families in comparison to those associated with metabolic functions. These results provide a new view of protein family evolution and temper claims about the phenotype and habitat of the LUCA.


Assuntos
Archaea/genética , Bactérias/genética , Filogenia , Proteínas Arqueais/genética , Proteínas de Bactérias/genética
2.
RNA Biol ; 17(5): 663-676, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32041469

RESUMO

Archaeal genomes are densely packed; thus, correct transcription termination is an important factor for orchestrated gene expression. A systematic analysis of RNA 3´ termini, to identify transcription termination sites (TTS) using RNAseq data has hitherto only been performed in two archaea, Methanosarcina mazei and Sulfolobus acidocaldarius. In this study, only regions directly downstream of annotated genes were analysed, and thus, only part of the genome had been investigated. Here, we developed a novel algorithm (Internal Enrichment-Peak Calling) that allows an unbiased, genome-wide identification of RNA 3´ termini independent of annotation. In an RNA fraction enriched for primary transcripts by terminator exonuclease (TEX) treatment we identified 1,543 RNA 3´ termini. Approximately half of these were located in intergenic regions, and the remainder were found in coding regions. A strong sequence signature consistent with known termination events at intergenic loci indicates a clear enrichment for native TTS among them. Using these data we determined distinct putative termination motifs for intergenic (a T stretch) and coding regions (AGATC). In vivo reporter gene tests of selected TTS confirmed termination at these sites, which exemplify the different motifs. For several genes, more than one termination site was detected, resulting in transcripts with different lengths of the 3´ untranslated region (3´ UTR).


Assuntos
Regiões 3' não Traduzidas , Regulação da Expressão Gênica em Archaea , Haloferax volcanii/genética , RNA Arqueal/genética , Algoritmos , Análise por Conglomerados , Biologia Computacional/métodos , Genoma Arqueal , Genômica/métodos , Anotação de Sequência Molecular , Motivos de Nucleotídeos , Fases de Leitura Aberta , Óperon , Terminação da Transcrição Genética
3.
J Math Biol ; 77(2): 313-341, 2018 08.
Artigo em Inglês | MEDLINE | ID: mdl-29260295

RESUMO

Clusters of paralogous genes such as the famous HOX cluster of developmental transcription factors tend to evolve by stepwise duplication of its members, often involving unequal crossing over. Gene conversion and possibly other mechanisms of concerted evolution further obfuscate the phylogenetic relationships. As a consequence, it is very difficult or even impossible to disentangle the detailed history of gene duplications in gene clusters. In this contribution we show that the expansion of gene clusters by unequal crossing over as proposed by Walter Gehring leads to distinctive patterns of genetic distances, namely a subclass of circular split systems. Furthermore, when the gene cluster was left undisturbed by genome rearrangements, the shortest Hamiltonian paths with respect to genetic distances coincide with the genomic order. This observation can be used to detect ancient genomic rearrangements of gene clusters and to distinguish gene clusters whose evolution was dominated by unequal crossing over within genes from those that expanded through other mechanisms.


Assuntos
Modelos Genéticos , Família Multigênica , Álcool Desidrogenase/genética , Algoritmos , Animais , Simulação por Computador , Troca Genética , Evolução Molecular , Duplicação Gênica , Genes Homeobox , Genoma , Humanos , Conceitos Matemáticos , Filogenia , Recombinação Genética
4.
BMC Genomics ; 17(1): 617, 2016 08 11.
Artigo em Inglês | MEDLINE | ID: mdl-27515907

RESUMO

BACKGROUND: Transfer RNAs (tRNAs) are ubiquitous in all living organism. They implement the genetic code so that most genomes contain distinct tRNAs for almost all 61 codons. They behave similar to mobile elements and proliferate in genomes spawning both local and non-local copies. Most tRNA families are therefore typically present as multicopy genes. The members of the individual tRNA families evolve under concerted or rapid birth-death evolution, so that paralogous copies maintain almost identical sequences over long evolutionary time-scales. To a good approximation these are functionally equivalent. Individual tRNA copies thus are evolutionary unstable and easily turn into pseudogenes and disappear. This leads to a rapid turnover of tRNAs and often large differences in the tRNA complements of closely related species. Since tRNA paralogs are not distinguished by sequence, common methods cannot not be used to establish orthology between tRNA genes. RESULTS: In this contribution we introduce a general framework to distinguish orthologs and paralogs in gene families that are subject to concerted evolution. It is based on the use of uniquely aligned adjacent sequence elements as anchors to establish syntenic conservation of sequence intervals. In practice, anchors and intervals can be extracted from genome-wide multiple sequence alignments. Syntenic clusters of concertedly evolving genes of different families can then be subdivided by list alignments, leading to usually small clusters of candidate co-orthologs. On the basis of recent advances in phylogenetic combinatorics, these candidate clusters can be further processed by cograph editing to recover their duplication histories. We developed a workflow that can be conceptualized as stepwise refinement of a graph of homologous genes. We apply this analysis strategy with different types of synteny anchors to investigate the evolution of tRNAs in primates and fruit flies. We identified a large number of tRNA remolding events concentrated at the tips of the phylogeny. With one notable exception all phylogenetically old tRNA remoldings do not change the isoacceptor class. CONCLUSIONS: Gene families evolving under concerted evolution are not amenable to classical phylogenetic analyses since paralogs maintain identical, species-specific sequences, precluding the estimation of correct gene trees from sequence differences. This leaves conservation of syntenic arrangements with respect to "anchor elements" that are not subject to concerted evolution as the only viable source of phylogenetic information. We have demonstrated here that a purely synteny-based analysis of tRNA gene histories is indeed feasible. Although the choice of synteny anchors influences the resolution in particular when tight gene clusters are present, and the quality of sequence alignments, genome assemblies, and genome rearrangements limits the scope of the analysis, largely coherent results can be obtained for tRNAs. In particular, we conclude that a large fraction of the tRNAs are recent copies. This proliferation is compensated by rapid pseudogenization as exemplified by many very recent alloacceptor remoldings.


Assuntos
Drosophila/genética , Genoma , Filogenia , Primatas/genética , RNA de Transferência/genética , Sintenia , Animais , Sequência de Bases , Códon , Evolução Molecular , Código Genético , Família Multigênica , Pseudogenes , Alinhamento de Sequência , Homologia de Sequência do Ácido Nucleico
5.
Algorithms Mol Biol ; 19(1): 13, 2024 Mar 16.
Artigo em Inglês | MEDLINE | ID: mdl-38493130

RESUMO

MOTIVATION: Many bioinformatics problems can be approached as optimization or controlled sampling tasks, and solved exactly and efficiently using Dynamic Programming (DP). However, such exact methods are typically tailored towards specific settings, complex to develop, and hard to implement and adapt to problem variations. METHODS: We introduce the Infrared framework to overcome such hindrances for a large class of problems. Its underlying paradigm is tailored toward problems that can be declaratively formalized as sparse feature networks, a generalization of constraint networks. Classic Boolean constraints specify a search space, consisting of putative solutions whose evaluation is performed through a combination of features. Problems are then solved using generic cluster tree elimination algorithms over a tree decomposition of the feature network. Their overall complexities are linear on the number of variables, and only exponential in the treewidth of the feature network. For sparse feature networks, associated with low to moderate treewidths, these algorithms allow to find optimal solutions, or generate controlled samples, with practical empirical efficiency. RESULTS: Implementing these methods, the Infrared software allows Python programmers to rapidly develop exact optimization and sampling applications based on a tree decomposition-based efficient processing. Instead of directly coding specialized algorithms, problems are declaratively modeled as sets of variables over finite domains, whose dependencies are captured by constraints and functions. Such models are then automatically solved by generic DP algorithms. To illustrate the applicability of Infrared in bioinformatics and guide new users, we model and discuss variants of bioinformatics applications. We provide reimplementations and extensions of methods for RNA design, RNA sequence-structure alignment, parsimony-driven inference of ancestral traits in phylogenetic trees/networks, and design of coding sequences. Moreover, we demonstrate multidimensional Boltzmann sampling. These applications of the framework-together with our novel results-underline the practical relevance of Infrared. Remarkably, the achieved complexities are typically equivalent to the ones of specialized algorithms and implementations. AVAILABILITY: Infrared is available at https://amibio.gitlabpages.inria.fr/Infrared with extensive documentation, including various usage examples and API reference; it can be installed using Conda or from source.

6.
Front Big Data ; 7: 1354007, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38495847

RESUMO

Introduction: Is Paris a 15-min city, where inhabitants can access essential amenities such as schools and shops with a 15-min walk or bike ride? The concept of a 15-min (more generally, X-minute) city was launched in the French capital and was part of the current mayor's plan in her latest re-election campaign. Yet, its fit with the existing urban structure had not been previously assessed. Methods: This article combines open map data from a large participatory project and geo-localized socio-economic data from official statistics to fill this gap. Results: We show that, while the city of Paris is rather homogeneous, it is nonetheless characterized by remarkable inequalities between a highly accessible city center (though with some internal differences in terms of types of amenities) and a less well-equipped periphery, where lower-income neighborhoods are more often found. The heterogeneity increases if we consider Paris together with its immediate surroundings, the "Petite Couronne," where large numbers of daily commuters and other users of city facilities live. Discussion: We thus conclude that successful implementation of the X-minute-city concept requires addressing existing socio-economic inequalities, and that especially in big cities, it should be extended beyond the narrow boundaries of the municipality itself to encompass the larger area around it.

7.
J Pers Med ; 13(6)2023 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-37373913

RESUMO

(1) Background: Cystic fibrosis (CF) is a disease with well-documented clinical differences between female and male patients. However, this gender gap is very poorly studied at the molecular level. (2) Methods: Expression differences in whole blood transcriptomics between female and male CF patients are analyzed in order to determine the pathways related to sex-biased genes and assess their potential influence on sex-specific effects in CF patients. (3) Results: We identify sex-biased genes in female and male CF patients and provide explanations for some sex-specific differences at the molecular level. (4) Conclusion: Genes in key pathways associated with CF are differentially expressed between sexes, and thus may account for the gender gap in morbidity and mortality in CF.

8.
Algorithms Mol Biol ; 18(1): 18, 2023 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-38041153

RESUMO

Although RNA secondary structure prediction is a textbook application of dynamic programming (DP) and routine task in RNA structure analysis, it remains challenging whenever pseudoknots come into play. Since the prediction of pseudoknotted structures by minimizing (realistically modelled) energy is NP-hard, specialized algorithms have been proposed for restricted conformation classes that capture the most frequently observed configurations. To achieve good performance, these methods rely on specific and carefully hand-crafted DP schemes. In contrast, we generalize and fully automatize the design of DP pseudoknot prediction algorithms. For this purpose, we formalize the problem of designing DP algorithms for an (infinite) class of conformations, modeled by (a finite number of) fatgraphs, and automatically build DP schemes minimizing their algorithmic complexity. We propose an algorithm for the problem, based on the tree-decomposition of a well-chosen representative structure, which we simplify and reinterpret as a DP scheme. The algorithm is fixed-parameter tractable for the treewidth tw of the fatgraph, and its output represents a [Formula: see text] algorithm (and even possibly [Formula: see text] in simple energy models) for predicting the MFE folding of an RNA of length n. We demonstrate, for the most common pseudoknot classes, that our automatically generated algorithms achieve the same complexities as reported in the literature for hand-crafted schemes. Our framework supports general energy models, partition function computations, recursive substructures and partial folding, and could pave the way for algebraic dynamic programming beyond the context-free case.

9.
Microbes Environ ; 36(2)2021.
Artigo em Inglês | MEDLINE | ID: mdl-33952861

RESUMO

Cyanobacteria thrive in diverse environments. However, questions remain about possible growth limitations in ancient environmental conditions. As a single genus, the Thermosynechococcus are cosmopolitan and live in chemically diverse habitats. To understand the genetic basis for this, we compared the protein coding component of Thermosynechococcus genomes. Supplementing the known genetic diversity of Thermosynechococcus, we report draft metagenome-assembled genomes of two Thermosynechococcus recovered from ferrous carbonate hot springs in Japan. We find that as a genus, Thermosynechococcus is genomically conserved, having a small pan-genome with few accessory genes per individual strain as well as few genes that are unique to the genus. Furthermore, by comparing orthologous protein groups, including an analysis of genes encoding proteins with an iron related function (uptake, storage or utilization), no clear differences in genetic content, or adaptive mechanisms could be detected between genus members, despite the range of environments they inhabit. Overall, our results highlight a seemingly innate ability for Thermosynechococcus to inhabit diverse habitats without having undergone substantial genomic adaptation to accommodate this. The finding of Thermosynechococcus in both hot and high iron environments without adaptation recognizable from the perspective of the proteome has implications for understanding the basis of thermophily within this clade, and also for understanding the possible genetic basis for high iron tolerance in cyanobacteria on early Earth. The conserved core genome may be indicative of an allopatric lifestyle-or reduced genetic complexity of hot spring habitats relative to other environments.


Assuntos
Genoma Bacteriano , Thermosynechococcus/genética , Thermosynechococcus/isolamento & purificação , Adaptação Fisiológica , Ecossistema , Genômica , Fontes Termais/microbiologia , Japão , Filogenia , Thermosynechococcus/classificação , Thermosynechococcus/fisiologia
10.
Front Microbiol ; 11: 594838, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33329479

RESUMO

In all three domains of life, tRNA genes contain introns that must be removed to yield functional tRNA. In archaea and eukarya, the first step of this process is catalyzed by a splicing endonuclease. The consensus structure recognized by the splicing endonuclease is a bulge-helix-bulge (BHB) motif which is also found in rRNA precursors. So far, a systematic analysis to identify all biological substrates of the splicing endonuclease has not been carried out. In this study, we employed CRISPRi to repress expression of the splicing endonuclease in the archaeon Haloferax volcanii to identify all substrates of this enzyme. Expression of the splicing endonuclease was reduced to 1% of its normal level, resulting in a significant extension of lag phase in H. volcanii growth. In the repression strain, 41 genes were down-regulated and 102 were up-regulated. As an additional approach in identifying new substrates of the splicing endonuclease, we isolated and sequenced circular RNAs, which identified excised introns removed from tRNA and rRNA precursors as well as from the 5' UTR of the gene HVO_1309. In vitro processing assays showed that the BHB sites in the 5' UTR of HVO_1309 and in a 16S rRNA-like precursor are processed by the recombinant splicing endonuclease. The splicing endonuclease is therefore an important player in RNA maturation in archaea.

11.
Life (Basel) ; 7(4)2017 Oct 31.
Artigo em Inglês | MEDLINE | ID: mdl-29088079

RESUMO

Several families of multicopy genes, such as transfer ribonucleic acids (tRNAs) and ribosomal RNAs (rRNAs), are subject to concerted evolution, an effect that keeps sequences of paralogous genes effectively identical. Under these circumstances, it is impossible to distinguish orthologs from paralogs on the basis of sequence similarity alone. Synteny, the preservation of relative genomic locations, however, also remains informative for the disambiguation of evolutionary relationships in this situation. In this contribution, we describe an automatic pipeline for the evolutionary analysis of such cases that use genome-wide alignments as a starting point to assign orthology relationships determined by synteny. The evolution of tRNAs in primates as well as the history of the Y RNA family in vertebrates and nematodes are used to showcase the method. The pipeline is freely available.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA