Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 59
Filtrar
2.
Mol Biol Evol ; 40(11)2023 Nov 03.
Artigo em Inglês | MEDLINE | ID: mdl-37879113

RESUMO

In phylogenomics, incongruences between gene trees, resulting from both artifactual and biological reasons, can decrease the signal-to-noise ratio and complicate species tree inference. The amount of data handled today in classical phylogenomic analyses precludes manual error detection and removal. However, a simple and efficient way to automate the identification of outliers from a collection of gene trees is still missing. Here, we present PhylteR, a method that allows rapid and accurate detection of outlier sequences in phylogenomic datasets, i.e. species from individual gene trees that do not follow the general trend. PhylteR relies on DISTATIS, an extension of multidimensional scaling to 3 dimensions to compare multiple distance matrices at once. In PhylteR, these distance matrices extracted from individual gene phylogenies represent evolutionary distances between species according to each gene. On simulated datasets, we show that PhylteR identifies outliers with more sensitivity and precision than a comparable existing method. We also show that PhylteR is not sensitive to ILS-induced incongruences, which is a desirable feature. On a biological dataset of 14,463 genes for 53 species previously assembled for Carnivora phylogenomics, we show (i) that PhylteR identifies as outliers sequences that can be considered as such by other means, and (ii) that the removal of these sequences improves the concordance between the gene trees and the species tree. Thanks to the generation of numerous graphical outputs, PhylteR also allows for the rapid and easy visual characterization of the dataset at hand, thus aiding in the precise identification of errors. PhylteR is distributed as an R package on CRAN and as containerized versions (docker and singularity).


Assuntos
Evolução Biológica , Filogenia
3.
PLoS Comput Biol ; 18(11): e1010621, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-36327227
4.
PLoS Biol ; 20(9): e3001776, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-36103518

RESUMO

Introgression, endosymbiosis, and gene transfer, i.e., horizontal gene flow (HGF), are primordial sources of innovation in all domains of life. Our knowledge on HGF relies on detection methods that exploit some of its signatures left on extant genomes. One of them is the effect of HGF on branch lengths of constructed phylogenies. This signature has been formalized in statistical tests for HGF detection and used for example to detect massive adaptive gene flows in malaria vectors or to order evolutionary events involved in eukaryogenesis. However, these studies rely on the assumption that ghost lineages (all unsampled extant and extinct taxa) have little influence. We demonstrate here with simulations and data reanalysis that when considering the more realistic condition that unsampled taxa are legion compared to sampled ones, the conclusion of these studies become unfounded or even reversed. This illustrates the necessity to recognize the existence of ghosts in evolutionary studies.


Assuntos
Evolução Biológica , Fluxo Gênico , Genoma , Filogenia
5.
Hist Philos Life Sci ; 44(3): 34, 2022 Aug 02.
Artigo em Inglês | MEDLINE | ID: mdl-35918616

RESUMO

This is the story, told in the light of a new analysis of historical data, of a mathematical biology problem that was explored in the 1930s in Thomas Morgan's laboratory at the California Institute of Technology. It is one of the early developments of evolutionary genetics and quantitative phylogeny, and deals with the identification and counting of chromosomal inversions in Drosophila species from comparisons of genetic maps. A re-analysis of the data produced in the 1930s using current mathematics and computational technologies reveals how a team of biologists, with the help of a renowned mathematician and against their first intuition, came to an erroneous conclusion regarding the presence of phylogenetic signals in gene arrangements. This example illustrates two different aspects of a same piece: (1) the appearance of a mathematical in biology problem solved with the development of a combinatorial algorithm, which was unusual at the time, and (2) the role of errors in scientific activity. Also underlying is the possible influence of computational complexity in understanding the directions of research in biology.


Assuntos
Inversão Cromossômica , Drosophila , Animais , Biologia , Drosophila/genética , Matemática , Filogenia
6.
Bioinformatics ; 38(8): 2350-2352, 2022 04 12.
Artigo em Inglês | MEDLINE | ID: mdl-35139153

RESUMO

MOTIVATION: Reconciliation between a host and its symbiont phylogenies or between a species and a gene phylogenies is a prevalent approach in evolution, however no simple generic tool (i.e. virtually usable by all reconciliation software, from host/symbiont to species/gene comparisons) is available to visualize reconciliation results. Moreover there is no tool to visualize 3-levels reconciliations, i.e. to visualize 2 nested reconciliations as for example in a host/symbiont/gene complex. RESULTS: Thirdkind is a light and easy to install command line software producing svg files displaying reconciliations, including 3-levels reconciliations. It takes a standard format recPhyloXML as input, and is thus usable with most reconciliation software. AVAILABILITY AND IMPLEMENTATION: https://github.com/simonpenel/thirdkind/wiki. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Software , Filogenia
7.
Syst Biol ; 71(5): 1147-1158, 2022 08 10.
Artigo em Inglês | MEDLINE | ID: mdl-35169846

RESUMO

Most species are extinct, those that are not are often unknown. Sequenced and sampled species are often a minority of known ones. Past evolutionary events involving horizontal gene flow, such as horizontal gene transfer, hybridization, introgression, and admixture, are therefore likely to involve "ghosts," that is extinct, unknown, or unsampled lineages. The existence of these ghost lineages is widely acknowledged, but their possible impact on the detection of gene flow and on the identification of the species involved is largely overlooked. It is generally considered as a possible source of error that, with reasonable approximation, can be ignored. We explore the possible influence of absent species on an evolutionary study by quantifying the effect of ghost lineages on introgression as detected by the popular D-statistic method. We show from simulated data that under certain frequently encountered conditions, the donors and recipients of horizontal gene flow can be wrongly identified if ghost lineages are not taken into account. In particular, having a distant outgroup, which is usually recommended, leads to an increase in the error probability and to false interpretations in most cases. We conclude that introgression from ghost lineages should be systematically considered as an alternative possible, even probable, scenario. [ABBA-BABA; D-statistic; gene flow; ghost lineage; introgression; simulation.].


Assuntos
Fluxo Gênico , Hibridização Genética , Evolução Biológica , Fluxo Gênico/genética , Transferência Genética Horizontal , Filogenia
8.
Genome Res ; 32(2): 280-296, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34930799

RESUMO

Gene expression is regulated through complex molecular interactions, involving cis-acting elements that can be situated far away from their target genes. Data on long-range contacts between promoters and regulatory elements are rapidly accumulating. However, it remains unclear how these regulatory relationships evolve and how they contribute to the establishment of robust gene expression profiles. Here, we address these questions by comparing genome-wide maps of promoter-centered chromatin contacts in mouse and human. We show that there is significant evolutionary conservation of cis-regulatory landscapes, indicating that selective pressures act to preserve not only regulatory element sequences but also their chromatin contacts with target genes. The extent of evolutionary conservation is remarkable for long-range promoter-enhancer contacts, illustrating how the structure of regulatory landscapes constrains large-scale genome evolution. We show that the evolution of cis-regulatory landscapes, measured in terms of distal element sequences, synteny, or contacts with target genes, is significantly associated with gene expression evolution.


Assuntos
Cromatina , Elementos Facilitadores Genéticos , Animais , Cromatina/genética , Evolução Molecular , Expressão Gênica , Camundongos , Regiões Promotoras Genéticas , Sintenia
9.
Methods Mol Biol ; 2231: 241-260, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33289897

RESUMO

We present Seaview version 5, a multiplatform program to perform multiple alignment and phylogenetic tree building from molecular sequence data. Seaview provides network access to sequence databases, alignment with arbitrary algorithm, parsimony, distance and maximum likelihood tree building with PhyML, and display, printing, and copy-to-clipboard or to SVG files of rooted or unrooted, binary or multifurcating phylogenetic trees. While Seaview is primarily a program providing a graphical user interface to guide the user into performing desired analyses, Seaview possesses also a command-line mode adequate for user-provided scripts. Seaview version 5 introduces the ability to reconcile a gene tree with a reference species tree and use this reconciliation to root and rearrange the gene tree. Seaview is freely available at http://doua.prabi.fr/software/seaview .


Assuntos
Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Software , Algoritmos , Códon/genética , Evolução Molecular , Código Genético , Dados de Sequência Molecular , Fases de Leitura Aberta/genética , Filogenia
10.
Bioinformatics ; 36(18): 4822-4824, 2020 09 15.
Artigo em Inglês | MEDLINE | ID: mdl-33085745

RESUMO

MOTIVATION: Gene and species tree reconciliation methods are used to interpret gene trees, root them and correct uncertainties that are due to scarcity of signal in multiple sequence alignments. So far, reconciliation tools have not been integrated in standard phylogenetic software and they either lack performance on certain functions, or usability for biologists. RESULTS: We present Treerecs, a phylogenetic software based on duplication-loss reconciliation. Treerecs is simple to install and to use. It is fast and versatile, has a graphic output, and can be used along with methods for phylogenetic inference on multiple alignments like PLL and Seaview. AVAILABILITY AND IMPLEMENTATION: Treerecs is open-source. Its source code (C++, AGPLv3) and manuals are available from https://project.inria.fr/treerecs/.


Assuntos
Algoritmos , Evolução Molecular , Filogenia , Alinhamento de Sequência , Software
11.
BMC Biol ; 18(1): 1, 2020 01 02.
Artigo em Inglês | MEDLINE | ID: mdl-31898513

RESUMO

BACKGROUND: New sequencing technologies have lowered financial barriers to whole genome sequencing, but resulting assemblies are often fragmented and far from 'finished'. Updating multi-scaffold drafts to chromosome-level status can be achieved through experimental mapping or re-sequencing efforts. Avoiding the costs associated with such approaches, comparative genomic analysis of gene order conservation (synteny) to predict scaffold neighbours (adjacencies) offers a potentially useful complementary method for improving draft assemblies. RESULTS: We evaluated and employed 3 gene synteny-based methods applied to 21 Anopheles mosquito assemblies to produce consensus sets of scaffold adjacencies. For subsets of the assemblies, we integrated these with additional supporting data to confirm and complement the synteny-based adjacencies: 6 with physical mapping data that anchor scaffolds to chromosome locations, 13 with paired-end RNA sequencing (RNAseq) data, and 3 with new assemblies based on re-scaffolding or long-read data. Our combined analyses produced 20 new superscaffolded assemblies with improved contiguities: 7 for which assignments of non-anchored scaffolds to chromosome arms span more than 75% of the assemblies, and a further 7 with chromosome anchoring including an 88% anchored Anopheles arabiensis assembly and, respectively, 73% and 84% anchored assemblies with comprehensively updated cytogenetic photomaps for Anopheles funestus and Anopheles stephensi. CONCLUSIONS: Experimental data from probe mapping, RNAseq, or long-read technologies, where available, all contribute to successful upgrading of draft assemblies. Our evaluations show that gene synteny-based computational methods represent a valuable alternative or complementary approach. Our improved Anopheles reference assemblies highlight the utility of applying comparative genomics approaches to improve community genomic resources.


Assuntos
Anopheles/genética , Evolução Biológica , Cromossomos , Técnicas Genéticas/instrumentação , Genômica/métodos , Sintenia , Animais , Mapeamento Cromossômico
12.
Bioinformatics ; 36(4): 1286-1288, 2020 02 15.
Artigo em Inglês | MEDLINE | ID: mdl-31566657

RESUMO

SUMMARY: Here we present Zombi, a tool to simulate the evolution of species, genomes and sequences in silico, that considers for the first time the evolution of genomes in extinct lineages. It also incorporates various features that have not to date been combined in a single simulator, such as the possibility of generating species trees with a pre-defined variation of speciation and extinction rates through time, simulating explicitly intergenic sequences of variable length and outputting gene tree-species tree reconciliations. AVAILABILITY AND IMPLEMENTATION: Source code and manual are freely available in https://github.com/AADavin/ZOMBI/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genoma , Software , Simulação por Computador , DNA Intergênico , Filogenia
13.
J Math Biol ; 78(6): 1981-2014, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-30767052

RESUMO

Gene tree/species tree reconciliation is a recent decisive progress in phylogenetic methods, accounting for the possible differences between gene histories and species histories. Reconciliation consists in explaining these differences by gene-scale events such as duplication, loss, transfer, which translates mathematically into a mapping between gene tree nodes and species tree nodes or branches. Gene conversion is a frequent and important evolutionary event, which results in the replacement of a gene by a copy of another from the same species and in the same gene tree. Including this event in reconciliation models has never been attempted because it introduces a dependency between lineages, and standard algorithms based on dynamic programming become ineffective. We propose here a novel mathematical framework including gene conversion as an evolutionary event in gene tree/species tree reconciliation. We describe a randomized algorithm that finds, in polynomial running time, a reconciliation minimizing the number of duplications, losses and conversions in the case when their weights are equal. We show that the space of optimal reconciliations includes an analog of the last common ancestor reconciliation, but is not limited to it. Our algorithm outputs any optimal reconciliation with a non-null probability. We argue that this study opens a research avenue on including gene conversion in reconciliation, and discuss its possible importance in biology.


Assuntos
Evolução Molecular , Conversão Gênica , Modelos Genéticos , Filogenia , Algoritmos , Simulação por Computador , Deleção de Genes , Duplicação Gênica , Transferência Genética Horizontal , Probabilidade
14.
BMC Genomics ; 19(Suppl 2): 96, 2018 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-29764366

RESUMO

BACKGROUND: Genomes rearrangements carry valuable information for phylogenetic inference or the elucidation of molecular mechanisms of adaptation. However, the detection of genome rearrangements is often hampered by current deficiencies in data and methods: Genomes obtained from short sequence reads have generally very fragmented assemblies, and comparing multiple gene orders generally leads to computationally intractable algorithmic questions. RESULTS: We present a computational method, ADSEQ, which, by combining ancestral gene order reconstruction, comparative scaffolding and de novo scaffolding methods, overcomes these two caveats. ADSEQ provides simultaneously improved assemblies and ancestral genomes, with statistical supports on all local features. Compared to previous comparative methods, it runs in polynomial time, it samples solutions in a probabilistic space, and it can handle a significantly larger gene complement from the considered extant genomes, with complex histories including gene duplications and losses. We use ADSEQ to provide improved assemblies and a genome history made of duplications, losses, gene translocations, rearrangements, of 18 complete Anopheles genomes, including several important malaria vectors. We also provide additional support for a differentiated mode of evolution of the sex chromosome and of the autosomes in these mosquito genomes. CONCLUSIONS: We demonstrate the method's ability to improve extant assemblies accurately through a procedure simulating realistic assembly fragmentation. We study a debated issue regarding the phylogeny of the Gambiae complex group of Anopheles genomes in the light of the evolution of chromosomal rearrangements, suggesting that the phylogenetic signal they carry can differ from the phylogenetic signal carried by gene sequences, more prone to introgression.


Assuntos
Anopheles/genética , Biologia Computacional/métodos , Rearranjo Gênico , Mosquitos Vetores/genética , Algoritmos , Animais , Evolução Molecular , Ordem dos Genes , Genoma de Inseto , Filogenia , Cromossomos Sexuais/genética
15.
Bioinformatics ; 34(21): 3646-3652, 2018 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-29762653

RESUMO

Motivation: A reconciliation is an annotation of the nodes of a gene tree with evolutionary events-for example, speciation, gene duplication, transfer, loss, etc.-along with a mapping onto a species tree. Many algorithms and software produce or use reconciliations but often using different reconciliation formats, regarding the type of events considered or whether the species tree is dated or not. This complicates the comparison and communication between different programs. Results: Here, we gather a consortium of software developers in gene tree species tree reconciliation to propose and endorse a format that aims to promote an integrative-albeit flexible-specification of phylogenetic reconciliations. This format, named recPhyloXML, is accompanied by several tools such as a reconciled tree visualizer and conversion utilities. Availability and implementation: http://phylariane.univ-lyon1.fr/recphyloxml/.


Assuntos
Evolução Molecular , Duplicação Gênica , Algoritmos , Filogenia , Software
16.
Nat Ecol Evol ; 2(5): 904-909, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29610471

RESUMO

Biodiversity has always been predominantly microbial, and the scarcity of fossils from bacteria, archaea and microbial eukaryotes has prevented a comprehensive dating of the tree of life. Here, we show that patterns of lateral gene transfer deduced from an analysis of modern genomes encode a novel and abundant source of information about the temporal coexistence of lineages throughout the history of life. We use state-of-the-art species tree-aware phylogenetic methods to reconstruct the history of thousands of gene families and demonstrate that dates implied by gene transfers are consistent with estimates from relaxed molecular clocks in Bacteria, Archaea and Eukarya. We present the order of speciations according to lateral gene transfer data calibrated to geological time for three datasets comprising 40 genomes for Cyanobacteria, 60 genomes for Archaea and 60 genomes for Fungi. An inspection of discrepancies between transfers and clocks and a comparison with mammalian fossils show that gene transfer in microbes is potentially as informative for dating the tree of life as the geological record in macroorganisms.


Assuntos
Evolução Molecular , Transferência Genética Horizontal , Genoma Arqueal , Genoma Bacteriano , Genoma Fúngico , Filogenia , Cianobactérias/genética
17.
Methods Mol Biol ; 1704: 343-362, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29277873

RESUMO

Comparative genomics considers the detection of similarities and differences between extant genomes, and, based on more or less formalized hypotheses regarding the involved evolutionary processes, inferring ancestral states explaining the similarities and an evolutionary history explaining the differences. In this chapter, we focus on the reconstruction of the organization of ancient genomes into chromosomes. We review different methodological approaches and software, applied to a wide range of datasets from different kingdoms of life and at different evolutionary depths. We discuss relations with genome assembly, and potential approaches to validate computational predictions on ancient genomes that are almost always only accessible through these predictions.


Assuntos
Evolução Biológica , Biologia Computacional/métodos , DNA Antigo/análise , Genoma , Modelos Genéticos , Cromossomos , Ordem dos Genes , Genômica/métodos , Software
18.
Algorithms Mol Biol ; 12: 16, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28592988

RESUMO

BACKGROUND: Combinatorial works on genome rearrangements have so far ignored the influence of intergene sizes, i.e. the number of nucleotides between consecutive genes, although it was recently shown decisive for the accuracy of inference methods (Biller et al. in Genome Biol Evol 8:1427-39, 2016; Biller et al. in Beckmann A, Bienvenu L, Jonoska N, editors. Proceedings of Pursuit of the Universal-12th conference on computability in Europe, CiE 2016, Lecture notes in computer science, vol 9709, Paris, France, June 27-July 1, 2016. Berlin: Springer, p. 35-44, 2016). In this line, we define a new genome rearrangement model called wDCJ, a generalization of the well-known double cut and join (or DCJ) operation that modifies both the gene order and the intergene size distribution of a genome. RESULTS: We first provide a generic formula for the wDCJ distance between two genomes, and show that computing this distance is strongly NP-complete. We then propose an approximation algorithm of ratio 4/3, and two exact ones: a fixed-parameter tractable (FPT) algorithm and an integer linear programming (ILP) formulation. CONCLUSIONS: We provide theoretical and empirical bounds on the expected growth of the parameter at the center of our FPT and ILP algorithms, assuming a probabilistic model of evolution under wDCJ, which shows that both these algorithms should run reasonably fast in practice.

20.
Bioinformatics ; 33(7): 980-987, 2017 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-28073758

RESUMO

Summary: Gene trees reconstructed from sequence alignments contain poorly supported branches when the phylogenetic signal in the sequences is insufficient to determine them all. When a species tree is available, the signal of gains and losses of genes can be used to correctly resolve the unsupported parts of the gene history. However finding a most parsimonious binary resolution of a non-binary tree obtained by contracting the unsupported branches is NP-hard if transfer events are considered as possible gene scale events, in addition to gene origination, duplication and loss. We propose an exact, parameterized algorithm to solve this problem in single-exponential time, where the parameter is the number of connected branches of the gene tree that show low support from the sequence alignment or, equivalently, the maximum number of children of any node of the gene tree once the low-support branches have been collapsed. This improves on the best known algorithm by an exponential factor. We propose a way to choose among optimal solutions based on the available information. We show the usability of this principle on several simulated and biological datasets. The results are comparable in quality to several other tested methods having similar goals, but our approach provides a lower running time and a guarantee that the produced solution is optimal. Availability and Implementation: Our algorithm has been integrated into the ecceTERA phylogeny package, available at http://mbb.univ-montp2.fr/MBB/download_sources/16__ecceTERA and which can be run online at http://mbb.univ-montp2.fr/MBB/subsection/softExec.php?soft=eccetera . Contact: celine.scornavacca@umontpellier.fr. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Duplicação Gênica , Genes , Filogenia , Algoritmos , Simulação por Computador , Cianobactérias/genética , Bases de Dados Genéticas , Evolução Molecular , Extinção Biológica , Variação Genética , Proteobactérias/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA