Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 67
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Syst Biol ; 2024 Feb 08.
Artigo em Inglês | MEDLINE | ID: mdl-38330161

RESUMO

The evolution of gene families is complex, involving gene-level evolutionary events such as gene duplication, horizontal gene transfer, and gene loss (DTL), and other processes such as incomplete lineage sorting (ILS). Because of this, topological differences often exist between gene trees and species trees. A number of models have been recently developed to explain these discrepancies, the most realistic of which attempt to consider both gene-level events and ILS. When unified in a single model, the interaction between ILS and gene-level events can cause polymorphism in gene copy number, which we refer to as copy number hemiplasy (CNH). In this paper we extend the Wright-Fisher process to include duplications and losses over several species, and show that the probability of CNH for this process can be significant. We study how well two unified models - MLMSC (MultiLocus MultiSpecies Coalescent), which models CNH, and DLCoal (Duplication, Loss, and Coalescence), which does not - approximate the Wright-Fisher process with duplication and loss. We then study the effect of CNH on gene family evolution by comparing MLMSC and DLCoal. We generate comparable gene trees under both models, showing significant differences in various summary statistics; most importantly, CNH reduces the number of gene copies greatly. If this is not taken into account, the traditional method of estimating duplication rates (by counting the number of gene copies) becomes inaccurate. The simulated gene trees are also used for species tree inference with the summary methods ASTRAL and ASTRAL-Pro, demonstrating that their accuracy, based on CNH-unaware simulations calibrated on real data, may have been overestimated.

2.
Nucleic Acids Res ; 52(D1): D529-D535, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37843103

RESUMO

To date, the databases built to gather information on gene orthology do not provide end-users with descriptors of the molecular evolution information and phylogenetic pattern of these orthologues. In this context, we developed OrthoMaM, a database of ORTHOlogous MAmmalian Markers describing the evolutionary dynamics of coding sequences in mammalian genomes. OrthoMaM version 12 includes 15,868 alignments of orthologous coding sequences (CDS) from the 190 complete mammalian genomes currently available. All annotations and 1-to-1 orthology assignments are based on NCBI. Orthologous CDS can be mined for potential informative markers at the different taxonomic levels of the mammalian tree. To this end, several evolutionary descriptors of DNA sequences are provided for querying purposes (e.g. base composition and relative substitution rate). The graphical web interface allows the user to easily browse and sort the results of combined queries. The corresponding multiple sequence alignments and ML trees, inferred using state-of-the art approaches, are available for download both at the nucleotide and amino acid levels. OrthoMaM v12 can be used by researchers interested either in reconstructing the phylogenetic relationships of mammalian taxa or in understanding the evolutionary dynamics of coding sequences in their genomes. OrthoMaM is available for browsing, querying and complete or filtered download at https://orthomam.mbb.cnrs.fr/.


Assuntos
Bases de Dados Genéticas , Genômica , Animais , Sequência de Bases , Genoma , Genômica/métodos , Mamíferos/classificação , Mamíferos/genética , Filogenia , Evolução Biológica
3.
Mol Biol Evol ; 40(10)2023 10 04.
Artigo em Inglês | MEDLINE | ID: mdl-37794645

RESUMO

Pangolins form a group of scaly mammals that are trafficked at record numbers for their meat and purported medicinal properties. Despite their conservation concern, knowledge of their evolution is limited by a paucity of genomic data. We aim to produce exhaustive genomic resources that include 3,238 orthologous genes and whole-genome polymorphisms to assess the evolution of all eight extant pangolin species. Robust orthologous gene-based phylogenies recovered the monophyly of the three genera and highlighted the existence of an undescribed species closely related to Southeast Asian pangolins. Signatures of middle Miocene admixture between an extinct, possibly European, lineage and the ancestor of Southeast Asian pangolins, provide new insights into the early evolutionary history of the group. Demographic trajectories and genome-wide heterozygosity estimates revealed contrasts between continental versus island populations and species lineages, suggesting that conservation planning should consider intraspecific patterns. With the expected loss of genomic diversity from recent, extensive trafficking not yet realized in pangolins, we recommend that populations be genetically surveyed to anticipate any deleterious impact of the illegal trade. Finally, we produce a complete set of genomic resources that will be integral for future conservation management and forensic endeavors for pangolins, including tracing their illegal trade. These comprise the completion of whole-genomes for pangolins through the hybrid assembly of the first reference genome for the giant pangolin (Smutsia gigantea) and new draft genomes (∼43x-77x) for four additional species, as well as a database of orthologous genes with over 3.4 million polymorphic sites.


Assuntos
Mamíferos , Pangolins , Animais , Pangolins/genética , Mamíferos/genética , Genoma , Filogenia , Genômica
4.
J Math Biol ; 85(3): 22, 2022 08 17.
Artigo em Inglês | MEDLINE | ID: mdl-35976512

RESUMO

methods seek to infer a species tree from a set of gene trees. A desirable property of such methods is that of statistical consistency; that is, the probability of inferring the wrong species tree (the error probability) tends to 0 as the number of input gene trees becomes large. A popular paradigm is to infer a species tree that agrees with the maximum number of quartets from the input set of gene trees; this has been proved to be statistically consistent under several models of gene evolution. In this paper, we study the asymptotic behaviour of the error probability of such methods in this limit, and show that it decays exponentially. For a 4-taxon species tree, we derive a closed form for the asymptotic behaviour in terms of the probability that the gene evolution process produces the correct topology. We also derive bounds for the sample complexity (the number of gene trees required to infer the true species tree with a given probability), which outperform existing bounds. We then extend our results to bounds for the asymptotic behaviour of the error probability for any species tree, and compare these to the true error probability for some model species trees using simulations.


Assuntos
Evolução Molecular , Modelos Genéticos , Especiação Genética , Filogenia , Probabilidade
5.
Algorithms Mol Biol ; 17(1): 15, 2022 Aug 20.
Artigo em Inglês | MEDLINE | ID: mdl-35987645

RESUMO

BACKGROUND: Phylogenetic reconstruction is one of the paramount challenges of contemporary bioinformatics. A subtask of existing tree reconstruction algorithms is modeled by the SMALL PARSIMONY problem: given a tree T and an assignment of character-states to its leaves, assign states to the internal nodes of T such as to minimize the parsimony score, that is, the number of edges of T connecting nodes with different states. While this problem is polynomial-time solvable on trees, the matter is more complicated if T contains reticulate events such as hybridizations or recombinations, i.e. when T is a network. Indeed, three different versions of the parsimony score on networks have been proposed and each of them is NP-hard to decide. Existing parameterized algorithms focus on combining the number c of possible character-states with the number of reticulate events (per biconnected component). RESULTS: We consider the parameter treewidth t of the underlying undirected graph of the input network, presenting dynamic programming algorithms for (slight generalizations of) all three versions of the parsimony problem on size-n networks running in times [Formula: see text], [Formula: see text], and [Formula: see text], respectively. Our algorithms use a formulation of the treewidth that may facilitate formalizing treewidth-based dynamic programming algorithms on phylogenetic networks for other problems. CONCLUSIONS: Our algorithms allow the computation of the three popular parsimony scores, modeling the evolutionary development of a (multistate) character on a given phylogenetic network of low treewidth. Our results subsume and improve previously known algorithm for all three variants. While our results rely on being given a "good" tree-decomposition of the input, encouraging theoretical results as well as practical implementations producing them are publicly available. We present a reformulation of tree decompositions in terms of "agreeing trees" on the same set of nodes. As this formulation may come more natural to researchers and engineers developing algorithms for phylogenetic networks, we hope to render exploiting the input network's treewidth as parameter more accessible to this audience.

6.
Bioinformatics ; 38(15): 3725-3733, 2022 08 02.
Artigo em Inglês | MEDLINE | ID: mdl-35713506

RESUMO

MOTIVATION: Phylogenetic networks can represent non-treelike evolutionary scenarios. Current, actively developed approaches for phylogenetic network inference jointly account for non-treelike evolution and incomplete lineage sorting (ILS). Unfortunately, this induces a very high computational complexity and current tools can only analyze small datasets. RESULTS: We present NetRAX, a tool for maximum likelihood (ML) inference of phylogenetic networks in the absence of ILS. Our tool leverages state-of-the-art methods for efficiently computing the phylogenetic likelihood function on trees, and extends them to phylogenetic networks via the notion of 'displayed trees'. NetRAX can infer ML phylogenetic networks from partitioned multiple sequence alignments and returns the inferred networks in Extended Newick format. On simulated data, our results show a very low relative difference in Bayesian Information Criterion (BIC) score and a near-zero unrooted softwired cluster distance to the true, simulated networks. With NetRAX, a network inference on a partitioned alignment with 8000 sites, 30 taxa and 3 reticulations completes within a few minutes on a standard laptop. AVAILABILITY AND IMPLEMENTATION: Our implementation is available under the GNU General Public License v3.0 at https://github.com/lutteropp/NetRAX. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Filogenia , Teorema de Bayes , Alinhamento de Sequência , Funções Verossimilhança
7.
Syst Biol ; 71(3): 526-546, 2022 04 19.
Artigo em Inglês | MEDLINE | ID: mdl-34324671

RESUMO

Introgression is an important biological process affecting at least 10% of the extant species in the animal kingdom. Introgression significantly impacts inference of phylogenetic species relationships where a strictly binary tree model cannot adequately explain reticulate net-like species relationships. Here, we use phylogenomic approaches to understand patterns of introgression along the evolutionary history of a unique, nonmodel insect system: dragonflies and damselflies (Odonata). We demonstrate that introgression is a pervasive evolutionary force across various taxonomic levels within Odonata. In particular, we show that the morphologically "intermediate" species of Anisozygoptera (one of the three primary suborders within Odonata besides Zygoptera and Anisoptera), which retain phenotypic characteristics of the other two suborders, experienced high levels of introgression likely coming from zygopteran genomes. Additionally, we find evidence for multiple cases of deep inter-superfamilial ancestral introgression. [Gene flow; Odonata; phylogenomics; reticulate evolution.].


Assuntos
Odonatos , Animais , Genoma , Insetos/anatomia & histologia , Odonatos/anatomia & histologia , Odonatos/genética , Filogenia
8.
J Math Biol ; 83(5): 52, 2021 10 21.
Artigo em Inglês | MEDLINE | ID: mdl-34676444

RESUMO

Measures of phylogenetic balance, such as the Colless and Sackin indices, play an important role in phylogenetics. Unfortunately, these indices are specifically designed for phylogenetic trees, and do not extend naturally to phylogenetic networks (which are increasingly used to describe reticulate evolution). This led us to consider a lesser-known balance index, whose definition is based on a probabilistic interpretation that is equally applicable to trees and to networks. This index, known as the [Formula: see text] index, was first proposed by Shao and Sokal (Syst Zool 39(3): 266-276, 1990). Surprisingly, it does not seem to have been studied mathematically since. Likewise, it is used only sporadically in the biological literature, where it tends to be viewed as arcane. In this paper, we study mathematical properties of [Formula: see text] such as its expectation and variance under the most common models of random trees and its extremal values over various classes of phylogenetic networks. We also assess its relevance in biological applications, and find it to be comparable to that of the Colless and Sackin indices. Altogether, our results call for a reevaluation of the status of this somewhat forgotten measure of phylogenetic balance.


Assuntos
Algoritmos , Evolução Biológica , Filogenia
9.
PLoS Comput Biol ; 17(9): e1008380, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34478440

RESUMO

For various species, high quality sequences and complete genomes are nowadays available for many individuals. This makes data analysis challenging, as methods need not only to be accurate, but also time efficient given the tremendous amount of data to process. In this article, we introduce an efficient method to infer the evolutionary history of individuals under the multispecies coalescent model in networks (MSNC). Phylogenetic networks are an extension of phylogenetic trees that can contain reticulate nodes, which allow to model complex biological events such as horizontal gene transfer, hybridization and introgression. We present a novel way to compute the likelihood of biallelic markers sampled along genomes whose evolution involved such events. This likelihood computation is at the heart of a Bayesian network inference method called SnappNet, as it extends the Snapp method inferring evolutionary trees under the multispecies coalescent model, to networks. SnappNet is available as a package of the well-known beast 2 software. Recently, the MCMC_BiMarkers method, implemented in PhyloNet, also extended Snapp to networks. Both methods take biallelic markers as input, rely on the same model of evolution and sample networks in a Bayesian framework, though using different methods for computing priors. However, SnappNet relies on algorithms that are exponentially more time-efficient on non-trivial networks. Using simulations, we compare performances of SnappNet and MCMC_BiMarkers. We show that both methods enjoy similar abilities to recover simple networks, but SnappNet is more accurate than MCMC_BiMarkers on more complex network scenarios. Also, on complex networks, SnappNet is found to be extremely faster than MCMC_BiMarkers in terms of time required for the likelihood computation. We finally illustrate SnappNet performances on a rice data set. SnappNet infers a scenario that is consistent with previous results and provides additional understanding of rice evolution.


Assuntos
Cadeias de Markov , Método de Monte Carlo , Filogenia , Algoritmos , Teorema de Bayes , Biologia Computacional/métodos , Evolução Molecular , Genes de Plantas , Funções Verossimilhança , Oryza/classificação , Oryza/genética
11.
Microb Genom ; 7(6)2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-34165421

RESUMO

Prokaryote genome evolution is characterized by the frequent gain of genes through horizontal gene transfer (HGT). For a gene, being horizontally transferred can represent a strong change in its genomic and physiological context. If the codon usage of a transferred gene deviates from that of the receiving organism, the fitness benefits it provides can be reduced due to a mismatch with the expression machinery. Consequently, transferred genes with a deviating codon usage can be selected against or elicit evolutionary responses that enhance their integration, such as gene amelioration and compensatory evolution. Within bacterial species, the extent and relative importance of these different mechanisms has never been considered altogether. In this study, a phylogeny-based method was used to investigate the occurrence of these different evolutionary responses in Pseudomonas aeruginosa. Selection on codon usage of genes acquired through HGT was observed over evolutionary time, with the overall codon usage converging towards that of the core genome. Gene amelioration, through the accumulation of synonymous mutations after HGT, did not seem to systematically affect transferred genes. This pattern therefore seemed to be mainly driven by selective retention of transferred genes with an initial codon usage similar to that of the core genes. Additionally, variation in the copy number of tRNA genes was often associated with the acquisition of genes for which the observed variation could enhance their expression. This provides evidence that compensatory evolution might be an important mechanism for the integration of horizontally transferred genes.


Assuntos
Uso do Códon , Evolução Molecular , Transferência Genética Horizontal , Pseudomonas aeruginosa/genética , Códon , Genes Bacterianos/genética , Genoma Bacteriano , Filogenia , RNA de Transferência/genética
12.
Elife ; 102021 02 18.
Artigo em Inglês | MEDLINE | ID: mdl-33599612

RESUMO

In a context of ongoing biodiversity erosion, obtaining genomic resources from wildlife is essential for conservation. The thousands of yearly mammalian roadkill provide a useful source material for genomic surveys. To illustrate the potential of this underexploited resource, we used roadkill samples to study the genomic diversity of the bat-eared fox (Otocyon megalotis) and the aardwolf (Proteles cristatus), both having subspecies with similar disjunct distributions in Eastern and Southern Africa. First, we obtained reference genomes with high contiguity and gene completeness by combining Nanopore long reads and Illumina short reads. Then, we showed that the two subspecies of aardwolf might warrant species status (P. cristatus and P. septentrionalis) by comparing their genome-wide genetic differentiation to pairs of well-defined species across Carnivora with a new Genetic Differentiation index (GDI) based on only a few resequenced individuals. Finally, we obtained a genome-scale Carnivora phylogeny including the new aardwolf species.


Assuntos
Raposas/classificação , Raposas/genética , Variação Genética , Genoma , Hyaenidae/classificação , Hyaenidae/genética , Animais , Sequenciamento de Nucleotídeos em Larga Escala/veterinária , Sequenciamento por Nanoporos/veterinária
13.
Curr Biol ; 31(6): 1303-1310.e4, 2021 03 22.
Artigo em Inglês | MEDLINE | ID: mdl-33476557

RESUMO

Due to their limited ranges and inherent isolation, island species have long been recognized as crucial systems for tackling a range of evolutionary questions, including in the early study of speciation.1,2 Such species have been less studied in the understanding of the evolutionary forces driving DNA sequence evolution. Island species usually have lower census population sizes (N) than continental species and, supposedly, lower effective population sizes (Ne). Given that both the rates of change caused by genetic drift and by selection are dependent upon Ne, island species are theoretically expected to exhibit (1) lower genetic diversity, (2) less effective natural selection against slightly deleterious mutations,3,4 and (3) a lower rate of adaptive evolution.5-8 Here, we have used a large set of newly sequenced and published whole-genome sequences of Passerida species (14 insular and 11 continental) to test these predictions. We confirm that island species exhibit lower census size and Ne, supporting the hypothesis that the smaller area available on islands constrains the upper bound of Ne. In the insular species, we find lower nucleotide diversity in coding regions, higher ratios of non-synonymous to synonymous polymorphisms, and lower adaptive substitution rates. Our results provide robust evidence that the lower Ne experienced by island species has affected both the ability of natural selection to efficiently remove weakly deleterious mutations and also the adaptive potential of island species, therefore providing considerable empirical support for the nearly neutral theory. We discuss the implications for both evolutionary and conservation biology.


Assuntos
Evolução Molecular , Genética Populacional , Aves Canoras , Animais , Deriva Genética , Variação Genética , Densidade Demográfica , Seleção Genética , Aves Canoras/genética
14.
Theor Popul Biol ; 137: 22-31, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-33333117

RESUMO

The multispecies coalescent process models the genealogical relationships of genes sampled from several species, enabling useful predictions about phenomena such as the discordance between a gene tree and the species phylogeny due to incomplete lineage sorting. Conversely, knowledge of large collections of gene trees can inform us about several aspects of the species phylogeny, such as its topology and ancestral population sizes. A fundamental open problem in this context is how to efficiently compute the probability of a gene tree topology, given the species phylogeny. Although a number of algorithms for this task have been proposed, they either produce approximate results, or, when they are exact, they do not scale to large data sets. In this paper, we present some progress towards exact and efficient computation of the probability of a gene tree topology. We provide a new algorithm that, given a species tree and the number of genes sampled for each species, calculates the probability that the gene tree topology will be concordant with the species tree. Moreover, we provide an algorithm that computes the probability of any specific gene tree topology concordant with the species tree. Both algorithms run in polynomial time and have been implemented in Python. Experiments show that they are able to analyze data sets where thousands of genes are sampled in a matter of minutes to hours.


Assuntos
Algoritmos , Modelos Genéticos , Especiação Genética , Filogenia , Probabilidade
15.
Syst Biol ; 70(4): 822-837, 2021 06 16.
Artigo em Inglês | MEDLINE | ID: mdl-33169795

RESUMO

Incomplete lineage sorting (ILS), the interaction between coalescence and speciation, can generate incongruence between gene trees and species trees, as can gene duplication (D), transfer (T), and loss (L). These processes are usually modeled independently, but in reality, ILS can affect gene copy number polymorphism, that is, interfere with DTL. This has been previously recognized, but not treated in a satisfactory way, mainly because DTL events are naturally modeled forward-in-time, while ILS is naturally modeled backward-in-time with the coalescent. Here, we consider the joint action of ILS and DTL on the gene tree/species tree problem in all its complexity. In particular, we show that the interaction between ILS and duplications/transfers (without losses) can result in patterns usually interpreted as resulting from gene loss, and that the realized rate of D, T, and L becomes nonhomogeneous in time when ILS is taken into account. We introduce algorithmic solutions to these problems. Our new model, the multilocus multispecies coalescent, which also accounts for any level of linkage between loci, generalizes the multispecies coalescent (MSC) model and offers a versatile, powerful framework for proper simulation, and inference of gene family evolution. [Gene duplication; gene loss; horizontal gene transfer; incomplete lineage sorting; multispecies coalescent; hemiplasy; recombination.].


Assuntos
Evolução Molecular , Duplicação Gênica , Modelos Genéticos , Família Multigênica , Simulação por Computador , Transferência Genética Horizontal , Especiação Genética , Filogenia
16.
Bioinformatics ; 36(18): 4822-4824, 2020 09 15.
Artigo em Inglês | MEDLINE | ID: mdl-33085745

RESUMO

MOTIVATION: Gene and species tree reconciliation methods are used to interpret gene trees, root them and correct uncertainties that are due to scarcity of signal in multiple sequence alignments. So far, reconciliation tools have not been integrated in standard phylogenetic software and they either lack performance on certain functions, or usability for biologists. RESULTS: We present Treerecs, a phylogenetic software based on duplication-loss reconciliation. Treerecs is simple to install and to use. It is fast and versatile, has a graphic output, and can be used along with methods for phylogenetic inference on multiple alignments like PLL and Seaview. AVAILABILITY AND IMPLEMENTATION: Treerecs is open-source. Its source code (C++, AGPLv3) and manuals are available from https://project.inria.fr/treerecs/.


Assuntos
Algoritmos , Evolução Molecular , Filogenia , Alinhamento de Sequência , Software
17.
Mol Biol Evol ; 37(11): 3292-3307, 2020 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-32886770

RESUMO

Phylogenetic inference from genome-wide data (phylogenomics) has revolutionized the study of evolution because it enables accounting for discordance among evolutionary histories across the genome. To this end, summary methods have been developed to allow accurate and scalable inference of species trees from gene trees. However, most of these methods, including the widely used ASTRAL, can only handle single-copy gene trees and do not attempt to model gene duplication and gene loss. As a result, most phylogenomic studies have focused on single-copy genes and have discarded large parts of the data. Here, we first propose a measure of quartet similarity between single-copy and multicopy trees that accounts for orthology and paralogy. We then introduce a method called ASTRAL-Pro (ASTRAL for PaRalogs and Orthologs) to find the species tree that optimizes our quartet similarity measure using dynamic programing. By studying its performance on an extensive collection of simulated data sets and on real data sets, we show that ASTRAL-Pro is more accurate than alternative methods.


Assuntos
Técnicas Genéticas , Filogenia , Algoritmos , Plantas/genética , Leveduras/genética
18.
Artigo em Inglês | MEDLINE | ID: mdl-30703035

RESUMO

Phylogenetic networks provide a mathematical model to represent the evolution of a set of species where, apart from speciation, reticulate evolutionary events have to be taken into account. Among these events, lateral gene transfers need special consideration due to the asymmetry in the roles of the species involved in such an event. To take into account this asymmetry, LGT networks were introduced. Contrarily to the case of phylogenetic trees, the combinatorial structure of phylogenetic networks is much less known and difficult to describe. One of the approaches in the literature is to classify them according to their level and find generators of the given level that can be used to recursively generate all networks. In this paper, we adapt the concept of generators to the case of LGT networks. We show how these generators, classified by their level, give rise to simple LGT networks of the specified level, and how any LGT network can be obtained from these simple networks, that act as building blocks of the generic structure. The stochastic models of evolution of phylogenetic networks are also much less studied than those for phylogenetic trees. In this setting, we introduce a novel two-parameter model that generates LGT networks. Finally, we present some computer simulations using this model in order to investigate the complexity of the generated networks, depending on the parameters of the model.


Assuntos
Biologia Computacional/métodos , Evolução Molecular , Transferência Genética Horizontal/genética , Modelos Genéticos , Filogenia , Simulação por Computador
19.
Syst Biol ; 69(1): 38-60, 2020 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-31062850

RESUMO

Evolutionary relationships have remained unresolved in many well-studied groups, even though advances in next-generation sequencing and analysis, using approaches such as transcriptomics, anchored hybrid enrichment, or ultraconserved elements, have brought systematics to the brink of whole genome phylogenomics. Recently, it has become possible to sequence the entire genomes of numerous nonbiological models in parallel at reasonable cost, particularly with shotgun sequencing. Here, we identify orthologous coding sequences from whole-genome shotgun sequences, which we then use to investigate the relevance and power of phylogenomic relationship inference and time-calibrated tree estimation. We study an iconic group of butterflies-swallowtails of the family Papilionidae-that has remained phylogenetically unresolved, with continued debate about the timing of their diversification. Low-coverage whole genomes were obtained using Illumina shotgun sequencing for all genera. Genome assembly coupled to BLAST-based orthology searches allowed extraction of 6621 orthologous protein-coding genes for 45 Papilionidae species and 16 outgroup species (with 32% missing data after cleaning phases). Supermatrix phylogenomic analyses were performed with both maximum-likelihood (IQ-TREE) and Bayesian mixture models (PhyloBayes) for amino acid sequences, which produced a fully resolved phylogeny providing new insights into controversial relationships. Species tree reconstruction from gene trees was performed with ASTRAL and SuperTriplets and recovered the same phylogeny. We estimated gene site concordant factors to complement traditional node-support measures, which strengthens the robustness of inferred phylogenies. Bayesian estimates of divergence times based on a reduced data set (760 orthologs and 12% missing data) indicate a mid-Cretaceous origin of Papilionoidea around 99.2 Ma (95% credibility interval: 68.6-142.7 Ma) and Papilionidae around 71.4 Ma (49.8-103.6 Ma), with subsequent diversification of modern lineages well after the Cretaceous-Paleogene event. These results show that shotgun sequencing of whole genomes, even when highly fragmented, represents a powerful approach to phylogenomics and molecular dating in a group that has previously been refractory to resolution.


Assuntos
Evolução Biológica , Borboletas/classificação , Borboletas/genética , Genoma de Inseto/genética , Filogenia , Animais , Tempo
20.
PLoS Comput Biol ; 15(10): e1007440, 2019 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-31596844

RESUMO

[This corrects the article DOI: 10.1371/journal.pcbi.1007347.].

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...