Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Proc Natl Acad Sci U S A ; 111(45): E4859-68, 2014 Nov 11.
Artículo en Inglés | MEDLINE | ID: mdl-25355905

RESUMEN

Reconstructing the origin and evolution of land plants and their algal relatives is a fundamental problem in plant phylogenetics, and is essential for understanding how critical adaptations arose, including the embryo, vascular tissue, seeds, and flowers. Despite advances in molecular systematics, some hypotheses of relationships remain weakly resolved. Inferring deep phylogenies with bouts of rapid diversification can be problematic; however, genome-scale data should significantly increase the number of informative characters for analyses. Recent phylogenomic reconstructions focused on the major divergences of plants have resulted in promising but inconsistent results. One limitation is sparse taxon sampling, likely resulting from the difficulty and cost of data generation. To address this limitation, transcriptome data for 92 streptophyte taxa were generated and analyzed along with 11 published plant genome sequences. Phylogenetic reconstructions were conducted using up to 852 nuclear genes and 1,701,170 aligned sites. Sixty-nine analyses were performed to test the robustness of phylogenetic inferences to permutations of the data matrix or to phylogenetic method, including supermatrix, supertree, and coalescent-based approaches, maximum-likelihood and Bayesian methods, partitioned and unpartitioned analyses, and amino acid versus DNA alignments. Among other results, we find robust support for a sister-group relationship between land plants and one group of streptophyte green algae, the Zygnematophyceae. Strong and robust support for a clade comprising liverworts and mosses is inconsistent with a widely accepted view of early land plant evolution, and suggests that phylogenetic hypotheses used to understand the evolution of fundamental plant traits should be reevaluated.


Asunto(s)
Evolución Molecular , Genoma de Planta/fisiología , Filogenia , Carácter Cuantitativo Heredable , Streptophyta/fisiología , Transcriptoma/fisiología , ADN de Plantas/genética , ADN de Plantas/metabolismo , Perfilación de la Expresión Génica , Alineación de Secuencia , Streptophyta/clasificación
2.
Mol Biol Evol ; 30(1): 197-214, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22930702

RESUMEN

Progress in sequencing technology allows researchers to assemble ever-larger supermatrices for phylogenomic inference. However, current phylogenomic studies often rest on patchy data sets, with some having 80% missing (or ambiguous) data or more. Though early simulations had suggested that missing data per se do not harm phylogenetic inference when using sufficiently large data sets, Lemmon et al. (Lemmon AR, Brown JM, Stanger-Hall K, Lemmon EM. 2009. The effect of ambiguous data on phylogenetic estimates obtained by maximum likelihood and Bayesian inference. Syst Biol. 58:130-145.) have recently cast doubt on this consensus in a study based on the introduction of parsimony-uninformative incomplete characters. In this work, we empirically reassess the issue of missing data in phylogenomics while exploring possible interactions with the model of sequence evolution. First, we note that parsimony-uninformative incomplete characters are actually informative in a probabilistic framework. A reanalysis of Lemmon's data set with this in mind gives a very different interpretation of their results and shows that some of their conclusions may be unfounded. Second, we investigate the effect of the progressive introduction of missing data in a complete supermatrix (126 genes × 39 species) capable of resolving animal relationships. These analyses demonstrate that missing data perturb phylogenetic inference slightly beyond the expected decrease in resolving power. In particular, they exacerbate systematic errors by reducing the number of species effectively available for the detection of multiple substitutions. Consequently, large sparse supermatrices are more sensitive to phylogenetic artifacts than smaller but less incomplete data sets, which argue for experimental designs aimed at collecting a modest number (~50) of highly covered genes. Our results further confirm that including incomplete yet short-branch taxa (i.e., slowly evolving species or close outgroups) can help to eschew artifacts, as predicted by simulations. Finally, it appears that selecting an adequate model of sequence evolution (e.g., the site-heterogeneous CAT model instead of the site-homogeneous WAG model) is more beneficial to phylogenetic accuracy than reducing the level of missing data.


Asunto(s)
Bases de Datos Genéticas , Genómica/métodos , Modelos Genéticos , Filogenia , Anfibios/clasificación , Anfibios/genética , Animales , Teorema de Bayes , Simulación por Computador , Evolución Molecular , Alineación de Secuencia , Análisis de Secuencia
3.
BMC Evol Biol ; 13: 5, 2013 Jan 09.
Artículo en Inglés | MEDLINE | ID: mdl-23302374

RESUMEN

BACKGROUND: Cnidaria (corals, sea anemones, hydroids, jellyfish) is a phylum of relatively simple aquatic animals characterized by the presence of the cnidocyst: a cell containing a giant capsular organelle with an eversible tubule (cnida). Species within Cnidaria have life cycles that involve one or both of the two distinct body forms, a typically benthic polyp, which may or may not be colonial, and a typically pelagic mostly solitary medusa. The currently accepted taxonomic scheme subdivides Cnidaria into two main assemblages: Anthozoa (Hexacorallia + Octocorallia) - cnidarians with a reproductive polyp and the absence of a medusa stage - and Medusozoa (Cubozoa, Hydrozoa, Scyphozoa, Staurozoa) - cnidarians that usually possess a reproductive medusa stage. Hypothesized relationships among these taxa greatly impact interpretations of cnidarian character evolution. RESULTS: We expanded the sampling of cnidarian mitochondrial genomes, particularly from Medusozoa, to reevaluate phylogenetic relationships within Cnidaria. Our phylogenetic analyses based on a mitochogenomic dataset support many prior hypotheses, including monophyly of Hexacorallia, Octocorallia, Medusozoa, Cubozoa, Staurozoa, Hydrozoa, Carybdeida, Chirodropida, and Hydroidolina, but reject the monophyly of Anthozoa, indicating that the Octocorallia + Medusozoa relationship is not the result of sampling bias, as proposed earlier. Further, our analyses contradict Scyphozoa [Discomedusae + Coronatae], Acraspeda [Cubozoa + Scyphozoa], as well as the hypothesis that Staurozoa is the sister group to all the other medusozoans. CONCLUSIONS: Cnidarian mitochondrial genomic data contain phylogenetic signal informative for understanding the evolutionary history of this phylum. Mitogenome-based phylogenies, which reject the monophyly of Anthozoa, provide further evidence for the polyp-first hypothesis. By rejecting the traditional Acraspeda and Scyphozoa hypotheses, these analyses suggest that the shared morphological characters in these groups are plesiomorphies, originated in the branch leading to Medusozoa. The expansion of mitogenomic data along with improvements in phylogenetic inference methods and use of additional nuclear markers will further enhance our understanding of the phylogenetic relationships and character evolution within Cnidaria.


Asunto(s)
Cnidarios/clasificación , Genoma Mitocondrial , Filogenia , Animales , Teorema de Bayes , Cnidarios/genética , ADN Mitocondrial/genética , Evolución Molecular , Genómica , Modelos Genéticos , Alineación de Secuencia , Análisis de Secuencia de ADN
4.
BMC Biol ; 9: 91, 2011 Dec 29.
Artículo en Inglés | MEDLINE | ID: mdl-22206462

RESUMEN

Contradicting the prejudice that endosymbiosis is a rare phenomenon, Husník and co-workers show in BMC Biology that bacterial endosymbiosis has occured several times independently during insect evolution. Rigorous phylogenetic analyses, in particular using complex models of sequence evolution and an original site removal procedure, allow this conclusion to be established after eschewing inference artefacts that usually plague the positioning of highly divergent endosymbiont genomic sequences.


Asunto(s)
Enterobacteriaceae/genética , Filogenia , Simbiosis
5.
BMC Evol Biol ; 11: 17, 2011 Jan 14.
Artículo en Inglés | MEDLINE | ID: mdl-21235782

RESUMEN

BACKGROUND: Model violations constitute the major limitation in inferring accurate phylogenies. Characterizing properties of the data that are not being correctly handled by current models is therefore of prime importance. One of the properties of protein evolution is the variation of the relative rate of substitutions across sites and over time, the latter is the phenomenon called heterotachy. Its effect on phylogenetic inference has recently obtained considerable attention, which led to the development of new models of sequence evolution. However, thus far focus has been on the quantitative heterogeneity of the evolutionary process, thereby overlooking more qualitative variations. RESULTS: We studied the importance of variation of the site-specific amino-acid substitution process over time and its possible impact on phylogenetic inference. We used the CAT model to define an infinite mixture of substitution processes characterized by equilibrium frequencies over the twenty amino acids, a useful proxy for qualitatively estimating the evolutionary process. Using two large datasets, we show that qualitative changes in site-specific substitution properties over time occurred significantly. To test whether this unaccounted qualitative variation can lead to an erroneous phylogenetic tree, we analyzed a concatenation of mitochondrial proteins in which Cnidaria and Porifera were erroneously grouped. The progressive removal of the sites with the most heterogeneous CAT profiles across clades led to the recovery of the monophyly of Eumetazoa (Cnidaria+Bilateria), suggesting that this heterogeneity can negatively influence phylogenetic inference. CONCLUSION: The time-heterogeneity of the amino-acid replacement process is therefore an important evolutionary aspect that should be incorporated in future models of sequence change.


Asunto(s)
Sustitución de Aminoácidos , Eucariontes/clasificación , Eucariontes/genética , Filogenia , Proteínas/genética , Animales , Secuencia de Bases , Bases de Datos de Ácidos Nucleicos , Evolución Molecular , Modelos Genéticos , Datos de Secuencia Molecular , Mutación Missense , Factores de Tiempo
6.
Curr Biol ; 15(14): 1325-30, 2005 Jul 26.
Artículo en Inglés | MEDLINE | ID: mdl-16051178

RESUMEN

Between 1 and 1.5 billion years ago, eukaryotic organisms acquired the ability to convert light into chemical energy through endosymbiosis with a Cyanobacterium (e.g.,). This event gave rise to "primary" plastids, which are present in green plants, red algae, and glaucophytes ("Plantae" sensu Cavalier-Smith). The widely accepted view that primary plastids arose only once implies two predictions: (1) all plastids form a monophyletic group, as do (2) primary photosynthetic eukaryotes. Nonetheless, unequivocal support for both predictions is lacking (e.g.,). In this report, we present two phylogenomic analyses, with 50 genes from 16 plastid and 15 cyanobacterial genomes and with 143 nuclear genes from 34 eukaryotic species, respectively. The nuclear dataset includes new sequences from glaucophytes, the less-studied group of primary photosynthetic eukaryotes. We find significant support for both predictions. Taken together, our analyses provide the first strong support for a single endosymbiotic event that gave rise to primary photosynthetic eukaryotes, the Plantae. Because our dataset does not cover the entire eukaryotic diversity (but only four of six major groups in), further testing of the monophyly of Plantae should include representatives from eukaryotic lineages for which currently insufficient sequence information is available.


Asunto(s)
Chlorophyta/genética , Cyanophora/genética , Evolución Molecular , Filogenia , Plantas/genética , Plastidios/genética , Rhodophyta/genética , Teorema de Bayes , Análisis por Conglomerados , Biología Computacional , Cianobacterias/genética , ADN Complementario/genética , Funciones de Verosimilitud , Modelos Genéticos , Análisis de Secuencia de ADN
7.
BMC Evol Biol ; 7 Suppl 1: S2, 2007 Feb 08.
Artículo en Inglés | MEDLINE | ID: mdl-17288575

RESUMEN

BACKGROUND: Phylogenetic analyses based on datasets rich in both genes and species (phylogenomics) are becoming a standard approach to resolve evolutionary questions. However, several difficulties are associated with the assembly of large datasets, such as multiple copies of a gene per species (paralogous or xenologous genes), lack of some genes for a given species, or partial sequences. The use of undetected paralogous or xenologous genes in phylogenetic inference can lead to inaccurate results, and the use of partial sequences to a lack of resolution. A tool that selects sequences, species, and genes, while dealing with these issues, is needed in a phylogenomics context. RESULTS: Here, we present SCaFoS, a tool that quickly assembles phylogenomic datasets containing maximal phylogenetic information while adjusting the amount of missing data in the selection of species, sequences and genes. Starting from individual sequence alignments, and using monophyletic groups defined by the user, SCaFoS creates chimeras with partial sequences, or selects, among multiple sequences, the orthologous and/or slowest evolving sequences. Once sequences representing each predefined monophyletic group have been selected, SCaFos retains genes according to the user's allowed level of missing data and generates files for super-matrix and super-tree analyses in several formats compatible with standard phylogenetic inference software. Because no clear-cut criteria exist for the sequence selection, a semi-automatic mode is available to accommodate user's expertise. CONCLUSION: SCaFos is able to deal with datasets of hundreds of species and genes, both at the amino acid or nucleotide level. It has a graphical interface and can be integrated in an automatic workflow. Moreover, SCaFoS is the first tool that integrates user's knowledge to select orthologous sequences, creates chimerical sequences to reduce missing data and selects genes according to their level of missing data. Finally, applying SCaFoS to different datasets, we show that the judicious selection of genes, species and sequences reduces tree reconstruction artefacts, especially if the dataset includes fast evolving species.


Asunto(s)
Evolución Molecular , Filogenia , Selección Genética , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos , Secuencia de Aminoácidos , Animales , ADN Concatenado/análisis , Modelos Genéticos , Datos de Secuencia Molecular
8.
Curr Biol ; 27(7): 958-967, 2017 Apr 03.
Artículo en Inglés | MEDLINE | ID: mdl-28318975

RESUMEN

Resolving the early diversification of animal lineages has proven difficult, even using genome-scale datasets. Several phylogenomic studies have supported the classical scenario in which sponges (Porifera) are the sister group to all other animals ("Porifera-sister" hypothesis), consistent with a single origin of the gut, nerve cells, and muscle cells in the stem lineage of eumetazoans (bilaterians + ctenophores + cnidarians). In contrast, several other studies have recovered an alternative topology in which ctenophores are the sister group to all other animals (including sponges). The "Ctenophora-sister" hypothesis implies that eumetazoan-specific traits, such as neurons and muscle cells, either evolved once along the metazoan stem lineage and were then lost in sponges and placozoans or evolved at least twice independently in Ctenophora and in Cnidaria + Bilateria. Here, we report on our reconstruction of deep metazoan relationships using a 1,719-gene dataset with dense taxonomic sampling of non-bilaterian animals that was assembled using a semi-automated procedure, designed to reduce known error sources. Our dataset outperforms previous metazoan gene superalignments in terms of data quality and quantity. Analyses with a best-fitting site-heterogeneous evolutionary model provide strong statistical support for placing sponges as the sister-group to all other metazoans, with ctenophores emerging as the second-earliest branching animal lineage. Only those methodological settings that exacerbated long-branch attraction artifacts yielded Ctenophora-sister. These results show that methodological issues must be carefully addressed to tackle difficult phylogenetic questions and pave the road to a better understanding of how fundamental features of animal body plans have emerged.


Asunto(s)
Evolución Biológica , Genoma , Invertebrados/clasificación , Filogenia , Poríferos/genética , Vertebrados/clasificación , Animales , Genómica/métodos , Invertebrados/genética , Poríferos/clasificación , Vertebrados/genética
9.
Gigascience ; 3: 17, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25625010

RESUMEN

The 1,000 plants (1KP) project is an international multi-disciplinary consortium that has generated transcriptome data from over 1,000 plant species, with exemplars for all of the major lineages across the Viridiplantae (green plants) clade. Here, we describe how to access the data used in a phylogenomics analysis of the first 85 species, and how to visualize our gene and species trees. Users can develop computational pipelines to analyse these data, in conjunction with data of their own that they can upload. Computationally estimated protein-protein interactions and biochemical pathways can be visualized at another site. Finally, we comment on our future plans and how they fit within this scalable system for the dissemination, visualization, and analysis of large multi-species data sets.

10.
Syst Biol ; 56(3): 389-99, 2007 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-17520503

RESUMEN

Genome-scale data sets result in an enhanced resolution of the phylogenetic inference by reducing stochastic errors. However, there is also an increase of systematic errors due to model violations, which can lead to erroneous phylogenies. Here, we explore the impact of systematic errors on the resolution of the eukaryotic phylogeny using a data set of 143 nuclear-encoded proteins from 37 species. The initial observation was that, despite the impressive amount of data, some branches had no significant statistical support. To demonstrate that this lack of resolution is due to a mutual annihilation of phylogenetic and nonphylogenetic signals, we created a series of data sets with slightly different taxon sampling. As expected, these data sets yielded strongly supported but mutually exclusive trees, thus confirming the presence of conflicting phylogenetic and nonphylogenetic signals in the original data set. To decide on the correct tree, we applied several methods expected to reduce the impact of some kinds of systematic error. Briefly, we show that (i) removing fast-evolving positions, (ii) recoding amino acids into functional categories, and (iii) using a site-heterogeneous mixture model (CAT) are three effective means of increasing the ratio of phylogenetic to nonphylogenetic signal. Finally, our results allow us to formulate guidelines for detecting and overcoming phylogenetic artefacts in genome-scale phylogenetic analyses.


Asunto(s)
Clasificación/métodos , Evolución Molecular , Genoma , Filogenia , Secuencia de Aminoácidos , Animales , Biología Computacional , Análisis de Componente Principal , Sesgo de Selección
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA