Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 44
Filter
1.
Mol Biol Evol ; 38(5): 1761-1776, 2021 05 04.
Article in English | MEDLINE | ID: mdl-33450027

ABSTRACT

Previous reports have shown that environmental temperature impacts proteome evolution in Bacteria and Archaea. However, it is unknown whether thermoadaptation mainly occurs via the sequential accumulation of substitutions, massive horizontal gene transfers, or both. Measuring the real contribution of amino acid substitution to thermoadaptation is challenging, because of confounding environmental and genetic factors (e.g., pH, salinity, genomic G + C content) that also affect proteome evolution. Here, using Methanococcales, a major archaeal lineage, as a study model, we show that optimal growth temperature is the major factor affecting variations in amino acid frequencies of proteomes. By combining phylogenomic and ancestral sequence reconstruction approaches, we disclose a sequential substitutional scheme in which lysine plays a central role by fine tuning the pool of arginine, serine, threonine, glutamine, and asparagine, whose frequencies are strongly correlated with optimal growth temperature. Finally, we show that colonization to new thermal niches is not associated with high amounts of horizontal gene transfers. Altogether, although the acquisition of a few key proteins through horizontal gene transfer may have favored thermoadaptation in Methanococcales, our findings support sequential amino acid substitutions as the main factor driving thermoadaptation.


Subject(s)
Amino Acid Substitution , Methanococcales/genetics , Thermotolerance/genetics , Gene Transfer, Horizontal , Methanococcales/chemistry , Proteome
2.
Mol Biol Evol ; 38(9): 3754-3774, 2021 08 23.
Article in English | MEDLINE | ID: mdl-33974066

ABSTRACT

Extreme halophilic Archaea thrive in high salt, where, through proteomic adaptation, they cope with the strong osmolarity and extreme ionic conditions of their environment. In spite of wide fundamental interest, however, studies providing insights into this adaptation are scarce, because of practical difficulties inherent to the purification and characterization of halophilic enzymes. In this work, we describe the evolutionary history of malate dehydrogenases (MalDH) within Halobacteria (a class of the Euryarchaeota phylum). We resurrected nine ancestors along the inferred halobacterial MalDH phylogeny, including the Last Common Ancestral MalDH of Halobacteria (LCAHa) and compared their biochemical properties with those of five modern halobacterial MalDHs. We monitored the stability of these various MalDHs, their oligomeric states and enzymatic properties, as a function of concentration for different salts in the solvent. We found that a variety of evolutionary processes, such as amino acid replacement, gene duplication, loss of MalDH gene and replacement owing to horizontal transfer resulted in significant differences in solubility, stability and catalytic properties between these enzymes in the three Halobacteriales, Haloferacales, and Natrialbales orders since the LCAHa MalDH. We also showed how a stability trade-off might favor the emergence of new properties during adaptation to diverse environmental conditions. Altogether, our results suggest a new view of halophilic protein adaptation in Archaea.


Subject(s)
Euryarchaeota , Halobacterium , Malates , Phylogeny , Proteomics
3.
Bioinformatics ; 36(18): 4822-4824, 2020 09 15.
Article in English | MEDLINE | ID: mdl-33085745

ABSTRACT

MOTIVATION: Gene and species tree reconciliation methods are used to interpret gene trees, root them and correct uncertainties that are due to scarcity of signal in multiple sequence alignments. So far, reconciliation tools have not been integrated in standard phylogenetic software and they either lack performance on certain functions, or usability for biologists. RESULTS: We present Treerecs, a phylogenetic software based on duplication-loss reconciliation. Treerecs is simple to install and to use. It is fast and versatile, has a graphic output, and can be used along with methods for phylogenetic inference on multiple alignments like PLL and Seaview. AVAILABILITY AND IMPLEMENTATION: Treerecs is open-source. Its source code (C++, AGPLv3) and manuals are available from https://project.inria.fr/treerecs/.


Subject(s)
Algorithms , Evolution, Molecular , Phylogeny , Sequence Alignment , Software
4.
Mol Phylogenet Evol ; 127: 46-54, 2018 10.
Article in English | MEDLINE | ID: mdl-29684598

ABSTRACT

Phylogenetic analyses of conserved core genes have disentangled most of the ancient relationships in Archaea. However, some groups remain debated, like the DPANN, a deep-branching super-phylum composed of nanosized archaea with reduced genomes. Among these, the Nanohaloarchaea require high-salt concentrations for growth. Their discovery in 2012 was significant because they represent, together with Halobacteria (a Class belonging to Euryarchaeota), the only two described lineages of extreme halophilic archaea. The phylogenetic position of Nanohaloarchaea is highly debated, being alternatively proposed as the sister-lineage of Halobacteria or a member of the DPANN super-phylum. Pinpointing the phylogenetic position of extreme halophilic archaea is important to improve our knowledge of the deep evolutionary history of Archaea and the molecular adaptive processes and evolutionary paths that allowed their emergence. Using comparative genomic approaches, we identified 258 markers carrying a reliable phylogenetic signal. By combining strategies limiting the impact of biases on phylogenetic inference, we showed that Nanohaloarchaea and Halobacteria represent two independent lines that derived from two distinct but related methanogen Class II lineages. This implies that adaptation to high salinity emerged twice independently in Archaea and indicates that emergence of Nanohaloarchaea within DPANN in previous studies is likely the consequence of a tree reconstruction artifact, challenging the existence of this super-phylum.


Subject(s)
Euryarchaeota/classification , Phylogeny , Salinity , Bayes Theorem , Conserved Sequence , Genes, Archaeal , Genomics
5.
Mol Biol Evol ; 33(2): 305-10, 2016 Feb.
Article in English | MEDLINE | ID: mdl-26541173

ABSTRACT

In a recent article, Nelson-Sathi et al. (NS) report that the origins of major archaeal lineages (MAL) correspond to massive group-specific gene acquisitions via HGT from bacteria (Nelson-Sathi et al. 2015. Origins of major archaeal clades correspond to gene acquisitions from bacteria. Nature 517(7532):77-80.). If correct, this would have fundamental implications for the process of diversification in microbes. However, a reexamination of these data and results shows that the methodology used by NS systematically inflates the number of genes acquired at the root of each MAL, and incorrectly assumes bacterial origins for these genes. A reanalysis of their data with appropriate phylogenetic models accounting for the dynamics of gene gain and loss between lineages supports the continuous acquisition of genes over long periods in the evolution of Archaea.


Subject(s)
Archaea/genetics , Bacteria/genetics , Evolution, Molecular , Gene Transfer, Horizontal , Genotype , Archaea/classification , Genes, Archaeal , Genes, Bacterial , Genomics , Phylogeny
6.
Mol Biol Evol ; 33(8): 2170-2, 2016 08.
Article in English | MEDLINE | ID: mdl-27189556

ABSTRACT

Ribosomal proteins (r-proteins) are increasingly used as an alternative to ribosomal rRNA for prokaryotic systematics. However, their routine use is difficult because r-proteins are often not or wrongly annotated in complete genome sequences, and there is currently no dedicated exhaustive database of r-proteins. RiboDB aims at fulfilling this gap. This weekly updated comprehensive database allows the fast and easy retrieval of r-protein sequences from publicly available complete prokaryotic genome sequences. The current version of RiboDB contains 90 r-proteins from 3,750 prokaryotic complete genomes encompassing 38 phyla/major classes and 1,759 different species. RiboDB is accessible at http://ribodb.univ-lyon1.fr and through ACNUC interfaces.


Subject(s)
Databases, Factual , Ribosomal Proteins/classification , Base Sequence , Databases, Protein , Phylogeny , Prokaryotic Cells/classification , RNA, Ribosomal , Ribosomes/classification , Software
7.
Mol Biol Evol ; 32(1): 13-22, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25371435

ABSTRACT

The resurrection of ancestral proteins provides direct insight into how natural selection has shaped proteins found in nature. By tracing substitutions along a gene phylogeny, ancestral proteins can be reconstructed in silico and subsequently synthesized in vitro. This elegant strategy reveals the complex mechanisms responsible for the evolution of protein functions and structures. However, to date, all protein resurrection studies have used simplistic approaches for ancestral sequence reconstruction (ASR), including the assumption that a single sequence alignment alone is sufficient to accurately reconstruct the history of the gene family. The impact of such shortcuts on conclusions about ancestral functions has not been investigated. Here, we show with simulations that utilizing information on species history using a model that accounts for the duplication, horizontal transfer, and loss (DTL) of genes statistically increases ASR accuracy. This underscores the importance of the tree topology in the inference of putative ancestors. We validate our in silico predictions using in vitro resurrection of the LeuB enzyme for the ancestor of the Firmicutes, a major and ancient bacterial phylum. With this particular protein, our experimental results demonstrate that information on the species phylogeny results in a biochemically more realistic and kinetically more stable ancestral protein. Additional resurrection experiments with different proteins are necessary to statistically quantify the impact of using species tree-aware gene trees on ancestral protein phenotypes. Nonetheless, our results suggest the need for incorporating both sequence and DTL information in future studies of protein resurrections to accurately define the genotype-phenotype space in which proteins diversify.


Subject(s)
Computational Biology/methods , Proteins/genetics , Amino Acid Sequence , Bacterial Proteins/genetics , Computer Simulation , Evolution, Molecular , Genotype , Gram-Positive Bacteria/enzymology , Gram-Positive Bacteria/genetics , Phenotype , Phylogeny
8.
Genome Res ; 23(2): 323-30, 2013 Feb.
Article in English | MEDLINE | ID: mdl-23132911

ABSTRACT

Comparisons of gene trees and species trees are key to understanding major processes of genome evolution such as gene duplication and loss. Because current methods to reconstruct phylogenies fail to model the two-way dependency between gene trees and the species tree, they often misrepresent gene and species histories. We present a new probabilistic model to jointly infer rooted species and gene trees for dozens of genomes and thousands of gene families. We use simulations to show that this method accurately infers the species tree and gene trees, is robust to misspecification of the models of sequence and gene family evolution, and provides a precise historic record of gene duplications and losses throughout genome evolution. We simultaneously reconstruct the history of mammalian species and their genes based on 36 completely sequenced genomes, and use the reconstructed gene trees to infer the gene content and organization of ancestral mammalian genomes. We show that our method yields a more accurate picture of ancestral genomes than the trees available in the authoritative database Ensembl.


Subject(s)
Genes , Genome , Models, Genetic , Phylogeny , Algorithms , Animals , Computational Biology/methods , Computer Simulation , Evolution, Molecular , Gene Deletion , Gene Duplication , Humans , Models, Statistical
9.
BMC Bioinformatics ; 16: 251, 2015 Aug 12.
Article in English | MEDLINE | ID: mdl-26264559

ABSTRACT

BACKGROUND: Estimating the phylogenetic position of bacterial and archaeal organisms by genetic sequence comparisons is considered as the gold-standard in taxonomy. This is also a way to identify the species of origin of the sequence. The quality of the reference database used in such analyses is crucial: the database must reflect the up-to-date bacterial nomenclature and accurately indicate the species of origin of its sequences. DESCRIPTION: leBIBI(QBPP) is a web tool taking as input a series of nucleotide sequences belonging to one of a set of reference markers (e.g., SSU rRNA, rpoB, groEL2) and automatically retrieving closely related sequences, aligning them, and performing phylogenetic reconstruction using an approximate maximum likelihood approach. The system returns a set of quality parameters and, if possible, a suggested taxonomic assigment for the input sequences. The reference databases are extracted from GenBank and present four degrees of stringency, from the "superstringent" degree (one type strain per species) to the loosely parsed degree ("lax" database). A set of one hundred to more than a thousand sequences may be analyzed at a time. The speed of the process has been optimized through careful hardware selection and database design. CONCLUSION: leBIBI(QBPP) is a powerful tool helping biologists to position bacterial or archaeal sequence commonly used markers in a phylogeny. It is a diagnostic tool for clinical, industrial and environmental microbiology laboratory, as well as an exploratory tool for more specialized laboratories. Its main advantages, relatively to comparable systems are: i) the use of a broad set of databases covering diverse markers with various degrees of stringency; ii) the use of an approximate Maximum Likelihood approach for phylogenetic reconstruction; iii) a speed compatible with on-line usage; and iv) providing fully documented results to help the user in decision making.


Subject(s)
Archaea/genetics , Bacteria/genetics , Databases, Nucleic Acid , Internet , Phylogeny , Sequence Analysis, RNA/methods , Software , Archaea/classification , Bacteria/classification , Computational Biology , Likelihood Functions , RNA, Archaeal/genetics , RNA, Bacterial/genetics , RNA, Ribosomal/genetics
10.
Mol Biol Evol ; 31(4): 832-45, 2014 Apr.
Article in English | MEDLINE | ID: mdl-24398320

ABSTRACT

The evolutionary origin of eukaryotes is a question of great interest for which many different hypotheses have been proposed. These hypotheses predict distinct patterns of evolutionary relationships for individual genes of the ancestral eukaryotic genome. The availability of numerous completely sequenced genomes covering the three domains of life makes it possible to contrast these predictions with empirical data. We performed a systematic analysis of the phylogenetic relationships of ancestral eukaryotic genes with archaeal and bacterial genes. In contrast with previous studies, we emphasize the critical importance of methods accounting for statistical support, horizontal gene transfer, and gene loss, and we disentangle the processes underlying the phylogenomic pattern we observe. We first recover a clear signal indicating that a fraction of the bacteria-like eukaryotic genes are of alphaproteobacterial origin. Then, we show that the majority of bacteria-related eukaryotic genes actually do not point to a relationship with a specific bacterial taxonomic group. We also provide evidence that eukaryotes branch close to the last archaeal common ancestor. Our results demonstrate that there is no phylogenetic support for hypotheses involving a fusion with a bacterium other than the ancestor of mitochondria. Overall, they leave only two possible interpretations, respectively, based on the early-mitochondria hypotheses, which suppose an early endosymbiosis of an alphaproteobacterium in an archaeal host and on the slow-drip autogenous hypothesis, in which early eukaryotic ancestors were particularly prone to horizontal gene transfers.


Subject(s)
Evolution, Molecular , Models, Genetic , Archaea/genetics , Bacteria/genetics , Biological Evolution , Gene Transfer, Horizontal , Genetic Speciation , Genome, Human , Humans , Phylogeny , Symbiosis/genetics , Yeasts/genetics
11.
Proc Natl Acad Sci U S A ; 109(13): 4962-7, 2012 Mar 27.
Article in English | MEDLINE | ID: mdl-22416123

ABSTRACT

Lateral gene transfer (LGT), the acquisition of genes from other species, is a major evolutionary force. However, its success as an adaptive process makes the reconstruction of the history of life an intricate puzzle: If no gene has remained unaffected during the course of life's evolution, how can one rely on molecular markers to reconstruct the relationships among species? Here, we take a completely different look at LGT and its impact for the reconstruction of the history of life. Rather than trying to remove the effect of LGT in phylogenies, and ignoring as a result most of the information of gene histories, we use an explicit phylogenetic model of gene transfer to reconcile gene histories with the tree of species. We studied 16 bacterial and archaeal phyla, representing a dataset of 12,000 gene families distributed in 336 genomes. Our results show that, in most phyla, LGT provides an abundant phylogenetic signal on the pattern of species diversification and that this signal is robust to the choice of gene families under study. We also find that LGT brings an abundant signal on the location of the root of species trees, which has been previously overlooked. Our results quantify the great variety of gene transfer rates among lineages of the tree of life and provide strong support for the "complexity hypothesis," which states that genes whose products participate to macromolecular protein complexes are relatively resistant to transfer.


Subject(s)
Archaea/genetics , Bacteria/genetics , Gene Transfer, Horizontal/genetics , Phylogeny , Evolution, Molecular , Models, Genetic , Species Specificity
12.
Mol Biol Evol ; 30(8): 1745-50, 2013 Aug.
Article in English | MEDLINE | ID: mdl-23699471

ABSTRACT

Efficient algorithms and programs for the analysis of the ever-growing amount of biological sequence data are strongly needed in the genomics era. The pace at which new data and methodologies are generated calls for the use of pre-existing, optimized-yet extensible-code, typically distributed as libraries or packages. This motivated the Bio++ project, aiming at developing a set of C++ libraries for sequence analysis, phylogenetics, population genetics, and molecular evolution. The main attractiveness of Bio++ is the extensibility and reusability of its components through its object-oriented design, without compromising the computer-efficiency of the underlying methods. We present here the second major release of the libraries, which provides an extended set of classes and methods. These extensions notably provide built-in access to sequence databases and new data structures for handling and manipulating sequences from the omics era, such as multiple genome alignments and sequencing reads libraries. More complex models of sequence evolution, such as mixture models and generic n-tuples alphabets, are also included.


Subject(s)
Computational Biology , Evolution, Molecular , Software , Algorithms , Computational Biology/methods , Genomics/methods , Humans , Internet
13.
Nature ; 456(7224): 942-5, 2008 Dec 18.
Article in English | MEDLINE | ID: mdl-19037246

ABSTRACT

Fossils of organisms dating from the origin and diversification of cellular life are scant and difficult to interpret, for this reason alternative means to investigate the ecology of the last universal common ancestor (LUCA) and of the ancestors of the three domains of life are of great scientific value. It was recently recognized that the effects of temperature on ancestral organisms left 'genetic footprints' that could be uncovered in extant genomes. Accordingly, analyses of resurrected proteins predicted that the bacterial ancestor was thermophilic and that Bacteria subsequently adapted to lower temperatures. As the archaeal ancestor is also thought to have been thermophilic, the LUCA was parsimoniously inferred as thermophilic too. However, an analysis of ribosomal RNAs supported the hypothesis of a non-hyperthermophilic LUCA. Here we show that both rRNA and protein sequences analysed with advanced, realistic models of molecular evolution provide independent support for two environmental-temperature-related phases during the evolutionary history of the tree of life. In the first period, thermotolerance increased from a mesophilic LUCA to thermophilic ancestors of Bacteria and of Archaea-Eukaryota; in the second period, it decreased. Therefore, the two lineages descending from the LUCA and leading to the ancestors of Bacteria and Archaea-Eukaryota convergently adapted to high temperatures, possibly in response to a climate change of the early Earth, and/or aided by the transition from an RNA genome in the LUCA to organisms with more thermostable DNA genomes. This analysis unifies apparently contradictory results into a coherent depiction of the evolution of an ecological trait over the entire tree of life.


Subject(s)
Adaptation, Physiological/physiology , Archaea/physiology , Hot Temperature , Adaptation, Physiological/genetics , Archaea/genetics , Evolution, Molecular , Genes, rRNA/genetics , Phylogeny
14.
Biol Lett ; 9(5): 20130608, 2013 Oct 23.
Article in English | MEDLINE | ID: mdl-24046876

ABSTRACT

Several lines of evidence such as the basal location of thermophilic lineages in large-scale phylogenetic trees and the ancestral sequence reconstruction of single enzymes or large protein concatenations support the conclusion that the ancestors of the bacterial and archaeal domains were thermophilic organisms which were adapted to hot environments during the early stages of the Earth. A parsimonious reasoning would therefore suggest that the last universal common ancestor (LUCA) was also thermophilic. Various authors have used branch-wise non-homogeneous evolutionary models that better capture the variation of molecular compositions among lineages to accurately reconstruct the ancestral G + C contents of ribosomal RNAs and the ancestral amino acid composition of highly conserved proteins. They confirmed the thermophilic nature of the ancestors of Bacteria and Archaea but concluded that LUCA, their last common ancestor, was a mesophilic organism having a moderate optimal growth temperature. In this letter, we investigate the unknown nature of the phylogenetic signal that informs ancestral sequence reconstruction to support this non-parsimonious scenario. We find that rate variation across sites of molecular sequences provides information at different time scales by recording the oldest adaptation to temperature in slow-evolving regions and subsequent adaptations in fast-evolving ones.


Subject(s)
Adaptation, Physiological , Cold Temperature , Earth, Planet , Life , Models, Theoretical , Phylogeny
15.
Mol Biol Evol ; 28(9): 2661-74, 2011 Sep.
Article in English | MEDLINE | ID: mdl-21498602

ABSTRACT

Methods to infer the ancestral conditions of life are commonly based on geological and paleontological analyses. Recently, several studies used genome sequences to gain information about past ecological conditions taking advantage of the property that the G+C and amino acid contents of bacterial and archaeal ribosomal DNA genes and proteins, respectively, are strongly influenced by the environmental temperature. The adaptation to optimal growth temperature (OGT) since the Last Universal Common Ancestor (LUCA) over the universal tree of life was examined, and it was concluded that LUCA was likely to have been a mesophilic organism and that a parallel adaptation to high temperature occurred independently along the two lineages leading to the ancestors of Bacteria on one side and of Archaea and Eukarya on the other side. Here, we focus on Archaea to gain a precise view of the adaptation to OGT over time in this domain. It has been often proposed on the basis of indirect evidence that the last archaeal common ancestor was a hyperthermophilic organism. Moreover, many results showed the influence of environmental temperature on the evolutionary dynamics of archaeal genomes: Thermophilic organisms generally display lower evolutionary rates than mesophiles. However, to our knowledge, no study tried to explain the differences of evolutionary rates for the entire archaeal domain and to investigate the evolution of substitution rates over time. A comprehensive archaeal phylogeny and a non homogeneous model of the molecular evolutionary process allowed us to estimate ancestral base and amino acid compositions and OGTs at each internal node of the archaeal phylogenetic tree. The last archaeal common ancestor is predicted to have been hyperthermophilic and adaptations to cooler environments can be observed for extant mesophilic species. Furthermore, mesophilic species present both long branches and high variation of nucleotide and amino acid compositions since the last archaeal common ancestor. The increase of substitution rates observed in mesophilic lineages along all their branches can be interpreted as an ongoing adaptation to colder temperatures and to new metabolisms. We conclude that environmental temperature is a major factor that governs evolutionary rates in Archaea.


Subject(s)
Adaptation, Physiological , Archaea/genetics , Models, Genetic , Mutation Rate , Temperature , Amino Acids/genetics , Archaea/physiology , Archaeal Proteins/genetics , Genome, Archaeal , Phylogeny , Protein Structure, Tertiary/genetics , RNA, Ribosomal/genetics
16.
BMC Ecol Evol ; 22(1): 1, 2022 01 05.
Article in English | MEDLINE | ID: mdl-34986784

ABSTRACT

BACKGROUND: The recent rise in cultivation-independent genome sequencing has provided key material to explore uncharted branches of the Tree of Life. This has been particularly spectacular concerning the Archaea, projecting them at the center stage as prominently relevant to understand early stages in evolution and the emergence of fundamental metabolisms as well as the origin of eukaryotes. Yet, resolving deep divergences remains a challenging task due to well-known tree-reconstruction artefacts and biases in extracting robust ancient phylogenetic signal, notably when analyzing data sets including the three Domains of Life. Among the various strategies aimed at mitigating these problems, divide-and-conquer approaches remain poorly explored, and have been primarily based on reconciliation among single gene trees which however notoriously lack ancient phylogenetic signal. RESULTS: We analyzed sub-sets of full supermatrices covering the whole Tree of Life with specific taxonomic sampling to robustly resolve different parts of the archaeal phylogeny in light of their current diversity. Our results strongly support the existence and early emergence of two main clades, Cluster I and Cluster II, which we name Ouranosarchaea and Gaiarchaea, and we clarify the placement of important novel archaeal lineages within these two clades. However, the monophyly and branching of the fast evolving nanosized DPANN members remains unclear and worth of further study. CONCLUSIONS: We inferred a well resolved rooted phylogeny of the Archaea that includes all recently described phyla of high taxonomic rank. This phylogeny represents a valuable reference to study the evolutionary events associated to the early steps of the diversification of the archaeal domain. Beyond the specifics of archaeal phylogeny, our results demonstrate the power of divide-and-conquer approaches to resolve deep phylogenetic relationships, which should be applied to progressively resolve the entire Tree of Life.


Subject(s)
Archaea , Eukaryota , Archaea/genetics , Phylogeny
17.
Mol Biol Evol ; 27(2): 221-4, 2010 Feb.
Article in English | MEDLINE | ID: mdl-19854763

ABSTRACT

We present SeaView version 4, a multiplatform program designed to facilitate multiple alignment and phylogenetic tree building from molecular sequence data through the use of a graphical user interface. SeaView version 4 combines all the functions of the widely used programs SeaView (in its previous versions) and Phylo_win, and expands them by adding network access to sequence databases, alignment with arbitrary algorithm, maximum-likelihood tree building with PhyML, and display, printing, and copy-to-clipboard of rooted or unrooted, binary or multifurcating phylogenetic trees. In relation to the wide present offer of tools and algorithms for phylogenetic analyses, SeaView is especially useful for teaching and for occasional users of such software. SeaView is freely available at http://pbil.univ-lyon1.fr/software/seaview.


Subject(s)
Computer Graphics , Phylogeny , Sequence Alignment/methods , Software , User-Computer Interface
18.
Methods Mol Biol ; 2231: 241-260, 2021.
Article in English | MEDLINE | ID: mdl-33289897

ABSTRACT

We present Seaview version 5, a multiplatform program to perform multiple alignment and phylogenetic tree building from molecular sequence data. Seaview provides network access to sequence databases, alignment with arbitrary algorithm, parsimony, distance and maximum likelihood tree building with PhyML, and display, printing, and copy-to-clipboard or to SVG files of rooted or unrooted, binary or multifurcating phylogenetic trees. While Seaview is primarily a program providing a graphical user interface to guide the user into performing desired analyses, Seaview possesses also a command-line mode adequate for user-provided scripts. Seaview version 5 introduces the ability to reconcile a gene tree with a reference species tree and use this reconciliation to root and rearrange the gene tree. Seaview is freely available at http://doua.prabi.fr/software/seaview .


Subject(s)
Sequence Alignment/methods , Sequence Analysis, DNA/methods , Software , Algorithms , Codon/genetics , Evolution, Molecular , Genetic Code , Molecular Sequence Data , Open Reading Frames/genetics , Phylogeny
19.
BMC Bioinformatics ; 11: 324, 2010 Jun 15.
Article in English | MEDLINE | ID: mdl-20550700

ABSTRACT

BACKGROUND: To understand the evolutionary role of Lateral Gene Transfer (LGT), accurate methods are needed to identify transferred genes and infer their timing of acquisition. Phylogenetic methods are particularly promising for this purpose, but the reconciliation of a gene tree with a reference (species) tree is computationally hard. In addition, the application of these methods to real data raises the problem of sorting out real and artifactual phylogenetic conflict. RESULTS: We present Prunier, a new method for phylogenetic detection of LGT based on the search for a maximum statistical agreement forest (MSAF) between a gene tree and a reference tree. The program is flexible as it can use any definition of "agreement" among trees. We evaluate the performance of Prunier and two other programs (EEEP and RIATA-HGT) for their ability to detect transferred genes in realistic simulations where gene trees are reconstructed from sequences. Prunier proposes a single scenario that compares to the other methods in terms of sensitivity, but shows higher specificity. We show that LGT scenarios carry a strong signal about the position of the root of the species tree and could be used to identify the direction of evolutionary time on the species tree. We use Prunier on a biological dataset of 23 universal proteins and discuss their suitability for inferring the tree of life. CONCLUSIONS: The ability of Prunier to take into account branch support in the process of reconciliation allows a gain in complexity, in comparison to EEEP, and in accuracy in comparison to RIATA-HGT. Prunier's greedy algorithm proposes a single scenario of LGT for a gene family, but its quality always compares to the best solutions provided by the other algorithms. When the root position is uncertain in the species tree, Prunier is able to infer a scenario per root at a limited additional computational cost and can easily run on large datasets.Prunier is implemented in C++, using the Bio++ library and the phylogeny program Treefinder. It is available at: http://pbil.univ-lyon1.fr/software/prunier.


Subject(s)
Algorithms , Gene Transfer, Horizontal , Phylogeny , Archaea/genetics , Bacteria/genetics , Software
20.
BMC Bioinformatics ; 10 Suppl 6: S3, 2009 Jun 16.
Article in English | MEDLINE | ID: mdl-19534752

ABSTRACT

BACKGROUND: Comparative genomics is a central step in many sequence analysis studies, from gene annotation and the identification of new functional regions in genomes, to the study of evolutionary processes at the molecular level (speciation, single gene or whole genome duplications, etc.) and phylogenetics. In that context, databases providing users high quality homologous families and sequence alignments as well as phylogenetic trees based on state of the art algorithms are becoming indispensable. METHODS: We developed an automated procedure allowing massive all-against-all similarity searches, gene clustering, multiple alignments computation, and phylogenetic trees construction and reconciliation. The application of this procedure to a very large set of sequences is possible through parallel computing on a large computer cluster. RESULTS: Three databases were developed using this procedure: HOVERGEN, HOGENOM and HOMOLENS. These databases share the same architecture but differ in their content. HOVERGEN contains sequences from vertebrates, HOGENOM is mainly devoted to completely sequenced microbial organisms, and HOMOLENS is devoted to metazoan genomes from Ensembl. Access to the databases is provided through Web query forms, a general retrieval system and a client-server graphical interface. The later can be used to perform tree-pattern based searches allowing, among other uses, to retrieve sets of orthologous genes. The three databases, as well as the software required to build and query them, can be used or downloaded from the PBIL (Pôle Bioinformatique Lyonnais) site at http://pbil.univ-lyon1.fr/.


Subject(s)
Databases, Genetic , Genomics/methods , Algorithms , Cluster Analysis , Internet , Phylogeny , Sequence Alignment , Software
SELECTION OF CITATIONS
SEARCH DETAIL