Búsqueda | Portal de Búsqueda de la BVS Ecuador

1.

OMA orthology in 2024: improved prokaryote coverage, ancestral and extant GO enrichment, a revamped synteny viewer and more in the OMA Ecosystem.

Altenhoff, Adrian M; Warwick Vesztrocy, Alex; Bernard, Charles; Train, Clement-Marie; Nicheperovich, Alina; Prieto Baños, Silvia; Julca, Irene; Moi, David; Nevers, Yannis; Majidian, Sina; Dessimoz, Christophe; Glover, Natasha M.

Nucleic Acids Res ; 52(D1): D513-D521, 2024 Jan 05.

Artículo en Inglés | MEDLINE | ID: mdl-37962356

RESUMEN

In this update paper, we present the latest developments in the OMA browser knowledgebase, which aims to provide high-quality orthology inferences and facilitate the study of gene families, genomes and their evolution. First, we discuss the addition of new species in the database, particularly an expanded representation of prokaryotic species. The OMA browser now offers Ancestral Genome pages and an Ancestral Gene Order viewer, allowing users to explore the evolutionary history and gene content of ancestral genomes. We also introduce a revamped Local Synteny Viewer to compare genomic neighborhoods across both extant and ancestral genomes. Hierarchical Orthologous Groups (HOGs) are now annotated with Gene Ontology annotations, and users can easily perform extant or ancestral GO enrichments. Finally, we recap new tools in the OMA Ecosystem, including OMAmer for proteome mapping, OMArk for proteome quality assessment, OMAMO for model organism selection and Read2Tree for phylogenetic species tree construction from reads. These new features provide exciting opportunities for orthology analysis and comparative genomics. OMA is accessible at https://omabrowser.org.

Asunto(s)

Bases de Datos Genéticas , Ecosistema , Genoma , Proteoma , Genoma/genética , Filogenia , Sintenía , Internet , Orden Génico/genética

2.

OMAMO: orthology-based alternative model organism selection.

Nicheperovich, Alina; Altenhoff, Adrian M; Dessimoz, Christophe; Majidian, Sina.

Bioinformatics ; 38(10): 2965-2966, 2022 05 13.

Artículo en Inglés | MEDLINE | ID: mdl-35561194

RESUMEN

SUMMARY: The conservation of pathways and genes across species has allowed scientists to use non-human model organisms to gain a deeper understanding of human biology. However, the use of traditional model systems such as mice, rats and zebrafish is costly, time-consuming and increasingly raises ethical concerns, which highlights the need to search for less complex model organisms. Existing tools only focus on the few well-studied model systems, most of which are complex animals. To address these issues, we have developed Orthologous Matrix and Alternative Model Organism (OMAMO), a software and a web service that provides the user with the best non-complex organism for research into a biological process of interest based on orthologous relationships between human and the species. The outputs provided by OMAMO were supported by a systematic literature review. AVAILABILITY AND IMPLEMENTATION: https://omabrowser.org/omamo/, https://github.com/DessimozLab/omamo. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Programas Informáticos , Pez Cebra , Animales , Ratones , Ratas , Pez Cebra/genética

3.

OMA orthology in 2021: website overhaul, conserved isoforms, ancestral gene order and more.

Altenhoff, Adrian M; Train, Clément-Marie; Gilbert, Kimberly J; Mediratta, Ishita; Mendes de Farias, Tarcisio; Moi, David; Nevers, Yannis; Radoykova, Hale-Seda; Rossier, Victor; Warwick Vesztrocy, Alex; Glover, Natasha M; Dessimoz, Christophe.

Nucleic Acids Res ; 49(D1): D373-D379, 2021 01 08.

Artículo en Inglés | MEDLINE | ID: mdl-33174605

RESUMEN

OMA is an established resource to elucidate evolutionary relationships among genes from currently 2326 genomes covering all domains of life. OMA provides pairwise and groupwise orthologs, functional annotations, local and global gene order conservation (synteny) information, among many other functions. This update paper describes the reorganisation of the database into gene-, group- and genome-centric pages. Other new and improved features are detailed, such as reporting of the evolutionarily best conserved isoforms of alternatively spliced genes, the inferred local order of ancestral genes, phylogenetic profiling, better cross-references, fast genome mapping, semantic data sharing via RDF, as well as a special coronavirus OMA with 119 viruses from the Nidovirales order, including SARS-CoV-2, the agent of the COVID-19 pandemic. We conclude with improvements to the documentation of the resource through primers, tutorials and short videos. OMA is accessible at https://omabrowser.org.

Asunto(s)

Algoritmos , Bases de Datos Genéticas , Orden Génico/genética , Genoma/genética , Animales , COVID-19/epidemiología , COVID-19/prevención & control , COVID-19/virología , Mapeo Cromosómico , Evolución Molecular , Ontología de Genes , Humanos , Internet , Pandemias , Filogenia , SARS-CoV-2/genética , SARS-CoV-2/fisiología , Especificidad de la Especie , Sintenía

4.

OMA standalone: orthology inference among public and custom genomes and transcriptomes.

Altenhoff, Adrian M; Levy, Jeremy; Zarowiecki, Magdalena; Tomiczek, Bartlomiej; Warwick Vesztrocy, Alex; Dalquen, Daniel A; Müller, Steven; Telford, Maximilian J; Glover, Natasha M; Dylus, David; Dessimoz, Christophe.

Genome Res ; 29(7): 1152-1163, 2019 07.

Artículo en Inglés | MEDLINE | ID: mdl-31235654

RESUMEN

Genomes and transcriptomes are now typically sequenced by individual laboratories but analyzing them often remains challenging. One essential step in many analyses lies in identifying orthologs-corresponding genes across multiple species-but this is far from trivial. The Orthologous MAtrix (OMA) database is a leading resource for identifying orthologs among publicly available, complete genomes. Here, we describe the OMA pipeline available as a standalone program for Linux and Mac. When run on a cluster, it has native support for the LSF, SGE, PBS Pro, and Slurm job schedulers and can scale up to thousands of parallel processes. Another key feature of OMA standalone is that users can combine their own data with existing public data by exporting genomes and precomputed alignments from the OMA database, which currently contains over 2100 complete genomes. We compare OMA standalone to other methods in the context of phylogenetic tree inference, by inferring a phylogeny of Lophotrochozoa, a challenging clade within the protostomes. We also discuss other potential applications of OMA standalone, including identifying gene families having undergone duplications/losses in specific clades, and identifying potential drug targets in nonmodel organisms. OMA standalone is available under the permissive open source Mozilla Public License Version 2.0.

Asunto(s)

Bases de Datos Genéticas , Genoma , Invertebrados/clasificación , Programas Informáticos , Transcriptoma , Animales , Invertebrados/genética , Filogenia

5.

The Quest for Orthologs benchmark service and consensus calls in 2020.

Altenhoff, Adrian M; Garrayo-Ventas, Javier; Cosentino, Salvatore; Emms, David; Glover, Natasha M; Hernández-Plaza, Ana; Nevers, Yannis; Sundesha, Vicky; Szklarczyk, Damian; Fernández, José M; Codó, Laia; For Orthologs Consortium, The Quest; Gelpi, Josep Ll; Huerta-Cepas, Jaime; Iwasaki, Wataru; Kelly, Steven; Lecompte, Odile; Muffato, Matthieu; Martin, Maria J; Capella-Gutierrez, Salvador; Thomas, Paul D; Sonnhammer, Erik; Dessimoz, Christophe.

Nucleic Acids Res ; 48(W1): W538-W545, 2020 07 02.

Artículo en Inglés | MEDLINE | ID: mdl-32374845

RESUMEN

The identification of orthologs-genes in different species which descended from the same gene in their last common ancestor-is a prerequisite for many analyses in comparative genomics and molecular evolution. Numerous algorithms and resources have been conceived to address this problem, but benchmarking and interpreting them is fraught with difficulties (need to compare them on a common input dataset, absence of ground truth, computational cost of calling orthologs). To address this, the Quest for Orthologs consortium maintains a reference set of proteomes and provides a web server for continuous orthology benchmarking (http://orthology.benchmarkservice.org). Furthermore, consensus ortholog calls derived from public benchmark submissions are provided on the Alliance of Genome Resources website, the joint portal of NIH-funded model organism databases.

Asunto(s)

Familia de Multigenes , Proteoma , Programas Informáticos , Animales , Benchmarking , Consenso , Genómica , Humanos , Ratones , Filogenia , Ratas

6.

The OMA orthology database in 2018: retrieving evolutionary relationships among all domains of life through richer web and programmatic interfaces.

Altenhoff, Adrian M; Glover, Natasha M; Train, Clément-Marie; Kaleb, Klara; Warwick Vesztrocy, Alex; Dylus, David; de Farias, Tarcisio M; Zile, Karina; Stevenson, Charles; Long, Jiao; Redestig, Henning; Gonnet, Gaston H; Dessimoz, Christophe.

Nucleic Acids Res ; 46(D1): D477-D485, 2018 01 04.

Artículo en Inglés | MEDLINE | ID: mdl-29106550

RESUMEN

The Orthologous Matrix (OMA) is a leading resource to relate genes across many species from all of life. In this update paper, we review the recent algorithmic improvements in the OMA pipeline, describe increases in species coverage (particularly in plants and early-branching eukaryotes) and introduce several new features in the OMA web browser. Notable improvements include: (i) a scalable, interactive viewer for hierarchical orthologous groups; (ii) protein domain annotations and domain-based links between orthologous groups; (iii) functionality to retrieve phylogenetic marker genes for a subset of species of interest; (iv) a new synteny dot plot viewer; and (v) an overhaul of the programmatic access (REST API and semantic web), which will facilitate incorporation of OMA analyses in computational pipelines and integration with other bioinformatic resources. OMA can be freely accessed at https://omabrowser.org.

Asunto(s)

Evolución Biológica , Bases de Datos Genéticas , Genoma , Anotación de Secuencia Molecular , Proteínas/genética , Sintenía , Algoritmos , Animales , Archaea/clasificación , Archaea/genética , Archaea/metabolismo , Bacterias/clasificación , Bacterias/genética , Bacterias/metabolismo , Biología Computacional/métodos , Hongos/clasificación , Hongos/genética , Hongos/metabolismo , Ontología de Genes , Humanos , Internet , Filogenia , Plantas/clasificación , Plantas/genética , Plantas/metabolismo , Dominios Proteicos , Proteínas/química , Proteínas/metabolismo , Navegador Web

7.

Standardized benchmarking in the quest for orthologs.

Altenhoff, Adrian M; Boeckmann, Brigitte; Capella-Gutierrez, Salvador; Dalquen, Daniel A; DeLuca, Todd; Forslund, Kristoffer; Huerta-Cepas, Jaime; Linard, Benjamin; Pereira, Cécile; Pryszcz, Leszek P; Schreiber, Fabian; da Silva, Alan Sousa; Szklarczyk, Damian; Train, Clément-Marie; Bork, Peer; Lecompte, Odile; von Mering, Christian; Xenarios, Ioannis; Sjölander, Kimmen; Jensen, Lars Juhl; Martin, Maria J; Muffato, Matthieu; Gabaldón, Toni; Lewis, Suzanna E; Thomas, Paul D; Sonnhammer, Erik; Dessimoz, Christophe.

Nat Methods ; 13(5): 425-30, 2016 05.

Artículo en Inglés | MEDLINE | ID: mdl-27043882

RESUMEN

Achieving high accuracy in orthology inference is essential for many comparative, evolutionary and functional genomic analyses, yet the true evolutionary history of genes is generally unknown and orthologs are used for very different applications across phyla, requiring different precision-recall trade-offs. As a result, it is difficult to assess the performance of orthology inference methods. Here, we present a community effort to establish standards and an automated web-based service to facilitate orthology benchmarking. Using this service, we characterize 15 well-established inference methods and resources on a battery of 20 different benchmarks. Standardized benchmarking provides a way for users to identify the most effective methods for the problem at hand, sets a minimum requirement for new tools and resources, and guides the development of more accurate orthology inference methods.

Asunto(s)

Biología Computacional/normas , Genómica/normas , Filogenia , Proteómica/normas , Archaea/clasificación , Archaea/genética , Bacterias/clasificación , Bacterias/genética , Biología Computacional/métodos , Bases de Datos Genéticas , Eucariontes/clasificación , Eucariontes/genética , Ontología de Genes , Genómica/métodos , Modelos Genéticos , Proteómica/métodos , Análisis de Secuencia de Proteína , Homología de Secuencia , Especificidad de la Especie

8.

Orthologous Matrix (OMA) algorithm 2.0: more robust to asymmetric evolutionary rates and more scalable hierarchical orthologous group inference.

Train, Clément-Marie; Glover, Natasha M; Gonnet, Gaston H; Altenhoff, Adrian M; Dessimoz, Christophe.

Bioinformatics ; 33(14): i75-i82, 2017 Jul 15.

Artículo en Inglés | MEDLINE | ID: mdl-28881964

RESUMEN

MOTIVATION: Accurate orthology inference is a fundamental step in many phylogenetics and comparative analysis. Many methods have been proposed, including OMA (Orthologous MAtrix). Yet substantial challenges remain, in particular in coping with fragmented genes or genes evolving at different rates after duplication, and in scaling to large datasets. With more and more genomes available, it is necessary to improve the scalability and robustness of orthology inference methods. RESULTS: We present improvements in the OMA algorithm: (i) refining the pairwise orthology inference step to account for same-species paralogs evolving at different rates, and (ii) minimizing errors in the pairwise orthology verification step by testing the consistency of pairwise distance estimates, which can be problematic in the presence of fragmentary sequences. In addition we introduce a more scalable procedure for hierarchical orthologous group (HOG) clustering, which are several orders of magnitude faster on large datasets. Using the Quest for Orthologs consortium orthology benchmark service, we show that these changes translate into substantial improvement on multiple empirical datasets. AVAILABILITY AND IMPLEMENTATION: This new OMA 2.0 algorithm is used in the OMA database ( http://omabrowser.org ) from the March 2017 release onwards, and can be run on custom genomes using OMA standalone version 2.0 and above ( http://omabrowser.org/standalone ). CONTACT: christophe.dessimoz@unil.ch or adrian.altenhoff@inf.ethz.ch.

Asunto(s)

Evolución Molecular , Genómica/métodos , Tasa de Mutación , Filogenia , Programas Informáticos , Algoritmos , Animales , Humanos , Mamíferos/genética

9.

The OMA orthology database in 2015: function predictions, better plant support, synteny view and other improvements.

Altenhoff, Adrian M; Skunca, Nives; Glover, Natasha; Train, Clément-Marie; Sueki, Anna; Pilizota, Ivana; Gori, Kevin; Tomiczek, Bartlomiej; Müller, Steven; Redestig, Henning; Gonnet, Gaston H; Dessimoz, Christophe.

Nucleic Acids Res ; 43(Database issue): D240-9, 2015 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-25399418

RESUMEN

The Orthologous Matrix (OMA) project is a method and associated database inferring evolutionary relationships amongst currently 1706 complete proteomes (i.e. the protein sequence associated for every protein-coding gene in all genomes). In this update article, we present six major new developments in OMA: (i) a new web interface; (ii) Gene Ontology function predictions as part of the OMA pipeline; (iii) better support for plant genomes and in particular homeologs in the wheat genome; (iv) a new synteny viewer providing the genomic context of orthologs; (v) statically computed hierarchical orthologous groups subsets downloadable in OrthoXML format; and (vi) possibility to export parts of the all-against-all computations and to combine them with custom data for 'client-side' orthology prediction. OMA can be accessed through the OMA Browser and various programmatic interfaces at http://omabrowser.org.

Asunto(s)

Bases de Datos de Proteínas , Proteínas de Plantas/genética , Proteoma/química , Homología de Secuencia de Aminoácido , Algoritmos , Ontología de Genes , Genoma de Planta , Humanos , Internet , Proteínas de Plantas/química , Proteoma/genética , Sintenía , Triticum/genética

10.

Fifteen years SIB Swiss Institute of Bioinformatics: life science databases, tools and support.

Stockinger, Heinz; Altenhoff, Adrian M; Arnold, Konstantin; Bairoch, Amos; Bastian, Frederic; Bergmann, Sven; Bougueleret, Lydie; Bucher, Philipp; Delorenzi, Mauro; Lane, Lydie; Le Mercier, Philippe; Lisacek, Frédérique; Michielin, Olivier; Palagi, Patricia M; Rougemont, Jacques; Schwede, Torsten; von Mering, Christian; van Nimwegen, Erik; Walther, Daniel; Xenarios, Ioannis; Zavolan, Mihaela; Zdobnov, Evgeny M; Zoete, Vincent; Appel, Ron D.

Nucleic Acids Res ; 42(Web Server issue): W436-41, 2014 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-24792157

RESUMEN

The SIB Swiss Institute of Bioinformatics (www.isb-sib.ch) was created in 1998 as an institution to foster excellence in bioinformatics. It is renowned worldwide for its databases and software tools, such as UniProtKB/Swiss-Prot, PROSITE, SWISS-MODEL, STRING, etc, that are all accessible on ExPASy.org, SIB's Bioinformatics Resource Portal. This article provides an overview of the scientific and training resources SIB has consistently been offering to the life science community for more than 15 years.

Asunto(s)

Biología Computacional , Bases de Datos de Compuestos Químicos , Programas Informáticos , Evolución Biológica , Bioestadística , Diseño de Fármacos , Genómica , Humanos , Internet , Conformación Proteica , Proteómica , Biología de Sistemas

11.

Resolving the ortholog conjecture: orthologs tend to be weakly, but significantly, more similar in function than paralogs.

Altenhoff, Adrian M; Studer, Romain A; Robinson-Rechavi, Marc; Dessimoz, Christophe.

PLoS Comput Biol ; 8(5): e1002514, 2012.

Artículo en Inglés | MEDLINE | ID: mdl-22615551

RESUMEN

The function of most proteins is not determined experimentally, but is extrapolated from homologs. According to the "ortholog conjecture", or standard model of phylogenomics, protein function changes rapidly after duplication, leading to paralogs with different functions, while orthologs retain the ancestral function. We report here that a comparison of experimentally supported functional annotations among homologs from 13 genomes mostly supports this model. We show that to analyze GO annotation effectively, several confounding factors need to be controlled: authorship bias, variation of GO term frequency among species, variation of background similarity among species pairs, and propagated annotation bias. After controlling for these biases, we observe that orthologs have generally more similar functional annotations than paralogs. This is especially strong for sub-cellular localization. We observe only a weak decrease in functional similarity with increasing sequence divergence. These findings hold over a large diversity of species; notably orthologs from model organisms such as E. coli, yeast or mouse have conserved function with human proteins.

Asunto(s)

Evolución Molecular , Modelos Químicos , Modelos Genéticos , Proteínas/química , Proteínas/genética , Secuencia de Aminoácidos , Simulación por Computador , Datos de Secuencia Molecular , Proteínas/metabolismo , Análisis de Secuencia de Proteína , Homología de Secuencia de Aminoácido , Relación Estructura-Actividad

12.

OMA 2011: orthology inference among 1000 complete genomes.

Altenhoff, Adrian M; Schneider, Adrian; Gonnet, Gaston H; Dessimoz, Christophe.

Nucleic Acids Res ; 39(Database issue): D289-94, 2011 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-21113020

RESUMEN

OMA (Orthologous MAtrix) is a database that identifies orthologs among publicly available, complete genomes. Initiated in 2004, the project is at its 11th release. It now includes 1000 genomes, making it one of the largest resources of its kind. Here, we describe recent developments in terms of species covered; the algorithmic pipeline--in particular regarding the treatment of alternative splicing, and new features of the web (OMA Browser) and programming interface (SOAP API). In the second part, we review the various representations provided by OMA and their typical applications. The database is publicly accessible at http://omabrowser.org.

Asunto(s)

Bases de Datos Genéticas , Genoma , Algoritmos , Empalme Alternativo , Evolución Molecular , Genes , Filogenia , Interfaz Usuario-Computador

13.

DrosOMA: the Drosophila Orthologous Matrix browser.

Thiébaut, Antonin; Altenhoff, Adrian M; Campli, Giulia; Glover, Natasha; Dessimoz, Christophe; Waterhouse, Robert M.

F1000Res ; 12: 936, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-38434623

RESUMEN

Background: Comparative genomic analyses to delineate gene evolutionary histories inform the understanding of organismal biology by characterising gene and gene family origins, trajectories, and dynamics, as well as enabling the tracing of speciation, duplication, and loss events, and facilitating the transfer of gene functional information across species. Genomic data are available for an increasing number of species from the genus Drosophila, however, a dedicated resource exploiting these data to provide the research community with browsable results from genus-wide orthology delineation has been lacking. Methods: Using the OMA Orthologous Matrix orthology inference approach and browser deployment framework, we catalogued orthologues across a selected set of Drosophila species with high-quality annotated genomes. We developed and deployed a dedicated instance of the OMA browser to facilitate intuitive exploration, visualisation, and downloading of the genus-wide orthology delineation results. Results: DrosOMA - the Drosophila Orthologous Matrix browser, accessible from https://drosoma.dcsr.unil.ch/ - presents the results of orthology delineation for 36 drosophilids from across the genus and four outgroup dipterans. It enables querying and browsing of the orthology data through a feature-rich web interface, with gene-view, orthologous group-view, and genome-view pages, including comprehensive gene name and identifier cross-references together with available functional annotations and protein domain architectures, as well as tools to visualise local and global synteny conservation. Conclusions: The DrosOMA browser demonstrates the deployability of the OMA browser framework for building user-friendly orthology databases with dense sampling of a selected taxonomic group. It provides the Drosophila research community with a tailored resource of browsable results from genus-wide orthology delineation.

Asunto(s)

Drosophila , Evolución Molecular , Animales , Drosophila/genética , Hibridación Genómica Comparativa , Bases de Datos Factuales , Genómica

14.

Phylogenetic and functional assessment of orthologs inference projects and methods.

Altenhoff, Adrian M; Dessimoz, Christophe.

PLoS Comput Biol ; 5(1): e1000262, 2009 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-19148271

RESUMEN

Accurate genome-wide identification of orthologs is a central problem in comparative genomics, a fact reflected by the numerous orthology identification projects developed in recent years. However, only a few reports have compared their accuracy, and indeed, several recent efforts have not yet been systematically evaluated. Furthermore, orthology is typically only assessed in terms of function conservation, despite the phylogeny-based original definition of Fitch. We collected and mapped the results of nine leading orthology projects and methods (COG, KOG, Inparanoid, OrthoMCL, Ensembl Compara, Homologene, RoundUp, EggNOG, and OMA) and two standard methods (bidirectional best-hit and reciprocal smallest distance). We systematically compared their predictions with respect to both phylogeny and function, using six different tests. This required the mapping of millions of sequences, the handling of hundreds of millions of predicted pairs of orthologs, and the computation of tens of thousands of trees. In phylogenetic analysis or in functional analysis where high specificity is required, we find that OMA and Homologene perform best. At lower functional specificity but higher coverage level, OrthoMCL outperforms Ensembl Compara, and to a lesser extent Inparanoid. Lastly, the large coverage of the recent EggNOG can be of interest to build broad functional grouping, but the method is not specific enough for phylogenetic or detailed function analyses. In terms of general methodology, we observe that the more sophisticated tree reconstruction/reconciliation approach of Ensembl Compara was at times outperformed by pairwise comparison approaches, even in phylogenetic tests. Furthermore, we show that standard bidirectional best-hit often outperforms projects with more complex algorithms. First, the present study provides guidance for the broad community of orthology data users as to which database best suits their needs. Second, it introduces new methodology to verify orthology. And third, it sets performance standards for current and future approaches.

Asunto(s)

Especiación Genética , Genómica/métodos , Filogenia , Animales , Bases de Datos Genéticas , Genómica/normas , Humanos , Modelos Genéticos , Fisiología Comparada , Sensibilidad y Especificidad , Especificidad de la Especie

15.

How to build phylogenetic species trees with OMA.

Dylus, David; Nevers, Yannis; Altenhoff, Adrian M; Gürtler, Antoine; Dessimoz, Christophe; Glover, Natasha M.

F1000Res ; 9: 511, 2020.

Artículo en Inglés | MEDLINE | ID: mdl-35722083

RESUMEN

Knowledge of species phylogeny is critical to many fields of biology. In an era of genome data availability, the most common way to make a phylogenetic species tree is by using multiple protein-coding genes, conserved in multiple species. This methodology is composed of several steps: orthology inference, multiple sequence alignment and inference of the phylogeny with dedicated tools. This can be a difficult task, and orthology inference, in particular, is usually computationally intensive and error prone if done ad hoc. This tutorial provides protocols to make use of OMA Orthologous Groups, a set of genes all orthologous to each other, to infer a phylogenetic species tree. It is designed to be user-friendly and computationally inexpensive, by providing two options: (1) Using only precomputed groups with species available on the OMA Browser, or (2) Computing orthologs using OMA Standalone for additional species, with the option of using precomputed orthology relations for those present in OMA. A protocol for downstream analyses is provided as well, including creating a supermatrix, tree inference, and visualization. All protocols use publicly available software, and we provide scripts and code snippets to facilitate data handling. The protocols are accompanied with practical examples.

16.

Inferring Orthology and Paralogy.

Altenhoff, Adrian M; Glover, Natasha M; Dessimoz, Christophe.

Methods Mol Biol ; 1910: 149-175, 2019.

Artículo en Inglés | MEDLINE | ID: mdl-31278664

RESUMEN

The distinction between orthologs and paralogs, genes that started diverging by speciation versus duplication, is relevant in a wide range of contexts, most notably phylogenetic tree inference and protein function annotation. In this chapter, we provide an overview of the methods used to infer orthology and paralogy. We survey both graph-based approaches (and their various grouping strategies) and tree-based approaches, which solve the more general problem of gene/species tree reconciliation. We discuss conceptual differences among the various orthology inference methods and databases and examine the difficult issue of verifying and benchmarking orthology predictions. Finally, we review typical applications of orthologous genes, groups, and reconciled trees and conclude with thoughts on future methodological developments.

Asunto(s)

Biología Computacional/métodos , Evolución Molecular , Genómica , Filogenia , Algoritmos , Animales , Genoma , Genómica/métodos , Humanos , Familia de Multigenes

17.

Speeding up all-against-all protein comparisons while maintaining sensitivity by considering subsequence-level homology.

Wittwer, Lucas D; Pilizota, Ivana; Altenhoff, Adrian M; Dessimoz, Christophe.

PeerJ ; 2: e607, 2014.

Artículo en Inglés | MEDLINE | ID: mdl-25320677

RESUMEN

Orthology inference and other sequence analyses across multiple genomes typically start by performing exhaustive pairwise sequence comparisons, a process referred to as "all-against-all". As this process scales quadratically in terms of the number of sequences analysed, this step can become a bottleneck, thus limiting the number of genomes that can be simultaneously analysed. Here, we explored ways of speeding-up the all-against-all step while maintaining its sensitivity. By exploiting the transitivity of homology and, crucially, ensuring that homology is defined in terms of consistent protein subsequences, our proof-of-concept resulted in a 4× speedup while recovering >99.6% of all homologs identified by the full all-against-all procedure on empirical sequences sets. In comparison, state-of-the-art k-mer approaches are orders of magnitude faster but only recover 3-14% of all homologous pairs. We also outline ideas to further improve the speed and recall of the new approach. An open source implementation is provided as part of the OMA standalone software at http://omabrowser.org/standalone.

18.

Inferring hierarchical orthologous groups from orthologous gene pairs.

Altenhoff, Adrian M; Gil, Manuel; Gonnet, Gaston H; Dessimoz, Christophe.

PLoS One ; 8(1): e53786, 2013.

Artículo en Inglés | MEDLINE | ID: mdl-23342000

RESUMEN

Hierarchical orthologous groups are defined as sets of genes that have descended from a single common ancestor within a taxonomic range of interest. Identifying such groups is useful in a wide range of contexts, including inference of gene function, study of gene evolution dynamics and comparative genomics. Hierarchical orthologous groups can be derived from reconciled gene/species trees but, this being a computationally costly procedure, many phylogenomic databases work on the basis of pairwise gene comparisons instead ("graph-based" approach). To our knowledge, there is only one published algorithm for graph-based hierarchical group inference, but both its theoretical justification and performance in practice are as of yet largely uncharacterised. We establish a formal correspondence between the orthology graph and hierarchical orthologous groups. Based on that, we devise GETHOGs ("Graph-based Efficient Technique for Hierarchical Orthologous Groups"), a novel algorithm to infer hierarchical groups directly from the orthology graph, thus without needing gene tree inference nor gene/species tree reconciliation. GETHOGs is shown to correctly reconstruct hierarchical orthologous groups when applied to perfect input, and several extensions with stringency parameters are provided to deal with imperfect input data. We demonstrate its competitiveness using both simulated and empirical data. GETHOGs is implemented as a part of the freely-available OMA standalone package (http://omabrowser.org/standalone). Furthermore, hierarchical groups inferred by GETHOGs ("OMA HOGs") on >1,000 genomes can be interactively queried via the OMA browser (http://omabrowser.org).

Asunto(s)

Algoritmos , Genómica/métodos , Homología de Secuencia de Ácido Nucleico , Bases de Datos Genéticas , Filogenia

19.

The impact of gene duplication, insertion, deletion, lateral gene transfer and sequencing error on orthology inference: a simulation study.

Dalquen, Daniel A; Altenhoff, Adrian M; Gonnet, Gaston H; Dessimoz, Christophe.

PLoS One ; 8(2): e56925, 2013.

Artículo en Inglés | MEDLINE | ID: mdl-23451112

RESUMEN

The identification of orthologous genes, a prerequisite for numerous analyses in comparative and functional genomics, is commonly performed computationally from protein sequences. Several previous studies have compared the accuracy of orthology inference methods, but simulated data has not typically been considered in cross-method assessment studies. Yet, while dependent on model assumptions, simulation-based benchmarking offers unique advantages: contrary to empirical data, all aspects of simulated data are known with certainty. Furthermore, the flexibility of simulation makes it possible to investigate performance factors in isolation of one another.Here, we use simulated data to dissect the performance of six methods for orthology inference available as standalone software packages (Inparanoid, OMA, OrthoInspector, OrthoMCL, QuartetS, SPIMAP) as well as two generic approaches (bidirectional best hit and reciprocal smallest distance). We investigate the impact of various evolutionary forces (gene duplication, insertion, deletion, and lateral gene transfer) and technological artefacts (ambiguous sequences) on orthology inference. We show that while gene duplication/loss and insertion/deletion are well handled by most methods (albeit for different trade-offs of precision and recall), lateral gene transfer disrupts all methods. As for ambiguous sequences, which might result from poor sequencing, assembly, or genome annotation, we show that they affect alignment score-based orthology methods more strongly than their distance-based counterparts.

Asunto(s)

Duplicación de Gen/genética , Transferencia de Gen Horizontal/genética , Mutagénesis Insercional/genética , Genómica/métodos

20.

Inferring orthology and paralogy.

Altenhoff, Adrian M; Dessimoz, Christophe.

Methods Mol Biol ; 855: 259-79, 2012.

Artículo en Inglés | MEDLINE | ID: mdl-22407712

RESUMEN

The distinction between orthologs and paralogs, genes that started diverging by speciation versus duplication, is relevant in a wide range of contexts, most notably phylogenetic tree inference and protein function annotation. In this chapter, we provide an overview of the methods used to infer orthology and paralogy. We survey both graph-based approaches (and their various grouping strategies) and tree-based approaches, which solve the more general problem of gene/species tree reconciliation. We discuss conceptual differences among the various orthology inference methods and databases, and examine the difficult issue of verifying and benchmarking orthology predictions. Finally, we review typical applications of orthologous genes, groups, and reconciled trees and conclude with thoughts on future methodological developments.

Asunto(s)

Biología Computacional/métodos , Evolución Molecular , Animales , Humanos , Filogenia

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA