Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 67
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 187(14): 3761-3778.e16, 2024 Jul 11.
Artículo en Inglés | MEDLINE | ID: mdl-38843834

RESUMEN

Novel antibiotics are urgently needed to combat the antibiotic-resistance crisis. We present a machine-learning-based approach to predict antimicrobial peptides (AMPs) within the global microbiome and leverage a vast dataset of 63,410 metagenomes and 87,920 prokaryotic genomes from environmental and host-associated habitats to create the AMPSphere, a comprehensive catalog comprising 863,498 non-redundant peptides, few of which match existing databases. AMPSphere provides insights into the evolutionary origins of peptides, including by duplication or gene truncation of longer sequences, and we observed that AMP production varies by habitat. To validate our predictions, we synthesized and tested 100 AMPs against clinically relevant drug-resistant pathogens and human gut commensals both in vitro and in vivo. A total of 79 peptides were active, with 63 targeting pathogens. These active AMPs exhibited antibacterial activity by disrupting bacterial membranes. In conclusion, our approach identified nearly one million prokaryotic AMP sequences, an open-access resource for antibiotic discovery.


Asunto(s)
Péptidos Antimicrobianos , Aprendizaje Automático , Microbiota , Péptidos Antimicrobianos/farmacología , Péptidos Antimicrobianos/química , Péptidos Antimicrobianos/genética , Humanos , Animales , Antibacterianos/farmacología , Ratones , Metagenoma , Bacterias/efectos de los fármacos , Bacterias/genética , Microbioma Gastrointestinal/efectos de los fármacos
2.
Cell ; 179(5): 1068-1083.e21, 2019 Nov 14.
Artículo en Inglés | MEDLINE | ID: mdl-31730850

RESUMEN

Ocean microbial communities strongly influence the biogeochemistry, food webs, and climate of our planet. Despite recent advances in understanding their taxonomic and genomic compositions, little is known about how their transcriptomes vary globally. Here, we present a dataset of 187 metatranscriptomes and 370 metagenomes from 126 globally distributed sampling stations and establish a resource of 47 million genes to study community-level transcriptomes across depth layers from pole-to-pole. We examine gene expression changes and community turnover as the underlying mechanisms shaping community transcriptomes along these axes of environmental variation and show how their individual contributions differ for multiple biogeochemically relevant processes. Furthermore, we find the relative contribution of gene expression changes to be significantly lower in polar than in non-polar waters and hypothesize that in polar regions, alterations in community activity in response to ocean warming will be driven more strongly by changes in organismal composition than by gene regulatory mechanisms. VIDEO ABSTRACT.


Asunto(s)
Regulación de la Expresión Génica , Metagenoma , Océanos y Mares , Transcriptoma/genética , Geografía , Microbiota/genética , Anotación de Secuencia Molecular , ARN Mensajero/genética , ARN Mensajero/metabolismo , Agua de Mar/microbiología , Temperatura
3.
Nature ; 626(7998): 377-384, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38109938

RESUMEN

Many of the Earth's microbes remain uncultured and understudied, limiting our understanding of the functional and evolutionary aspects of their genetic material, which remain largely overlooked in most metagenomic studies1. Here we analysed 149,842 environmental genomes from multiple habitats2-6 and compiled a curated catalogue of 404,085 functionally and evolutionarily significant novel (FESNov) gene families exclusive to uncultivated prokaryotic taxa. All FESNov families span multiple species, exhibit strong signals of purifying selection and qualify as new orthologous groups, thus nearly tripling the number of bacterial and archaeal gene families described to date. The FESNov catalogue is enriched in clade-specific traits, including 1,034 novel families that can distinguish entire uncultivated phyla, classes and orders, probably representing synapomorphies that facilitated their evolutionary divergence. Using genomic context analysis and structural alignments we predicted functional associations for 32.4% of FESNov families, including 4,349 high-confidence associations with important biological processes. These predictions provide a valuable hypothesis-driven framework that we used for experimental validatation of a new gene family involved in cell motility and a novel set of antimicrobial peptides. We also demonstrate that the relative abundance profiles of novel families can discriminate between environments and clinical conditions, leading to the discovery of potentially new biomarkers associated with colorectal cancer. We expect this work to enhance future metagenomics studies and expand our knowledge of the genetic repertory of uncultivated organisms.


Asunto(s)
Archaea , Bacterias , Ecosistema , Evolución Molecular , Genes Arqueales , Genes Bacterianos , Genómica , Conocimiento , Péptidos Antimicrobianos/genética , Archaea/clasificación , Archaea/genética , Bacterias/clasificación , Bacterias/genética , Biomarcadores , Movimiento Celular/genética , Neoplasias Colorrectales/genética , Genómica/métodos , Genómica/tendencias , Metagenómica/tendencias , Familia de Multigenes , Filogenia , Reproducibilidad de los Resultados
4.
Nature ; 601(7892): 252-256, 2022 01.
Artículo en Inglés | MEDLINE | ID: mdl-34912116

RESUMEN

Microbial genes encode the majority of the functional repertoire of life on earth. However, despite increasing efforts in metagenomic sequencing of various habitats1-3, little is known about the distribution of genes across the global biosphere, with implications for human and planetary health. Here we constructed a non-redundant gene catalogue of 303 million species-level genes (clustered at 95% nucleotide identity) from 13,174 publicly available metagenomes across 14 major habitats and use it to show that most genes are specific to a single habitat. The small fraction of genes found in multiple habitats is enriched in antibiotic-resistance genes and markers for mobile genetic elements. By further clustering these species-level genes into 32 million protein families, we observed that a small fraction of these families contain the majority of the genes (0.6% of families account for 50% of the genes). The majority of species-level genes and protein families are rare. Furthermore, species-level genes, and in particular the rare ones, show low rates of positive (adaptive) selection, supporting a model in which most genetic variability observed within each protein family is neutral or nearly neutral.


Asunto(s)
Metagenoma , Metagenómica , Antibacterianos/farmacología , Farmacorresistencia Microbiana , Ecosistema , Humanos , Metagenoma/genética
5.
Nucleic Acids Res ; 51(D1): D389-D394, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36399505

RESUMEN

The eggNOG (evolutionary gene genealogy Non-supervised Orthologous Groups) database is a bioinformatics resource providing orthology data and comprehensive functional information for organisms from all domains of life. Here, we present a major update of the database and website (version 6.0), which increases the number of covered organisms to 12 535 reference species, expands functional annotations, and implements new functionality. In total, eggNOG 6.0 provides a hierarchy of over 17M orthologous groups (OGs) computed at 1601 taxonomic levels, spanning 10 756 bacterial, 457 archaeal and 1322 eukaryotic organisms. OGs have been thoroughly annotated using recent knowledge from functional databases, including KEGG, Gene Ontology, UniProtKB, BiGG, CAZy, CARD, PFAM and SMART. eggNOG also offers phylogenetic trees for all OGs, maximising utility and versatility for end users while allowing researchers to investigate the evolutionary history of speciation and duplication events as well as the phylogenetic distribution of functional terms within each OG. Furthermore, the eggNOG 6.0 website contains new functionality to mine orthology and functional data with ease, including the possibility of generating phylogenetic profiles for multiple OGs across species or identifying single-copy OGs at custom taxonomic levels. eggNOG 6.0 is available at http://eggnog6.embl.de.


Asunto(s)
Bases de Datos Genéticas , Genómica , Filogenia , Biología Computacional , Eucariontes/genética
6.
Nucleic Acids Res ; 51(D1): D760-D766, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36408900

RESUMEN

The interpretation of genomic, transcriptomic and other microbial 'omics data is highly dependent on the availability of well-annotated genomes. As the number of publicly available microbial genomes continues to increase exponentially, the need for quality control and consistent annotation is becoming critical. We present proGenomes3, a database of 907 388 high-quality genomes containing 4 billion genes that passed stringent criteria and have been consistently annotated using multiple functional and taxonomic databases including mobile genetic elements and biosynthetic gene clusters. proGenomes3 encompasses 41 171 species-level clusters, defined based on universal single copy marker genes, for which pan-genomes and contextual habitat annotations are provided. The database is available at http://progenomes.embl.de/.


Asunto(s)
Genoma , Células Procariotas , Bases de Datos Genéticas , Genómica , Anotación de Secuencia Molecular , Bacterias/clasificación , Bacterias/genética
7.
Nature ; 560(7717): 233-237, 2018 08.
Artículo en Inglés | MEDLINE | ID: mdl-30069051

RESUMEN

Soils harbour some of the most diverse microbiomes on Earth and are essential for both nutrient cycling and carbon storage. To understand soil functioning, it is necessary to model the global distribution patterns and functional gene repertoires of soil microorganisms, as well as the biotic and environmental associations between the diversity and structure of both bacterial and fungal soil communities1-4. Here we show, by leveraging metagenomics and metabarcoding of global topsoil samples (189 sites, 7,560 subsamples), that bacterial, but not fungal, genetic diversity is highest in temperate habitats and that microbial gene composition varies more strongly with environmental variables than with geographic distance. We demonstrate that fungi and bacteria show global niche differentiation that is associated with contrasting diversity responses to precipitation and soil pH. Furthermore, we provide evidence for strong bacterial-fungal antagonism, inferred from antibiotic-resistance genes, in topsoil and ocean habitats, indicating the substantial role of biotic interactions in shaping microbial communities. Our results suggest that both competition and environmental filtering affect the abundance, composition and encoded gene functions of bacterial and fungal communities, indicating that the relative contributions of these microorganisms to global nutrient cycling varies spatially.


Asunto(s)
Bacterias/aislamiento & purificación , Biodiversidad , Planeta Tierra , Hongos/aislamiento & purificación , Microbiota/fisiología , Microbiología del Suelo , Bacterias/genética , Código de Barras del ADN Taxonómico , Farmacorresistencia Microbiana/genética , Hongos/genética , Concentración de Iones de Hidrógeno , Metagenómica , Microbiota/genética , Océanos y Mares , Lluvia , Agua de Mar/microbiología
8.
Nucleic Acids Res ; 50(W1): W352-W357, 2022 07 05.
Artículo en Inglés | MEDLINE | ID: mdl-35639770

RESUMEN

Synteny conservation analysis is a well-established methodology to investigate the potential functional role of unknown prokaryotic genes. However, bioinformatic tools to reconstruct and visualise genomic contexts usually depend on slow computations, are restricted to narrow taxonomic ranges, and/or do not allow for the functional and interactive exploration of neighbouring genes across different species. Here, we present GeCoViz, an online resource built upon 12 221 reference prokaryotic genomes that provides fast and interactive visualisation of custom genomic regions anchored by any target gene, which can be sought by either name, orthologous group (KEGGs, eggNOGs), protein domain (PFAM) or sequence. To facilitate functional and evolutionary interpretation, GeCoViz allows to customise the taxonomic scope of each analysis and provides comprehensive annotations of the neighbouring genes. Interactive visualisation options include, among others, the scaled representations of gene lengths and genomic distances, and on the fly calculation of synteny conservation of neighbouring genes, which can be highlighted based on custom thresholds. The resulting plots can be downloaded as high-quality images for publishing purposes. Overall, GeCoViz offers an easy-to-use, comprehensive, fast and interactive web-based tool for investigating the genomic context of prokaryotic genes, and is freely available at https://gecoviz.cgmlab.org.


Asunto(s)
Visualización de Datos , Evolución Molecular , Genómica , Células Procariotas , Programas Informáticos , Genómica/métodos , Células Procariotas/metabolismo , Genes Bacterianos/genética , Genoma Bacteriano/genética , Internet
9.
Nucleic Acids Res ; 50(W1): W577-W582, 2022 07 05.
Artículo en Inglés | MEDLINE | ID: mdl-35544233

RESUMEN

Phylogenomics data have grown exponentially over the last decades. It is currently common for genome-wide projects to generate hundreds or even thousands of phylogenetic trees and multiple sequence alignments, which may also be very large in size. However, the analysis and interpretation of such data still depends on custom bioinformatic and visualisation workflows that are largely unattainable for non-expert users. Here, we present PhyloCloud, an online platform aimed at hosting, indexing and exploring large phylogenetic tree collections, providing also seamless access to common analyses and operations, such as node annotation, searching, topology editing, automatic tree rooting, orthology detection and more. In addition, PhyloCloud provides quick access to tools that allow users to build their own phylogenies using fast predefined workflows, graphically compare tree topologies, or query taxonomic databases such as NBCI or GTDB. Finally, PhyloCloud offers a novel tree visualisation system based on ETE Toolkit v4.0, which can be used to explore very large trees and enhance them with custom annotations and multiple sequence alignments. The platform allows for sharing tree collections and specific tree views via private links, or make them fully public, serving also as a repository of phylogenomic data. PhyloCloud is available at https://phylocloud.cgmlab.org.


Asunto(s)
Biología Computacional , Genoma , Filogenia , Alineación de Secuencia , Bases de Datos Genéticas
10.
Mol Biol Evol ; 38(12): 5825-5829, 2021 12 09.
Artículo en Inglés | MEDLINE | ID: mdl-34597405

RESUMEN

Even though automated functional annotation of genes represents a fundamental step in most genomic and metagenomic workflows, it remains challenging at large scales. Here, we describe a major upgrade to eggNOG-mapper, a tool for functional annotation based on precomputed orthology assignments, now optimized for vast (meta)genomic data sets. Improvements in version 2 include a full update of both the genomes and functional databases to those from eggNOG v5, as well as several efficiency enhancements and new features. Most notably, eggNOG-mapper v2 now allows for: 1) de novo gene prediction from raw contigs, 2) built-in pairwise orthology prediction, 3) fast protein domain discovery, and 4) automated GFF decoration. eggNOG-mapper v2 is available as a standalone tool or as an online service at http://eggnog-mapper.embl.de.


Asunto(s)
Bases de Datos Genéticas , Metagenómica , Genómica , Metagenoma , Anotación de Secuencia Molecular , Filogenia , Programas Informáticos
11.
Nucleic Acids Res ; 48(D1): D621-D625, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31647096

RESUMEN

Microbiology depends on the availability of annotated microbial genomes for many applications. Comparative genomics approaches have been a major advance, but consistent and accurate annotations of genomes can be hard to obtain. In addition, newer concepts such as the pan-genome concept are still being implemented to help answer biological questions. Hence, we present proGenomes2, which provides 87 920 high-quality genomes in a user-friendly and interactive manner. Genome sequences and annotations can be retrieved individually or by taxonomic clade. Every genome in the database has been assigned to a species cluster and most genomes could be accurately assigned to one or multiple habitats. In addition, general functional annotations and specific annotations of antibiotic resistance genes and single nucleotide variants are provided. In short, proGenomes2 provides threefold more genomes, enhanced habitat annotations, updated taxonomic and functional annotation and improved linkage to the NCBI BioSample database. The database is available at http://progenomes.embl.de/.


Asunto(s)
Bases de Datos Genéticas , Genoma Arqueal , Genoma Bacteriano , Genómica , Biología Computacional/métodos , Ecosistema , Internet , Anotación de Secuencia Molecular , Polimorfismo de Nucleótido Simple , Células Procariotas , Reproducibilidad de los Resultados , Programas Informáticos
12.
Nucleic Acids Res ; 48(W1): W538-W545, 2020 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-32374845

RESUMEN

The identification of orthologs-genes in different species which descended from the same gene in their last common ancestor-is a prerequisite for many analyses in comparative genomics and molecular evolution. Numerous algorithms and resources have been conceived to address this problem, but benchmarking and interpreting them is fraught with difficulties (need to compare them on a common input dataset, absence of ground truth, computational cost of calling orthologs). To address this, the Quest for Orthologs consortium maintains a reference set of proteomes and provides a web server for continuous orthology benchmarking (http://orthology.benchmarkservice.org). Furthermore, consensus ortholog calls derived from public benchmark submissions are provided on the Alliance of Genome Resources website, the joint portal of NIH-funded model organism databases.


Asunto(s)
Familia de Multigenes , Proteoma , Programas Informáticos , Animales , Benchmarking , Consenso , Genómica , Humanos , Ratones , Filogenia , Ratas
13.
Nucleic Acids Res ; 47(D1): D607-D613, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30476243

RESUMEN

Proteins and their functional interactions form the backbone of the cellular machinery. Their connectivity network needs to be considered for the full understanding of biological phenomena, but the available information on protein-protein associations is incomplete and exhibits varying levels of annotation granularity and reliability. The STRING database aims to collect, score and integrate all publicly available sources of protein-protein interaction information, and to complement these with computational predictions. Its goal is to achieve a comprehensive and objective global network, including direct (physical) as well as indirect (functional) interactions. The latest version of STRING (11.0) more than doubles the number of organisms it covers, to 5090. The most important new feature is an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input. For the enrichment analysis, STRING implements well-known classification systems such as Gene Ontology and KEGG, but also offers additional, new classification systems based on high-throughput text-mining as well as on a hierarchical clustering of the association network itself. The STRING resource is available online at https://string-db.org/.


Asunto(s)
Genómica/métodos , Mapeo de Interacción de Proteínas/métodos , Programas Informáticos , Animales , Bases de Datos Genéticas , Ontología de Genes , Humanos
14.
Nucleic Acids Res ; 47(D1): D309-D314, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30418610

RESUMEN

eggNOG is a public database of orthology relationships, gene evolutionary histories and functional annotations. Here, we present version 5.0, featuring a major update of the underlying genome sets, which have been expanded to 4445 representative bacteria and 168 archaea derived from 25 038 genomes, as well as 477 eukaryotic organisms and 2502 viral proteomes that were selected for diversity and filtered by genome quality. In total, 4.4M orthologous groups (OGs) distributed across 379 taxonomic levels were computed together with their associated sequence alignments, phylogenies, HMM models and functional descriptors. Precomputed evolutionary analysis provides fine-grained resolution of duplication/speciation events within each OG. Our benchmarks show that, despite doubling the amount of genomes, the quality of orthology assignments and functional annotations (80% coverage) has persisted without significant changes across this update. Finally, we improved eggNOG online services for fast functional annotation and orthology prediction of custom genomics or metagenomics datasets. All precomputed data are publicly available for downloading or via API queries at http://eggnog.embl.de.


Asunto(s)
Secuencia Conservada , Bases de Datos Genéticas , Evolución Molecular , Filogenia , Homología de Secuencia , Animales , Clasificación , Eucariontes/genética , Duplicación de Gen , Ontología de Genes , Genes Virales , Genoma , Humanos , Anotación de Secuencia Molecular , Proteoma , Alineación de Secuencia , Relación Estructura-Actividad
15.
Mol Biol Evol ; 36(10): 2157-2164, 2019 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-31241141

RESUMEN

Gene families evolve by the processes of speciation (creating orthologs), gene duplication (paralogs), and horizontal gene transfer (xenologs), in addition to sequence divergence and gene loss. Orthologs in particular play an essential role in comparative genomics and phylogenomic analyses. With the continued sequencing of organisms across the tree of life, the data are available to reconstruct the unique evolutionary histories of tens of thousands of gene families. Accurate reconstruction of these histories, however, is a challenging computational problem, and the focus of the Quest for Orthologs Consortium. We review the recent advances and outstanding challenges in this field, as revealed at a symposium and meeting held at the University of Southern California in 2017. Key advances have been made both at the level of orthology algorithm development and with respect to coordination across the community of algorithm developers and orthology end-users. Applications spanned a broad range, including gene function prediction, phylostratigraphy, genome evolution, and phylogenomics. The meetings highlighted the increasing use of meta-analyses integrating results from multiple different algorithms, and discussed ongoing challenges in orthology inference as well as the next steps toward improvement and integration of orthology resources.


Asunto(s)
Evolución Molecular , Genómica/tendencias , Familia de Multigenes , Algoritmos , Animales , Genómica/métodos , Humanos
16.
Gut ; 68(10): 1781-1790, 2019 10.
Artículo en Inglés | MEDLINE | ID: mdl-30658995

RESUMEN

OBJECTIVE: The composition of the healthy human adult gut microbiome is relatively stable over prolonged periods, and representatives of the most highly abundant and prevalent species have been cultured and described. However, microbial abundances can change on perturbations, such as antibiotics intake, enabling the identification and characterisation of otherwise low abundant species. DESIGN: Analysing gut microbial time-series data, we used shotgun metagenomics to create strain level taxonomic and functional profiles. Community dynamics were modelled postintervention with a focus on conditionally rare taxa and previously unknown bacteria. RESULTS: In response to a commonly prescribed cephalosporin (ceftriaxone), we observe a strong compositional shift in one subject, in which a previously unknown species, UBorkfalki ceftriaxensis, was identified, blooming to 92% relative abundance. The genome assembly reveals that this species (1) belongs to a so far undescribed order of Firmicutes, (2) is ubiquitously present at low abundances in at least one third of adults, (3) is opportunistically growing, being ecologically similar to typical probiotic species and (4) is stably associated to healthy hosts as determined by single nucleotide variation analysis. It was the first coloniser after the antibiotic intervention that led to a long-lasting microbial community shift and likely permanent loss of nine commensals. CONCLUSION: The bloom of UB. ceftriaxensis and a subsequent one of Parabacteroides distasonis demonstrate the existence of monodominance community states in the gut. Our study points to an undiscovered wealth of low abundant but common taxa in the human gut and calls for more highly resolved longitudinal studies, in particular on ecosystem perturbations.


Asunto(s)
Antibacterianos/farmacología , Bacterias/genética , Microbioma Gastrointestinal/efectos de los fármacos , Metagenómica/métodos , Microbiota/genética , Bacterias/efectos de los fármacos , Humanos , Microbiota/efectos de los fármacos
17.
Nat Methods ; 13(5): 425-30, 2016 05.
Artículo en Inglés | MEDLINE | ID: mdl-27043882

RESUMEN

Achieving high accuracy in orthology inference is essential for many comparative, evolutionary and functional genomic analyses, yet the true evolutionary history of genes is generally unknown and orthologs are used for very different applications across phyla, requiring different precision-recall trade-offs. As a result, it is difficult to assess the performance of orthology inference methods. Here, we present a community effort to establish standards and an automated web-based service to facilitate orthology benchmarking. Using this service, we characterize 15 well-established inference methods and resources on a battery of 20 different benchmarks. Standardized benchmarking provides a way for users to identify the most effective methods for the problem at hand, sets a minimum requirement for new tools and resources, and guides the development of more accurate orthology inference methods.


Asunto(s)
Biología Computacional/normas , Genómica/normas , Filogenia , Proteómica/normas , Archaea/clasificación , Archaea/genética , Bacterias/clasificación , Bacterias/genética , Biología Computacional/métodos , Bases de Datos Genéticas , Eucariontes/clasificación , Eucariontes/genética , Ontología de Genes , Genómica/métodos , Modelos Genéticos , Proteómica/métodos , Análisis de Secuencia de Proteína , Homología de Secuencia , Especificidad de la Especie
18.
Bioinformatics ; 34(2): 323-329, 2018 Jan 15.
Artículo en Inglés | MEDLINE | ID: mdl-28968857

RESUMEN

The Quest for Orthologs (QfO) is an open collaboration framework for experts in comparative phylogenomics and related research areas who have an interest in highly accurate orthology predictions and their applications. We here report highlights and discussion points from the QfO meeting 2015 held in Barcelona. Achievements in recent years have established a basis to support developments for improved orthology prediction and to explore new approaches. Central to the QfO effort is proper benchmarking of methods and services, as well as design of standardized datasets and standardized formats to allow sharing and comparison of results. Simultaneously, analysis pipelines have been improved, evaluated and adapted to handle large datasets. All this would not have occurred without the long-term collaboration of Consortium members. Meeting regularly to review and coordinate complementary activities from a broad spectrum of innovative researchers clearly benefits the community. Highlights of the meeting include addressing sources of and legitimacy of disagreements between orthology calls, the context dependency of orthology definitions, special challenges encountered when analyzing very anciently rooted orthologies, orthology in the light of whole-genome duplications, and the concept of orthologous versus paralogous relationships at different levels, including domain-level orthology. Furthermore, particular needs for different applications (e.g. plant genomics, ancient gene families and others) and the infrastructure for making orthology inferences available (e.g. interfaces with model organism databases) were discussed, with several ongoing efforts that are expected to be reported on during the upcoming 2017 QfO meeting.

19.
Nucleic Acids Res ; 45(D1): D529-D534, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-28053165

RESUMEN

The availability of microbial genomes has opened many new avenues of research within microbiology. This has been driven primarily by comparative genomics approaches, which rely on accurate and consistent characterization of genomic sequences. It is nevertheless difficult to obtain consistent taxonomic and integrated functional annotations for defined prokaryotic clades. Thus, we developed proGenomes, a resource that provides user-friendly access to currently 25 038 high-quality genomes whose sequences and consistent annotations can be retrieved individually or by taxonomic clade. These genomes are assigned to 5306 consistent and accurate taxonomic species clusters based on previously established methodology. proGenomes also contains functional information for almost 80 million protein-coding genes, including a comprehensive set of general annotations and more focused annotations for carbohydrate-active enzymes and antibiotic resistance genes. Additionally, broad habitat information is provided for many genomes. All genomes and associated information can be downloaded by user-selected clade or multiple habitat-specific sets of representative genomes. We expect that the availability of high-quality genomes with comprehensive functional annotations will promote advances in clinical microbial genomics, functional evolution and other subfields of microbiology. proGenomes is available at http://progenomes.embl.de.


Asunto(s)
Biología Computacional/métodos , Código de Barras del ADN Taxonómico/métodos , Genoma , Genómica/métodos , Células Procariotas , Bases de Datos Genéticas , Anotación de Secuencia Molecular , Navegador Web
20.
Mol Biol Evol ; 34(6): 1535-1542, 2017 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-28369572

RESUMEN

Phylogenetic trees are routinely visualized to present and interpret the evolutionary relationships of species. Most empirical evolutionary data studies contain a visualization of the inferred tree with branch support values. Ambiguous semantics in tree file formats can lead to erroneous tree visualizations and therefore to incorrect interpretations of phylogenetic analyses. Here, we discuss problems that arise when displaying branch values on trees after rerooting. Branch values are typically stored as node labels in the widely-used Newick tree format. However, such values are attributes of branches. Storing them as node labels can therefore yield errors when rerooting trees. This depends on the mostly implicit semantics that tools deploy to interpret node labels. We reviewed ten tree viewers and ten bioinformatics toolkits that can display and reroot trees. We found that 14 out of 20 of these tools do not permit users to select the semantics of node labels. Thus, unaware users might obtain incorrect results when rooting trees. We illustrate such incorrect mappings for several test cases and real examples taken from the literature. This review has already led to improvements in eight tools. We suggest tools should provide options that explicitly force users to define the semantics of node labels.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Evolución Biológica , Bases de Datos Genéticas , Filogenia , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA