Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 67
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 187(14): 3761-3778.e16, 2024 Jul 11.
Artigo em Inglês | MEDLINE | ID: mdl-38843834

RESUMO

Novel antibiotics are urgently needed to combat the antibiotic-resistance crisis. We present a machine-learning-based approach to predict antimicrobial peptides (AMPs) within the global microbiome and leverage a vast dataset of 63,410 metagenomes and 87,920 prokaryotic genomes from environmental and host-associated habitats to create the AMPSphere, a comprehensive catalog comprising 863,498 non-redundant peptides, few of which match existing databases. AMPSphere provides insights into the evolutionary origins of peptides, including by duplication or gene truncation of longer sequences, and we observed that AMP production varies by habitat. To validate our predictions, we synthesized and tested 100 AMPs against clinically relevant drug-resistant pathogens and human gut commensals both in vitro and in vivo. A total of 79 peptides were active, with 63 targeting pathogens. These active AMPs exhibited antibacterial activity by disrupting bacterial membranes. In conclusion, our approach identified nearly one million prokaryotic AMP sequences, an open-access resource for antibiotic discovery.


Assuntos
Peptídeos Antimicrobianos , Aprendizado de Máquina , Microbiota , Peptídeos Antimicrobianos/farmacologia , Peptídeos Antimicrobianos/química , Peptídeos Antimicrobianos/genética , Humanos , Animais , Antibacterianos/farmacologia , Camundongos , Metagenoma , Bactérias/efeitos dos fármacos , Bactérias/genética , Microbioma Gastrointestinal/efeitos dos fármacos
2.
Cell ; 179(5): 1068-1083.e21, 2019 Nov 14.
Artigo em Inglês | MEDLINE | ID: mdl-31730850

RESUMO

Ocean microbial communities strongly influence the biogeochemistry, food webs, and climate of our planet. Despite recent advances in understanding their taxonomic and genomic compositions, little is known about how their transcriptomes vary globally. Here, we present a dataset of 187 metatranscriptomes and 370 metagenomes from 126 globally distributed sampling stations and establish a resource of 47 million genes to study community-level transcriptomes across depth layers from pole-to-pole. We examine gene expression changes and community turnover as the underlying mechanisms shaping community transcriptomes along these axes of environmental variation and show how their individual contributions differ for multiple biogeochemically relevant processes. Furthermore, we find the relative contribution of gene expression changes to be significantly lower in polar than in non-polar waters and hypothesize that in polar regions, alterations in community activity in response to ocean warming will be driven more strongly by changes in organismal composition than by gene regulatory mechanisms. VIDEO ABSTRACT.


Assuntos
Regulação da Expressão Gênica , Metagenoma , Oceanos e Mares , Transcriptoma/genética , Geografia , Microbiota/genética , Anotação de Sequência Molecular , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Água do Mar/microbiologia , Temperatura
3.
Nature ; 626(7998): 377-384, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38109938

RESUMO

Many of the Earth's microbes remain uncultured and understudied, limiting our understanding of the functional and evolutionary aspects of their genetic material, which remain largely overlooked in most metagenomic studies1. Here we analysed 149,842 environmental genomes from multiple habitats2-6 and compiled a curated catalogue of 404,085 functionally and evolutionarily significant novel (FESNov) gene families exclusive to uncultivated prokaryotic taxa. All FESNov families span multiple species, exhibit strong signals of purifying selection and qualify as new orthologous groups, thus nearly tripling the number of bacterial and archaeal gene families described to date. The FESNov catalogue is enriched in clade-specific traits, including 1,034 novel families that can distinguish entire uncultivated phyla, classes and orders, probably representing synapomorphies that facilitated their evolutionary divergence. Using genomic context analysis and structural alignments we predicted functional associations for 32.4% of FESNov families, including 4,349 high-confidence associations with important biological processes. These predictions provide a valuable hypothesis-driven framework that we used for experimental validatation of a new gene family involved in cell motility and a novel set of antimicrobial peptides. We also demonstrate that the relative abundance profiles of novel families can discriminate between environments and clinical conditions, leading to the discovery of potentially new biomarkers associated with colorectal cancer. We expect this work to enhance future metagenomics studies and expand our knowledge of the genetic repertory of uncultivated organisms.


Assuntos
Archaea , Bactérias , Ecossistema , Evolução Molecular , Genes Arqueais , Genes Bacterianos , Genômica , Conhecimento , Peptídeos Antimicrobianos/genética , Archaea/classificação , Archaea/genética , Bactérias/classificação , Bactérias/genética , Biomarcadores , Movimento Celular/genética , Neoplasias Colorretais/genética , Genômica/métodos , Genômica/tendências , Metagenômica/tendências , Família Multigênica , Filogenia , Reprodutibilidade dos Testes
4.
Nature ; 601(7892): 252-256, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-34912116

RESUMO

Microbial genes encode the majority of the functional repertoire of life on earth. However, despite increasing efforts in metagenomic sequencing of various habitats1-3, little is known about the distribution of genes across the global biosphere, with implications for human and planetary health. Here we constructed a non-redundant gene catalogue of 303 million species-level genes (clustered at 95% nucleotide identity) from 13,174 publicly available metagenomes across 14 major habitats and use it to show that most genes are specific to a single habitat. The small fraction of genes found in multiple habitats is enriched in antibiotic-resistance genes and markers for mobile genetic elements. By further clustering these species-level genes into 32 million protein families, we observed that a small fraction of these families contain the majority of the genes (0.6% of families account for 50% of the genes). The majority of species-level genes and protein families are rare. Furthermore, species-level genes, and in particular the rare ones, show low rates of positive (adaptive) selection, supporting a model in which most genetic variability observed within each protein family is neutral or nearly neutral.


Assuntos
Metagenoma , Metagenômica , Antibacterianos/farmacologia , Resistência Microbiana a Medicamentos , Ecossistema , Humanos , Metagenoma/genética
5.
Nucleic Acids Res ; 51(D1): D389-D394, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36399505

RESUMO

The eggNOG (evolutionary gene genealogy Non-supervised Orthologous Groups) database is a bioinformatics resource providing orthology data and comprehensive functional information for organisms from all domains of life. Here, we present a major update of the database and website (version 6.0), which increases the number of covered organisms to 12 535 reference species, expands functional annotations, and implements new functionality. In total, eggNOG 6.0 provides a hierarchy of over 17M orthologous groups (OGs) computed at 1601 taxonomic levels, spanning 10 756 bacterial, 457 archaeal and 1322 eukaryotic organisms. OGs have been thoroughly annotated using recent knowledge from functional databases, including KEGG, Gene Ontology, UniProtKB, BiGG, CAZy, CARD, PFAM and SMART. eggNOG also offers phylogenetic trees for all OGs, maximising utility and versatility for end users while allowing researchers to investigate the evolutionary history of speciation and duplication events as well as the phylogenetic distribution of functional terms within each OG. Furthermore, the eggNOG 6.0 website contains new functionality to mine orthology and functional data with ease, including the possibility of generating phylogenetic profiles for multiple OGs across species or identifying single-copy OGs at custom taxonomic levels. eggNOG 6.0 is available at http://eggnog6.embl.de.


Assuntos
Bases de Dados Genéticas , Genômica , Filogenia , Biologia Computacional , Eucariotos/genética
6.
Nucleic Acids Res ; 51(D1): D760-D766, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36408900

RESUMO

The interpretation of genomic, transcriptomic and other microbial 'omics data is highly dependent on the availability of well-annotated genomes. As the number of publicly available microbial genomes continues to increase exponentially, the need for quality control and consistent annotation is becoming critical. We present proGenomes3, a database of 907 388 high-quality genomes containing 4 billion genes that passed stringent criteria and have been consistently annotated using multiple functional and taxonomic databases including mobile genetic elements and biosynthetic gene clusters. proGenomes3 encompasses 41 171 species-level clusters, defined based on universal single copy marker genes, for which pan-genomes and contextual habitat annotations are provided. The database is available at http://progenomes.embl.de/.


Assuntos
Genoma , Células Procarióticas , Bases de Dados Genéticas , Genômica , Anotação de Sequência Molecular , Bactérias/classificação , Bactérias/genética
7.
Nature ; 560(7717): 233-237, 2018 08.
Artigo em Inglês | MEDLINE | ID: mdl-30069051

RESUMO

Soils harbour some of the most diverse microbiomes on Earth and are essential for both nutrient cycling and carbon storage. To understand soil functioning, it is necessary to model the global distribution patterns and functional gene repertoires of soil microorganisms, as well as the biotic and environmental associations between the diversity and structure of both bacterial and fungal soil communities1-4. Here we show, by leveraging metagenomics and metabarcoding of global topsoil samples (189 sites, 7,560 subsamples), that bacterial, but not fungal, genetic diversity is highest in temperate habitats and that microbial gene composition varies more strongly with environmental variables than with geographic distance. We demonstrate that fungi and bacteria show global niche differentiation that is associated with contrasting diversity responses to precipitation and soil pH. Furthermore, we provide evidence for strong bacterial-fungal antagonism, inferred from antibiotic-resistance genes, in topsoil and ocean habitats, indicating the substantial role of biotic interactions in shaping microbial communities. Our results suggest that both competition and environmental filtering affect the abundance, composition and encoded gene functions of bacterial and fungal communities, indicating that the relative contributions of these microorganisms to global nutrient cycling varies spatially.


Assuntos
Bactérias/isolamento & purificação , Biodiversidade , Planeta Terra , Fungos/isolamento & purificação , Microbiota/fisiologia , Microbiologia do Solo , Bactérias/genética , Código de Barras de DNA Taxonômico , Resistência Microbiana a Medicamentos/genética , Fungos/genética , Concentração de Íons de Hidrogênio , Metagenômica , Microbiota/genética , Oceanos e Mares , Chuva , Água do Mar/microbiologia
8.
Nucleic Acids Res ; 50(W1): W352-W357, 2022 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-35639770

RESUMO

Synteny conservation analysis is a well-established methodology to investigate the potential functional role of unknown prokaryotic genes. However, bioinformatic tools to reconstruct and visualise genomic contexts usually depend on slow computations, are restricted to narrow taxonomic ranges, and/or do not allow for the functional and interactive exploration of neighbouring genes across different species. Here, we present GeCoViz, an online resource built upon 12 221 reference prokaryotic genomes that provides fast and interactive visualisation of custom genomic regions anchored by any target gene, which can be sought by either name, orthologous group (KEGGs, eggNOGs), protein domain (PFAM) or sequence. To facilitate functional and evolutionary interpretation, GeCoViz allows to customise the taxonomic scope of each analysis and provides comprehensive annotations of the neighbouring genes. Interactive visualisation options include, among others, the scaled representations of gene lengths and genomic distances, and on the fly calculation of synteny conservation of neighbouring genes, which can be highlighted based on custom thresholds. The resulting plots can be downloaded as high-quality images for publishing purposes. Overall, GeCoViz offers an easy-to-use, comprehensive, fast and interactive web-based tool for investigating the genomic context of prokaryotic genes, and is freely available at https://gecoviz.cgmlab.org.


Assuntos
Visualização de Dados , Evolução Molecular , Genômica , Células Procarióticas , Software , Genômica/métodos , Células Procarióticas/metabolismo , Genes Bacterianos/genética , Genoma Bacteriano/genética , Internet
9.
Nucleic Acids Res ; 50(W1): W577-W582, 2022 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-35544233

RESUMO

Phylogenomics data have grown exponentially over the last decades. It is currently common for genome-wide projects to generate hundreds or even thousands of phylogenetic trees and multiple sequence alignments, which may also be very large in size. However, the analysis and interpretation of such data still depends on custom bioinformatic and visualisation workflows that are largely unattainable for non-expert users. Here, we present PhyloCloud, an online platform aimed at hosting, indexing and exploring large phylogenetic tree collections, providing also seamless access to common analyses and operations, such as node annotation, searching, topology editing, automatic tree rooting, orthology detection and more. In addition, PhyloCloud provides quick access to tools that allow users to build their own phylogenies using fast predefined workflows, graphically compare tree topologies, or query taxonomic databases such as NBCI or GTDB. Finally, PhyloCloud offers a novel tree visualisation system based on ETE Toolkit v4.0, which can be used to explore very large trees and enhance them with custom annotations and multiple sequence alignments. The platform allows for sharing tree collections and specific tree views via private links, or make them fully public, serving also as a repository of phylogenomic data. PhyloCloud is available at https://phylocloud.cgmlab.org.


Assuntos
Biologia Computacional , Genoma , Filogenia , Alinhamento de Sequência , Bases de Dados Genéticas
10.
Mol Biol Evol ; 38(12): 5825-5829, 2021 12 09.
Artigo em Inglês | MEDLINE | ID: mdl-34597405

RESUMO

Even though automated functional annotation of genes represents a fundamental step in most genomic and metagenomic workflows, it remains challenging at large scales. Here, we describe a major upgrade to eggNOG-mapper, a tool for functional annotation based on precomputed orthology assignments, now optimized for vast (meta)genomic data sets. Improvements in version 2 include a full update of both the genomes and functional databases to those from eggNOG v5, as well as several efficiency enhancements and new features. Most notably, eggNOG-mapper v2 now allows for: 1) de novo gene prediction from raw contigs, 2) built-in pairwise orthology prediction, 3) fast protein domain discovery, and 4) automated GFF decoration. eggNOG-mapper v2 is available as a standalone tool or as an online service at http://eggnog-mapper.embl.de.


Assuntos
Bases de Dados Genéticas , Metagenômica , Genômica , Metagenoma , Anotação de Sequência Molecular , Filogenia , Software
11.
Nucleic Acids Res ; 48(D1): D621-D625, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31647096

RESUMO

Microbiology depends on the availability of annotated microbial genomes for many applications. Comparative genomics approaches have been a major advance, but consistent and accurate annotations of genomes can be hard to obtain. In addition, newer concepts such as the pan-genome concept are still being implemented to help answer biological questions. Hence, we present proGenomes2, which provides 87 920 high-quality genomes in a user-friendly and interactive manner. Genome sequences and annotations can be retrieved individually or by taxonomic clade. Every genome in the database has been assigned to a species cluster and most genomes could be accurately assigned to one or multiple habitats. In addition, general functional annotations and specific annotations of antibiotic resistance genes and single nucleotide variants are provided. In short, proGenomes2 provides threefold more genomes, enhanced habitat annotations, updated taxonomic and functional annotation and improved linkage to the NCBI BioSample database. The database is available at http://progenomes.embl.de/.


Assuntos
Bases de Dados Genéticas , Genoma Arqueal , Genoma Bacteriano , Genômica , Biologia Computacional/métodos , Ecossistema , Internet , Anotação de Sequência Molecular , Polimorfismo de Nucleotídeo Único , Células Procarióticas , Reprodutibilidade dos Testes , Software
12.
Nucleic Acids Res ; 48(W1): W538-W545, 2020 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-32374845

RESUMO

The identification of orthologs-genes in different species which descended from the same gene in their last common ancestor-is a prerequisite for many analyses in comparative genomics and molecular evolution. Numerous algorithms and resources have been conceived to address this problem, but benchmarking and interpreting them is fraught with difficulties (need to compare them on a common input dataset, absence of ground truth, computational cost of calling orthologs). To address this, the Quest for Orthologs consortium maintains a reference set of proteomes and provides a web server for continuous orthology benchmarking (http://orthology.benchmarkservice.org). Furthermore, consensus ortholog calls derived from public benchmark submissions are provided on the Alliance of Genome Resources website, the joint portal of NIH-funded model organism databases.


Assuntos
Família Multigênica , Proteoma , Software , Animais , Benchmarking , Consenso , Genômica , Humanos , Camundongos , Filogenia , Ratos
13.
Nucleic Acids Res ; 47(D1): D607-D613, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30476243

RESUMO

Proteins and their functional interactions form the backbone of the cellular machinery. Their connectivity network needs to be considered for the full understanding of biological phenomena, but the available information on protein-protein associations is incomplete and exhibits varying levels of annotation granularity and reliability. The STRING database aims to collect, score and integrate all publicly available sources of protein-protein interaction information, and to complement these with computational predictions. Its goal is to achieve a comprehensive and objective global network, including direct (physical) as well as indirect (functional) interactions. The latest version of STRING (11.0) more than doubles the number of organisms it covers, to 5090. The most important new feature is an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input. For the enrichment analysis, STRING implements well-known classification systems such as Gene Ontology and KEGG, but also offers additional, new classification systems based on high-throughput text-mining as well as on a hierarchical clustering of the association network itself. The STRING resource is available online at https://string-db.org/.


Assuntos
Genômica/métodos , Mapeamento de Interação de Proteínas/métodos , Software , Animais , Bases de Dados Genéticas , Ontologia Genética , Humanos
14.
Nucleic Acids Res ; 47(D1): D309-D314, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30418610

RESUMO

eggNOG is a public database of orthology relationships, gene evolutionary histories and functional annotations. Here, we present version 5.0, featuring a major update of the underlying genome sets, which have been expanded to 4445 representative bacteria and 168 archaea derived from 25 038 genomes, as well as 477 eukaryotic organisms and 2502 viral proteomes that were selected for diversity and filtered by genome quality. In total, 4.4M orthologous groups (OGs) distributed across 379 taxonomic levels were computed together with their associated sequence alignments, phylogenies, HMM models and functional descriptors. Precomputed evolutionary analysis provides fine-grained resolution of duplication/speciation events within each OG. Our benchmarks show that, despite doubling the amount of genomes, the quality of orthology assignments and functional annotations (80% coverage) has persisted without significant changes across this update. Finally, we improved eggNOG online services for fast functional annotation and orthology prediction of custom genomics or metagenomics datasets. All precomputed data are publicly available for downloading or via API queries at http://eggnog.embl.de.


Assuntos
Sequência Conservada , Bases de Dados Genéticas , Evolução Molecular , Filogenia , Homologia de Sequência , Animais , Classificação , Eucariotos/genética , Duplicação Gênica , Ontologia Genética , Genes Virais , Genoma , Humanos , Anotação de Sequência Molecular , Proteoma , Alinhamento de Sequência , Relação Estrutura-Atividade
15.
Mol Biol Evol ; 36(10): 2157-2164, 2019 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-31241141

RESUMO

Gene families evolve by the processes of speciation (creating orthologs), gene duplication (paralogs), and horizontal gene transfer (xenologs), in addition to sequence divergence and gene loss. Orthologs in particular play an essential role in comparative genomics and phylogenomic analyses. With the continued sequencing of organisms across the tree of life, the data are available to reconstruct the unique evolutionary histories of tens of thousands of gene families. Accurate reconstruction of these histories, however, is a challenging computational problem, and the focus of the Quest for Orthologs Consortium. We review the recent advances and outstanding challenges in this field, as revealed at a symposium and meeting held at the University of Southern California in 2017. Key advances have been made both at the level of orthology algorithm development and with respect to coordination across the community of algorithm developers and orthology end-users. Applications spanned a broad range, including gene function prediction, phylostratigraphy, genome evolution, and phylogenomics. The meetings highlighted the increasing use of meta-analyses integrating results from multiple different algorithms, and discussed ongoing challenges in orthology inference as well as the next steps toward improvement and integration of orthology resources.


Assuntos
Evolução Molecular , Genômica/tendências , Família Multigênica , Algoritmos , Animais , Genômica/métodos , Humanos
16.
Gut ; 68(10): 1781-1790, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-30658995

RESUMO

OBJECTIVE: The composition of the healthy human adult gut microbiome is relatively stable over prolonged periods, and representatives of the most highly abundant and prevalent species have been cultured and described. However, microbial abundances can change on perturbations, such as antibiotics intake, enabling the identification and characterisation of otherwise low abundant species. DESIGN: Analysing gut microbial time-series data, we used shotgun metagenomics to create strain level taxonomic and functional profiles. Community dynamics were modelled postintervention with a focus on conditionally rare taxa and previously unknown bacteria. RESULTS: In response to a commonly prescribed cephalosporin (ceftriaxone), we observe a strong compositional shift in one subject, in which a previously unknown species, UBorkfalki ceftriaxensis, was identified, blooming to 92% relative abundance. The genome assembly reveals that this species (1) belongs to a so far undescribed order of Firmicutes, (2) is ubiquitously present at low abundances in at least one third of adults, (3) is opportunistically growing, being ecologically similar to typical probiotic species and (4) is stably associated to healthy hosts as determined by single nucleotide variation analysis. It was the first coloniser after the antibiotic intervention that led to a long-lasting microbial community shift and likely permanent loss of nine commensals. CONCLUSION: The bloom of UB. ceftriaxensis and a subsequent one of Parabacteroides distasonis demonstrate the existence of monodominance community states in the gut. Our study points to an undiscovered wealth of low abundant but common taxa in the human gut and calls for more highly resolved longitudinal studies, in particular on ecosystem perturbations.


Assuntos
Antibacterianos/farmacologia , Bactérias/genética , Microbioma Gastrointestinal/efeitos dos fármacos , Metagenômica/métodos , Microbiota/genética , Bactérias/efeitos dos fármacos , Humanos , Microbiota/efeitos dos fármacos
17.
Nat Methods ; 13(5): 425-30, 2016 05.
Artigo em Inglês | MEDLINE | ID: mdl-27043882

RESUMO

Achieving high accuracy in orthology inference is essential for many comparative, evolutionary and functional genomic analyses, yet the true evolutionary history of genes is generally unknown and orthologs are used for very different applications across phyla, requiring different precision-recall trade-offs. As a result, it is difficult to assess the performance of orthology inference methods. Here, we present a community effort to establish standards and an automated web-based service to facilitate orthology benchmarking. Using this service, we characterize 15 well-established inference methods and resources on a battery of 20 different benchmarks. Standardized benchmarking provides a way for users to identify the most effective methods for the problem at hand, sets a minimum requirement for new tools and resources, and guides the development of more accurate orthology inference methods.


Assuntos
Biologia Computacional/normas , Genômica/normas , Filogenia , Proteômica/normas , Archaea/classificação , Archaea/genética , Bactérias/classificação , Bactérias/genética , Biologia Computacional/métodos , Bases de Dados Genéticas , Eucariotos/classificação , Eucariotos/genética , Ontologia Genética , Genômica/métodos , Modelos Genéticos , Proteômica/métodos , Análise de Sequência de Proteína , Homologia de Sequência , Especificidade da Espécie
18.
Bioinformatics ; 34(2): 323-329, 2018 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-28968857

RESUMO

The Quest for Orthologs (QfO) is an open collaboration framework for experts in comparative phylogenomics and related research areas who have an interest in highly accurate orthology predictions and their applications. We here report highlights and discussion points from the QfO meeting 2015 held in Barcelona. Achievements in recent years have established a basis to support developments for improved orthology prediction and to explore new approaches. Central to the QfO effort is proper benchmarking of methods and services, as well as design of standardized datasets and standardized formats to allow sharing and comparison of results. Simultaneously, analysis pipelines have been improved, evaluated and adapted to handle large datasets. All this would not have occurred without the long-term collaboration of Consortium members. Meeting regularly to review and coordinate complementary activities from a broad spectrum of innovative researchers clearly benefits the community. Highlights of the meeting include addressing sources of and legitimacy of disagreements between orthology calls, the context dependency of orthology definitions, special challenges encountered when analyzing very anciently rooted orthologies, orthology in the light of whole-genome duplications, and the concept of orthologous versus paralogous relationships at different levels, including domain-level orthology. Furthermore, particular needs for different applications (e.g. plant genomics, ancient gene families and others) and the infrastructure for making orthology inferences available (e.g. interfaces with model organism databases) were discussed, with several ongoing efforts that are expected to be reported on during the upcoming 2017 QfO meeting.

19.
Nucleic Acids Res ; 45(D1): D529-D534, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-28053165

RESUMO

The availability of microbial genomes has opened many new avenues of research within microbiology. This has been driven primarily by comparative genomics approaches, which rely on accurate and consistent characterization of genomic sequences. It is nevertheless difficult to obtain consistent taxonomic and integrated functional annotations for defined prokaryotic clades. Thus, we developed proGenomes, a resource that provides user-friendly access to currently 25 038 high-quality genomes whose sequences and consistent annotations can be retrieved individually or by taxonomic clade. These genomes are assigned to 5306 consistent and accurate taxonomic species clusters based on previously established methodology. proGenomes also contains functional information for almost 80 million protein-coding genes, including a comprehensive set of general annotations and more focused annotations for carbohydrate-active enzymes and antibiotic resistance genes. Additionally, broad habitat information is provided for many genomes. All genomes and associated information can be downloaded by user-selected clade or multiple habitat-specific sets of representative genomes. We expect that the availability of high-quality genomes with comprehensive functional annotations will promote advances in clinical microbial genomics, functional evolution and other subfields of microbiology. proGenomes is available at http://progenomes.embl.de.


Assuntos
Biologia Computacional/métodos , Código de Barras de DNA Taxonômico/métodos , Genoma , Genômica/métodos , Células Procarióticas , Bases de Dados Genéticas , Anotação de Sequência Molecular , Navegador
20.
Mol Biol Evol ; 34(6): 1535-1542, 2017 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-28369572

RESUMO

Phylogenetic trees are routinely visualized to present and interpret the evolutionary relationships of species. Most empirical evolutionary data studies contain a visualization of the inferred tree with branch support values. Ambiguous semantics in tree file formats can lead to erroneous tree visualizations and therefore to incorrect interpretations of phylogenetic analyses. Here, we discuss problems that arise when displaying branch values on trees after rerooting. Branch values are typically stored as node labels in the widely-used Newick tree format. However, such values are attributes of branches. Storing them as node labels can therefore yield errors when rerooting trees. This depends on the mostly implicit semantics that tools deploy to interpret node labels. We reviewed ten tree viewers and ten bioinformatics toolkits that can display and reroot trees. We found that 14 out of 20 of these tools do not permit users to select the semantics of node labels. Thus, unaware users might obtain incorrect results when rooting trees. We illustrate such incorrect mappings for several test cases and real examples taken from the literature. This review has already led to improvements in eight tools. We suggest tools should provide options that explicitly force users to define the semantics of node labels.


Assuntos
Algoritmos , Biologia Computacional/métodos , Evolução Biológica , Bases de Dados Genéticas , Filogenia , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA