Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
1.
Nature ; 626(7998): 377-384, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38109938

ABSTRACT

Many of the Earth's microbes remain uncultured and understudied, limiting our understanding of the functional and evolutionary aspects of their genetic material, which remain largely overlooked in most metagenomic studies1. Here we analysed 149,842 environmental genomes from multiple habitats2-6 and compiled a curated catalogue of 404,085 functionally and evolutionarily significant novel (FESNov) gene families exclusive to uncultivated prokaryotic taxa. All FESNov families span multiple species, exhibit strong signals of purifying selection and qualify as new orthologous groups, thus nearly tripling the number of bacterial and archaeal gene families described to date. The FESNov catalogue is enriched in clade-specific traits, including 1,034 novel families that can distinguish entire uncultivated phyla, classes and orders, probably representing synapomorphies that facilitated their evolutionary divergence. Using genomic context analysis and structural alignments we predicted functional associations for 32.4% of FESNov families, including 4,349 high-confidence associations with important biological processes. These predictions provide a valuable hypothesis-driven framework that we used for experimental validatation of a new gene family involved in cell motility and a novel set of antimicrobial peptides. We also demonstrate that the relative abundance profiles of novel families can discriminate between environments and clinical conditions, leading to the discovery of potentially new biomarkers associated with colorectal cancer. We expect this work to enhance future metagenomics studies and expand our knowledge of the genetic repertory of uncultivated organisms.


Subject(s)
Archaea , Bacteria , Ecosystem , Evolution, Molecular , Genes, Archaeal , Genes, Bacterial , Genomics , Knowledge , Antimicrobial Peptides/genetics , Archaea/classification , Archaea/genetics , Bacteria/classification , Bacteria/genetics , Biomarkers , Cell Movement/genetics , Colorectal Neoplasms/genetics , Genomics/methods , Genomics/trends , Metagenomics/trends , Multigene Family , Phylogeny , Reproducibility of Results
2.
Nucleic Acids Res ; 51(D1): D389-D394, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36399505

ABSTRACT

The eggNOG (evolutionary gene genealogy Non-supervised Orthologous Groups) database is a bioinformatics resource providing orthology data and comprehensive functional information for organisms from all domains of life. Here, we present a major update of the database and website (version 6.0), which increases the number of covered organisms to 12 535 reference species, expands functional annotations, and implements new functionality. In total, eggNOG 6.0 provides a hierarchy of over 17M orthologous groups (OGs) computed at 1601 taxonomic levels, spanning 10 756 bacterial, 457 archaeal and 1322 eukaryotic organisms. OGs have been thoroughly annotated using recent knowledge from functional databases, including KEGG, Gene Ontology, UniProtKB, BiGG, CAZy, CARD, PFAM and SMART. eggNOG also offers phylogenetic trees for all OGs, maximising utility and versatility for end users while allowing researchers to investigate the evolutionary history of speciation and duplication events as well as the phylogenetic distribution of functional terms within each OG. Furthermore, the eggNOG 6.0 website contains new functionality to mine orthology and functional data with ease, including the possibility of generating phylogenetic profiles for multiple OGs across species or identifying single-copy OGs at custom taxonomic levels. eggNOG 6.0 is available at http://eggnog6.embl.de.


Subject(s)
Databases, Genetic , Genomics , Phylogeny , Computational Biology , Eukaryota/genetics
3.
Nucleic Acids Res ; 50(W1): W577-W582, 2022 07 05.
Article in English | MEDLINE | ID: mdl-35544233

ABSTRACT

Phylogenomics data have grown exponentially over the last decades. It is currently common for genome-wide projects to generate hundreds or even thousands of phylogenetic trees and multiple sequence alignments, which may also be very large in size. However, the analysis and interpretation of such data still depends on custom bioinformatic and visualisation workflows that are largely unattainable for non-expert users. Here, we present PhyloCloud, an online platform aimed at hosting, indexing and exploring large phylogenetic tree collections, providing also seamless access to common analyses and operations, such as node annotation, searching, topology editing, automatic tree rooting, orthology detection and more. In addition, PhyloCloud provides quick access to tools that allow users to build their own phylogenies using fast predefined workflows, graphically compare tree topologies, or query taxonomic databases such as NBCI or GTDB. Finally, PhyloCloud offers a novel tree visualisation system based on ETE Toolkit v4.0, which can be used to explore very large trees and enhance them with custom annotations and multiple sequence alignments. The platform allows for sharing tree collections and specific tree views via private links, or make them fully public, serving also as a repository of phylogenomic data. PhyloCloud is available at https://phylocloud.cgmlab.org.


Subject(s)
Computational Biology , Genome , Phylogeny , Sequence Alignment , Databases, Genetic
4.
Mol Biol Evol ; 38(12): 5825-5829, 2021 12 09.
Article in English | MEDLINE | ID: mdl-34597405

ABSTRACT

Even though automated functional annotation of genes represents a fundamental step in most genomic and metagenomic workflows, it remains challenging at large scales. Here, we describe a major upgrade to eggNOG-mapper, a tool for functional annotation based on precomputed orthology assignments, now optimized for vast (meta)genomic data sets. Improvements in version 2 include a full update of both the genomes and functional databases to those from eggNOG v5, as well as several efficiency enhancements and new features. Most notably, eggNOG-mapper v2 now allows for: 1) de novo gene prediction from raw contigs, 2) built-in pairwise orthology prediction, 3) fast protein domain discovery, and 4) automated GFF decoration. eggNOG-mapper v2 is available as a standalone tool or as an online service at http://eggnog-mapper.embl.de.


Subject(s)
Databases, Genetic , Metagenomics , Genomics , Metagenome , Molecular Sequence Annotation , Phylogeny , Software
5.
Nucleic Acids Res ; 48(D1): D621-D625, 2020 01 08.
Article in English | MEDLINE | ID: mdl-31647096

ABSTRACT

Microbiology depends on the availability of annotated microbial genomes for many applications. Comparative genomics approaches have been a major advance, but consistent and accurate annotations of genomes can be hard to obtain. In addition, newer concepts such as the pan-genome concept are still being implemented to help answer biological questions. Hence, we present proGenomes2, which provides 87 920 high-quality genomes in a user-friendly and interactive manner. Genome sequences and annotations can be retrieved individually or by taxonomic clade. Every genome in the database has been assigned to a species cluster and most genomes could be accurately assigned to one or multiple habitats. In addition, general functional annotations and specific annotations of antibiotic resistance genes and single nucleotide variants are provided. In short, proGenomes2 provides threefold more genomes, enhanced habitat annotations, updated taxonomic and functional annotation and improved linkage to the NCBI BioSample database. The database is available at http://progenomes.embl.de/.


Subject(s)
Databases, Genetic , Genome, Archaeal , Genome, Bacterial , Genomics , Computational Biology/methods , Ecosystem , Internet , Molecular Sequence Annotation , Polymorphism, Single Nucleotide , Prokaryotic Cells , Reproducibility of Results , Software
6.
Nucleic Acids Res ; 48(W1): W538-W545, 2020 07 02.
Article in English | MEDLINE | ID: mdl-32374845

ABSTRACT

The identification of orthologs-genes in different species which descended from the same gene in their last common ancestor-is a prerequisite for many analyses in comparative genomics and molecular evolution. Numerous algorithms and resources have been conceived to address this problem, but benchmarking and interpreting them is fraught with difficulties (need to compare them on a common input dataset, absence of ground truth, computational cost of calling orthologs). To address this, the Quest for Orthologs consortium maintains a reference set of proteomes and provides a web server for continuous orthology benchmarking (http://orthology.benchmarkservice.org). Furthermore, consensus ortholog calls derived from public benchmark submissions are provided on the Alliance of Genome Resources website, the joint portal of NIH-funded model organism databases.


Subject(s)
Multigene Family , Proteome , Software , Animals , Benchmarking , Consensus , Genomics , Humans , Mice , Phylogeny , Rats
7.
Nucleic Acids Res ; 47(D1): D309-D314, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30418610

ABSTRACT

eggNOG is a public database of orthology relationships, gene evolutionary histories and functional annotations. Here, we present version 5.0, featuring a major update of the underlying genome sets, which have been expanded to 4445 representative bacteria and 168 archaea derived from 25 038 genomes, as well as 477 eukaryotic organisms and 2502 viral proteomes that were selected for diversity and filtered by genome quality. In total, 4.4M orthologous groups (OGs) distributed across 379 taxonomic levels were computed together with their associated sequence alignments, phylogenies, HMM models and functional descriptors. Precomputed evolutionary analysis provides fine-grained resolution of duplication/speciation events within each OG. Our benchmarks show that, despite doubling the amount of genomes, the quality of orthology assignments and functional annotations (80% coverage) has persisted without significant changes across this update. Finally, we improved eggNOG online services for fast functional annotation and orthology prediction of custom genomics or metagenomics datasets. All precomputed data are publicly available for downloading or via API queries at http://eggnog.embl.de.


Subject(s)
Conserved Sequence , Databases, Genetic , Evolution, Molecular , Phylogeny , Sequence Homology , Animals , Classification , Eukaryota/genetics , Gene Duplication , Gene Ontology , Genes, Viral , Genome , Humans , Molecular Sequence Annotation , Proteome , Sequence Alignment , Structure-Activity Relationship
8.
Science ; 374(6568): 717-723, 2021 Nov 05.
Article in English | MEDLINE | ID: mdl-34735222

ABSTRACT

The evolutionary origin of metazoan cell types such as neurons and muscles is not known. Using whole-body single-cell RNA sequencing in a sponge, an animal without nervous system and musculature, we identified 18 distinct cell types. These include nitric oxide­sensitive contractile pinacocytes, amoeboid phagocytes, and secretory neuroid cells that reside in close contact with digestive choanocytes that express scaffolding and receptor proteins. Visualizing neuroid cells by correlative x-ray and electron microscopy revealed secretory vesicles and cellular projections enwrapping choanocyte microvilli and cilia. Our data show a communication system that is organized around sponge digestive chambers, using conserved modules that became incorporated into the pre- and postsynapse in the nervous systems of other animals.


Subject(s)
Biological Evolution , Porifera/cytology , Animals , Cell Communication , Cell Surface Extensions/ultrastructure , Cilia/physiology , Cilia/ultrastructure , Digestive System/cytology , Mesoderm/cytology , Nervous System/cytology , Nervous System Physiological Phenomena , Nitric Oxide/metabolism , Porifera/genetics , Porifera/metabolism , RNA-Seq , Secretory Vesicles/ultrastructure , Signal Transduction , Single-Cell Analysis , Transcriptome
SELECTION OF CITATIONS
SEARCH DETAIL