Pesquisa | Portal Regional da BVS

Crowdsourced benchmarking of taxonomic metagenome profilers: lessons learned from the sbv IMPROVER Microbiomics challenge.

Poussin, Carine; Khachatryan, Lusine; Sierro, Nicolas; Narsapuram, Vijay Kumar; Meyer, Fernando; Kaikala, Vinay; Chawla, Vandna; Muppirala, Usha; Kumar, Sunil; Belcastro, Vincenzo; Battey, James N D; Scotti, Elena; Boué, Stéphanie; McHardy, Alice C; Peitsch, Manuel C; Ivanov, Nikolai V; Hoeng, Julia.

BMC Genomics ; 23(1): 624, 2022 Aug 30.

Artigo em Inglês | MEDLINE | ID: mdl-36042406

RESUMO

BACKGROUND: Selection of optimal computational strategies for analyzing metagenomics data is a decisive step in determining the microbial composition of a sample, and this procedure is complex because of the numerous tools currently available. The aim of this research was to summarize the results of crowdsourced sbv IMPROVER Microbiomics Challenge designed to evaluate the performance of off-the-shelf metagenomics software as well as to investigate the robustness of these results by the extended post-challenge analysis. In total 21 off-the-shelf taxonomic metagenome profiling pipelines were benchmarked for their capacity to identify the microbiome composition at various taxon levels across 104 shotgun metagenomics datasets of bacterial genomes (representative of various microbiome samples) from public databases. Performance was determined by comparing predicted taxonomy profiles with the gold standard. RESULTS: Most taxonomic profilers performed homogeneously well at the phylum level but generated intermediate and heterogeneous scores at the genus and species levels, respectively. kmer-based pipelines using Kraken with and without Bracken or using CLARK-S performed best overall, but they exhibited lower precision than the two marker-gene-based methods MetaPhlAn and mOTU. Filtering out the 1% least abundance species-which were not reliably predicted-helped increase the performance of most profilers by increasing precision but at the cost of recall. However, the use of adaptive filtering thresholds determined from the sample's Shannon index increased the performance of most kmer-based profilers while mitigating the tradeoff between precision and recall. CONCLUSIONS: kmer-based metagenomic pipelines using Kraken/Bracken or CLARK-S performed most robustly across a large variety of microbiome datasets. Removing non-reliably predicted low-abundance species by using diversity-dependent adaptive filtering thresholds further enhanced the performance of these tools. This work demonstrates the applicability of computational pipelines for accurately determining taxonomic profiles in clinical and environmental contexts and exemplifies the power of crowdsourcing for unbiased evaluation.

Assuntos

Crowdsourcing , Metagenoma , Benchmarking , Metagenômica/métodos , Software

Ensembl 2022.

Cunningham, Fiona; Allen, James E; Allen, Jamie; Alvarez-Jarreta, Jorge; Amode, M Ridwan; Armean, Irina M; Austine-Orimoloye, Olanrewaju; Azov, Andrey G; Barnes, If; Bennett, Ruth; Berry, Andrew; Bhai, Jyothish; Bignell, Alexandra; Billis, Konstantinos; Boddu, Sanjay; Brooks, Lucy; Charkhchi, Mehrnaz; Cummins, Carla; Da Rin Fioretto, Luca; Davidson, Claire; Dodiya, Kamalkumar; Donaldson, Sarah; El Houdaigui, Bilal; El Naboulsi, Tamara; Fatima, Reham; Giron, Carlos Garcia; Genez, Thiago; Martinez, Jose Gonzalez; Guijarro-Clarke, Cristina; Gymer, Arthur; Hardy, Matthew; Hollis, Zoe; Hourlier, Thibaut; Hunt, Toby; Juettemann, Thomas; Kaikala, Vinay; Kay, Mike; Lavidas, Ilias; Le, Tuan; Lemos, Diana; Marugán, José Carlos; Mohanan, Shamika; Mushtaq, Aleena; Naven, Marc; Ogeh, Denye N; Parker, Anne; Parton, Andrew; Perry, Malcolm; Pilizota, Ivana; Prosovetskaia, Irina.

Nucleic Acids Res ; 50(D1): D988-D995, 2022 01 07.

Artigo em Inglês | MEDLINE | ID: mdl-34791404

RESUMO

Ensembl (https://www.ensembl.org) is unique in its flexible infrastructure for access to genomic data and annotation. It has been designed to efficiently deliver annotation at scale for all eukaryotic life, and it also provides deep comprehensive annotation for key species. Genomes representing a greater diversity of species are increasingly being sequenced. In response, we have focussed our recent efforts on expediting the annotation of new assemblies. Here, we report the release of the greatest annual number of newly annotated genomes in the history of Ensembl via our dedicated Ensembl Rapid Release platform (http://rapid.ensembl.org). We have also developed a new method to generate comparative analyses at scale for these assemblies and, for the first time, we have annotated non-vertebrate eukaryotes. Meanwhile, we continually improve, extend and update the annotation for our high-value reference vertebrate genomes and report the details here. We have a range of specific software tools for specific tasks, such as the Ensembl Variant Effect Predictor (VEP) and the newly developed interface for the Variant Recoder. All Ensembl data, software and tools are freely available for download and are accessible programmatically.

Assuntos

Bases de Dados Genéticas , Genoma/genética , Anotação de Sequência Molecular , Software , Animais , Biologia Computacional/classificação , Humanos

Ensembl Genomes 2022: an expanding genome resource for non-vertebrates.

Yates, Andrew D; Allen, James; Amode, Ridwan M; Azov, Andrey G; Barba, Matthieu; Becerra, Andrés; Bhai, Jyothish; Campbell, Lahcen I; Carbajo Martinez, Manuel; Chakiachvili, Marc; Chougule, Kapeel; Christensen, Mikkel; Contreras-Moreira, Bruno; Cuzick, Alayne; Da Rin Fioretto, Luca; Davis, Paul; De Silva, Nishadi H; Diamantakis, Stavros; Dyer, Sarah; Elser, Justin; Filippi, Carla V; Gall, Astrid; Grigoriadis, Dionysios; Guijarro-Clarke, Cristina; Gupta, Parul; Hammond-Kosack, Kim E; Howe, Kevin L; Jaiswal, Pankaj; Kaikala, Vinay; Kumar, Vivek; Kumari, Sunita; Langridge, Nick; Le, Tuan; Luypaert, Manuel; Maslen, Gareth L; Maurel, Thomas; Moore, Benjamin; Muffato, Matthieu; Mushtaq, Aleena; Naamati, Guy; Naithani, Sushma; Olson, Andrew; Parker, Anne; Paulini, Michael; Pedro, Helder; Perry, Emily; Preece, Justin; Quinton-Tulloch, Mark; Rodgers, Faye; Rosello, Marc.

Nucleic Acids Res ; 50(D1): D996-D1003, 2022 01 07.

Artigo em Inglês | MEDLINE | ID: mdl-34791415

RESUMO

Ensembl Genomes (https://www.ensemblgenomes.org) provides access to non-vertebrate genomes and analysis complementing vertebrate resources developed by the Ensembl project (https://www.ensembl.org). The two resources collectively present genome annotation through a consistent set of interfaces spanning the tree of life presenting genome sequence, annotation, variation, transcriptomic data and comparative analysis. Here, we present our largest increase in plant, metazoan and fungal genomes since the project's inception creating one of the world's most comprehensive genomic resources and describe our efforts to reduce genome redundancy in our Bacteria portal. We detail our new efforts in gene annotation, our emerging support for pangenome analysis, our efforts to accelerate data dissemination through the Ensembl Rapid Release resource and our new AlphaFold visualization. Finally, we present details of our future plans including updates on our integration with Ensembl, and how we plan to improve our support for the microbial research community. Software and data are made available without restriction via our website, online tools platform and programmatic interfaces (available under an Apache 2.0 license). Data updates are synchronised with Ensembl's release cycle.

Assuntos

Bases de Dados Genéticas , Genômica , Internet , Software , Animais , Biologia Computacional , Genoma Bacteriano/genética , Genoma Fúngico/genética , Genoma de Planta/genética , Plantas/classificação , Plantas/genética , Vertebrados/classificação , Vertebrados/genética

Ensembl 2021.

Howe, Kevin L; Achuthan, Premanand; Allen, James; Allen, Jamie; Alvarez-Jarreta, Jorge; Amode, M Ridwan; Armean, Irina M; Azov, Andrey G; Bennett, Ruth; Bhai, Jyothish; Billis, Konstantinos; Boddu, Sanjay; Charkhchi, Mehrnaz; Cummins, Carla; Da Rin Fioretto, Luca; Davidson, Claire; Dodiya, Kamalkumar; El Houdaigui, Bilal; Fatima, Reham; Gall, Astrid; Garcia Giron, Carlos; Grego, Tiago; Guijarro-Clarke, Cristina; Haggerty, Leanne; Hemrom, Anmol; Hourlier, Thibaut; Izuogu, Osagie G; Juettemann, Thomas; Kaikala, Vinay; Kay, Mike; Lavidas, Ilias; Le, Tuan; Lemos, Diana; Gonzalez Martinez, Jose; Marugán, José Carlos; Maurel, Thomas; McMahon, Aoife C; Mohanan, Shamika; Moore, Benjamin; Muffato, Matthieu; Oheh, Denye N; Paraschas, Dimitrios; Parker, Anne; Parton, Andrew; Prosovetskaia, Irina; Sakthivel, Manoj P; Salam, Ahamed I Abdul; Schmitt, Bianca M; Schuilenburg, Helen; Sheppard, Dan.

Nucleic Acids Res ; 49(D1): D884-D891, 2021 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-33137190

RESUMO

The Ensembl project (https://www.ensembl.org) annotates genomes and disseminates genomic data for vertebrate species. We create detailed and comprehensive annotation of gene structures, regulatory elements and variants, and enable comparative genomics by inferring the evolutionary history of genes and genomes. Our integrated genomic data are made available in a variety of ways, including genome browsers, search interfaces, specialist tools such as the Ensembl Variant Effect Predictor, download files and programmatic interfaces. Here, we present recent Ensembl developments including two new website portals. Ensembl Rapid Release (http://rapid.ensembl.org) is designed to provide core tools and services for genomes as soon as possible and has been deployed to support large biodiversity sequencing projects. Our SARS-CoV-2 genome browser (https://covid-19.ensembl.org) integrates our own annotation with publicly available genomic data from numerous sources to facilitate the use of genomics in the international scientific response to the COVID-19 pandemic. We also report on other updates to our annotation resources, tools and services. All Ensembl data and software are freely available without restriction.

Assuntos

Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Genômica/métodos , SARS-CoV-2/genética , Vertebrados/genética , Animais , COVID-19/epidemiologia , COVID-19/virologia , Humanos , Internet , Anotação de Sequência Molecular/métodos , Pandemias , Vertebrados/classificação

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA