Your browser doesn't support javascript.
loading
Benchmarking bioinformatic virus identification tools using real-world metagenomic data across biomes.
Wu, Ling-Yi; Wijesekara, Yasas; Piedade, Gonçalo J; Pappas, Nikolaos; Brussaard, Corina P D; Dutilh, Bas E.
Afiliação
  • Wu LY; Theoretical Biology and Bioinformatics, Science4Life, Utrecht University, Padualaan 8, Utrecht, 3584 CH, The Netherlands.
  • Wijesekara Y; Institute of Bioinformatics, University Medicine Greifswald, Felix Hausdorff Str. 8, 17475, Greifswald, Germany.
  • Piedade GJ; Department Marine Microbiology and Biogeochemistry, NIOZ Royal Netherlands Institute for Sea Research, Den Burg, PO Box 59, Texel, 1790 AB, The Netherlands.
  • Pappas N; Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Amsterdam, The Netherlands.
  • Brussaard CPD; Theoretical Biology and Bioinformatics, Science4Life, Utrecht University, Padualaan 8, Utrecht, 3584 CH, The Netherlands.
  • Dutilh BE; Department Marine Microbiology and Biogeochemistry, NIOZ Royal Netherlands Institute for Sea Research, Den Burg, PO Box 59, Texel, 1790 AB, The Netherlands.
Genome Biol ; 25(1): 97, 2024 04 15.
Article em En | MEDLINE | ID: mdl-38622738
ABSTRACT

BACKGROUND:

As most viruses remain uncultivated, metagenomics is currently the main method for virus discovery. Detecting viruses in metagenomic data is not trivial. In the past few years, many bioinformatic virus identification tools have been developed for this task, making it challenging to choose the right tools, parameters, and cutoffs. As all these tools measure different biological signals, and use different algorithms and training and reference databases, it is imperative to conduct an independent benchmarking to give users objective guidance.

RESULTS:

We compare the performance of nine state-of-the-art virus identification tools in thirteen modes on eight paired viral and microbial datasets from three distinct biomes, including a new complex dataset from Antarctic coastal waters. The tools have highly variable true positive rates (0-97%) and false positive rates (0-30%). PPR-Meta best distinguishes viral from microbial contigs, followed by DeepVirFinder, VirSorter2, and VIBRANT. Different tools identify different subsets of the benchmarking data and all tools, except for Sourmash, find unique viral contigs. Performance of tools improved with adjusted parameter cutoffs, indicating that adjustment of parameter cutoffs before usage should be considered.

CONCLUSIONS:

Together, our independent benchmarking facilitates selecting choices of bioinformatic virus identification tools and gives suggestions for parameter adjustments to viromics researchers.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Vírus / Benchmarking Idioma: En Revista: Genome Biol Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Vírus / Benchmarking Idioma: En Revista: Genome Biol Ano de publicação: 2024 Tipo de documento: Article