Your browser doesn't support javascript.
loading
Assessing genome assembly quality prior to downstream analysis: N50 versus BUSCO.
Jauhal, April A; Newcomb, Richard D.
Afiliação
  • Jauhal AA; School of Biological Sciences, University of Auckland, Auckland, New Zealand.
  • Newcomb RD; The New Zealand Institute for Plant & Food Research, Auckland, New Zealand.
Mol Ecol Resour ; 21(5): 1416-1421, 2021 Jul.
Article em En | MEDLINE | ID: mdl-33629477
ABSTRACT
With the ever-increasing number of publicly available eukaryotic genome assemblies and user-friendly bioinformatics tools, there are increasing opportunities for researchers to use genomic resources in their research. While there are multiple dimensions to genome quality, it is often reduced to a single score that may not be correlated with other metrics, or appropriate for all applications of an assembly. To assess whether the commonly reported N50 value could reliably predict a separate dimension of genome quality, gene space completeness, we performed a meta-analysis of 611 published articles on eukaryotic genomes that used BUSCO scores, in addition to the typical N50 score. We found that although assemblies with relatively high contig and scaffold N50 values consistently had high BUSCO scores, a high BUSCO score could also be obtained from assemblies with a low N50. This reinforces that despite its ubiquity, N50 is not a perfect proxy for all measures of genome accuracy. Our data also suggests that variations in BUSCO scores among assemblies with poor N50 scores may be related to the number of introns in conserved eukaryotic genes. We stress the importance of screening and evaluating assembly quality based on the appropriate tools and urge increased reporting of additional genome assessment metrics in addition to N50. We also discuss the potential limitations of BUSCO and suggest improvements for assessing gene space within genome assemblies.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Genoma / Biologia Computacional / Genômica Tipo de estudo: Prognostic_studies / Systematic_reviews Idioma: En Revista: Mol Ecol Resour Ano de publicação: 2021 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Genoma / Biologia Computacional / Genômica Tipo de estudo: Prognostic_studies / Systematic_reviews Idioma: En Revista: Mol Ecol Resour Ano de publicação: 2021 Tipo de documento: Article