Your browser doesn't support javascript.
loading
Quality assessment of gene repertoire annotations with OMArk.
Nevers, Yannis; Warwick Vesztrocy, Alex; Rossier, Victor; Train, Clément-Marie; Altenhoff, Adrian; Dessimoz, Christophe; Glover, Natasha M.
Afiliación
  • Nevers Y; Department of Computational Biology, University of Lausanne, Lausanne, Switzerland. yannis.nevers@unil.ch.
  • Warwick Vesztrocy A; Swiss Institute of Bioinformatics, Lausanne, Switzerland. yannis.nevers@unil.ch.
  • Rossier V; Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.
  • Train CM; Swiss Institute of Bioinformatics, Lausanne, Switzerland.
  • Altenhoff A; Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.
  • Dessimoz C; Swiss Institute of Bioinformatics, Lausanne, Switzerland.
  • Glover NM; Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland.
Nat Biotechnol ; 2024 Feb 21.
Article en En | MEDLINE | ID: mdl-38383603
ABSTRACT
In the era of biodiversity genomics, it is crucial to ensure that annotations of protein-coding gene repertoires are accurate. State-of-the-art tools to assess genome annotations measure the completeness of a gene repertoire but are blind to other errors, such as gene overprediction or contamination. We introduce OMArk, a software package that relies on fast, alignment-free sequence comparisons between a query proteome and precomputed gene families across the tree of life. OMArk assesses not only the completeness but also the consistency of the gene repertoire as a whole relative to closely related species and reports likely contamination events. Analysis of 1,805 UniProt Eukaryotic Reference Proteomes with OMArk demonstrated strong evidence of contamination in 73 proteomes and identified error propagation in avian gene annotation resulting from the use of a fragmented zebra finch proteome as a reference. This study illustrates the importance of comparing and prioritizing proteomes based on their quality measures.

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Idioma: En Revista: Nat Biotechnol / Nat. biotechnol / Nature biotechnology Asunto de la revista: BIOTECNOLOGIA Año: 2024 Tipo del documento: Article País de afiliación: Suiza

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Idioma: En Revista: Nat Biotechnol / Nat. biotechnol / Nature biotechnology Asunto de la revista: BIOTECNOLOGIA Año: 2024 Tipo del documento: Article País de afiliación: Suiza