Your browser doesn't support javascript.
loading
Evaluation and integration of functional annotation pipelines for newly sequenced organisms: the potato genome as a test case.
Amar, David; Frades, Itziar; Danek, Agnieszka; Goldberg, Tatyana; Sharma, Sanjeev K; Hedley, Pete E; Proux-Wera, Estelle; Andreasson, Erik; Shamir, Ron; Tzfadia, Oren; Alexandersson, Erik.
Afiliación
  • Amar D; Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel. ddam.am@gmail.com.
  • Frades I; Deptartment of Plant Protection Biology, Swedish University of Agricultural Sciences, Alnarp, Sweden. maria.iciar.frades@slu.se.
  • Danek A; Institute of Informatics, Silesian University of Technology, Akademicka 2A, 44-100, Gliwice, Poland. agnieszka.danek@polsl.pl.
  • Goldberg T; Department for Bioinformatics and Computational Biology, Technical University of Munich, Arcisstraße 21, 80333, Munich, Germany. goldberg@rostlab.org.
  • Sharma SK; Cell and Molecular Sciences, The James Hutton Institute, Aberdeen, Scotland, UK. Sanjeev.Sharma@hutton.ac.uk.
  • Hedley PE; Cell and Molecular Sciences, The James Hutton Institute, Aberdeen, Scotland, UK. Pete.Hedley@hutton.ac.uk.
  • Proux-Wera E; Deptartment of Plant Protection Biology, Swedish University of Agricultural Sciences, Alnarp, Sweden. estelle.proux@scilifelab.se.
  • Andreasson E; Current affiliation: SciLifeLab, Stockholm University, Universitetsvägen 10, 114 18, Stockholm, Sweden. estelle.proux@scilifelab.se.
  • Shamir R; Deptartment of Plant Protection Biology, Swedish University of Agricultural Sciences, Alnarp, Sweden. erik.andreasson@slu.se.
  • Tzfadia O; Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel. rshamir@tau.ac.il.
  • Alexandersson E; Department of Plant Science, The Weizmann Institute of Science, Rehovot, Israel. oren.tzfadia@weizmann.ac.il.
BMC Plant Biol ; 14: 329, 2014 Dec 05.
Article en En | MEDLINE | ID: mdl-25476999
ABSTRACT

BACKGROUND:

For most organisms, even if their genome sequence is available, little functional information about individual genes or proteins exists. Several annotation pipelines have been developed for functional analysis based on sequence, 'omics', and literature data. However, researchers encounter little guidance on how well they perform. Here, we used the recently sequenced potato genome as a case study. The potato genome was selected since its genome is newly sequenced and it is a non-model plant even if there is relatively ample information on individual potato genes, and multiple gene expression profiles are available.

RESULTS:

We show that the automatic gene annotations of potato have low accuracy when compared to a "gold standard" based on experimentally validated potato genes. Furthermore, we evaluate six state-of-the-art annotation pipelines and show that their predictions are markedly dissimilar (Jaccard similarity coefficient of 0.27 between pipelines on average). To overcome this discrepancy, we introduce a simple GO structure-based algorithm that reconciles the predictions of the different pipelines. We show that the integrated annotation covers more genes, increases by over 50% the number of highly co-expressed GO processes, and obtains much higher agreement with the gold standard.

CONCLUSIONS:

We find that different annotation pipelines produce different results, and show how to integrate them into a unified annotation that is of higher quality than each single pipeline. We offer an improved functional annotation of both PGSC and ITAG potato gene models, as well as tools that can be applied to additional pipelines and improve annotation in other organisms. This will greatly aid future functional analysis of '-omics' datasets from potato and other organisms with newly sequenced genomes. The new potato annotations are available with this paper.
Asunto(s)

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Solanum tuberosum / Genoma de Planta / Anotación de Secuencia Molecular Tipo de estudio: Prognostic_studies Idioma: En Revista: BMC Plant Biol Asunto de la revista: BOTANICA Año: 2014 Tipo del documento: Article País de afiliación: Israel

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Solanum tuberosum / Genoma de Planta / Anotación de Secuencia Molecular Tipo de estudio: Prognostic_studies Idioma: En Revista: BMC Plant Biol Asunto de la revista: BOTANICA Año: 2014 Tipo del documento: Article País de afiliación: Israel