RNAontheBENCH: computational and empirical resources for benchmarking RNAseq quantification and differential expression methods.

Germain, Pierre-Luc; Vitriolo, Alessandro; Adamo, Antonio; Laise, Pasquale; Das, Vivek; Testa, Giuseppe

Germain, Pierre-Luc; Vitriolo, Alessandro; Adamo, Antonio; Laise, Pasquale; Das, Vivek; Testa, Giuseppe.

Afiliación

Germain PL; European Institute of Oncology, Department of Experimental Oncology, Via Adamello 16, 20139 Milano, Italy.
Vitriolo A; European Institute of Oncology, Department of Experimental Oncology, Via Adamello 16, 20139 Milano, Italy University of Milan, Department of Oncology and Hemato-Oncology, Via Festa del Perdono 7, 20122 Milano, Italy.
Adamo A; European Institute of Oncology, Department of Experimental Oncology, Via Adamello 16, 20139 Milano, Italy.
Laise P; European Institute of Oncology, Department of Experimental Oncology, Via Adamello 16, 20139 Milano, Italy.
Das V; European Institute of Oncology, Department of Experimental Oncology, Via Adamello 16, 20139 Milano, Italy University of Milan, Department of Oncology and Hemato-Oncology, Via Festa del Perdono 7, 20122 Milano, Italy.
Testa G; European Institute of Oncology, Department of Experimental Oncology, Via Adamello 16, 20139 Milano, Italy University of Milan, Department of Oncology and Hemato-Oncology, Via Festa del Perdono 7, 20122 Milano, Italy giuseppe.testa@ieo.eu.

Nucleic Acids Res ; 44(11): 5054-67, 2016 06 20.

Article en En | MEDLINE | ID: mdl-27190234

ABSTRACT

ABSTRACT

RNA sequencing (RNAseq) has become the method of choice for transcriptome analysis, yet no consensus exists as to the most appropriate pipeline for its analysis, with current benchmarks suffering important limitations. Here, we address these challenges through a rich benchmarking resource harnessing (i) two RNAseq datasets including ERCC ExFold spike-ins; (ii) Nanostring measurements of a panel of 150 genes on the same samples; (iii) a set of internal, genetically-determined controls; (iv) a reanalysis of the SEQC dataset; and (v) a focus on relative quantification (i.e. across-samples). We use this resource to compare different approaches to each step of RNAseq analysis, from alignment to differential expression testing. We show that methods providing the best absolute quantification do not necessarily provide good relative quantification across samples, that count-based methods are superior for gene-level relative quantification, and that the new generation of pseudo-alignment-based software performs as well as established methods, at a fraction of the computing time. We also assess the impact of library type and size on quantification and differential expression analysis. Finally, we have created a R package and a web platform to enable the simple and streamlined application of this resource to the benchmarking of future methods.

Asunto(s)

Biología Computacional/métodos; Perfilación de la Expresión Génica/métodos; Análisis de Secuencia de ARN/métodos; Programas Informáticos; Simulación por Computador; Dosificación de Gen; Regulación de la Expresión Génica; Biblioteca de Genes; Reproducibilidad de los Resultados; Transcriptoma; Navegador Web

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Programas Informáticos / Análisis de Secuencia de ARN / Biología Computacional / Perfilación de la Expresión Génica Tipo de estudio: Prognostic_studies Idioma: En Revista: Nucleic Acids Res Año: 2016 Tipo del documento: Article País de afiliación: Italia

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google