Comparative evaluation of full-length isoform quantification from RNA-Seq.
BMC Bioinformatics
; 22(1): 266, 2021 May 25.
Article
em En
| MEDLINE
| ID: mdl-34034652
ABSTRACT
BACKGROUND:
Full-length isoform quantification from RNA-Seq is a key goal in transcriptomics analyses and has been an area of active development since the beginning. The fundamental difficulty stems from the fact that RNA transcripts are long, while RNA-Seq reads are short.RESULTS:
Here we use simulated benchmarking data that reflects many properties of real data, including polymorphisms, intron signal and non-uniform coverage, allowing for systematic comparative analyses of isoform quantification accuracy and its impact on differential expression analysis. Genome, transcriptome and pseudo alignment-based methods are included; and a simple approach is included as a baseline control.CONCLUSIONS:
Salmon, kallisto, RSEM, and Cufflinks exhibit the highest accuracy on idealized data, while on more realistic data they do not perform dramatically better than the simple approach. We determine the structural parameters with the greatest impact on quantification accuracy to be length and sequence compression complexity and not so much the number of isoforms. The effect of incomplete annotation on performance is also investigated. Overall, the tested methods show sufficient divergence from the truth to suggest that full-length isoform quantification and isoform level DE should still be employed selectively.Palavras-chave
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Assunto principal:
Perfilação da Expressão Gênica
/
Transcriptoma
Idioma:
En
Revista:
BMC Bioinformatics
Assunto da revista:
INFORMATICA MEDICA
Ano de publicação:
2021
Tipo de documento:
Article
País de afiliação:
Estados Unidos