Enhancing transcriptome expression quantification through accurate assignment of long RNA sequencing reads with TranSigner.
bioRxiv
; 2024 Aug 17.
Article
em En
| MEDLINE
| ID: mdl-39185147
ABSTRACT
Recently developed long-read RNA sequencing technologies promise to provide a more accurate and comprehensive view of transcriptomes compared to short-read sequencers, primarily due to their capability to achieve full-length sequencing of transcripts. However, realizing this potential requires computational tools tailored to process long reads, which exhibit a higher error rate than short reads. Existing methods for assembling and quantifying long-read data often disagree on expressed transcripts and their abundance levels, leading researchers to lack confidence in the transcriptomes produced using this data. One approach to address the uncertainties in transcriptome assembly and quantification is by assigning the long reads to transcripts, enabling a more detailed characterization of transcript support at the read level. Here, we introduce TranSigner, a versatile tool that assigns long reads to any input transcriptome. TranSigner consists of three consecutive modules performing read alignment to the given transcripts, computation of read-to-transcript compatibility based on alignment scores and positions, and execution of an expectation-maximization algorithm to probabilistically assign reads to transcripts and estimate transcript abundances. Using simulated data and experimental datasets from three well-studied organisms - Homo sapiens, Arabidopsis thaliana, and Mus musculus - we demonstrate that TranSigner achieves accurate read assignments, obtaining higher accuracy in transcript abundance estimation compared to existing tools.
Texto completo:
1
Base de dados:
MEDLINE
Idioma:
En
Ano de publicação:
2024
Tipo de documento:
Article