Recovery of missing single-cell RNA-sequencing data with optimized transcriptomic references.
Nat Methods
; 20(10): 1506-1515, 2023 Oct.
Article
in En
| MEDLINE
| ID: mdl-37697162
Single-cell RNA-sequencing (scRNA-seq) is an indispensable tool for characterizing cellular diversity and generating hypotheses throughout biology. Droplet-based scRNA-seq datasets often lack expression data for genes that can be detected with other methods. Here we show that the observed sensitivity deficits stem from three sources: (1) poor annotation of 3' gene ends; (2) issues with intronic read incorporation; and (3) gene overlap-derived read loss. We show that missing gene expression data can be recovered by optimizing the reference transcriptome for scRNA-seq through recovering false intergenic reads, implementing a hybrid pre-mRNA mapping strategy and resolving gene overlaps. We demonstrate, with a diverse collection of mouse and human tissue data, that reference optimization can substantially improve cellular profiling resolution and reveal missing cell types and marker genes. Our findings argue that transcriptomic references need to be optimized for scRNA-seq analysis and warrant a reanalysis of previously published datasets and cell atlases.
Full text:
1
Collection:
01-internacional
Database:
MEDLINE
Language:
En
Journal:
Nat Methods
Journal subject:
TECNICAS E PROCEDIMENTOS DE LABORATORIO
Year:
2023
Document type:
Article
Affiliation country:
Country of publication: