scLENS: data-driven signal detection for unbiased scRNA-seq data analysis.
Nat Commun
; 15(1): 3575, 2024 Apr 27.
Article
in En
| MEDLINE
| ID: mdl-38678050
ABSTRACT
High dimensionality and noise have limited the new biological insights that can be discovered in scRNA-seq data. While dimensionality reduction tools have been developed to extract biological signals from the data, they often require manual determination of signal dimension, introducing user bias. Furthermore, a common data preprocessing method, log normalization, can unintentionally distort signals in the data. Here, we develop scLENS, a dimensionality reduction tool that circumvents the long-standing issues of signal distortion and manual input. Specifically, we identify the primary cause of signal distortion during log normalization and effectively address it by uniformizing cell vector lengths with L2 normalization. Furthermore, we utilize random matrix theory-based noise filtering and a signal robustness test to enable data-driven determination of the threshold for signal dimensions. Our method outperforms 11 widely used dimensionality reduction tools and performs particularly well for challenging scRNA-seq datasets with high sparsity and variability. To facilitate the use of scLENS, we provide a user-friendly package that automates accurate signal detection of scRNA-seq data without manual time-consuming tuning.
Full text:
1
Collection:
01-internacional
Database:
MEDLINE
Main subject:
Algorithms
/
RNA-Seq
/
Single-Cell Gene Expression Analysis
Limits:
Animals
/
Humans
Language:
En
Journal:
Nat Commun
Journal subject:
BIOLOGIA
/
CIENCIA
Year:
2024
Document type:
Article
Country of publication:
United kingdom