Your browser doesn't support javascript.
loading
Batch-Corrected Distance Mitigates Temporal and Spatial Variability for Clustering and Visualization of Single-Cell Gene Expression Data.
Liang, Shaoheng; Dou, Jinzhuang; Iqbal, Ramiz; Chen, Ken.
Affiliation
  • Liang S; Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center.
  • Dou J; Department of Computer Science, Rice University.
  • Iqbal R; Current address: Computational Biology Department, Carnegie Mellon University.
  • Chen K; Department of Bioinformatics and Computational Biology, MD Anderson Cancer Center.
Res Sq ; 2023 Jul 26.
Article in En | MEDLINE | ID: mdl-37547002
ABSTRACT
Clustering and visualization are essential parts of single-cell gene expression data analysis. The Euclidean distance used in most distance-based methods is not optimal. The batch effect, i.e., the variability among samples gathered from different times, tissues, and patients, introduces large between-group distance and obscures the true identities of cells. To solve this problem, we introduce Batch-Corrected Distance (BCD), a metric using temporal/spatial locality of the batch effect to control for such factors. We validate BCD on simulated data as well as applied it to a mouse retina development dataset and a lung dataset. We also found the utility of our approach in understanding the progression of the Coronavirus Disease 2019 (COVID-19). BCD achieves more accurate clusters and better visualizations than state-of-the-art batch correction methods on longitudinal datasets. BCD can be directly integrated with most clustering and visualization methods to enable more scientific findings.

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: Res Sq Year: 2023 Document type: Article

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: Res Sq Year: 2023 Document type: Article