Your browser doesn't support javascript.
loading
SinCWIm: An imputation method for single-cell RNA sequence dropouts using weighted alternating least squares.
Gong, Lejun; Cui, Xiong; Liu, Yang; Lin, Cai; Gao, Zhihong.
Affiliation
  • Gong L; Jiangsu Key Lab of Big Data Security & Intelligent Processing, School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, China. Electronic address: glj98226@njupt.edu.cn.
  • Cui X; Jiangsu Key Lab of Big Data Security & Intelligent Processing, School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, China.
  • Liu Y; Jiangsu Key Lab of Big Data Security & Intelligent Processing, School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, China.
  • Lin C; Department of Burn, Wound Repair and Regenerative Medicine Center, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, 325000, China. Electronic address: 13025092850@163.com.
  • Gao Z; Zhejiang Engineering Research Center of Intelligent Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325000, China. Electronic address: gzh@wzhospital.cn.
Comput Biol Med ; 171: 108225, 2024 Mar.
Article in En | MEDLINE | ID: mdl-38442556
ABSTRACT
BACKGROUND AND

OBJECTIVES:

Single-cell RNA sequencing (scRNA-seq) provides a powerful tool for exploring cellular heterogeneity, discovering novel or rare cell types, distinguishing between tissue-specific cellular composition, and understanding cell differentiation during development. However, due to technological limitations, dropout events in scRNA-seq can mistakenly convert some entries in the real data to zero. This is equivalent to introducing noise into the data of cell gene expression entries. The data is contaminated, which affects the performance of downstream analyses, including clustering, cell annotation, differential gene expression analysis, and so on. Therefore, it is a crucial work to accurately determine which zeros are due to dropout events and perform imputation operations on them.

METHODS:

Considering the different confidence levels of different zeros in the gene expression matrix, this paper proposes a SinCWIm method for dropout events in scRNA-seq based on weighted alternating least squares (WALS). The method utilizes Pearson correlation coefficient and hierarchical clustering to quantify the confidence of zero entries. It is then combined with WALS for matrix decomposition. And the imputation result is made close to the actual number by outlier removal and data correction operations.

RESULTS:

A total of eight single-cell sequencing datasets were used for comparative experiments to demonstrate the overall superiority of SinCWIm over state-of-the-art models. SinCWIm was applied to cluster the data to obtain an adjusted RAND index evaluation, and the Usoskin, Pollen and Bladder datasets scored 94.46%, 96.48% and 76.74%, respectively. In addition, significant improvements were made in the retention of differential expression genes and visualization.

CONCLUSIONS:

SinCWIm provides a valuable imputation method for handling dropout events in single-cell sequencing data. In comparison to advanced methods, SinCWIm demonstrates excellent performance in clustering, visualization and other aspects. It is applicable to various single-cell sequencing datasets.
Subject(s)
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Gene Expression Profiling / Single-Cell Analysis Language: En Journal: Comput Biol Med / Comput. biol. med / Computers in biology and medicine Year: 2024 Document type: Article Country of publication: Estados Unidos

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Gene Expression Profiling / Single-Cell Analysis Language: En Journal: Comput Biol Med / Comput. biol. med / Computers in biology and medicine Year: 2024 Document type: Article Country of publication: Estados Unidos