Your browser doesn't support javascript.
loading
Systematic evaluation with practical guidelines for single-cell and spatially resolved transcriptomics data simulation under multiple scenarios.
Duo, Hongrui; Li, Yinghong; Lan, Yang; Tao, Jingxin; Yang, Qingxia; Xiao, Yingxue; Sun, Jing; Li, Lei; Nie, Xiner; Zhang, Xiaoxi; Liang, Guizhao; Liu, Mingwei; Hao, Youjin; Li, Bo.
Affiliation
  • Duo H; College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China.
  • Li Y; Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, People's Republic of China.
  • Lan Y; Institute of Pathology and Southwest Cancer Center, Southwest Hospital, Army Medical University, Chongqing, 400038, People's Republic of China.
  • Tao J; College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China.
  • Yang Q; Zhejiang Provincial Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou, 310058, People's Republic of China.
  • Xiao Y; College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China.
  • Sun J; College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China.
  • Li L; College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China.
  • Nie X; Key Laboratory of Biorheological Science and Technology, Ministry of Education, Bioengineering College, Chongqing University, Chongqing, 400044, People's Republic of China.
  • Zhang X; College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China.
  • Liang G; Key Laboratory of Biorheological Science and Technology, Ministry of Education, Bioengineering College, Chongqing University, Chongqing, 400044, People's Republic of China.
  • Liu M; Key Laboratory of Clinical Laboratory Diagnostics, College of Laboratory Medicine, Chongqing Medical University, Chongqing, 400016, People's Republic of China.
  • Hao Y; College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China. haoyoujin@hotmail.com.
  • Li B; College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China. libcell@cqnu.edu.cn.
Genome Biol ; 25(1): 145, 2024 06 03.
Article in En | MEDLINE | ID: mdl-38831386
ABSTRACT

BACKGROUND:

Single-cell RNA sequencing (scRNA-seq) and spatially resolved transcriptomics (SRT) have led to groundbreaking advancements in life sciences. To develop bioinformatics tools for scRNA-seq and SRT data and perform unbiased benchmarks, data simulation has been widely adopted by providing explicit ground truth and generating customized datasets. However, the performance of simulation methods under multiple scenarios has not been comprehensively assessed, making it challenging to choose suitable methods without practical guidelines.

RESULTS:

We systematically evaluated 49 simulation methods developed for scRNA-seq and/or SRT data in terms of accuracy, functionality, scalability, and usability using 152 reference datasets derived from 24 platforms. SRTsim, scDesign3, ZINB-WaVE, and scDesign2 have the best accuracy performance across various platforms. Unexpectedly, some methods tailored to scRNA-seq data have potential compatibility for simulating SRT data. Lun, SPARSim, and scDesign3-tree outperform other methods under corresponding simulation scenarios. Phenopath, Lun, Simple, and MFA yield high scalability scores but they cannot generate realistic simulated data. Users should consider the trade-offs between method accuracy and scalability (or functionality) when making decisions. Additionally, execution errors are mainly caused by failed parameter estimations and appearance of missing or infinite values in calculations. We provide practical guidelines for method selection, a standard pipeline Simpipe ( https//github.com/duohongrui/simpipe ; https//doi.org/10.5281/zenodo.11178409 ), and an online tool Simsite ( https//www.ciblab.net/software/simshiny/ ) for data simulation.

CONCLUSIONS:

No method performs best on all criteria, thus a good-yet-not-the-best method is recommended if it solves problems effectively and reasonably. Our comprehensive work provides crucial insights for developers on modeling gene expression data and fosters the simulation process for users.
Subject(s)
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Gene Expression Profiling / Single-Cell Analysis Limits: Humans Language: En Journal: Genome Biol Journal subject: BIOLOGIA MOLECULAR / GENETICA Year: 2024 Document type: Article

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Gene Expression Profiling / Single-Cell Analysis Limits: Humans Language: En Journal: Genome Biol Journal subject: BIOLOGIA MOLECULAR / GENETICA Year: 2024 Document type: Article