Your browser doesn't support javascript.
loading
Occlusion enhanced pan-cancer classification via deep learning.
Zhao, Xing; Chen, Zigui; Wang, Huating; Sun, Hao.
Affiliation
  • Zhao X; Department of Orthopaedics and Traumatology, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.
  • Chen Z; Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Guangdong, People's Republic of China.
  • Wang H; Department of Microbiology, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.
  • Sun H; Department of Orthopaedics and Traumatology, Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.
BMC Bioinformatics ; 25(1): 260, 2024 Aug 08.
Article in En | MEDLINE | ID: mdl-39118043
ABSTRACT
Quantitative measurement of RNA expression levels through RNA-Seq is an ideal replacement for conventional cancer diagnosis via microscope examination. Currently, cancer-related RNA-Seq studies focus on two aspects classifying the status and tissue of origin of a sample and discovering marker genes. Existing studies typically identify marker genes by statistically comparing healthy and cancer samples. However, this approach overlooks marker genes with low expression level differences and may be influenced by experimental results. This paper introduces "GENESO," a novel framework for pan-cancer classification and marker gene discovery using the occlusion method in conjunction with deep learning. we first trained a baseline deep LSTM neural network capable of distinguishing the origins and statuses of samples utilizing RNA-Seq data. Then, we propose a novel marker gene discovery method called "Symmetrical Occlusion (SO)". It collaborates with the baseline LSTM network, mimicking the "gain of function" and "loss of function" of genes to evaluate their importance in pan-cancer classification quantitatively. By identifying the genes of utmost importance, we then isolate them to train new neural networks, resulting in higher-performance LSTM models that utilize only a reduced set of highly relevant genes. The baseline neural network achieves an impressive validation accuracy of 96.59% in pan-cancer classification. With the help of SO, the accuracy of the second network reaches 98.30%, while using 67% fewer genes. Notably, our method excels in identifying marker genes that are not differentially expressed. Moreover, we assessed the feasibility of our method using single-cell RNA-Seq data, employing known marker genes as a validation test.
Subject(s)
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Deep Learning / Neoplasms Limits: Humans Language: En Journal: BMC Bioinformatics Journal subject: INFORMATICA MEDICA Year: 2024 Document type: Article

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Deep Learning / Neoplasms Limits: Humans Language: En Journal: BMC Bioinformatics Journal subject: INFORMATICA MEDICA Year: 2024 Document type: Article