Your browser doesn't support javascript.
loading
Application of Transcriptome-Based Gene Set Featurization for Machine Learning Model to Predict the Origin of Metastatic Cancer.
Jeong, Yeonuk; Chu, Jinah; Kang, Juwon; Baek, Seungjun; Lee, Jae-Hak; Jung, Dong-Sub; Kim, Won-Woo; Kim, Yi-Rang; Kang, Jihoon; Do, In-Gu.
Afiliação
  • Jeong Y; Oncocross Ltd., Seoul 04168, Republic of Korea.
  • Chu J; Department of Pathology, Kangbuk Samsung Hospital, Sungkyunkwan University School of Medicine, Seoul 03181, Republic of Korea.
  • Kang J; Oncocross Ltd., Seoul 04168, Republic of Korea.
  • Baek S; Yonsei Institute of Pharmaceutical Sciences, College of Pharmacy, Yonsei University, Incheon 21983, Republic of Korea.
  • Lee JH; Oncocross Ltd., Seoul 04168, Republic of Korea.
  • Jung DS; Oncocross Ltd., Seoul 04168, Republic of Korea.
  • Kim WW; Oncocross Ltd., Seoul 04168, Republic of Korea.
  • Kim YR; Oncocross Ltd., Seoul 04168, Republic of Korea.
  • Kang J; Oncocross Ltd., Seoul 04168, Republic of Korea.
  • Do IG; Oncocross Ltd., Seoul 04168, Republic of Korea.
Curr Issues Mol Biol ; 46(7): 7291-7302, 2024 Jul 09.
Article em En | MEDLINE | ID: mdl-39057073
ABSTRACT
Identifying the primary site of origin of metastatic cancer is vital for guiding treatment decisions, especially for patients with cancer of unknown primary (CUP). Despite advanced diagnostic techniques, CUP remains difficult to pinpoint and is responsible for a considerable number of cancer-related fatalities. Understanding its origin is crucial for effective management and potentially improving patient outcomes. This study introduces a machine learning framework, ONCOfind-AI, that leverages transcriptome-based gene set features to enhance the accuracy of predicting the origin of metastatic cancers. We demonstrate its potential to facilitate the integration of RNA sequencing and microarray data by using gene set scores for characterization of transcriptome profiles generated from different platforms. Integrating data from different platforms resulted in improved accuracy of machine learning models for predicting cancer origins. We validated our method using external data from clinical samples collected through the Kangbuk Samsung Medical Center and Gene Expression Omnibus. The external validation results demonstrate a top-1 accuracy ranging from 0.80 to 0.86, with a top-2 accuracy of 0.90. This study highlights that incorporating biological knowledge through curated gene sets can help to merge gene expression data from different platforms, thereby enhancing the compatibility needed to develop more effective machine learning prediction models.
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Idioma: En Revista: Curr Issues Mol Biol Assunto da revista: BIOLOGIA MOLECULAR Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Idioma: En Revista: Curr Issues Mol Biol Assunto da revista: BIOLOGIA MOLECULAR Ano de publicação: 2024 Tipo de documento: Article