Your browser doesn't support javascript.
loading
Enabling personalised disease diagnosis by combining a patient's time-specific gene expression profile with a biomedical knowledge base.
Verma, Ghanshyam; Rebholz-Schuhmann, Dietrich; Madden, Michael G.
Afiliación
  • Verma G; Insight Centre for Data Analytics, School of Computer Science, University of Galway, Galway, Ireland. ghanshyam.verma@insight-centre.org.
  • Rebholz-Schuhmann D; School of Computer Science, University of Galway, Galway, Ireland. ghanshyam.verma@insight-centre.org.
  • Madden MG; ZB MED - Information Centre for Life Sciences, University of Cologne, Cologne, Germany.
BMC Bioinformatics ; 25(1): 62, 2024 Feb 07.
Article en En | MEDLINE | ID: mdl-38326757
ABSTRACT

BACKGROUND:

Recent developments in the domain of biomedical knowledge bases (KBs) open up new ways to exploit biomedical knowledge that is available in the form of KBs. Significant work has been done in the direction of biomedical KB creation and KB completion, specifically, those having gene-disease associations and other related entities. However, the use of such biomedical KBs in combination with patients' temporal clinical data still largely remains unexplored, but has the potential to immensely benefit medical diagnostic decision support systems.

RESULTS:

We propose two new algorithms, LOADDx and SCADDx, to combine a patient's gene expression data with gene-disease association and other related information available in the form of a KB, to assist personalized disease diagnosis. We have tested both of the algorithms on two KBs and on four real-world gene expression datasets of respiratory viral infection caused by Influenza-like viruses of 19 subtypes. We also compare the performance of proposed algorithms with that of five existing state-of-the-art machine learning algorithms (k-NN, Random Forest, XGBoost, Linear SVM, and SVM with RBF Kernel) using two validation approaches LOOCV and a single internal validation set. Both SCADDx and LOADDx outperform the existing algorithms when evaluated with both validation approaches. SCADDx is able to detect infections with up to 100% accuracy in the cases of Datasets 2 and 3. Overall, SCADDx and LOADDx are able to detect an infection within 72 h of infection with 91.38% and 92.66% average accuracy respectively considering all four datasets, whereas XGBoost, which performed best among the existing machine learning algorithms, can detect the infection with only 86.43% accuracy on an average.

CONCLUSIONS:

We demonstrate how our novel idea of using the most and least differentially expressed genes in combination with a KB can enable identification of the diseases that a patient is most likely to have at a particular time, from a KB with thousands of diseases. Moreover, the proposed algorithms can provide a short ranked list of the most likely diseases for each patient along with their most affected genes, and other entities linked with them in the KB, which can support health care professionals in their decision-making.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Bases del Conocimiento / Transcriptoma Tipo de estudio: Diagnostic_studies / Prognostic_studies Límite: Humans Idioma: En Revista: BMC Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2024 Tipo del documento: Article País de afiliación: Irlanda Pais de publicación: Reino Unido

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Bases del Conocimiento / Transcriptoma Tipo de estudio: Diagnostic_studies / Prognostic_studies Límite: Humans Idioma: En Revista: BMC Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2024 Tipo del documento: Article País de afiliación: Irlanda Pais de publicación: Reino Unido