Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Comput Biol Med ; 174: 108398, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38608322

RESUMO

The recurrence of low-stage lung cancer poses a challenge due to its unpredictable nature and diverse patient responses to treatments. Personalized care and patient outcomes heavily rely on early relapse identification, yet current predictive models, despite their potential, lack comprehensive genetic data. This inadequacy fuels our research focus-integrating specific genetic information, such as pathway scores, into clinical data. Our aim is to refine machine learning models for more precise relapse prediction in early-stage non-small cell lung cancer. To address the scarcity of genetic data, we employ imputation techniques, leveraging publicly available datasets such as The Cancer Genome Atlas (TCGA), integrating pathway scores into our patient cohort from the Cancer Long Survivor Artificial Intelligence Follow-up (CLARIFY) project. Through the integration of imputed pathway scores from the TCGA dataset with clinical data, our approach achieves notable strides in predicting relapse among a held-out test set of 200 patients. By training machine learning models on enriched knowledge graph data, inclusive of triples derived from pathway score imputation, we achieve a promising precision of 82% and specificity of 91%. These outcomes highlight the potential of our models as supplementary tools within tumour, node, and metastasis (TNM) classification systems, offering improved prognostic capabilities for lung cancer patients. In summary, our research underscores the significance of refining machine learning models for relapse prediction in early-stage non-small cell lung cancer. Our approach, centered on imputing pathway scores and integrating them with clinical data, not only enhances predictive performance but also demonstrates the promising role of machine learning in anticipating relapse and ultimately elevating patient outcomes.


Assuntos
Carcinoma Pulmonar de Células não Pequenas , Genômica , Neoplasias Pulmonares , Aprendizado de Máquina , Humanos , Neoplasias Pulmonares/genética , Carcinoma Pulmonar de Células não Pequenas/genética , Genômica/métodos , Recidiva Local de Neoplasia/genética , Feminino , Masculino , Bases de Dados Genéticas
2.
JCO Clin Cancer Inform ; 7: e2200062, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37428988

RESUMO

PURPOSE: Stratifying patients with cancer according to risk of relapse can personalize their care. In this work, we provide an answer to the following research question: How to use machine learning to estimate probability of relapse in patients with early-stage non-small-cell lung cancer (NSCLC)? MATERIALS AND METHODS: For predicting relapse in 1,387 patients with early-stage (I-II) NSCLC from the Spanish Lung Cancer Group data (average age 65.7 years, female 24.8%, male 75.2%), we train tabular and graph machine learning models. We generate automatic explanations for the predictions of such models. For models trained on tabular data, we adopt SHapley Additive exPlanations local explanations to gauge how each patient feature contributes to the predicted outcome. We explain graph machine learning predictions with an example-based method that highlights influential past patients. RESULTS: Machine learning models trained on tabular data exhibit a 76% accuracy for the random forest model at predicting relapse evaluated with a 10-fold cross-validation (the model was trained 10 times with different independent sets of patients in test, train, and validation sets, and the reported metrics are averaged over these 10 test sets). Graph machine learning reaches 68% accuracy over a held-out test set of 200 patients, calibrated on a held-out set of 100 patients. CONCLUSION: Our results show that machine learning models trained on tabular and graph data can enable objective, personalized, and reproducible prediction of relapse and, therefore, disease outcome in patients with early-stage NSCLC. With further prospective and multisite validation, and additional radiological and molecular data, this prognostic model could potentially serve as a predictive decision support tool for deciding the use of adjuvant treatments in early-stage lung cancer.


Assuntos
Carcinoma Pulmonar de Células não Pequenas , Neoplasias Pulmonares , Humanos , Masculino , Feminino , Idoso , Carcinoma Pulmonar de Células não Pequenas/diagnóstico , Carcinoma Pulmonar de Células não Pequenas/terapia , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/terapia , Recidiva Local de Neoplasia/diagnóstico , Aprendizado de Máquina , Prognóstico
3.
J Biomed Inform ; 144: 104424, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37352900

RESUMO

OBJECTIVE: Lung cancer exhibits unpredictable recurrence in low-stage tumors and variable responses to different therapeutic interventions. Predicting relapse in early-stage lung cancer can facilitate precision medicine and improve patient survivability. While existing machine learning models rely on clinical data, incorporating genomic information could enhance their efficiency. This study aims to impute and integrate specific types of genomic data with clinical data to improve the accuracy of machine learning models for predicting relapse in early-stage, non-small cell lung cancer patients. METHODS: The study utilized a publicly available TCGA lung cancer cohort and imputed genetic pathway scores into the Spanish Lung Cancer Group (SLCG) data, specifically in 1348 early-stage patients. Initially, tumor recurrence was predicted without imputed pathway scores. Subsequently, the SLCG data were augmented with pathway scores imputed from TCGA. The integrative approach aimed to enhance relapse risk prediction performance. RESULTS: The integrative approach achieved improved relapse risk prediction with the following evaluation metrics: an area under the precision-recall curve (PR-AUC) score of 0.75, an area under the ROC (ROC-AUC) score of 0.80, an F1 score of 0.61, and a Precision of 0.80. The prediction explanation model SHAP (SHapley Additive exPlanations) was employed to explain the machine learning model's predictions. CONCLUSION: We conclude that our explainable predictive model is a promising tool for oncologists that addresses an unmet clinical need of post-treatment patient stratification based on the relapse risk while also improving the predictive power by incorporating proxy genomic data not available for specific patients.


Assuntos
Carcinoma Pulmonar de Células não Pequenas , Neoplasias Pulmonares , Carcinoma de Pequenas Células do Pulmão , Humanos , Carcinoma Pulmonar de Células não Pequenas/diagnóstico , Carcinoma Pulmonar de Células não Pequenas/genética , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/genética , Recidiva Local de Neoplasia/genética , Pulmão
4.
IEEE Trans Nanobioscience ; 22(4): 781-788, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37167037

RESUMO

This work is motivated by the scarcity of tools for accurate, unsupervised information extraction from unstructured clinical notes in computationally underrepresented languages, such as Czech. We introduce a stepping stone to a broad array of downstream tasks such as summarisation or integration of individual patient records, extraction of structured information for national cancer registry reporting or building of semi-structured semantic patient representations that can be used for computing patient embeddings. More specifically, we present a method for unsupervised extraction of semantically-labeled textual segments from clinical notes and test it out on a dataset of Czech breast cancer patients, provided by Masaryk Memorial Cancer Institute (the largest Czech hospital specialising exclusively in oncology). Our goal was to extract, classify (i.e. label) and cluster segments of the free-text notes that correspond to specific clinical features (e.g., family background, comorbidities or toxicities). Finally, we propose a tool for computer-assisted semantic mapping of segment types to pre-defined ontologies and validate it on a downstream task of category-specific patient similarity. The presented results demonstrate the practical relevance of the proposed approach for building more sophisticated extraction and analytical pipelines deployed on Czech clinical notes.


Assuntos
Neoplasias da Mama , Semântica , Humanos , Feminino , Armazenamento e Recuperação da Informação , Análise por Conglomerados
5.
Cancers (Basel) ; 14(16)2022 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-36011034

RESUMO

BACKGROUND: Artificial intelligence (AI) has contributed substantially in recent years to the resolution of different biomedical problems, including cancer. However, AI tools with significant and widespread impact in oncology remain scarce. The goal of this study is to present an AI-based solution tool for cancer patients data analysis that assists clinicians in identifying the clinical factors associated with poor prognosis, relapse and survival, and to develop a prognostic model that stratifies patients by risk. MATERIALS AND METHODS: We used clinical data from 5275 patients diagnosed with non-small cell lung cancer, breast cancer, and non-Hodgkin lymphoma at Hospital Universitario Puerta de Hierro-Majadahonda. Accessible clinical parameters measured with a wearable device and quality of life questionnaires data were also collected. RESULTS: Using an AI-tool, data from 5275 cancer patients were analyzed, integrating clinical data, questionnaires data, and data collected from wearable devices. Descriptive analyses were performed in order to explore the patients' characteristics, survival probabilities were calculated, and a prognostic model identified low and high-risk profile patients. CONCLUSION: Overall, the reconstruction of the population's risk profile for the cancer-specific predictive model was achieved and proved useful in clinical practice using artificial intelligence. It has potential application in clinical settings to improve risk stratification, early detection, and surveillance management of cancer patients.

6.
Brief Bioinform ; 22(2): 1679-1693, 2021 03 22.
Artigo em Inglês | MEDLINE | ID: mdl-32065227

RESUMO

Complex biological systems are traditionally modelled as graphs of interconnected biological entities. These graphs, i.e. biological knowledge graphs, are then processed using graph exploratory approaches to perform different types of analytical and predictive tasks. Despite the high predictive accuracy of these approaches, they have limited scalability due to their dependency on time-consuming path exploratory procedures. In recent years, owing to the rapid advances of computational technologies, new approaches for modelling graphs and mining them with high accuracy and scalability have emerged. These approaches, i.e. knowledge graph embedding (KGE) models, operate by learning low-rank vector representations of graph nodes and edges that preserve the graph's inherent structure. These approaches were used to analyse knowledge graphs from different domains where they showed superior performance and accuracy compared to previous graph exploratory approaches. In this work, we study this class of models in the context of biological knowledge graphs and their different applications. We then show how KGE models can be a natural fit for representing complex biological knowledge modelled as graphs. We also discuss their predictive and analytical capabilities in different biology applications. In this regard, we present two example case studies that demonstrate the capabilities of KGE models: prediction of drug-target interactions and polypharmacy side effects. Finally, we analyse different practical considerations for KGEs, and we discuss possible opportunities and challenges related to adopting them for modelling biological systems.


Assuntos
Biologia Computacional/métodos , Redes Neurais de Computação , Algoritmos , Interações Medicamentosas , Humanos , Aprendizado de Máquina
7.
AMIA Jt Summits Transl Sci Proc ; 2020: 449-458, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32477666

RESUMO

Polypharmacy is the use of drug combinations and is commonly used for treating complex and terminal diseases. Despite its effectiveness in many cases, it poses high risks of adverse side effects. Polypharmacy side-effects occur due to unwanted interactions of combined drugs, and they can cause severe complications to patients which results in increasing the risks of morbidity and leading to new mortalities. The use of drug polypharmacy is currently in its early stages; thus, the knowledge of their probable side-effects is limited. This encouraged multiple works to investigate machine learning techniques to efficiently and reliably predict adverse effects of drug combinations. In this context, the Decagon model is known to provide state-of-the-art results. It models polypharmacy side-effect data as a knowledge graph and formulates finding possible adverse effects as a link prediction task over the knowledge graph. The link prediction is solved using an embedding model based on graph convolutions. Despite its effectiveness, the Decagon approach still suffers from a high rate of false positives. In this work, we propose a new knowledge graph embedding technique that uses multi-part embedding vectors to predict polypharmacy side-effects. Like in the Decagon model, we model polypharmacy side effects as a knowledge graph. However, we perform the link prediction task using an approach based on tensor decomposition. Our experimental evaluation shows that our approach outperforms the Decagon model with 12% and 16% margins in terms of the area under the ROC and precision recall curves, respectively.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA