RESUMO
Objective: This study uses electronic health record (EHR) data to predict 12 common cancer symptoms, assessing the efficacy of machine learning (ML) models in identifying symptom influencers. Materials and Methods: We analyzed EHR data of 8156 adults diagnosed with cancer who underwent cancer treatment from 2017 to 2020. Structured and unstructured EHR data were sourced from the Enterprise Data Warehouse for Research at the University of Iowa Hospital and Clinics. Several predictive models, including logistic regression, random forest (RF), and XGBoost, were employed to forecast symptom development. The performances of the models were evaluated by F1-score and area under the curve (AUC) on the testing set. The SHapley Additive exPlanations framework was used to interpret these models and identify the predictive risk factors associated with fatigue as an exemplar. Results: The RF model exhibited superior performance with a macro average AUC of 0.755 and an F1-score of 0.729 in predicting a range of cancer-related symptoms. For instance, the RF model achieved an AUC of 0.954 and an F1-score of 0.914 for pain prediction. Key predictive factors identified included clinical history, cancer characteristics, treatment modalities, and patient demographics depending on the symptom. For example, the odds ratio (OR) for fatigue was significantly influenced by allergy (OR = 2.3, 95% CI: 1.8-2.9) and colitis (OR = 1.9, 95% CI: 1.5-2.4). Discussion: Our research emphasizes the critical integration of multimorbidity and patient characteristics in modeling cancer symptoms, revealing the considerable influence of chronic conditions beyond cancer itself. Conclusion: We highlight the potential of ML for predicting cancer symptoms, suggesting a pathway for integrating such models into clinical systems to enhance personalized care and symptom management.
RESUMO
PURPOSE: Identifying cancer symptoms in electronic health record (EHR) narratives is feasible with natural language processing (NLP). However, more efficient NLP systems are needed to detect various symptoms and distinguish observed symptoms from negated symptoms and medication-related side effects. We evaluated the accuracy of NLP in (1) detecting 14 symptom groups (ie, pain, fatigue, swelling, depressed mood, anxiety, nausea/vomiting, pruritus, headache, shortness of breath, constipation, numbness/tingling, decreased appetite, impaired memory, disturbed sleep) and (2) distinguishing observed symptoms in EHR narratives among patients with cancer. METHODS: We extracted 902,508 notes for 11,784 unique patients diagnosed with cancer and developed a gold standard corpus of 1,112 notes labeled for presence or absence of 14 symptom groups. We trained an embeddings-augmented NLP system integrating human and machine intelligence and conventional machine learning algorithms. NLP metrics were calculated on a gold standard corpus subset for testing. RESULTS: The interannotator agreement for labeling the gold standard corpus was excellent at 92%. The embeddings-augmented NLP model achieved the best performance (F1 score = 0.877). The highest NLP accuracy was observed in pruritus (F1 score = 0.937) while the lowest accuracy was in swelling (F1 score = 0.787). After classifying the entire data set with embeddings-augmented NLP, we found that 41% of the notes included symptom documentation. Pain was the most documented symptom (29% of all notes) while impaired memory was the least documented (0.7% of all notes). CONCLUSION: We illustrated the feasibility of detecting 14 symptom groups in EHR narratives and showed that an embeddings-augmented NLP system outperforms conventional machine learning algorithms in detecting symptom information and differentiating observed symptoms from negated symptoms and medication-related side effects.
Assuntos
Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Neoplasias , Humanos , Neoplasias/diagnóstico , Neoplasias/psicologia , Feminino , Masculino , Algoritmos , Aprendizado de Máquina , Narração , Pessoa de Meia-IdadeRESUMO
CONTEXT: Extracting cancer symptom documentation allows clinicians to develop highly individualized symptom prediction algorithms to deliver symptom management care. Leveraging advanced language models to detect symptom data in clinical narratives can significantly enhance this process. OBJECTIVE: This study uses a pretrained large language model to detect and extract cancer symptoms in clinical notes. METHODS: We developed a pretrained language model to identify cancer symptoms in clinical notes based on a clinical corpus from the Enterprise Data Warehouse for Research at a healthcare system in the Midwestern United States. This study was conducted in 4 phases:1 pretraining a Bio-Clinical BERT model on one million unlabeled clinical documents,2 fine-tuning Symptom-BERT for detecting 13 cancer symptom groups within 1112 annotated clinical notes,3 generating 180 synthetic clinical notes using ChatGPT-4 for external validation, and4 comparing the internal and external performance of Symptom-BERT against a non-pretrained version and six other BERT implementations. RESULTS: The Symptom-BERT model effectively detected cancer symptoms in clinical notes. It achieved results with a micro-averaged F1-score of 0.933, an AUC of 0.929 internally, and 0.831 and 0.834 externally. Our analysis shows that physical symptoms, like Pruritus, are typically identified with higher performance than psychological symptoms, such as anxiety. CONCLUSION: This study underscores the transformative potential of specialized pretraining on domain-specific data in boosting the performance of language models for medical applications. The Symptom-BERT model's exceptional efficacy in detecting cancer symptoms heralds a groundbreaking stride in patient-centered AI technologies, offering a promising path to elevate symptom management and cultivate superior patient self-care outcomes.
Assuntos
Registros Eletrônicos de Saúde , Neoplasias , Humanos , Neoplasias/diagnóstico , Neoplasias/terapia , Processamento de Linguagem Natural , AlgoritmosRESUMO
BACKGROUND: People with cancer frequently experience severe and distressing symptoms associated with cancer and its treatments. Predicting symptoms in patients with cancer continues to be a significant challenge for both clinicians and researchers. The rapid evolution of machine learning (ML) highlights the need for a current systematic review to improve cancer symptom prediction. OBJECTIVE: This systematic review aims to synthesize the literature that has used ML algorithms to predict the development of cancer symptoms and to identify the predictors of these symptoms. This is essential for integrating new developments and identifying gaps in existing literature. METHODS: We conducted this systematic review in accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) checklist. We conducted a systematic search of CINAHL, Embase, and PubMed for English records published from 1984 to August 11, 2023, using the following search terms: cancer, neoplasm, specific symptoms, neural networks, machine learning, specific algorithm names, and deep learning. All records that met the eligibility criteria were individually reviewed by 2 coauthors, and key findings were extracted and synthesized. We focused on studies using ML algorithms to predict cancer symptoms, excluding nonhuman research, technical reports, reviews, book chapters, conference proceedings, and inaccessible full texts. RESULTS: A total of 42 studies were included, the majority of which were published after 2017. Most studies were conducted in North America (18/42, 43%) and Asia (16/42, 38%). The sample sizes in most studies (27/42, 64%) typically ranged from 100 to 1000 participants. The most prevalent category of algorithms was supervised ML, accounting for 39 (93%) of the 42 studies. Each of the methods-deep learning, ensemble classifiers, and unsupervised ML-constituted 3 (3%) of the 42 studies. The ML algorithms with the best performance were logistic regression (9/42, 17%), random forest (7/42, 13%), artificial neural networks (5/42, 9%), and decision trees (5/42, 9%). The most commonly included primary cancer sites were the head and neck (9/42, 22%) and breast (8/42, 19%), with 17 (41%) of the 42 studies not specifying the site. The most frequently studied symptoms were xerostomia (9/42, 14%), depression (8/42, 13%), pain (8/42, 13%), and fatigue (6/42, 10%). The significant predictors were age, gender, treatment type, treatment number, cancer site, cancer stage, chemotherapy, radiotherapy, chronic diseases, comorbidities, physical factors, and psychological factors. CONCLUSIONS: This review outlines the algorithms used for predicting symptoms in individuals with cancer. Given the diversity of symptoms people with cancer experience, analytic approaches that can handle complex and nonlinear relationships are critical. This knowledge can pave the way for crafting algorithms tailored to a specific symptom. In addition, to improve prediction precision, future research should compare cutting-edge ML strategies such as deep learning and ensemble methods with traditional statistical models.