Your browser doesn't support javascript.
loading
A Study of Social and Behavioral Determinants of Health in Lung Cancer Patients Using Transformers-based Natural Language Processing Models.
Yu, Zehao; Yang, Xi; Dang, Chong; Wu, Songzi; Adekkanattu, Prakash; Pathak, Jyotishman; George, Thomas J; Hogan, William R; Guo, Yi; Bian, Jiang; Wu, Yonghui.
Afiliación
  • Yu Z; Department of Health Outcomes and Biomedical Informatics.
  • Yang X; Department of Health Outcomes and Biomedical Informatics.
  • Dang C; Cancer Informatics Shared Resources, University of Florida Health Cancer Center, University of Florida, Gainesville, Florida, USA.
  • Wu S; Department of Health Outcomes and Biomedical Informatics.
  • Adekkanattu P; Department of Health Outcomes and Biomedical Informatics.
  • Pathak J; Information Technologies and Services.
  • George TJ; Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA.
  • Hogan WR; Division of Hematology & Oncology, Department of Medicine, College of Medicine.
  • Guo Y; Department of Health Outcomes and Biomedical Informatics.
  • Bian J; Department of Health Outcomes and Biomedical Informatics.
  • Wu Y; Cancer Informatics Shared Resources, University of Florida Health Cancer Center, University of Florida, Gainesville, Florida, USA.
AMIA Annu Symp Proc ; 2021: 1225-1233, 2021.
Article en En | MEDLINE | ID: mdl-35309014
ABSTRACT
Social and behavioral determinants of health (SBDoH) have important roles in shaping people's health. In clinical research studies, especially comparative effectiveness studies, failure to adjust for SBDoH factors will potentially cause confounding issues and misclassification errors in either statistical analyses and machine learning-based models. However, there are limited studies to examine SBDoH factors in clinical outcomes due to the lack of structured SBDoH information in current electronic health record (EHR) systems, while much of the SBDoH information is documented in clinical narratives. Natural language processing (NLP) is thus the key technology to extract such information from unstructured clinical text. However, there is not a mature clinical NLP system focusing on SBDoH. In this study, we examined two state-of-the-art transformer-based NLP models, including BERT and RoBERTa, to extract SBDoH concepts from clinical narratives, applied the best performing model to extract SBDoH concepts on a lung cancer screening patient cohort, and examined the difference of SBDoH information between NLP extracted results and structured EHRs (SBDoH information captured in standard vocabularies such as the International Classification of Diseases codes). The experimental results show that the BERT-based NLP model achieved the best strict/lenient F1-score of 0.8791 and 0.8999, respectively. The comparison between NLP extracted SBDoH information and structured EHRs in the lung cancer patient cohort of 864 patients with 161,933 various types of clinical notes showed that much more detailed information about smoking, education, and employment were only captured in clinical narratives and that it is necessary to use both clinical narratives and structured EHRs to construct a more complete picture of patients' SBDoH factors.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Procesamiento de Lenguaje Natural / Neoplasias Pulmonares Tipo de estudio: Diagnostic_studies / Prognostic_studies / Screening_studies Límite: Humans Idioma: En Revista: AMIA Annu Symp Proc Asunto de la revista: INFORMATICA MEDICA Año: 2021 Tipo del documento: Article

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Procesamiento de Lenguaje Natural / Neoplasias Pulmonares Tipo de estudio: Diagnostic_studies / Prognostic_studies / Screening_studies Límite: Humans Idioma: En Revista: AMIA Annu Symp Proc Asunto de la revista: INFORMATICA MEDICA Año: 2021 Tipo del documento: Article