Pesquisa | BVS Economia da Saúde

Application of natural language processing to identify social needs from patient medical notes: development and assessment of a scalable, performant, and rule-based model in an integrated healthcare delivery system.

Gray, Geoffrey M; Zirikly, Ayah; Ahumada, Luis M; Rouhizadeh, Masoud; Richards, Thomas; Kitchen, Christopher; Foroughmand, Iman; Hatef, Elham.

JAMIA Open ; 6(4): ooad085, 2023 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-37799347

RESUMO

Objectives: To develop and test a scalable, performant, and rule-based model for identifying 3 major domains of social needs (residential instability, food insecurity, and transportation issues) from the unstructured data in electronic health records (EHRs). Materials and Methods: We included patients aged 18 years or older who received care at the Johns Hopkins Health System (JHHS) between July 2016 and June 2021 and had at least 1 unstructured (free-text) note in their EHR during the study period. We used a combination of manual lexicon curation and semiautomated lexicon creation for feature development. We developed an initial rules-based pipeline (Match Pipeline) using 2 keyword sets for each social needs domain. We performed rule-based keyword matching for distinct lexicons and tested the algorithm using an annotated dataset comprising 192 patients. Starting with a set of expert-identified keywords, we tested the adjustments by evaluating false positives and negatives identified in the labeled dataset. We assessed the performance of the algorithm using measures of precision, recall, and F1 score. Results: The algorithm for identifying residential instability had the best overall performance, with a weighted average for precision, recall, and F1 score of 0.92, 0.84, and 0.92 for identifying patients with homelessness and 0.84, 0.82, and 0.79 for identifying patients with housing insecurity. Metrics for the food insecurity algorithm were high but the transportation issues algorithm was the lowest overall performing metric. Discussion: The NLP algorithm in identifying social needs at JHHS performed relatively well and would provide the opportunity for implementation in a healthcare system. Conclusion: The NLP approach developed in this project could be adapted and potentially operationalized in the routine data processes of a healthcare system.

Developing and validating a natural language processing algorithm to extract preoperative cannabis use status documentation from unstructured narrative clinical notes.

Sajdeya, Ruba; Mardini, Mamoun T; Tighe, Patrick J; Ison, Ronald L; Bai, Chen; Jugl, Sebastian; Hanzhi, Gao; Zandbiglari, Kimia; Adiba, Farzana I; Winterstein, Almut G; Pearson, Thomas A; Cook, Robert L; Rouhizadeh, Masoud.

J Am Med Inform Assoc ; 30(8): 1418-1428, 2023 07 19.

Artigo em Inglês | MEDLINE | ID: mdl-37178155

RESUMO

OBJECTIVE: This study aimed to develop a natural language processing algorithm (NLP) using machine learning (ML) techniques to identify and classify documentation of preoperative cannabis use status. MATERIALS AND METHODS: We developed and applied a keyword search strategy to identify documentation of preoperative cannabis use status in clinical documentation within 60 days of surgery. We manually reviewed matching notes to classify each documentation into 8 different categories based on context, time, and certainty of cannabis use documentation. We applied 2 conventional ML and 3 deep learning models against manual annotation. We externally validated our model using the MIMIC-III dataset. RESULTS: The tested classifiers achieved classification results close to human performance with up to 93% and 94% precision and 95% recall of preoperative cannabis use status documentation. External validation showed consistent results with up to 94% precision and recall. DISCUSSION: Our NLP model successfully replicated human annotation of preoperative cannabis use documentation, providing a baseline framework for identifying and classifying documentation of cannabis use. We add to NLP methods applied in healthcare for clinical concept extraction and classification, mainly concerning social determinants of health and substance use. Our systematically developed lexicon provides a comprehensive knowledge-based resource covering a wide range of cannabis-related concepts for future NLP applications. CONCLUSION: We demonstrated that documentation of preoperative cannabis use status could be accurately identified using an NLP algorithm. This approach can be employed to identify comparison groups based on cannabis exposure for growing research efforts aiming to guide cannabis-related clinical practices and policies.

Assuntos

Cannabis , Registros Eletrônicos de Saúde , Humanos , Processamento de Linguagem Natural , Algoritmos , Documentação

Development and assessment of a natural language processing model to identify residential instability in electronic health records' unstructured data: a comparison of 3 integrated healthcare delivery systems.

Hatef, Elham; Rouhizadeh, Masoud; Nau, Claudia; Xie, Fagen; Rouillard, Christopher; Abu-Nasser, Mahmoud; Padilla, Ariadna; Lyons, Lindsay Joe; Kharrazi, Hadi; Weiner, Jonathan P; Roblin, Douglas.

JAMIA Open ; 5(1): ooac006, 2022 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-35224458

RESUMO

OBJECTIVE: To evaluate whether a natural language processing (NLP) algorithm could be adapted to extract, with acceptable validity, markers of residential instability (ie, homelessness and housing insecurity) from electronic health records (EHRs) of 3 healthcare systems. MATERIALS AND METHODS: We included patients 18 years and older who received care at 1 of 3 healthcare systems from 2016 through 2020 and had at least 1 free-text note in the EHR during this period. We conducted the study independently; the NLP algorithm logic and method of validity assessment were identical across sites. The approach to the development of the gold standard for assessment of validity differed across sites. Using the EntityRuler module of spaCy 2.3 Python toolkit, we created a rule-based NLP system made up of expert-developed patterns indicating residential instability at the lead site and enriched the NLP system using insight gained from its application at the other 2 sites. We adapted the algorithm at each site then validated the algorithm using a split-sample approach. We assessed the performance of the algorithm by measures of positive predictive value (precision), sensitivity (recall), and specificity. RESULTS: The NLP algorithm performed with moderate precision (0.45, 0.73, and 1.0) at 3 sites. The sensitivity and specificity of the NLP algorithm varied across 3 sites (sensitivity: 0.68, 0.85, and 0.96; specificity: 0.69, 0.89, and 1.0). DISCUSSION: The performance of this NLP algorithm to identify residential instability in 3 different healthcare systems suggests the algorithm is generally valid and applicable in other healthcare systems with similar EHRs. CONCLUSION: The NLP approach developed in this project is adaptable and can be modified to extract types of social needs other than residential instability from EHRs across different healthcare systems.

Measuring the Value of a Practical Text Mining Approach to Identify Patients With Housing Issues in the Free-Text Notes in Electronic Health Record: Findings of a Retrospective Cohort Study.

Hatef, Elham; Singh Deol, Gurmehar; Rouhizadeh, Masoud; Li, Ashley; Eibensteiner, Katyusha; Monsen, Craig B; Bratslaver, Roman; Senese, Margaret; Kharrazi, Hadi.

Front Public Health ; 9: 697501, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34513783

RESUMO

Introduction: Despite the growing efforts to standardize coding for social determinants of health (SDOH), they are infrequently captured in electronic health records (EHRs). Most SDOH variables are still captured in the unstructured fields (i.e., free-text) of EHRs. In this study we attempt to evaluate a practical text mining approach (i.e., advanced pattern matching techniques) in identifying phrases referring to housing issues, an important SDOH domain affecting value-based healthcare providers, using EHR of a large multispecialty medical group in the New England region, United States. To present how this approach would help the health systems to address the SDOH challenges of their patients we assess the demographic and clinical characteristics of patients with and without housing issues and briefly look into the patterns of healthcare utilization among the study population and for those with and without housing challenges. Methods: We identified five categories of housing issues [i.e., homelessness current (HC), homelessness history (HH), homelessness addressed (HA), housing instability (HI), and building quality (BQ)] and developed several phrases addressing each one through collaboration with SDOH experts, consulting the literature, and reviewing existing coding standards. We developed pattern-matching algorithms (i.e., advanced regular expressions), and then applied them in the selected EHR. We assessed the text mining approach for recall (sensitivity) and precision (positive predictive value) after comparing the identified phrases with manually annotated free-text for different housing issues. Results: The study dataset included EHR structured data for a total of 20,342 patients and 2,564,344 free-text clinical notes. The mean (SD) age in the study population was 75.96 (7.51). Additionally, 58.78% of the cohort were female. BQ and HI were the most frequent housing issues documented in EHR free-text notes and HH was the least frequent one. The regular expression methodology, when compared to manual annotation, had a high level of precision (positive predictive value) at phrase, note, and patient levels (96.36, 95.00, and 94.44%, respectively) across different categories of housing issues, but the recall (sensitivity) rate was relatively low (30.11, 32.20, and 41.46%, respectively). Conclusion: Results of this study can be used to advance the research in this domain, to assess the potential value of EHR's free-text in identifying patients with a high risk of housing issues, to improve patient care and outcomes, and to eventually mitigate socioeconomic disparities across individuals and communities.

Assuntos

Registros Eletrônicos de Saúde , Habitação , Mineração de Dados , Feminino , Humanos , Estudos Retrospectivos , Determinantes Sociais da Saúde , Estados Unidos

Assessing the Impact of Social Needs and Social Determinants of Health on Health Care Utilization: Using Patient- and Community-Level Data.

Hatef, Elham; Ma, Xiaomeng; Rouhizadeh, Masoud; Singh, Gurmehar; Weiner, Jonathan P; Kharrazi, Hadi.

Popul Health Manag ; 24(2): 222-230, 2021 04.

Artigo em Inglês | MEDLINE | ID: mdl-32598228

RESUMO

As the US health care system moves to expand access to and quality of medical care, the importance of addressing patient-level social needs and community-level social determinants of health (SDOH) is increasingly being recognized. This study evaluates individual- and community-level needs of housing (one of the SDOH domains) across the patient population of an academic medical center and explores how the level of housing needs impacts health care utilization. The authors performed a descriptive analysis of housing issues identified in both structured and unstructured (eg, clinical notes) data extracted from the electronic health record (EHR) and compared this to community-level characteristics of patients' neighborhood as measured by the Area Deprivation Index. Multivariate analyses were performed to assess the association between these and other factors on the frequency of service encounters. Among the 1,034,683 study participants, 59,703 (5.8%) had at least 1 housing issue identified in their EHR from structured or unstructured data combined. After adjusting for other factors, patients with housing instability and homelessness had 49% and 34% more encounters with the health care system compared to patients without housing issues (P < 0.00001). Patients living in the most disadvantaged neighborhoods had 55% more encounters with the health care system compared to those living in the most advantaged neighborhoods (P < 0.00001). This data collection approach and findings can inform health care systems aiming to make use of their EHRs and community-level SDOH information to provide a full assessment of patients' social needs and challenges.

Assuntos

Medicare , Determinantes Sociais da Saúde , Idoso , Registros Eletrônicos de Saúde , Feminino , Humanos , Masculino , Aceitação pelo Paciente de Cuidados de Saúde , Características de Residência , Estados Unidos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA