RESUMEN
OBJECTIVES: Social support (SS) and social isolation (SI) are social determinants of health (SDOH) associated with psychiatric outcomes. In electronic health records (EHRs), individual-level SS/SI is typically documented in narrative clinical notes rather than as structured coded data. Natural language processing (NLP) algorithms can automate the otherwise labor-intensive process of extraction of such information. MATERIALS AND METHODS: Psychiatric encounter notes from Mount Sinai Health System (MSHS, n = 300) and Weill Cornell Medicine (WCM, n = 225) were annotated to create a gold-standard corpus. A rule-based system (RBS) involving lexicons and a large language model (LLM) using FLAN-T5-XL were developed to identify mentions of SS and SI and their subcategories (eg, social network, instrumental support, and loneliness). RESULTS: For extracting SS/SI, the RBS obtained higher macroaveraged F1-scores than the LLM at both MSHS (0.89 versus 0.65) and WCM (0.85 versus 0.82). For extracting the subcategories, the RBS also outperformed the LLM at both MSHS (0.90 versus 0.62) and WCM (0.82 versus 0.81). DISCUSSION AND CONCLUSION: Unexpectedly, the RBS outperformed the LLMs across all metrics. An intensive review demonstrates that this finding is due to the divergent approach taken by the RBS and LLM. The RBS was designed and refined to follow the same specific rules as the gold-standard annotations. Conversely, the LLM was more inclusive with categorization and conformed to common English-language understanding. Both approaches offer advantages, although additional replication studies are warranted.
RESUMEN
OBJECTIVE: Social determinants of health (SDoH) are nonclinical dispositions that impact patient health risks and clinical outcomes. Leveraging SDoH in clinical decision-making can potentially improve diagnosis, treatment planning, and patient outcomes. Despite increased interest in capturing SDoH in electronic health records (EHRs), such information is typically locked in unstructured clinical notes. Natural language processing (NLP) is the key technology to extract SDoH information from clinical text and expand its utility in patient care and research. This article presents a systematic review of the state-of-the-art NLP approaches and tools that focus on identifying and extracting SDoH data from unstructured clinical text in EHRs. MATERIALS AND METHODS: A broad literature search was conducted in February 2021 using 3 scholarly databases (ACL Anthology, PubMed, and Scopus) following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. A total of 6402 publications were initially identified, and after applying the study inclusion criteria, 82 publications were selected for the final review. RESULTS: Smoking status (n = 27), substance use (n = 21), homelessness (n = 20), and alcohol use (n = 15) are the most frequently studied SDoH categories. Homelessness (n = 7) and other less-studied SDoH (eg, education, financial problems, social isolation and support, family problems) are mostly identified using rule-based approaches. In contrast, machine learning approaches are popular for identifying smoking status (n = 13), substance use (n = 9), and alcohol use (n = 9). CONCLUSION: NLP offers significant potential to extract SDoH data from narrative clinical notes, which in turn can aid in the development of screening tools, risk prediction models, and clinical decision support systems.
Asunto(s)
Registros Electrónicos de Salud , Procesamiento de Lenguaje Natural , Manejo de Datos , Humanos , Aprendizaje Automático , Determinantes Sociales de la SaludRESUMEN
The behavioral variant of frontotemporal dementia is usually a sporadic and progressive neurodegenerative disorder. Here, we report the subacute onset of a frontotemporal dementia phenotype with a treatable etiology. The patient has a history of rheumatoid arthritis, episcleritis, and thyroid eye disease on immunosuppressive therapy. He experienced a rapid personality change, including inappropriate behavior, which suggested frontotemporal dementia. Results of imaging and neuropsychological testing also suggested frontotemporal dementia. Because of his autoimmune diseases and unusually short onset of symptoms, serum paraneoplastic panel and cerebrospinal fluid were analyzed and revealed elevated P/Q- and N-type calcium channel antibodies. Treatment with therapeutic plasma exchange resulted in a rapid improvement of his behavior and cognition. This case suggests that there may be some treatable causes of frontotemporal dementia symptomatology, that is, paraneoplastic antibodies. In the context of atypical features of frontotemporal dementia, practitioners should maintain a high index of suspicion.