Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
J Med Internet Res ; 26: e47739, 2024 Feb 13.
Artículo en Inglés | MEDLINE | ID: mdl-38349732

RESUMEN

BACKGROUND: Assessment of activities of daily living (ADLs) and instrumental ADLs (iADLs) is key to determining the severity of dementia and care needs among older adults. However, such information is often only documented in free-text clinical notes within the electronic health record and can be challenging to find. OBJECTIVE: This study aims to develop and validate machine learning models to determine the status of ADL and iADL impairments based on clinical notes. METHODS: This cross-sectional study leveraged electronic health record clinical notes from Mass General Brigham's Research Patient Data Repository linked with Medicare fee-for-service claims data from 2007 to 2017 to identify individuals aged 65 years or older with at least 1 diagnosis of dementia. Notes for encounters both 180 days before and after the first date of dementia diagnosis were randomly sampled. Models were trained and validated using note sentences filtered by expert-curated keywords (filtered cohort) and further evaluated using unfiltered sentences (unfiltered cohort). The model's performance was compared using area under the receiver operating characteristic curve and area under the precision-recall curve (AUPRC). RESULTS: The study included 10,000 key-term-filtered sentences representing 441 people (n=283, 64.2% women; mean age 82.7, SD 7.9 years) and 1000 unfiltered sentences representing 80 people (n=56, 70% women; mean age 82.8, SD 7.5 years). Area under the receiver operating characteristic curve was high for the best-performing ADL and iADL models on both cohorts (>0.97). For ADL impairment identification, the random forest model achieved the best AUPRC (0.89, 95% CI 0.86-0.91) on the filtered cohort; the support vector machine model achieved the highest AUPRC (0.82, 95% CI 0.75-0.89) for the unfiltered cohort. For iADL impairment, the Bio+Clinical bidirectional encoder representations from transformers (BERT) model had the highest AUPRC (filtered: 0.76, 95% CI 0.68-0.82; unfiltered: 0.58, 95% CI 0.001-1.0). Compared with a keyword-search approach on the unfiltered cohort, machine learning reduced false-positive rates from 4.5% to 0.2% for ADL and 1.8% to 0.1% for iADL. CONCLUSIONS: In this study, we demonstrated the ability of machine learning models to accurately identify ADL and iADL impairment based on free-text clinical notes, which could be useful in determining the severity of dementia.


Asunto(s)
Demencia , Procesamiento de Lenguaje Natural , Estados Unidos , Humanos , Anciano , Femenino , Anciano de 80 o más Años , Masculino , Estudios Transversales , Actividades Cotidianas , Estado Funcional , Medicare
2.
J Med Internet Res ; 26: e53367, 2024 Apr 04.
Artículo en Inglés | MEDLINE | ID: mdl-38573752

RESUMEN

BACKGROUND: Real-time surveillance of emerging infectious diseases necessitates a dynamically evolving, computable case definition, which frequently incorporates symptom-related criteria. For symptom detection, both population health monitoring platforms and research initiatives primarily depend on structured data extracted from electronic health records. OBJECTIVE: This study sought to validate and test an artificial intelligence (AI)-based natural language processing (NLP) pipeline for detecting COVID-19 symptoms from physician notes in pediatric patients. We specifically study patients presenting to the emergency department (ED) who can be sentinel cases in an outbreak. METHODS: Subjects in this retrospective cohort study are patients who are 21 years of age and younger, who presented to a pediatric ED at a large academic children's hospital between March 1, 2020, and May 31, 2022. The ED notes for all patients were processed with an NLP pipeline tuned to detect the mention of 11 COVID-19 symptoms based on Centers for Disease Control and Prevention (CDC) criteria. For a gold standard, 3 subject matter experts labeled 226 ED notes and had strong agreement (F1-score=0.986; positive predictive value [PPV]=0.972; and sensitivity=1.0). F1-score, PPV, and sensitivity were used to compare the performance of both NLP and the International Classification of Diseases, 10th Revision (ICD-10) coding to the gold standard chart review. As a formative use case, variations in symptom patterns were measured across SARS-CoV-2 variant eras. RESULTS: There were 85,678 ED encounters during the study period, including 4% (n=3420) with patients with COVID-19. NLP was more accurate at identifying encounters with patients that had any of the COVID-19 symptoms (F1-score=0.796) than ICD-10 codes (F1-score =0.451). NLP accuracy was higher for positive symptoms (sensitivity=0.930) than ICD-10 (sensitivity=0.300). However, ICD-10 accuracy was higher for negative symptoms (specificity=0.994) than NLP (specificity=0.917). Congestion or runny nose showed the highest accuracy difference (NLP: F1-score=0.828 and ICD-10: F1-score=0.042). For encounters with patients with COVID-19, prevalence estimates of each NLP symptom differed across variant eras. Patients with COVID-19 were more likely to have each NLP symptom detected than patients without this disease. Effect sizes (odds ratios) varied across pandemic eras. CONCLUSIONS: This study establishes the value of AI-based NLP as a highly effective tool for real-time COVID-19 symptom detection in pediatric patients, outperforming traditional ICD-10 methods. It also reveals the evolving nature of symptom prevalence across different virus variants, underscoring the need for dynamic, technology-driven approaches in infectious disease surveillance.


Asunto(s)
Biovigilancia , COVID-19 , Médicos , SARS-CoV-2 , Estados Unidos , Humanos , Niño , Inteligencia Artificial , Estudios Retrospectivos , COVID-19/diagnóstico , COVID-19/epidemiología
3.
Stud Health Technol Inform ; 310: 1460-1461, 2024 Jan 25.
Artículo en Inglés | MEDLINE | ID: mdl-38269696

RESUMEN

Clinical text contains rich patient information and has attracted much research interest in applying Natural Language Processing (NLP) tools to model it. In this study, we quantified and analyzed the textual characteristics of five common clinical note types using multiple measurements, including lexical-level features, semantic content, and grammaticality. We found there exist significant linguistic variations in different clinical note types, while some types tend to be more similar than others.


Asunto(s)
Lingüística , Procesamiento de Lenguaje Natural , Humanos , Semántica
4.
JMIR AI ; 3: e51240, 2024 Jan 15.
Artículo en Inglés | MEDLINE | ID: mdl-38875566

RESUMEN

BACKGROUND: Pancreatic cancer is the third leading cause of cancer deaths in the United States. Pancreatic ductal adenocarcinoma (PDAC) is the most common form of pancreatic cancer, accounting for up to 90% of all cases. Patient-reported symptoms are often the triggers of cancer diagnosis and therefore, understanding the PDAC-associated symptoms and the timing of symptom onset could facilitate early detection of PDAC. OBJECTIVE: This paper aims to develop a natural language processing (NLP) algorithm to capture symptoms associated with PDAC from clinical notes within a large integrated health care system. METHODS: We used unstructured data within 2 years prior to PDAC diagnosis between 2010 and 2019 and among matched patients without PDAC to identify 17 PDAC-related symptoms. Related terms and phrases were first compiled from publicly available resources and then recursively reviewed and enriched with input from clinicians and chart review. A computerized NLP algorithm was iteratively developed and fine-trained via multiple rounds of chart review followed by adjudication. Finally, the developed algorithm was applied to the validation data set to assess performance and to the study implementation notes. RESULTS: A total of 408,147 and 709,789 notes were retrieved from 2611 patients with PDAC and 10,085 matched patients without PDAC, respectively. In descending order, the symptom distribution of the study implementation notes ranged from 4.98% for abdominal or epigastric pain to 0.05% for upper extremity deep vein thrombosis in the PDAC group, and from 1.75% for back pain to 0.01% for pale stool in the non-PDAC group. Validation of the NLP algorithm against adjudicated chart review results of 1000 notes showed that precision ranged from 98.9% (jaundice) to 84% (upper extremity deep vein thrombosis), recall ranged from 98.1% (weight loss) to 82.8% (epigastric bloating), and F1-scores ranged from 0.97 (jaundice) to 0.86 (depression). CONCLUSIONS: The developed and validated NLP algorithm could be used for the early detection of PDAC.

5.
J Am Coll Emerg Physicians Open ; 5(2): e13133, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38481520

RESUMEN

Objectives: This study presents a design framework to enhance the accuracy by which large language models (LLMs), like ChatGPT can extract insights from clinical notes. We highlight this framework via prompt refinement for the automated determination of HEART (History, ECG, Age, Risk factors, Troponin risk algorithm) scores in chest pain evaluation. Methods: We developed a pipeline for LLM prompt testing, employing stochastic repeat testing and quantifying response errors relative to physician assessment. We evaluated the pipeline for automated HEART score determination across a limited set of 24 synthetic clinical notes representing four simulated patients. To assess whether iterative prompt design could improve the LLMs' ability to extract complex clinical concepts and apply rule-based logic to translate them to HEART subscores, we monitored diagnostic performance during prompt iteration. Results: Validation included three iterative rounds of prompt improvement for three HEART subscores with 25 repeat trials totaling 1200 queries each for GPT-3.5 and GPT-4. For both LLM models, from initial to final prompt design, there was a decrease in the rate of responses with erroneous, non-numerical subscore answers. Accuracy of numerical responses for HEART subscores (discrete 0-2 point scale) improved for GPT-4 from the initial to final prompt iteration, decreasing from a mean error of 0.16-0.10 (95% confidence interval: 0.07-0.14) points. Conclusion: We established a framework for iterative prompt design in the clinical space. Although the results indicate potential for integrating LLMs in structured clinical note analysis, translation to real, large-scale clinical data with appropriate data privacy safeguards is needed.

6.
JAMIA Open ; 6(3): ooad045, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37416449

RESUMEN

Objectives: Clinical notes are a veritable treasure trove of information on a patient's disease progression, medical history, and treatment plans, yet are locked in secured databases accessible for research only after extensive ethics review. Removing personally identifying and protected health information (PII/PHI) from the records can reduce the need for additional Institutional Review Boards (IRB) reviews. In this project, our goals were to: (1) develop a robust and scalable clinical text de-identification pipeline that is compliant with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule for de-identification standards and (2) share routinely updated de-identified clinical notes with researchers. Materials and Methods: Building on our open-source de-identification software called Philter, we added features to: (1) make the algorithm and the de-identified data HIPAA compliant, which also implies type 2 error-free redaction, as certified via external audit; (2) reduce over-redaction errors; and (3) normalize and shift date PHI. We also established a streamlined de-identification pipeline using MongoDB to automatically extract clinical notes and provide truly de-identified notes to researchers with periodic monthly refreshes at our institution. Results: To the best of our knowledge, the Philter V1.0 pipeline is currently the first and only certified, de-identified redaction pipeline that makes clinical notes available to researchers for nonhuman subjects' research, without further IRB approval needed. To date, we have made over 130 million certified de-identified clinical notes available to over 600 UCSF researchers. These notes were collected over the past 40 years, and represent data from 2757016 UCSF patients.

7.
Diagnostics (Basel) ; 13(13)2023 Jul 06.
Artículo en Inglés | MEDLINE | ID: mdl-37443689

RESUMEN

The International Classification of Diseases (ICD) code is a diagnostic classification standard that is frequently used as a referencing system in healthcare and insurance. However, it takes time and effort to find and use the right diagnosis code based on a patient's medical records. In response, deep learning (DL) methods have been developed to assist physicians in the ICD coding process. Our findings propose a deep learning model that utilized clinical notes from medical records to predict ICD-10 codes. Our research used text-based medical data from the outpatient department (OPD) of a university hospital from January to December 2016. The dataset used clinical notes from five departments, and a total of 21,953 medical records were collected. Clinical notes consisted of a subjective component, objective component, assessment, plan (SOAP) notes, diagnosis code, and a drug list. The dataset was divided into two groups: 90% for training and 10% for test cases. We applied natural language processing (NLP) technique (word embedding, Word2Vector) to process the data. A deep learning-based convolutional neural network (CNN) model was created based on the information presented above. Three metrics (precision, recall, and F-score) were used to calculate the achievement of the deep learning CNN model. Clinically acceptable results were achieved through the deep learning model for five departments (precision: 0.53-0.96; recall: 0.85-0.99; and F-score: 0.65-0.98). With a precision of 0.95, a recall of 0.99, and an F-score of 0.98, the deep learning model performed the best in the department of cardiology. Our proposed CNN model significantly improved the prediction performance for an automated ICD-10 code prediction system based on prior clinical information. This CNN model could reduce the laborious task of manual coding and could assist physicians in making a better diagnosis.

8.
Workplace Health Saf ; 71(10): 484-490, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37387505

RESUMEN

BACKGROUND: Type II workplace violence in health care, perpetrated by patients/clients toward home healthcare nurses, is a serious health and safety issue. A significant portion of violent incidents are not officially reported. Natural language processing can detect these "hidden cases" from clinical notes. In this study, we computed the 12-month prevalence of Type II workplace violence from home healthcare nurses' clinical notes by developing and utilizing a natural language processing system. METHODS: Nearly 600,000 clinical visit notes from two large U.S.-based home healthcare agencies were analyzed. The notes were recorded from January 1, 2019 to December 31, 2019. Rule- and machine-learning-based natural language processing algorithms were applied to identify clinical notes containing workplace violence descriptions. RESULTS: The natural language processing algorithms identified 236 clinical notes that included Type II workplace violence toward home healthcare nurses. The prevalence of physical violence was 0.067 incidents per 10,000 home visits. The prevalence of nonphysical violence was 3.76 incidents per 10,000 home visits. The prevalence of any violence was four incidents per 10,000 home visits. In comparison, no Type II workplace violence incidents were recorded in the official incident report systems of the two agencies in this same time period. CONCLUSIONS AND APPLICATION TO PRACTICE: Natural language processing can be an effective tool to augment formal reporting by capturing violence incidents from daily, ongoing, large volumes of clinical notes. It can enable managers and clinicians to stay informed of potential violence risks and keep their practice environment safe.


Asunto(s)
Violencia Laboral , Humanos , Procesamiento de Lenguaje Natural , Lugar de Trabajo , Agresión , Gestión de Riesgos
9.
JMIR Med Inform ; 11: e44977, 2023 Apr 20.
Artículo en Inglés | MEDLINE | ID: mdl-37079367

RESUMEN

BACKGROUND: The clinical narrative in electronic health records (EHRs) carries valuable information for predictive analytics; however, its free-text form is difficult to mine and analyze for clinical decision support (CDS). Large-scale clinical natural language processing (NLP) pipelines have focused on data warehouse applications for retrospective research efforts. There remains a paucity of evidence for implementing NLP pipelines at the bedside for health care delivery. OBJECTIVE: We aimed to detail a hospital-wide, operational pipeline to implement a real-time NLP-driven CDS tool and describe a protocol for an implementation framework with a user-centered design of the CDS tool. METHODS: The pipeline integrated a previously trained open-source convolutional neural network model for screening opioid misuse that leveraged EHR notes mapped to standardized medical vocabularies in the Unified Medical Language System. A sample of 100 adult encounters were reviewed by a physician informaticist for silent testing of the deep learning algorithm before deployment. An end user interview survey was developed to examine the user acceptability of a best practice alert (BPA) to provide the screening results with recommendations. The planned implementation also included a human-centered design with user feedback on the BPA, an implementation framework with cost-effectiveness, and a noninferiority patient outcome analysis plan. RESULTS: The pipeline was a reproducible workflow with a shared pseudocode for a cloud service to ingest, process, and store clinical notes as Health Level 7 messages from a major EHR vendor in an elastic cloud computing environment. Feature engineering of the notes used an open-source NLP engine, and the features were fed into the deep learning algorithm, with the results returned as a BPA in the EHR. On-site silent testing of the deep learning algorithm demonstrated a sensitivity of 93% (95% CI 66%-99%) and specificity of 92% (95% CI 84%-96%), similar to published validation studies. Before deployment, approvals were received across hospital committees for inpatient operations. Five interviews were conducted; they informed the development of an educational flyer and further modified the BPA to exclude certain patients and allow the refusal of recommendations. The longest delay in pipeline development was because of cybersecurity approvals, especially because of the exchange of protected health information between the Microsoft (Microsoft Corp) and Epic (Epic Systems Corp) cloud vendors. In silent testing, the resultant pipeline provided a BPA to the bedside within minutes of a provider entering a note in the EHR. CONCLUSIONS: The components of the real-time NLP pipeline were detailed with open-source tools and pseudocode for other health systems to benchmark. The deployment of medical artificial intelligence systems in routine clinical care presents an important yet unfulfilled opportunity, and our protocol aimed to close the gap in the implementation of artificial intelligence-driven CDS. TRIAL REGISTRATION: ClinicalTrials.gov NCT05745480; https://www.clinicaltrials.gov/ct2/show/NCT05745480.

10.
JAMIA Open ; 5(2): ooac034, 2022 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-35663115

RESUMEN

Objective: To assess the overlap of information between electronic health record (EHR) and patient-nurse verbal communication in home healthcare (HHC). Methods: Patient-nurse verbal communications during home visits were recorded between February 16, 2021 and September 2, 2021 with patients being served in an organization located in the Northeast United States. Twenty-two audio recordings for 15 patients were transcribed. To compare overlap of information, manual annotations of problems and interventions were made on transcriptions as well as information from EHR including structured data and clinical notes corresponding to HHC visits. Results: About 30% (1534/5118) of utterances (ie, spoken language preceding/following silence or a change of speaker) were identified as including problems or interventions. A total of 216 problems and 492 interventions were identified through verbal communication among all the patients in the study. Approximately 50.5% of the problems and 20.8% of the interventions discussed during the verbal communication were not documented in the EHR. Preliminary results showed that statistical differences between racial groups were observed in a comparison of problems and interventions. Discussion: This study was the first to investigate the extent that problems and interventions were mentioned in patient-nurse verbal communication during HHC visits and whether this information was documented in EHR. Our analysis identified gaps in information overlap and possible racial disparities. Conclusion: Our results highlight the value of analyzing communications between HHC patients and nurses. Future studies should explore ways to capture information in verbal communication using automated speech recognition.

SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda