Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
Crit Care Explor ; 5(5): e0897, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-37151895

RESUMO

Hospital early warning systems that use machine learning (ML) to predict clinical deterioration are increasingly being used to aid clinical decision-making. However, it is not known how ML predictions complement physician and nurse judgment. Our objective was to train and validate a ML model to predict patient deterioration and compare model predictions with real-world physician and nurse predictions. DESIGN: Retrospective and prospective cohort study. SETTING: Academic tertiary care hospital. PATIENTS: Adult general internal medicine hospitalizations. MEASUREMENTS AND MAIN RESULTS: We developed and validated a neural network model to predict in-hospital death and ICU admission in 23,528 hospitalizations between April 2011 and April 2019. We then compared model predictions with 3,374 prospectively collected predictions from nurses, residents, and attending physicians about their own patients in 960 hospitalizations between April 30, and August 28, 2019. ML model predictions achieved clinician-level accuracy for predicting ICU admission or death (ML median F1 score 0.32 [interquartile range (IQR) 0.30-0.34], AUC 0.77 [IQ 0.76-0.78]; clinicians median F1-score 0.33 [IQR 0.30-0.35], AUC 0.64 [IQR 0.63-0.66]). ML predictions were more accurate than clinicians for ICU admission. Of all ICU admissions and deaths, 36% occurred in hospitalizations where the model and clinicians disagreed. Combining human and model predictions detected 49% of clinical deterioration events, improving sensitivity by 16% compared with clinicians alone and 24% compared with the model alone while maintaining a positive predictive value of 33%, thus keeping false alarms at a clinically acceptable level. CONCLUSIONS: ML models can complement clinician judgment to predict clinical deterioration in hospital. These findings demonstrate important opportunities for human-computer collaboration to improve prognostication and personalized medicine in hospital.

3.
Front Digit Health ; 4: 932123, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36133802

RESUMO

Background: Deploying safe and effective machine learning models is essential to realize the promise of artificial intelligence for improved healthcare. Yet, there remains a large gap between the number of high-performing ML models trained on healthcare data and the actual deployment of these models. Here, we describe the deployment of CHARTwatch, an artificial intelligence-based early warning system designed to predict patient risk of clinical deterioration. Methods: We describe the end-to-end infrastructure that was developed to deploy CHARTwatch and outline the process from data extraction to communicating patient risk scores in real-time to physicians and nurses. We then describe the various challenges that were faced in deployment, including technical issues (e.g., unstable database connections), process-related challenges (e.g., changes in how a critical lab is measured), and challenges related to deploying a clinical system in the middle of a pandemic. We report various measures to quantify the success of the deployment: model performance, adherence to workflows, and infrastructure uptime/downtime. Ultimately, success is driven by end-user adoption and impact on relevant clinical outcomes. We assess our deployment process by evaluating how closely we followed existing guidance for good machine learning practice (GMLP) and identify gaps that are not addressed in this guidance. Results: The model demonstrated strong and consistent performance in real-time in the first 19 months after deployment (AUC 0.76) as in the silent deployment heldout test data (AUC 0.79). The infrastructure remained online for >99% of time in the first year of deployment. Our deployment adhered to all 10 aspects of GMLP guiding principles. Several steps were crucial for deployment but are not mentioned or are missing details in the GMLP principles, including the need for a silent testing period, the creation of robust downtime protocols, and the importance of end-user engagement. Evaluation for impacts on clinical outcomes and adherence to clinical protocols is underway. Conclusion: We deployed an artificial intelligence-based early warning system to predict clinical deterioration in hospital. Careful attention to data infrastructure, identifying problems in a silent testing period, close monitoring during deployment, and strong engagement with end-users were critical for successful deployment.

4.
Neuroradiology ; 64(12): 2357-2362, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-35913525

RESUMO

PURPOSE: Data extraction from radiology free-text reports is time consuming when performed manually. Recently, more automated extraction methods using natural language processing (NLP) are proposed. A previously developed rule-based NLP algorithm showed promise in its ability to extract stroke-related data from radiology reports. We aimed to externally validate the accuracy of CHARTextract, a rule-based NLP algorithm, to extract stroke-related data from free-text radiology reports. METHODS: Free-text reports of CT angiography (CTA) and perfusion (CTP) studies of consecutive patients with acute ischemic stroke admitted to a regional stroke center for endovascular thrombectomy were analyzed from January 2015 to 2021. Stroke-related variables were manually extracted as reference standard from clinical reports, including proximal and distal anterior circulation occlusion, posterior circulation occlusion, presence of ischemia or hemorrhage, Alberta stroke program early CT score (ASPECTS), and collateral status. These variables were simultaneously extracted using a rule-based NLP algorithm. The NLP algorithm's accuracy, specificity, sensitivity, positive predictive value (PPV), and negative predictive value (NPV) were assessed. RESULTS: The NLP algorithm's accuracy was > 90% for identifying distal anterior occlusion, posterior circulation occlusion, hemorrhage, and ASPECTS. Accuracy was 85%, 74%, and 79% for proximal anterior circulation occlusion, presence of ischemia, and collateral status respectively. The algorithm confirmed the absence of variables from radiology reports with an 87-100% accuracy. CONCLUSIONS: Rule-based NLP has a moderate to good performance for stroke-related data extraction from free-text imaging reports. The algorithm's accuracy was affected by inconsistent report styles and lexicon among reporting radiologists.


Assuntos
AVC Isquêmico , Acidente Vascular Cerebral , Humanos , Processamento de Linguagem Natural , Acidente Vascular Cerebral/diagnóstico por imagem , Algoritmos , Automação
5.
JMIR Med Inform ; 10(1): e25157, 2022 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-35019849

RESUMO

BACKGROUND: The Expanded Disability Status Scale (EDSS) score is a widely used measure to monitor disability progression in people with multiple sclerosis (MS). However, extracting and deriving the EDSS score from unstructured electronic health records can be time-consuming. OBJECTIVE: We aimed to compare rule-based and deep learning natural language processing algorithms for detecting and predicting the total EDSS score and EDSS functional system subscores from the electronic health records of patients with MS. METHODS: We studied 17,452 electronic health records of 4906 MS patients followed at one of Canada's largest MS clinics between June 2015 and July 2019. We randomly divided the records into training (80%) and test (20%) data sets, and compared the performance characteristics of 3 natural language processing models. First, we applied a rule-based approach, extracting the EDSS score from sentences containing the keyword "EDSS." Next, we trained a convolutional neural network (CNN) model to predict the 19 half-step increments of the EDSS score. Finally, we used a combined rule-based-CNN model. For each approach, we determined the accuracy, precision, recall, and F-score compared with the reference standard, which was manually labeled EDSS scores in the clinic database. RESULTS: Overall, the combined keyword-CNN model demonstrated the best performance, with accuracy, precision, recall, and an F-score of 0.90, 0.83, 0.83, and 0.83 respectively. Respective figures for the rule-based and CNN models individually were 0.57, 0.91, 0.65, and 0.70, and 0.86, 0.70, 0.70, and 0.70. Because of missing data, the model performance for EDSS subscores was lower than that for the total EDSS score. Performance improved when considering notes with known values of the EDSS subscores. CONCLUSIONS: A combined keyword-CNN natural language processing model can extract and accurately predict EDSS scores from patient records. This approach can be automated for efficient information extraction in clinical and research settings.

6.
Thromb Res ; 209: 51-58, 2022 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-34871982

RESUMO

BACKGROUND: Identifying venous thromboembolism (VTE) from large clinical and administrative databases is important for research and quality improvement. OBJECTIVE: To develop and validate natural language processing (NLP) algorithms to identify VTE from radiology reports among general internal medicine (GIM) inpatients. METHODS: This cross-sectional study included GIM hospitalizations between April 1, 2010 and March 31, 2017 at 5 hospitals in Toronto, Ontario, Canada. We developed NLP algorithms to identify pulmonary embolism (PE) and deep venous thrombosis (DVT) from radiologist reports of thoracic computed tomography (CT), extremity compression ultrasound (US), and nuclear ventilation-perfusion (VQ) scans in a training dataset of 1551 hospitalizations. We compared the accuracy of our NLP algorithms, the previously-published "simpleNLP" tool, and administrative discharge diagnosis codes (ICD-10-CA) for PE and DVT to the "gold standard" manual review in a separate random sample of 4000 GIM hospitalizations. RESULTS: Our NLP algorithms were highly accurate for identifying DVT from US, with sensitivity 0.94, positive predictive value (PPV) 0.90, and Area Under the Receiver-Operating-Characteristic Curve (AUC) 0.96; and in identifying PE from CT, with sensitivity 0.91, PPV 0.89, and AUC 0.96. Administrative diagnosis codes and the simple NLP tool were less accurate for DVT (ICD-10-CA sensitivity 0.63, PPV 0.43, AUC 0.81; simpleNLP sensitivity 0.41, PPV 0.36, AUC 0.66) and PE (ICD-10-CA sensitivity 0.83, PPV 0.70, AUC 0.91; simpleNLP sensitivity 0.89, PPV 0.62, AUC 0.92). CONCLUSIONS: Administrative diagnosis codes are unreliable in identifying VTE in hospitalized patients. We developed highly accurate NLP algorithms to identify VTE from radiology reports in a multicentre sample and have made the algorithms freely available to the academic community with a user-friendly tool (https://lks-chart.github.io/CHARTextract-docs/08-downloads/rulesets.html#venous-thromboembolism-vte-rulesets).


Assuntos
Embolia Pulmonar , Radiologia , Tromboembolia Venosa , Algoritmos , Estudos Transversais , Hospitalização , Humanos , Classificação Internacional de Doenças , Processamento de Linguagem Natural , Ontário , Embolia Pulmonar/diagnóstico por imagem , Tromboembolia Venosa/diagnóstico por imagem
8.
Diabetes Obes Metab ; 23(10): 2311-2319, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-34142418

RESUMO

AIM: To predict the risk of hypoglycaemia using machine-learning techniques in hospitalized patients. METHODS: We conducted a retrospective cohort study of patients hospitalized under general internal medicine (GIM) and cardiovascular surgery (CV) at a tertiary care teaching hospital in Toronto, Ontario. Three models were generated using supervised machine learning: least absolute shrinkage and selection operator (LASSO) logistic regression; gradient-boosted trees; and a recurrent neural network. Each model included baseline patient data and time-varying data. Natural-language processing was used to incorporate text data from physician and nursing notes. RESULTS: We included 8492 GIM admissions and 8044 CV admissions. Hypoglycaemia occurred in 16% of GIM admissions and 13% of CV admissions. The area under the curve for the models in the held-out validation set was approximately 0.80 on the GIM ward and 0.82 on the CV ward. When the threshold for hypoglycaemia was lowered to 2.9 mmol/L (52 mg/dL), similar results were observed. Among the patients at the highest decile of risk, the positive predictive value was approximately 50% and the sensitivity was 99%. CONCLUSION: Machine-learning approaches can accurately identify patients at high risk of hypoglycaemia in hospital. Future work will involve evaluating whether implementing this model with targeted clinical interventions can improve clinical outcomes.


Assuntos
Hipoglicemia , Aprendizado de Máquina , Hospitais , Humanos , Hipoglicemia/diagnóstico , Hipoglicemia/epidemiologia , Modelos Logísticos , Estudos Retrospectivos
9.
JMIR Med Inform ; 9(5): e24381, 2021 May 04.
Artigo em Inglês | MEDLINE | ID: mdl-33944791

RESUMO

BACKGROUND: Diagnostic neurovascular imaging data are important in stroke research, but obtaining these data typically requires laborious manual chart reviews. OBJECTIVE: We aimed to determine the accuracy of a natural language processing (NLP) approach to extract information on the presence and location of vascular occlusions as well as other stroke-related attributes based on free-text reports. METHODS: From the full reports of 1320 consecutive computed tomography (CT), CT angiography, and CT perfusion scans of the head and neck performed at a tertiary stroke center between October 2017 and January 2019, we manually extracted data on the presence of proximal large vessel occlusion (primary outcome), as well as distal vessel occlusion, ischemia, hemorrhage, Alberta stroke program early CT score (ASPECTS), and collateral status (secondary outcomes). Reports were randomly split into training (n=921) and validation (n=399) sets, and attributes were extracted using rule-based NLP. We reported the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and the overall accuracy of the NLP approach relative to the manually extracted data. RESULTS: The overall prevalence of large vessel occlusion was 12.2%. In the training sample, the NLP approach identified this attribute with an overall accuracy of 97.3% (95.5% sensitivity, 98.1% specificity, 84.1% PPV, and 99.4% NPV). In the validation set, the overall accuracy was 95.2% (90.0% sensitivity, 97.4% specificity, 76.3% PPV, and 98.5% NPV). The accuracy of identifying distal or basilar occlusion as well as hemorrhage was also high, but there were limitations in identifying cerebral ischemia, ASPECTS, and collateral status. CONCLUSIONS: NLP may improve the efficiency of large-scale imaging data collection for stroke surveillance and research.

10.
PLoS One ; 16(3): e0247872, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33657184

RESUMO

BACKGROUND: Tuberculosis (TB) is a major cause of death worldwide. TB research draws heavily on clinical cohorts which can be generated using electronic health records (EHR), but granular information extracted from unstructured EHR data is limited. The St. Michael's Hospital TB database (SMH-TB) was established to address gaps in EHR-derived TB clinical cohorts and provide researchers and clinicians with detailed, granular data related to TB management and treatment. METHODS: We collected and validated multiple layers of EHR data from the TB outpatient clinic at St. Michael's Hospital, Toronto, Ontario, Canada to generate the SMH-TB database. SMH-TB contains structured data directly from the EHR, and variables generated using natural language processing (NLP) by extracting relevant information from free-text within clinic, radiology, and other notes. NLP performance was assessed using recall, precision and F1 score averaged across variable labels. We present characteristics of the cohort population using binomial proportions and 95% confidence intervals (CI), with and without adjusting for NLP misclassification errors. RESULTS: SMH-TB currently contains retrospective patient data spanning 2011 to 2018, for a total of 3298 patients (N = 3237 with at least 1 associated dictation). Performance of TB diagnosis and medication NLP rulesets surpasses 93% in recall, precision and F1 metrics, indicating good generalizability. We estimated 20% (95% CI: 18.4-21.2%) were diagnosed with active TB and 46% (95% CI: 43.8-47.2%) were diagnosed with latent TB. After adjusting for potential misclassification, the proportion of patients diagnosed with active and latent TB was 18% (95% CI: 16.8-19.7%) and 40% (95% CI: 37.8-41.6%) respectively. CONCLUSION: SMH-TB is a unique database that includes a breadth of structured data derived from structured and unstructured EHR data by using NLP rulesets. The data are available for a variety of research applications, such as clinical epidemiology, quality improvement and mathematical modeling studies.


Assuntos
Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Tuberculose/epidemiologia , Bases de Dados Factuais , Feminino , Hospitais , Humanos , Armazenamento e Recuperação da Informação , Masculino , Ontário/epidemiologia , Estudos Retrospectivos , Tuberculose/diagnóstico
12.
PLoS One ; 14(3): e0212342, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30917120

RESUMO

Language is one the earliest capacities affected by cognitive change. To monitor that change longitudinally, we have developed a web portal for remote linguistic data acquisition, called Talk2Me, consisting of a variety of tasks. In order to facilitate research in different aspects of language, we provide baselines including the relations between different scoring functions within and across tasks. These data can be used to augment studies that require a normative model; for example, we provide baseline classification results in identifying dementia. These data are released publicly along with a comprehensive open-source package for extracting approximately two thousand lexico-syntactic, acoustic, and semantic features. This package can be applied arbitrarily to studies that include linguistic data. To our knowledge, this is the most comprehensive publicly available software for extracting linguistic features. The software includes scoring functions for different tasks.


Assuntos
Coleta de Dados/métodos , Linguística/métodos , Adulto , Idoso , Idoso de 80 Anos ou mais , Feminino , Humanos , Idioma , Linguística/instrumentação , Masculino , Pessoa de Meia-Idade , Portais do Paciente , Semântica , Software , Adulto Jovem
14.
J Biomed Inform ; 100S: 100057, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-34384583

RESUMO

Representing words as numerical vectors based on the contexts in which they appear has become the de facto method of analyzing text with machine learning. In this paper, we provide a guide for training these representations on clinical text data, using a survey of relevant research. Specifically, we discuss different types of word representations, clinical text corpora, available pre-trained clinical word vector embeddings, intrinsic and extrinsic evaluation, applications, and limitations of these approaches. This work can be used as a blueprint for clinicians and healthcare workers who may want to incorporate clinical text features in their own models and applications.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...