Pesquisa | Biblioteca Virtual em Saúde

1.

BERT-Based Neural Network for Inpatient Fall Detection From Electronic Medical Records: Retrospective Cohort Study.

Cheligeer, Cheligeer; Wu, Guosong; Lee, Seungwon; Pan, Jie; Southern, Danielle A; Martin, Elliot A; Sapiro, Natalie; Eastwood, Cathy A; Quan, Hude; Xu, Yuan.

JMIR Med Inform ; 12: e48995, 2024 Jan 30.

Artigo em Inglês | MEDLINE | ID: mdl-38289643

RESUMO

BACKGROUND: Inpatient falls are a substantial concern for health care providers and are associated with negative outcomes for patients. Automated detection of falls using machine learning (ML) algorithms may aid in improving patient safety and reducing the occurrence of falls. OBJECTIVE: This study aims to develop and evaluate an ML algorithm for inpatient fall detection using multidisciplinary progress record notes and a pretrained Bidirectional Encoder Representation from Transformers (BERT) language model. METHODS: A cohort of 4323 adult patients admitted to 3 acute care hospitals in Calgary, Alberta, Canada from 2016 to 2021 were randomly sampled. Trained reviewers determined falls from patient charts, which were linked to electronic medical records and administrative data. The BERT-based language model was pretrained on clinical notes, and a fall detection algorithm was developed based on a neural network binary classification architecture. RESULTS: To address various use scenarios, we developed 3 different Alberta hospital notes-specific BERT models: a high sensitivity model (sensitivity 97.7, IQR 87.7-99.9), a high positive predictive value model (positive predictive value 85.7, IQR 57.2-98.2), and the high F1-score model (F1=64.4). Our proposed method outperformed 3 classical ML algorithms and an International Classification of Diseases code-based algorithm for fall detection, showing its potential for improved performance in diverse clinical settings. CONCLUSIONS: The developed algorithm provides an automated and accurate method for inpatient fall detection using multidisciplinary progress record notes and a pretrained BERT language model. This method could be implemented in clinical practice to improve patient safety and reduce the occurrence of falls in hospitals.

2.

Automated extraction of weight, height, and obesity in electronic medical records are highly valid.

Sandhu, Namneet; Krusina, Alexander; Quan, Hude; Walker, Robin; Martin, Elliot A; Eastwood, Cathy A; Southern, Danielle A.

Obes Sci Pract ; 10(1): e705, 2024 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-38263997

RESUMO

Objective: Coding of obesity using the International Classification of Diseases (ICD) in healthcare administrative databases is under-reported and thus unreliable for measuring prevalence or incidence. This study aimed to develop and test a rule-based algorithm for automating the detection and severity of obesity using height and weight collected in several sections of the Electronic Medical Records (EMRs). Methods: In this cross-sectional study, 1904 inpatient charts randomly selected in three hospitals in Calgary, Canada between January and June 2015 were reviewed and linked with AllScripts Sunrise Clinical Manager EMRs. A rule-based algorithm was created which looks for patients' height and weight values recorded in EMRs. Clinical notes were split into sentences and searched for height and weight, and BMI was computed. Results: The study cohort consisted of 1904 patients with 50.8% females and 43.3% > 64 years of age. The final model to identify obesity within EMRs resulted in a sensitivity of 92.9%, specificity of 98.4%, positive predictive value of 96.7%, negative predictive value of 96.6%, and F1 score of 94.8%. Conclusions: This study developed a highly valid rule-based EMR algorithm that detects height and weight. This could allow large-scale analyses using obesity that were previously not possible.

3.

Exploring the reliability of inpatient EMR algorithms for diabetes identification.

Lee, Seungwon; Martin, Elliot A; Pan, Jie; Eastwood, Cathy A; Southern, Danielle A; Campbell, David J T; Shaheen, Abdel Aziz; Quan, Hude; Butalia, Sonia.

BMJ Health Care Inform ; 30(1)2023 Dec 20.

Artigo em Inglês | MEDLINE | ID: mdl-38123357

RESUMO

INTRODUCTION: Accurate identification of medical conditions within a real-time inpatient setting is crucial for health systems. Current inpatient comorbidity algorithms rely on integrating various sources of administrative data, but at times, there is a considerable lag in obtaining and linking these data. Our study objective was to develop electronic medical records (EMR) data-based inpatient diabetes phenotyping algorithms. MATERIALS AND METHODS: A chart review on 3040 individuals was completed, and 583 had diabetes. We linked EMR data on these individuals to the International Classification of Disease (ICD) administrative databases. The following EMR-data-based diabetes algorithms were developed: (1) laboratory data, (2) medication data, (3) laboratory and medications data, (4) diabetes concept keywords and (5) diabetes free-text algorithm. Combined algorithms used or statements between the above algorithms. Algorithm performances were measured using chart review as a gold standard. We determined the best-performing algorithm as the one that showed the high performance of sensitivity (SN), and positive predictive value (PPV). RESULTS: The algorithms tested generally performed well: ICD-coded data, SN 0.84, specificity (SP) 0.98, PPV 0.93 and negative predictive value (NPV) 0.96; medication and laboratory algorithm, SN 0.90, SP 0.95, PPV 0.80 and NPV 0.97; all document types algorithm, SN 0.95, SP 0.98, PPV 0.94 and NPV 0.99. DISCUSSION: Free-text data-based diabetes algorithm can yield comparable or superior performance to a commonly used ICD-coded algorithm and could supplement existing methods. These types of inpatient EMR-based algorithms for case identification may become a key method for timely resource planning and care delivery.

Assuntos

Diabetes Mellitus , Registros Eletrônicos de Saúde , Humanos , Pacientes Internados , Reprodutibilidade dos Testes , Algoritmos

4.

Cerebrovascular disease case identification in inpatient electronic medical record data using natural language processing.

Pan, Jie; Zhang, Zilong; Peters, Steven Ray; Vatanpour, Shabnam; Walker, Robin L; Lee, Seungwon; Martin, Elliot A; Quan, Hude.

Brain Inform ; 10(1): 22, 2023 Sep 02.

Artigo em Inglês | MEDLINE | ID: mdl-37658963

RESUMO

BACKGROUND: Abstracting cerebrovascular disease (CeVD) from inpatient electronic medical records (EMRs) through natural language processing (NLP) is pivotal for automated disease surveillance and improving patient outcomes. Existing methods rely on coders' abstraction, which has time delays and under-coding issues. This study sought to develop an NLP-based method to detect CeVD using EMR clinical notes. METHODS: CeVD status was confirmed through a chart review on randomly selected hospitalized patients who were 18 years or older and discharged from 3 hospitals in Calgary, Alberta, Canada, between January 1 and June 30, 2015. These patients' chart data were linked to administrative discharge abstract database (DAD) and Sunrise™ Clinical Manager (SCM) EMR database records by Personal Health Number (a unique lifetime identifier) and admission date. We trained multiple natural language processing (NLP) predictive models by combining two clinical concept extraction methods and two supervised machine learning (ML) methods: random forest and XGBoost. Using chart review as the reference standard, we compared the model performances with those of the commonly applied International Classification of Diseases (ICD-10-CA) codes, on the metrics of sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). RESULT: Of the study sample (n = 3036), the prevalence of CeVD was 11.8% (n = 360); the median patient age was 63; and females accounted for 50.3% (n = 1528) based on chart data. Among 49 extracted clinical documents from the EMR, four document types were identified as the most influential text sources for identifying CeVD disease ("nursing transfer report," "discharge summary," "nursing notes," and "inpatient consultation."). The best performing NLP model was XGBoost, combining the Unified Medical Language System concepts extracted by cTAKES (e.g., top-ranked concepts, "Cerebrovascular accident" and "Transient ischemic attack"), and the term frequency-inverse document frequency vectorizer. Compared with ICD codes, the model achieved higher validity overall, such as sensitivity (25.0% vs 70.0%), specificity (99.3% vs 99.1%), PPV (82.6 vs. 87.8%), and NPV (90.8% vs 97.1%). CONCLUSION: The NLP algorithm developed in this study performed better than the ICD code algorithm in detecting CeVD. The NLP models could result in an automated EMR tool for identifying CeVD cases and be applied for future studies such as surveillance, and longitudinal studies.

5.

Development of machine learning models for the detection of surgical site infections following total hip and knee arthroplasty: a multicenter cohort study.

Wu, Guosong; Cheligeer, Cheligeer; Southern, Danielle A; Martin, Elliot A; Xu, Yuan; Leal, Jenine; Ellison, Jennifer; Bush, Kathryn; Williamson, Tyler; Quan, Hude; Eastwood, Cathy A.

Antimicrob Resist Infect Control ; 12(1): 88, 2023 09 02.

Artigo em Inglês | MEDLINE | ID: mdl-37658409

RESUMO

BACKGROUND: Population based surveillance of surgical site infections (SSIs) requires precise case-finding strategies. We sought to develop and validate machine learning models to automate the process of complex (deep incisional/organ space) SSIs case detection. METHODS: This retrospective cohort study included adult patients (age ≥ 18 years) admitted to Calgary, Canada acute care hospitals who underwent primary total elective hip (THA) or knee (TKA) arthroplasty between Jan 1st, 2013 and Aug 31st, 2020. True SSI conditions were judged by the Alberta Health Services Infection Prevention and Control (IPC) program staff. Using the IPC cases as labels, we developed and validated nine XGBoost models to identify deep incisional SSIs, organ space SSIs and complex SSIs using administrative data, electronic medical records (EMR) free text data, and both. The performance of machine learning models was assessed by sensitivity, specificity, positive predictive value, negative predictive value, F1 score, the area under the receiver operating characteristic curve (ROC AUC) and the area under the precision-recall curve (PR AUC). In addition, a bootstrap 95% confidence interval (95% CI) was calculated. RESULTS: There were 22,059 unique patients with 27,360 hospital admissions resulting in 88,351 days of hospital stay. This included 16,561 (60.5%) TKA and 10,799 (39.5%) THA procedures. There were 235 ascertained SSIs. Of them, 77 (32.8%) were superficial incisional SSIs, 57 (24.3%) were deep incisional SSIs, and 101 (42.9%) were organ space SSIs. The incidence rates were 0.37 for superficial incisional SSIs, 0.21 for deep incisional SSIs, 0.37 for organ space and 0.58 for complex SSIs per 100 surgical procedures, respectively. The optimal XGBoost models using administrative data and text data combined achieved a ROC AUC of 0.906 (95% CI 0.835-0.978), PR AUC of 0.637 (95% CI 0.528-0.746), and F1 score of 0.79 (0.67-0.90). CONCLUSIONS: Our findings suggest machine learning models derived from administrative data and EMR text data achieved high performance and can be used to automate the detection of complex SSIs.

The incidence rates of surgical site infections following total hip and knee arthroplasty were 0.5 and 0.52 per 100 surgical procedures. The incidence of SSIs varied significantly between care facilities (ranging from 0.53 to 1.71 per 100 procedures). The optimal machine learning model achieved a ROC AUC of 0.906 (95% CI 0.8350.978), PR AUC of 0.637 (95% CI 0.5280.746), and F1 score of 0.79 (0.670.90).

Assuntos

Artroplastia do Joelho , Adulto , Humanos , Adolescente , Artroplastia do Joelho/efeitos adversos , Infecção da Ferida Cirúrgica/diagnóstico , Infecção da Ferida Cirúrgica/epidemiologia , Estudos Retrospectivos , Alberta , Aprendizado de Máquina

6.

Hypertension identification using inpatient clinical notes from electronic medical records: an explainable, data-driven algorithm study.

Martin, Elliot A; D'Souza, Adam G; Lee, Seungwon; Doktorchik, Chelsea; Eastwood, Cathy A; Quan, Hude.

CMAJ Open ; 11(1): E131-E139, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-36787990

RESUMO

BACKGROUND: Case identification is important for health services research, measuring health system performance and risk adjustment, but existing methods based on manual chart review or diagnosis codes can be expensive, time consuming or of limited validity. We aimed to develop a hypertension case definition in electronic medical records (EMRs) for inpatient clinical notes using machine learning. METHODS: A cohort of patients 18 years of age or older who were discharged from 1 of 3 Calgary acute care facilities (1 academic hospital and 2 community hospitals) between Jan. 1 and June 30, 2015, were randomly selected, and we compared the performance of EMR phenotype algorithms developed using machine learning with an algorithm based on the Canadian version of the International Statistical Classification of Diseases and Related Health Problems, 10th Revision (ICD), in identifying patients with hypertension. Hypertension status was determined by chart review, the machine-learning algorithms used EMR notes and the ICD algorithm used the Discharge Abstract Database (Canadian Institute for Health Information). RESULTS: Of our study sample (n = 3040), 1475 (48.5%) patients had hypertension. The group with hypertension was older (median age of 71.0 yr v. 52.5 yr for those patients without hypertension) and had fewer females (710 [48.2%] v. 764 [52.3%]). Our final EMR-based models had higher sensitivity than the ICD algorithm (> 90% v. 47%), while maintaining high positive predictive values (> 90% v. 97%). INTERPRETATION: We found that hypertension tends to have clear documentation in EMRs and is well classified by concept search on free text. Machine learning can provide insights into how and where conditions are documented in EMRs and suggest nonmachine-learning phenotypes to implement.

Assuntos

Registros Eletrônicos de Saúde , Hipertensão , Feminino , Humanos , Pacientes Internados , Canadá/epidemiologia , Algoritmos , Hipertensão/diagnóstico , Hipertensão/epidemiologia

7.

CREATE: A New Data Resource to Support Cardiac Precision Health.

Lee, Seungwon; Li, Bing; Martin, Elliot A; D'Souza, Adam G; Jiang, Jason; Doktorchik, Chelsea; Southern, Danielle A; Lee, Joon; Wiebe, Natalie; Quan, Hude; Eastwood, Cathy A.

CJC Open ; 3(5): 639-645, 2021 May.

Artigo em Inglês | MEDLINE | ID: mdl-34036259

RESUMO

BACKGROUND: The initiatives of precision medicine and learning health systems require databases with rich and accurately captured data on patient characteristics. We introduce the Clinical Registry, AdminisTrative Data and Electronic Medical Records (CREATE) database, which includes linked data from 4 population databases: Alberta Provincial Project for Outcome Assessment in Coronary Heart Disease (APPROACH; a national clinical registry), Sunrise Clinical Manager (SCM) electronic medical record (city-wide), the Discharge Abstract Database (DAD), and the National Ambulatory Care Reporting System (NACRS). The intent of this work is to introduce a cardiovascular-specific database for pursuing precision health activities using big data analytics. METHODS: We used deterministic data linkage to link SCM electronic medical record data to APPROACH clinical registry data using patient identifier variables. The APPROACH-SCM data set was subsequently linked to DAD and NACRS to obtain inpatient and outpatient cohort data. We further validated the quality of the linkage, where applicable, in these databases by comparing against the Alberta Health Insurance Care Plan registry database. RESULTS: We achieved 99.96% linkage across these 4 databases. Currently, there are 30,984 patients with 35,753 catheterizations in the CREATE database. The inpatient cohort contained 65.75% (20,373/30,984) of the patient sample, whereas the outpatient cohort contained 29.78% (9226/30,984). The infrastructure and the process to update and expand the database has been established. CONCLUSIONS: CREATE is intended to serve as a database for supporting big data analytics activities surrounding cardiac precision health. The CREATE database will be managed by the Centre for Health Informatics at the University of Calgary, and housed in a secure high-performance computing environment.

CONTEXTE: Les initiatives en matière de médecine de précision et les systèmes de santé apprenants ont besoin de bases de données riches et exactes sur les caractéristiques des patients. Nous présentons ici la base de données CREATE ( C linical Re gistry, A dminis t rative Data and E lectronic Medical Records), qui regroupe les données couplées de quatre bases de données populationnelles : le registre clinique national APPROACH ( A lberta P rovincial Pr oject for O utcome A ssessment in C oronary H eart Disease), le système de gestion des dossiers médicaux électroniques SCM (Sunrise Clinical Manager, utilisé à l'échelle municipale), la Base de données sur les congés des patients (BDCP), et le Système national d'information sur les soins ambulatoires (SNISA). Notre objectif est d'offrir une base de données portant précisément sur les maladies cardiovasculaires, afin de soutenir les activités en santé de précision nécessitant l'analyse de mégadonnées. MÉTHODOLOGIE: Nous avons utilisé une méthode de couplage déterministe pour apparier les données du système SCM à celles du registre APPROACH à l'aide de variables d'identification des patients. L'ensemble de données SCM-APPROACH a ensuite été couplé aux données de la BDCP et du SNISA, afin d'obtenir les données des cohortes des patients hospitalisés et des patients ambulatoires. Lorsque c'était possible, nous avons en outre validé la qualité du couplage en comparant les données à celles de la base de données du Régime d'assurance maladie de l'Alberta. RÉSULTATS: Nous avons obtenu un taux de couplage de 99,96 % pour les quatre bases de données. À l'heure actuelle, la base de données CREATE compte 30 984 patients ayant subi 35 753 cathétérismes. La cohorte des patients hospitalisés représente 65,75 % (20 373/30 984) de l'échantillon, tandis que la cohorte des patients ambulatoires représente 29,78 % (9226/30 984). L'infrastructure et le processus de mise à jour et d'expansion de la base de données ont été définis. CONCLUSIONS: La base de données CREATE est destinée à soutenir les activités d'analyse de mégadonnées nécessaires à la santé cardiaque de précision. Elle sera gérée par le Centre for Health Informatics de l'Université de Calgary et hébergée dans un environnement informatique à haut rendement sécurisé.

8.

Network Inference and Maximum Entropy Estimation on Information Diagrams.

Martin, Elliot A; Hlinka, Jaroslav; Meinke, Alexander; Dechterenko, Filip; Tintera, Jaroslav; Oliver, Isaura; Davidsen, Jörn.

Sci Rep ; 7(1): 7062, 2017 08 01.

Artigo em Inglês | MEDLINE | ID: mdl-28765522

RESUMO

Maximum entropy estimation is of broad interest for inferring properties of systems across many disciplines. Using a recently introduced technique for estimating the maximum entropy of a set of random discrete variables when conditioning on bivariate mutual informations and univariate entropies, we show how this can be used to estimate the direct network connectivity between interacting units from observed activity. As a generic example, we consider phase oscillators and show that our approach is typically superior to simply using the mutual information. In addition, we propose a nonparametric formulation of connected informations, used to test the explanatory power of a network description in general. We give an illustrative example showing how this agrees with the existing parametric formulation, and demonstrate its applicability and advantages for resting-state human brain networks, for which we also discuss its direct effective connectivity. Finally, we generalize to continuous random variables and vastly expand the types of information-theoretic quantities one can condition on. This allows us to establish significant advantages of this approach over existing ones. Not only does our method perform favorably in the undersampled regime, where existing methods fail, but it also can be dramatically less computationally expensive as the cardinality of the variables increases.

9.

Pairwise network information and nonlinear correlations.

Martin, Elliot A; Hlinka, Jaroslav; Davidsen, Jörn.

Phys Rev E ; 94(4-1): 040301, 2016 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-27841521

RESUMO

Reconstructing the structural connectivity between interacting units from observed activity is a challenge across many different disciplines. The fundamental first step is to establish whether or to what extent the interactions between the units can be considered pairwise and, thus, can be modeled as an interaction network with simple links corresponding to pairwise interactions. In principle, this can be determined by comparing the maximum entropy given the bivariate probability distributions to the true joint entropy. In many practical cases, this is not an option since the bivariate distributions needed may not be reliably estimated or the optimization is too computationally expensive. Here we present an approach that allows one to use mutual informations as a proxy for the bivariate probability distributions. This has the advantage of being less computationally expensive and easier to estimate. We achieve this by introducing a novel entropy maximization scheme that is based on conditioning on entropies and mutual informations. This renders our approach typically superior to other methods based on linear approximations. The advantages of the proposed method are documented using oscillator networks and a resting-state human brain network as generic relevant examples.

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA