Pesquisa | BVS Integralidade em Saúde

1.

Pre-operative lung ablation prediction using deep learning.

Keshavamurthy, Krishna Nand; Eickhoff, Carsten; Ziv, Etay.

Eur Radiol ; 2024 May 22.

Artigo em Inglês | MEDLINE | ID: mdl-38775950

RESUMO

OBJECTIVE: Microwave lung ablation (MWA) is a minimally invasive and inexpensive alternative cancer treatment for patients who are not candidates for surgery/radiotherapy. However, a major challenge for MWA is its relatively high tumor recurrence rates, due to incomplete treatment as a result of inaccurate planning. We introduce a patient-specific, deep-learning model to accurately predict post-treatment ablation zones to aid planning and enable effective treatments. MATERIALS AND METHODS: Our IRB-approved retrospective study consisted of ablations with a single applicator/burn/vendor between 01/2015 and 01/2019. The input data included pre-procedure computerized tomography (CT), ablation power/time, and applicator position. The ground truth ablation zone was segmented from follow-up CT post-treatment. Novel deformable image registration optimized for ablation scans and an applicator-centric co-ordinate system for data analysis were applied. Our prediction model was based on the U-net architecture. The registrations were evaluated using target registration error (TRE) and predictions using Bland-Altman plots, Dice co-efficient, precision, and recall, compared against the applicator vendor's estimates. RESULTS: The data included 113 unique ablations from 72 patients (median age 57, interquartile range (IQR) (49-67); 41 women). We obtained a TRE ≤ 2 mm on 52 ablations. Our prediction had no bias from ground truth ablation volumes (p = 0.169) unlike the vendor's estimate (p < 0.001) and had smaller limits of agreement (p < 0.001). An 11% improvement was achieved in the Dice score. The ability to account for patient-specific in-vivo anatomical effects due to vessels, chest wall, heart, lung boundaries, and fissures was shown. CONCLUSIONS: We demonstrated a patient-specific deep-learning model to predict the ablation treatment effect prior to the procedure, with the potential for improved planning, achieving complete treatments, and reduce tumor recurrence. CLINICAL RELEVANCE STATEMENT: Our method addresses the current lack of reliable tools to estimate ablation extents, required for ensuring successful ablation treatments. The potential clinical implications include improved treatment planning, ensuring complete treatments, and reducing tumor recurrence.

2.

Drug-drug interaction prediction with Wasserstein Adversarial Autoencoder-based knowledge graph embeddings.

Dai, Yuanfei; Guo, Chenhao; Guo, Wenzhong; Eickhoff, Carsten.

Brief Bioinform ; 22(4)2021 07 20.

Artigo em Inglês | MEDLINE | ID: mdl-33126246

RESUMO

An interaction between pharmacological agents can trigger unexpected adverse events. Capturing richer and more comprehensive information about drug-drug interactions (DDIs) is one of the key tasks in public health and drug development. Recently, several knowledge graph (KG) embedding approaches have received increasing attention in the DDI domain due to their capability of projecting drugs and interactions into a low-dimensional feature space for predicting links and classifying triplets. However, existing methods only apply a uniformly random mode to construct negative samples. As a consequence, these samples are often too simplistic to train an effective model. In this paper, we propose a new KG embedding framework by introducing adversarial autoencoders (AAEs) based on Wasserstein distances and Gumbel-Softmax relaxation for DDI tasks. In our framework, the autoencoder is employed to generate high-quality negative samples and the hidden vector of the autoencoder is regarded as a plausible drug candidate. Afterwards, the discriminator learns the embeddings of drugs and interactions based on both positive and negative triplets. Meanwhile, in order to solve vanishing gradient problems on the discrete representation-an inherent flaw in traditional generative models-we utilize the Gumbel-Softmax relaxation and the Wasserstein distance to train the embedding model steadily. We empirically evaluate our method on two tasks: link prediction and DDI classification. The experimental results show that our framework can attain significant improvements and noticeably outperform competitive baselines. Supplementary information: Supplementary data and code are available at https://github.com/dyf0631/AAE_FOR_KG.

Assuntos

Desenvolvimento de Medicamentos , Interações Medicamentosas , Redes Neurais de Computação , Reconhecimento Automatizado de Padrão

3.

Risk Factors for Pediatric Sepsis in the Emergency Department: A Machine Learning Pilot Study.

Mercurio, Laura; Pou, Sovijja; Duffy, Susan; Eickhoff, Carsten.

Pediatr Emerg Care ; 39(2): e48-e56, 2023 Feb 01.

Artigo em Inglês | MEDLINE | ID: mdl-36648121

RESUMO

OBJECTIVE: To identify underappreciated sepsis risk factors among children presenting to a pediatric emergency department (ED). METHODS: A retrospective observational study (2017-2019) of children aged 18 years and younger presenting to a pediatric ED at a tertiary care children's hospital with fever, hypotension, or an infectious disease International Classification of Diseases (ICD)-10 diagnosis. Structured patient data including demographics, problem list, and vital signs were extracted for 35,074 qualifying ED encounters. According to the Improving Pediatric Sepsis Outcomes Classification, confirmed by expert review, 191 patients met clinical sepsis criteria. Five machine learning models were trained to predict sepsis/nonsepsis outcomes. Top features enabling model performance (N = 20) were then extracted to identify patient risk factors. RESULTS: Machine learning methods reached a performance of up to 93% sensitivity and 84% specificity in identifying patients who received a hospital diagnosis of sepsis. A random forest classifier performed the best, followed by a classification and regression tree. Maximum documented heart rate was the top feature in these models, with importance coefficients (ICs) of 0.09 and 0.21, which represent how much an individual feature contributes to the model. Maximum mean arterial pressure was the second most important feature (IC 0.05, 0.13). Immunization status (IC 0.02), age (IC 0.03), and patient zip code (IC 0.02) were also among the top features enabling models to predict sepsis from ED visit data. Stratified analysis revealed changes in the predictive importance of risk factors by race, ethnicity, oncologic history, and insurance status. CONCLUSIONS: Machine learning models trained to identify pediatric sepsis using ED clinical and sociodemographic variables confirmed well-established predictors, including heart rate and mean arterial pressure, and identified underappreciated relationships between sepsis and patient age, immunization status, and demographics.

Assuntos

Serviço Hospitalar de Emergência , Sepse , Humanos , Criança , Projetos Piloto , Aprendizado de Máquina , Estudos Retrospectivos , Sepse/diagnóstico , Sepse/epidemiologia , Fatores de Risco

4.

COVID-19 mortality prediction in the intensive care unit with deep learning based on longitudinal chest X-rays and clinical data.

Cheng, Jianhong; Sollee, John; Hsieh, Celina; Yue, Hailin; Vandal, Nicholas; Shanahan, Justin; Choi, Ji Whae; Tran, Thi My Linh; Halsey, Kasey; Iheanacho, Franklin; Warren, James; Ahmed, Abdullah; Eickhoff, Carsten; Feldman, Michael; Mortani Barbosa, Eduardo; Kamel, Ihab; Lin, Cheng Ting; Yi, Thomas; Healey, Terrance; Zhang, Paul; Wu, Jing; Atalay, Michael; Bai, Harrison X; Jiao, Zhicheng; Wang, Jianxin.

Eur Radiol ; 32(7): 4446-4456, 2022 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-35184218

RESUMO

OBJECTIVES: We aimed to develop deep learning models using longitudinal chest X-rays (CXRs) and clinical data to predict in-hospital mortality of COVID-19 patients in the intensive care unit (ICU). METHODS: Six hundred fifty-four patients (212 deceased, 442 alive, 5645 total CXRs) were identified across two institutions. Imaging and clinical data from one institution were used to train five longitudinal transformer-based networks applying five-fold cross-validation. The models were tested on data from the other institution, and pairwise comparisons were used to determine the best-performing models. RESULTS: A higher proportion of deceased patients had elevated white blood cell count, decreased absolute lymphocyte count, elevated creatine concentration, and incidence of cardiovascular and chronic kidney disease. A model based on pre-ICU CXRs achieved an AUC of 0.632 and an accuracy of 0.593, and a model based on ICU CXRs achieved an AUC of 0.697 and an accuracy of 0.657. A model based on all longitudinal CXRs (both pre-ICU and ICU) achieved an AUC of 0.702 and an accuracy of 0.694. A model based on clinical data alone achieved an AUC of 0.653 and an accuracy of 0.657. The addition of longitudinal imaging to clinical data in a combined model significantly improved performance, reaching an AUC of 0.727 (p = 0.039) and an accuracy of 0.732. CONCLUSIONS: The addition of longitudinal CXRs to clinical data significantly improves mortality prediction with deep learning for COVID-19 patients in the ICU. KEY POINTS: â¢ Deep learning was used to predict mortality in COVID-19 ICU patients. â¢ Serial radiographs and clinical data were used. â¢ The models could inform clinical decision-making and resource allocation.

Assuntos

COVID-19 , Aprendizado Profundo , Humanos , Unidades de Terapia Intensiva , Radiografia , Raios X

5.

Development of a Deep Learning Network to Classify Inferior Vena Cava Collapse to Predict Fluid Responsiveness.

Blaivas, Michael; Blaivas, Laura; Philips, Gary; Merchant, Roland; Levy, Mitchell; Abbasi, Adeel; Eickhoff, Carsten; Shapiro, Nathan; Corl, Keith.

J Ultrasound Med ; 40(8): 1495-1504, 2021 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-33038035

RESUMO

OBJECTIVES: To create a deep learning algorithm capable of video classification, using a long short-term memory (LSTM) network, to analyze collapsibility of the inferior vena cava (IVC) to predict fluid responsiveness in critically ill patients. METHODS: We used a data set of IVC ultrasound (US) videos to train the LSTM network. The data set was created from IVC US videos of spontaneously breathing critically ill patients undergoing intravenous fluid resuscitation as part of 2 prior prospective studies. We randomly selected 90% of the IVC videos to train the LSTM network and 10% of the videos to test the LSTM network's ability to predict fluid responsiveness. Fluid responsiveness was defined as a greater than 10% increase in the cardiac index after a 500-mL fluid bolus, as measured by bioreactance. RESULTS: We analyzed 211 videos from 175 critically ill patients: 191 to train the LSTM network and 20 to test it. Using standard data augmentation techniques, we increased our sample size from 191 to 3820 videos. Of the 175 patients, 91 (52%) were fluid responders. The LSTM network was able to predict fluid responsiveness moderately well, with an area under the receiver operating characteristic curve of 0.70 (95% confidence interval [CI], 0.43-1.00), a positive likelihood ratio of infinity, and a negative likelihood ratio of 0.3 (95% CI, 0.12-0.77). In comparison, point-of-care US experts using video review offline and manual diameter measurement via software caliper tools achieved an area under the receiver operating characteristic curve of 0.94 (95% CI, 0.83-0.99). CONCLUSIONS: We demonstrated that an LSTM network can be trained by using videos of IVC US to classify IVC collapse to predict fluid responsiveness. Our LSTM network performed moderately well given the small training cohort but worse than point-of-care US experts. Further training and testing of the LSTM network with a larger data sets is warranted.

Assuntos

Aprendizado Profundo , Choque , Hidratação , Humanos , Estudos Prospectivos , Veia Cava Inferior/diagnóstico por imagem

6.

Detecting Large Vessel Occlusion at Multiphase CT Angiography by Using a Deep Convolutional Neural Network.

Stib, Matthew T; Vasquez, Justin; Dong, Mary P; Kim, Yun Ho; Subzwari, Sumera S; Triedman, Harold J; Wang, Amy; Wang, Hsin-Lei Charlene; Yao, Anthony D; Jayaraman, Mahesh; Boxerman, Jerrold L; Eickhoff, Carsten; Cetintemel, Ugur; Baird, Grayson L; McTaggart, Ryan A.

Radiology ; 297(3): 640-649, 2020 12.

Artigo em Inglês | MEDLINE | ID: mdl-32990513

RESUMO

Background Large vessel occlusion (LVO) stroke is one of the most time-sensitive diagnoses in medicine and requires emergent endovascular therapy to reduce morbidity and mortality. Leveraging recent advances in deep learning may facilitate rapid detection and reduce time to treatment. Purpose To develop a convolutional neural network to detect LVOs at multiphase CT angiography. Materials and Methods This multicenter retrospective study evaluated 540 adults with CT angiography examinations for suspected acute ischemic stroke from February 2017 to June 2018. Examinations positive for LVO (n = 270) were confirmed by catheter angiography and LVO-negative examinations (n = 270) were confirmed through review of clinical and radiology reports. Preprocessing of the CT angiography examinations included vasculature segmentation and the creation of maximum intensity projection images to emphasize the contrast agent-enhanced vasculature. Seven experiments were performed by using combinations of the three phases (arterial, phase 1; peak venous, phase 2; and late venous, phase 3) of the CT angiography. Model performance was evaluated on the held-out test set. Metrics included area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. Results The test set included 62 patients (mean age, 69.5 years; 48% women). Single-phase CT angiography achieved an AUC of 0.74 (95% confidence interval [CI]: 0.63, 0.85) with sensitivity of 77% (24 of 31; 95% CI: 59%, 89%) and specificity of 71% (22 of 31; 95% CI: 53%, 84%). Phases 1, 2, and 3 together achieved an AUC of 0.89 (95% CI: 0.81, 0.96), sensitivity of 100% (31 of 31; 95% CI: 99%, 100%), and specificity of 77% (24 of 31; 95% CI: 59%, 89%), a statistically significant improvement relative to single-phase CT angiography (P = .01). Likewise, phases 1 and 3 and phases 2 and 3 also demonstrated improved fit relative to single phase (P = .03). Conclusion This deep learning model was able to detect the presence of large vessel occlusion and its diagnostic performance was enhanced by using delayed phases at multiphase CT angiography examinations. © RSNA, 2020 Online supplemental material is available for this article. See also the editorial by Ospel and Goyal in this issue.

Assuntos

Isquemia Encefálica/diagnóstico por imagem , Angiografia por Tomografia Computadorizada , Redes Neurais de Computação , Acidente Vascular Cerebral/diagnóstico por imagem , Idoso , Angiografia Cerebral , Meios de Contraste , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Estudos Retrospectivos , Sensibilidade e Especificidade

7.

Dynamic compression schemes for graph coloring.

Mustafa, Harun; Schilken, Ingo; Karasikov, Mikhail; Eickhoff, Carsten; Rätsch, Gunnar; Kahles, André.

Bioinformatics ; 35(3): 407-414, 2019 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-30020403

RESUMO

Motivation: Technological advancements in high-throughput DNA sequencing have led to an exponential growth of sequencing data being produced and stored as a byproduct of biomedical research. Despite its public availability, a majority of this data remains hard to query for the research community due to a lack of efficient data representation and indexing solutions. One of the available techniques to represent read data is a condensed form as an assembly graph. Such a representation contains all sequence information but does not store contextual information and metadata. Results: We present two new approaches for a compressed representation of a graph coloring: a lossless compression scheme based on a novel application of wavelet tries as well as a highly accurate lossy compression based on a set of Bloom filters. Both strategies retain a coloring even when adding to the underlying graph topology. We present construction and merge procedures for both methods and evaluate their performance on a wide range of different datasets. By dropping the requirement of a fully lossless compression and using the topological information of the underlying graph, we can reduce memory requirements by up to three orders of magnitude. Representing individual colors as independently stored modules, our approaches can be efficiently parallelized and provide strategies for dynamic use. These properties allow for an easy upscaling to the problem sizes common to the biomedical domain. Availability and implementation: We provide prototype implementations in C++, summaries of our experiments as well as links to all datasets publicly at https://github.com/ratschlab/graph_annotation. Supplementary information: Supplementary data are available at Bioinformatics online.

Assuntos

Biologia Computacional , Compressão de Dados , Software , Algoritmos , Cor , Genômica , Sequenciamento de Nucleotídeos em Larga Escala

8.

Correction to: COVID-19 mortality prediction in the intensive care unit with deep learning based on longitudinal chest X-rays and clinical data.

Cheng, Jianhong; Sollee, John; Hsieh, Celina; Yue, Hailin; Vandal, Nicholas; Shanahan, Justin; Choi, Ji Whae; Tran, Thi My Linh; Halsey, Kasey; Iheanacho, Franklin; Warren, James; Ahmed, Abdullah; Eickhoff, Carsten; Feldman, Michael; Barbosa, Eduardo Mortani; Kamel, Ihab; Lin, Cheng Ting; Yi, Thomas; Healey, Terrance; Zhang, Paul; Wu, Jing; Atalay, Michael; Bai, Harrison X; Jiao, Zhicheng; Wang, Jianxin.

Eur Radiol ; 32(7): 5034, 2022 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-35320415

9.

Machine learning to predict hemorrhage and thrombosis during extracorporeal membrane oxygenation.

Abbasi, Adeel; Karasu, Yasmin; Li, Cindy; Sodha, Neel R; Eickhoff, Carsten; Ventetuolo, Corey E.

Crit Care ; 24(1): 689, 2020 12 10.

Artigo em Inglês | MEDLINE | ID: mdl-33302954

Assuntos

Oxigenação por Membrana Extracorpórea/efeitos adversos , Previsões/métodos , Hemorragia/diagnóstico , Aprendizado de Máquina/tendências , Trombose/diagnóstico , Distribuição de Qui-Quadrado , Estudos de Coortes , Oxigenação por Membrana Extracorpórea/métodos , Oxigenação por Membrana Extracorpórea/estatística & dados numéricos , Hemorragia/fisiopatologia , Humanos , Estudos Retrospectivos , Trombose/fisiopatologia

10.

Artificial intelligence-assisted care in medicine: a revolution or yet another blunt weapon?

Meyer, Alexander; Cypko, Mario A; Eickhoff, Carsten; Falk, Volkmar; Emmert, Maximilian Y.

Eur Heart J ; 40(40): 3286-3289, 2019 10 21.

Artigo em Inglês | MEDLINE | ID: mdl-31633172

Assuntos

Inteligência Artificial , Tomada de Decisões Assistida por Computador , Procedimentos Cirúrgicos Cardíacos , Doenças Cardiovasculares/diagnóstico , Doenças Cardiovasculares/terapia , Aprendizado Profundo , Humanos

11.

Learning to Make Rare and Complex Diagnoses With Generative AI Assistance: Qualitative Study of Popular Large Language Models.

Abdullahi, Tassallah; Singh, Ritambhara; Eickhoff, Carsten.

JMIR Med Educ ; 10: e51391, 2024 Feb 13.

Artigo em Inglês | MEDLINE | ID: mdl-38349725

RESUMO

BACKGROUND: Patients with rare and complex diseases often experience delayed diagnoses and misdiagnoses because comprehensive knowledge about these diseases is limited to only a few medical experts. In this context, large language models (LLMs) have emerged as powerful knowledge aggregation tools with applications in clinical decision support and education domains. OBJECTIVE: This study aims to explore the potential of 3 popular LLMs, namely Bard (Google LLC), ChatGPT-3.5 (OpenAI), and GPT-4 (OpenAI), in medical education to enhance the diagnosis of rare and complex diseases while investigating the impact of prompt engineering on their performance. METHODS: We conducted experiments on publicly available complex and rare cases to achieve these objectives. We implemented various prompt strategies to evaluate the performance of these models using both open-ended and multiple-choice prompts. In addition, we used a majority voting strategy to leverage diverse reasoning paths within language models, aiming to enhance their reliability. Furthermore, we compared their performance with the performance of human respondents and MedAlpaca, a generative LLM specifically designed for medical tasks. RESULTS: Notably, all LLMs outperformed the average human consensus and MedAlpaca, with a minimum margin of 5% and 13%, respectively, across all 30 cases from the diagnostic case challenge collection. On the frequently misdiagnosed cases category, Bard tied with MedAlpaca but surpassed the human average consensus by 14%, whereas GPT-4 and ChatGPT-3.5 outperformed MedAlpaca and the human respondents on the moderately often misdiagnosed cases category with minimum accuracy scores of 28% and 11%, respectively. The majority voting strategy, particularly with GPT-4, demonstrated the highest overall score across all cases from the diagnostic complex case collection, surpassing that of other LLMs. On the Medical Information Mart for Intensive Care-III data sets, Bard and GPT-4 achieved the highest diagnostic accuracy scores, with multiple-choice prompts scoring 93%, whereas ChatGPT-3.5 and MedAlpaca scored 73% and 47%, respectively. Furthermore, our results demonstrate that there is no one-size-fits-all prompting approach for improving the performance of LLMs and that a single strategy does not universally apply to all LLMs. CONCLUSIONS: Our findings shed light on the diagnostic capabilities of LLMs and the challenges associated with identifying an optimal prompting strategy that aligns with each language model's characteristics and specific task requirements. The significance of prompt engineering is highlighted, providing valuable insights for researchers and practitioners who use these language models for medical training. Furthermore, this study represents a crucial step toward understanding how LLMs can enhance diagnostic reasoning in rare and complex medical cases, paving the way for developing effective educational tools and accurate diagnostic aids to improve patient care and outcomes.

Assuntos

Aprendizagem , Resolução de Problemas , Humanos , Reprodutibilidade dos Testes , Escolaridade , Idioma

12.

Retrieval-Based Diagnostic Decision Support: Mixed Methods Study.

Abdullahi, Tassallah; Mercurio, Laura; Singh, Ritambhara; Eickhoff, Carsten.

JMIR Med Inform ; 12: e50209, 2024 Jun 19.

Artigo em Inglês | MEDLINE | ID: mdl-38896468

RESUMO

BACKGROUND: Diagnostic errors pose significant health risks and contribute to patient mortality. With the growing accessibility of electronic health records, machine learning models offer a promising avenue for enhancing diagnosis quality. Current research has primarily focused on a limited set of diseases with ample training data, neglecting diagnostic scenarios with limited data availability. OBJECTIVE: This study aims to develop an information retrieval (IR)-based framework that accommodates data sparsity to facilitate broader diagnostic decision support. METHODS: We introduced an IR-based diagnostic decision support framework called CliniqIR. It uses clinical text records, the Unified Medical Language System Metathesaurus, and 33 million PubMed abstracts to classify a broad spectrum of diagnoses independent of training data availability. CliniqIR is designed to be compatible with any IR framework. Therefore, we implemented it using both dense and sparse retrieval approaches. We compared CliniqIR's performance to that of pretrained clinical transformer models such as Clinical Bidirectional Encoder Representations from Transformers (ClinicalBERT) in supervised and zero-shot settings. Subsequently, we combined the strength of supervised fine-tuned ClinicalBERT and CliniqIR to build an ensemble framework that delivers state-of-the-art diagnostic predictions. RESULTS: On a complex diagnosis data set (DC3) without any training data, CliniqIR models returned the correct diagnosis within their top 3 predictions. On the Medical Information Mart for Intensive Care III data set, CliniqIR models surpassed ClinicalBERT in predicting diagnoses with <5 training samples by an average difference in mean reciprocal rank of 0.10. In a zero-shot setting where models received no disease-specific training, CliniqIR still outperformed the pretrained transformer models with a greater mean reciprocal rank of at least 0.10. Furthermore, in most conditions, our ensemble framework surpassed the performance of its individual components, demonstrating its enhanced ability to make precise diagnostic predictions. CONCLUSIONS: Our experiments highlight the importance of IR in leveraging unstructured knowledge resources to identify infrequently encountered diagnoses. In addition, our ensemble framework benefits from combining the complementary strengths of the supervised and retrieval-based models to diagnose a broad spectrum of diseases.

13.

A Language Model-Powered Simulated Patient With Automated Feedback for History Taking: Prospective Study.

Holderried, Friederike; Stegemann-Philipps, Christian; Herrmann-Werner, Anne; Festl-Wietek, Teresa; Holderried, Martin; Eickhoff, Carsten; Mahling, Moritz.

JMIR Med Educ ; 10: e59213, 2024 Aug 16.

Artigo em Inglês | MEDLINE | ID: mdl-39150749

RESUMO

BACKGROUND: Although history taking is fundamental for diagnosing medical conditions, teaching and providing feedback on the skill can be challenging due to resource constraints. Virtual simulated patients and web-based chatbots have thus emerged as educational tools, with recent advancements in artificial intelligence (AI) such as large language models (LLMs) enhancing their realism and potential to provide feedback. OBJECTIVE: In our study, we aimed to evaluate the effectiveness of a Generative Pretrained Transformer (GPT) 4 model to provide structured feedback on medical students' performance in history taking with a simulated patient. METHODS: We conducted a prospective study involving medical students performing history taking with a GPT-powered chatbot. To that end, we designed a chatbot to simulate patients' responses and provide immediate feedback on the comprehensiveness of the students' history taking. Students' interactions with the chatbot were analyzed, and feedback from the chatbot was compared with feedback from a human rater. We measured interrater reliability and performed a descriptive analysis to assess the quality of feedback. RESULTS: Most of the study's participants were in their third year of medical school. A total of 1894 question-answer pairs from 106 conversations were included in our analysis. GPT-4's role-play and responses were medically plausible in more than 99% of cases. Interrater reliability between GPT-4 and the human rater showed "almost perfect" agreement (Cohen κ=0.832). Less agreement (κ<0.6) detected for 8 out of 45 feedback categories highlighted topics about which the model's assessments were overly specific or diverged from human judgement. CONCLUSIONS: The GPT model was effective in providing structured feedback on history-taking dialogs provided by medical students. Although we unraveled some limitations regarding the specificity of feedback for certain feedback categories, the overall high agreement with human raters suggests that LLMs can be a valuable tool for medical education. Our findings, thus, advocate the careful integration of AI-driven feedback mechanisms in medical training and highlight important aspects when LLMs are used in that context.

Assuntos

Anamnese , Simulação de Paciente , Estudantes de Medicina , Humanos , Estudos Prospectivos , Anamnese/métodos , Anamnese/normas , Estudantes de Medicina/psicologia , Feminino , Masculino , Competência Clínica/normas , Inteligência Artificial , Retroalimentação , Reprodutibilidade dos Testes , Educação de Graduação em Medicina/métodos

14.

Short-term vital parameter forecasting in the intensive care unit: A benchmark study leveraging data from patients after cardiothoracic surgery.

Hinrichs, Nils; Roeschl, Tobias; Lanmueller, Pia; Balzer, Felix; Eickhoff, Carsten; O'Brien, Benjamin; Falk, Volkmar; Meyer, Alexander.

PLOS Digit Health ; 3(9): e0000598, 2024 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-39264979

RESUMO

Patients in an Intensive Care Unit (ICU) are closely and continuously monitored, and many machine learning (ML) solutions have been proposed to predict specific outcomes like death, bleeding, or organ failure. Forecasting of vital parameters is a more general approach to ML-based patient monitoring, but the literature on its feasibility and robust benchmarks of achievable accuracy are scarce. We implemented five univariate statistical models (the naïve model, the Theta method, exponential smoothing, the autoregressive integrated moving average model, and an autoregressive single-layer neural network), two univariate neural networks (N-BEATS and N-HiTS), and two multivariate neural networks designed for sequential data (a recurrent neural network with gated recurrent unit, GRU, and a Transformer network) to produce forecasts for six vital parameters recorded at five-minute intervals during intensive care monitoring. Vital parameters were the diastolic, systolic, and mean arterial blood pressure, central venous pressure, peripheral oxygen saturation (measured by non-invasive pulse oximetry) and heart rate, and forecasts were made for 5 through 120 minutes into the future. Patients used in this study recovered from cardiothoracic surgery in an ICU. The patient cohort used for model development (n = 22,348) and internal testing (n = 2,483) originated from a heart center in Germany, while a patient sub-set from the eICU collaborative research database, an American multicenter ICU cohort, was used for external testing (n = 7,477). The GRU was the predominant method in this study. Uni- and multivariate neural network models proved to be superior to univariate statistical models across vital parameters and forecast horizons, and their advantage steadily became more pronounced for increasing forecast horizons. With this study, we established an extensive set of benchmarks for forecast performance in the ICU. Our findings suggest that supplying physicians with short-term forecasts of vital parameters in the ICU is feasible, and that multivariate neural networks are most suited for the task due to their ability to learn patterns across thousands of patients.

15.

Predicting Acute Brain Injury in Venoarterial Extracorporeal Membrane Oxygenation Patients with Tree-Based Machine Learning: Analysis of the Extracorporeal Life Support Organization Registry.

Kalra, Andrew; Bachina, Preetham; Shou, Benjamin L; Hwang, Jaeho; Barshay, Meylakh; Kulkarni, Shreyas; Sears, Isaac; Eickhoff, Carsten; Bermudez, Christian A; Brodie, Daniel; Ventetuolo, Corey E; Kim, Bo Soo; Whitman, Glenn J R; Abbasi, Adeel; Cho, Sung-Min.

Res Sq ; 2024 Jan 11.

Artigo em Inglês | MEDLINE | ID: mdl-38260374

RESUMO

Objective: To determine if machine learning (ML) can predict acute brain injury (ABI) and identify modifiable risk factors for ABI in venoarterial extracorporeal membrane oxygenation (VA-ECMO) patients. Design: Retrospective cohort study of the Extracorporeal Life Support Organization (ELSO) Registry (2009-2021). Setting: International, multicenter registry study of 676 ECMO centers. Patients: Adults (≥18 years) supported with VA-ECMO or extracorporeal cardiopulmonary resuscitation (ECPR). Interventions: None. Measurements and Main Results: Our primary outcome was ABI: central nervous system (CNS) ischemia, intracranial hemorrhage (ICH), brain death, and seizures. We utilized Random Forest, CatBoost, LightGBM and XGBoost ML algorithms (10-fold leave-one-out cross-validation) to predict and identify features most important for ABI. We extracted 65 total features: demographics, pre-ECMO/on-ECMO laboratory values, and pre-ECMO/on-ECMO settings.Of 35,855 VA-ECMO (non-ECPR) patients (median age=57.8 years, 66% male), 7.7% (n=2,769) experienced ABI. In VA-ECMO (non-ECPR), the area under the receiver-operator characteristics curves (AUC-ROC) to predict ABI, CNS ischemia, and ICH was 0.67, 0.67, and 0.62, respectively. The true positive, true negative, false positive, false negative, positive, and negative predictive values were 33%, 88%, 12%, 67%, 18%, and 94%, respectively for ABI. Longer ECMO duration, higher 24h ECMO pump flow, and higher on-ECMO PaO2 were associated with ABI.Of 10,775 ECPR patients (median age=57.1 years, 68% male), 16.5% (n=1,787) experienced ABI. The AUC-ROC for ABI, CNS ischemia, and ICH was 0.72, 0.73, and 0.69, respectively. The true positive, true negative, false positive, false negative, positive, and negative predictive values were 61%, 70%, 30%, 39%, 29% and 90%, respectively, for ABI. Longer ECMO duration, younger age, and higher 24h ECMO pump flow were associated with ABI. Conclusions: This is the largest study predicting neurological complications on sufficiently powered international ECMO cohorts. Longer ECMO duration and higher 24h pump flow were associated with ABI in both non-ECPR and ECPR VA-ECMO.

16.

Acute brain injury risk prediction models in venoarterial extracorporeal membrane oxygenation patients with tree-based machine learning: An Extracorporeal Life Support Organization Registry analysis.

Kalra, Andrew; Bachina, Preetham; Shou, Benjamin L; Hwang, Jaeho; Barshay, Meylakh; Kulkarni, Shreyas; Sears, Isaac; Eickhoff, Carsten; Bermudez, Christian A; Brodie, Daniel; Ventetuolo, Corey E; Kim, Bo Soo; Whitman, Glenn J R; Abbasi, Adeel; Cho, Sung-Min.

JTCVS Open ; 20: 64-88, 2024 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-39296456

RESUMO

Objective: We aimed to determine if machine learning can predict acute brain injury and to identify modifiable risk factors for acute brain injury in patients receiving venoarterial extracorporeal membrane oxygenation. Methods: We included adults (age ≥18 years) receiving venoarterial extracorporeal membrane oxygenation or extracorporeal cardiopulmonary resuscitation in the Extracorporeal Life Support Organization Registry (2009-2021). Our primary outcome was acute brain injury: central nervous system ischemia, intracranial hemorrhage, brain death, and seizures. We used Random Forest, CatBoost, LightGBM, and XGBoost machine learning algorithms (10-fold leave-1-out cross-validation) to predict and identify features most important for acute brain injury. We extracted 65 total features: demographics, pre-extracorporeal membrane oxygenation/on-extracorporeal membrane oxygenation laboratory values, and pre-extracorporeal membrane oxygenation/on-extracorporeal membrane oxygenation settings. Results: Of 35,855 patients receiving venoarterial extracorporeal membrane oxygenation (nonextracorporeal cardiopulmonary resuscitation) (median age of 57.8 years, 66% were male), 7.7% (n = 2769) experienced acute brain injury. In venoarterial extracorporeal membrane oxygenation (nonextracorporeal cardiopulmonary resuscitation), the area under the receiver operator characteristic curves to predict acute brain injury, central nervous system ischemia, and intracranial hemorrhage were 0.67, 0.67, and 0.62, respectively. The true-positive, true-negative, false-positive, false-negative, positive, and negative predictive values were 33%, 88%, 12%, 67%, 18%, and 94%, respectively, for acute brain injury. Longer extracorporeal membrane oxygenation duration, higher 24-hour extracorporeal membrane oxygenation pump flow, and higher on-extracorporeal membrane oxygenation partial pressure of oxygen were associated with acute brain injury. Of 10,775 patients receiving extracorporeal cardiopulmonary resuscitation (median age of 57.1 years, 68% were male), 16.5% (n = 1787) experienced acute brain injury. The area under the receiver operator characteristic curves for acute brain injury, central nervous system ischemia, and intracranial hemorrhage were 0.72, 0.73, and 0.69, respectively. Longer extracorporeal membrane oxygenation duration, older age, and higher 24-hour extracorporeal membrane oxygenation pump flow were associated with acute brain injury. Conclusions: In the largest study predicting neurological complications with machine learning in extracorporeal membrane oxygenation, longer extracorporeal membrane oxygenation duration and higher 24-hour pump flow were associated with acute brain injury in nonextracorporeal cardiopulmonary resuscitation and extracorporeal cardiopulmonary resuscitation venoarterial extracorporeal membrane oxygenation.

17.

Neural text generation in regulatory medical writing.

Meyer, Claudia; Adkins, Daniel; Pal, Koyena; Galici, Ruggero; Garcia-Agundez, Augusto; Eickhoff, Carsten.

Front Pharmacol ; 14: 1086913, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-36843925

RESUMO

Background: A steep increase in new drug applications has increased the overhead of writing technical documents such as medication guides. Natural language processing can contribute to reducing this burden. Objective: To generate medication guides from texts that relate to prescription drug labeling information. Materials and Methods: We collected official drug label information from the DailyMed website. We focused on drug labels containing medication guide sections to train and test our model. To construct our training dataset, we aligned "source" text from the document with similar "target" text from the medication guide using three families of alignment techniques: global, manual, and heuristic alignment. The resulting source-target pairs were provided as input to a Pointer Generator Network, an abstractive text summarization model. Results: Global alignment produced the lowest ROUGE scores and relatively poor qualitative results, as running the model frequently resulted in mode collapse. Manual alignment also resulted in mode collapse, albeit higher ROUGE scores than global alignment. Within the family of heuristic alignment approaches, we compared different methods and found BM25-based alignments to produce significantly better summaries (at least 6.8 ROUGE points above the other techniques). This alignment surpassed both the global and manual alignments in terms of ROUGE and qualitative scoring. Conclusion: The results of this study indicate that a heuristic approach to generating inputs for an abstractive summarization model increased ROUGE scores, compared to a global or manual approach when automatically generating biomedical text. Such methods hold the potential to significantly reduce the manual labor burden in medical writing and related disciplines.

18.

A feasibility study on AI-controlled closed-loop electrical stimulation implants.

Eickhoff, Steffen; Garcia-Agundez, Augusto; Haidar, Daniela; Zaidat, Bashar; Adjei-Mosi, Michael; Li, Peter; Eickhoff, Carsten.

Sci Rep ; 13(1): 10163, 2023 06 22.

Artigo em Inglês | MEDLINE | ID: mdl-37349359

RESUMO

Miniaturized electrical stimulation (ES) implants show great promise in practice, but their real-time control by means of biophysical mechanistic algorithms is not feasible due to computational complexity. Here, we study the feasibility of more computationally efficient machine learning methods to control ES implants. For this, we estimate the normalized twitch force of the stimulated extensor digitorum longus muscle on n = 11 Wistar rats with intra- and cross-subject calibration. After 2000 training stimulations, we reach a mean absolute error of 0.03 in an intra-subject setting and 0.2 in a cross-subject setting with a random forest regressor. To the best of our knowledge, this work is the first experiment showing the feasibility of AI to simulate complex ES mechanistic models. However, the results of cross-subject training motivate more research on error reduction methods for this setting.

Assuntos

Inteligência Artificial , Músculo Esquelético , Ratos , Animais , Ratos Wistar , Estudos de Viabilidade , Músculo Esquelético/fisiologia , Estimulação Elétrica/métodos , Contração Muscular

19.

Interpretable machine learning-based predictive modeling of patient outcomes following cardiac surgery.

Abbasi, Adeel; Li, Cindy; Dekle, Max; Bermudez, Christian A; Brodie, Daniel; Sellke, Frank W; Sodha, Neel R; Ventetuolo, Corey E; Eickhoff, Carsten.

J Thorac Cardiovasc Surg ; 2023 Nov 29.

Artigo em Inglês | MEDLINE | ID: mdl-38040328

RESUMO

BACKGROUND: The clinical applicability of machine learning predictions of patient outcomes following cardiac surgery remains unclear. We applied machine learning to predict patient outcomes associated with high morbidity and mortality after cardiac surgery and identified the importance of variables to the derived model's performance. METHODS: We applied machine learning to the Society of Thoracic Surgeons Adult Cardiac Surgery Database to predict postoperative hemorrhage requiring reoperation, venous thromboembolism (VTE), and stroke. We used permutation feature importance to identify variables important to model performance and a misclassification analysis to study the limitations of the model. RESULTS: The study dataset included 662,772 subjects who underwent cardiac surgery between 2015 and 2017 and 240 variables. Hemorrhage requiring reoperation, VTE, and stroke occurred in 2.9%, 1.2%, and 2.0% of subjects, respectively. The model performed remarkably well at predicting all 3 complications (area under the receiver operating characteristic curve, 0.92-0.97). Preoperative and intraoperative variables were not important to model performance; instead, performance for the prediction of all 3 outcomes was driven primarily by several postoperative variables, including known risk factors for the complications, such as mechanical ventilation and new onset of postoperative arrhythmias. Many of the postoperative variables important to model performance also increased the risk of subject misclassification, indicating internal validity. CONCLUSIONS: A machine learning model accurately and reliably predicts patient outcomes following cardiac surgery. Postoperative, as opposed to preoperative or intraoperative variables, are important to model performance. Interventions targeting this period, including minimizing the duration of mechanical ventilation and early treatment of new-onset postoperative arrhythmias, may help lower the risk of these complications.

20.

Delirium detection using wearable sensors and machine learning in patients with intracerebral hemorrhage.

Ahmed, Abdullah; Garcia-Agundez, Augusto; Petrovic, Ivana; Radaei, Fatemeh; Fife, James; Zhou, John; Karas, Hunter; Moody, Scott; Drake, Jonathan; Jones, Richard N; Eickhoff, Carsten; Reznik, Michael E.

Front Neurol ; 14: 1135472, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37360342

RESUMO

Objective: Delirium is associated with worse outcomes in patients with stroke and neurocritical illness, but delirium detection in these patients can be challenging with existing screening tools. To address this gap, we aimed to develop and evaluate machine learning models that detect episodes of post-stroke delirium based on data from wearable activity monitors in conjunction with stroke-related clinical features. Design: Prospective observational cohort study. Setting: Neurocritical Care and Stroke Units at an academic medical center. Patients: We recruited 39 patients with moderate-to-severe acute intracerebral hemorrhage (ICH) and hemiparesis over a 1-year period [mean (SD) age 71.3 (12.20), 54% male, median (IQR) initial NIH Stroke Scale 14.5 (6), median (IQR) ICH score 2 (1)]. Measurements and main results: Each patient received daily assessments for delirium by an attending neurologist, while activity data were recorded throughout each patient's hospitalization using wrist-worn actigraph devices (on both paretic and non-paretic arms). We compared the predictive accuracy of Random Forest, SVM and XGBoost machine learning methods in classifying daily delirium status using clinical information alone and combined with actigraph data. Among our study cohort, 85% of patients (n = 33) had at least one delirium episode, while 71% of monitoring days (n = 209) were rated as days with delirium. Clinical information alone had a low accuracy in detecting delirium on a day-to-day basis [accuracy mean (SD) 62% (18%), F1 score mean (SD) 50% (17%)]. Prediction performance improved significantly (p < 0.001) with the addition of actigraph data [accuracy mean (SD) 74% (10%), F1 score 65% (10%)]. Among actigraphy features, night-time actigraph data were especially relevant for classification accuracy. Conclusions: We found that actigraphy in conjunction with machine learning models improves clinical detection of delirium in patients with stroke, thus paving the way to make actigraph-assisted predictions clinically actionable.

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa