Pesquisa | Biblioteca Virtual em Saúde

Pediatric ECG-Based Deep Learning to Predict Left Ventricular Dysfunction and Remodeling.

Mayourian, Joshua; La Cava, William G; Vaid, Akhil; Nadkarni, Girish N; Ghelani, Sunil J; Mannix, Rebekah; Geva, Tal; Dionne, Audrey; Alexander, Mark E; Duong, Son Q; Triedman, John K.

Circulation ; 149(12): 917-931, 2024 03 19.

Artigo em Inglês | MEDLINE | ID: mdl-38314583

RESUMO

BACKGROUND: Artificial intelligence-enhanced ECG analysis shows promise to detect ventricular dysfunction and remodeling in adult populations. However, its application to pediatric populations remains underexplored. METHODS: A convolutional neural network was trained on paired ECG-echocardiograms (≤2 days apart) from patients ≤18 years of age without major congenital heart disease to detect human expert-classified greater than mild left ventricular (LV) dysfunction, hypertrophy, and dilation (individually and as a composite outcome). Model performance was evaluated on single ECG-echocardiogram pairs per patient at Boston Children's Hospital and externally at Mount Sinai Hospital using area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPRC). RESULTS: The training cohort comprised 92 377 ECG-echocardiogram pairs (46 261 patients; median age, 8.2 years). Test groups included internal testing (12 631 patients; median age, 8.8 years; 4.6% composite outcomes), emergency department (2830 patients; median age, 7.7 years; 10.0% composite outcomes), and external validation (5088 patients; median age, 4.3 years; 6.1% composite outcomes) cohorts. Model performance was similar on internal test and emergency department cohorts, with model predictions of LV hypertrophy outperforming the pediatric cardiologist expert benchmark. Adding age and sex to the model added no benefit to model performance. When using quantitative outcome cutoffs, model performance was similar between internal testing (composite outcome: AUROC, 0.88, AUPRC, 0.43; LV dysfunction: AUROC, 0.92, AUPRC, 0.23; LV hypertrophy: AUROC, 0.88, AUPRC, 0.28; LV dilation: AUROC, 0.91, AUPRC, 0.47) and external validation (composite outcome: AUROC, 0.86, AUPRC, 0.39; LV dysfunction: AUROC, 0.94, AUPRC, 0.32; LV hypertrophy: AUROC, 0.84, AUPRC, 0.25; LV dilation: AUROC, 0.87, AUPRC, 0.33), with composite outcome negative predictive values of 99.0% and 99.2%, respectively. Saliency mapping highlighted ECG components that influenced model predictions (precordial QRS complexes for all outcomes; T waves for LV dysfunction). High-risk ECG features include lateral T-wave inversion (LV dysfunction), deep S waves in V1 and V2 and tall R waves in V6 (LV hypertrophy), and tall R waves in V4 through V6 (LV dilation). CONCLUSIONS: This externally validated algorithm shows promise to inexpensively screen for LV dysfunction and remodeling in children, which may facilitate improved access to care by democratizing the expertise of pediatric cardiologists.

Assuntos

Aprendizado Profundo , Disfunção Ventricular Esquerda , Adulto , Humanos , Criança , Pré-Escolar , Eletrocardiografia , Inteligência Artificial , Disfunção Ventricular Esquerda/diagnóstico por imagem , Hipertrofia Ventricular Esquerda/diagnóstico por imagem

A foundation model for clinician-centered drug repurposing.

Huang, Kexin; Chandak, Payal; Wang, Qianwen; Havaldar, Shreyas; Vaid, Akhil; Leskovec, Jure; Nadkarni, Girish; Glicksberg, Benjamin S; Gehlenborg, Nils; Zitnik, Marinka.

medRxiv ; 2024 Aug 07.

Artigo em Inglês | MEDLINE | ID: mdl-39148855

RESUMO

Drug repurposing - identifying new therapeutic uses for approved drugs - is often serendipitous and opportunistic, expanding the use of drugs for new diseases. The clinical utility of drug repurposing AI models remains limited because the models focus narrowly on diseases for which some drugs already exist. Here, we introduce T x GNN, a graph foundation model for zero-shot drug repurposing, identifying therapeutic candidates even for diseases with limited treatment options or no existing drugs. Trained on a medical knowledge graph, T x GNN utilizes a graph neural network and metric-learning module to rank drugs as potential indications and contraindications across 17,080 diseases. When benchmarked against eight methods, T x GNN improves prediction accuracy for indications by 49.2% and contraindications by 35.1% under stringent zero-shot evaluation. To facilitate model interpretation, T x GNN's Explainer module offers transparent insights into multi-hop medical knowledge paths that form T x GNN's predictive rationales. Human evaluation of T x GNN's Explainer showed that T x GNN's predictions and explanations perform encouragingly on multiple axes of performance beyond accuracy. Many of T x GNN's novel predictions align with off-label prescriptions clinicians make in a large healthcare system. T x GNN's drug repurposing predictions are accurate, consistent with off-label drug use, and can be investigated by human experts through multi-hop interpretable rationales.

Local large language models for privacy-preserving accelerated review of historic echocardiogram reports.

Vaid, Akhil; Duong, Son Q; Lampert, Joshua; Kovatch, Patricia; Freeman, Robert; Argulian, Edgar; Croft, Lori; Lerakis, Stamatios; Goldman, Martin; Khera, Rohan; Nadkarni, Girish N.

J Am Med Inform Assoc ; 31(9): 2097-2102, 2024 Sep 01.

Artigo em Inglês | MEDLINE | ID: mdl-38687616

RESUMO

OBJECTIVES: The study developed framework that leverages an open-source Large Language Model (LLM) to enable clinicians to ask plain-language questions about a patient's entire echocardiogram report history. This approach is intended to streamline the extraction of clinical insights from multiple echocardiogram reports, particularly in patients with complex cardiac diseases, thereby enhancing both patient care and research efficiency. MATERIALS AND METHODS: Data from over 10 years were collected, comprising echocardiogram reports from patients with more than 10 echocardiograms on file at the Mount Sinai Health System. These reports were converted into a single document per patient for analysis, broken down into snippets and relevant snippets were retrieved using text similarity measures. The LLaMA-2 70B model was employed for analyzing the text using a specially crafted prompt. The model's performance was evaluated against ground-truth answers created by faculty cardiologists. RESULTS: The study analyzed 432 reports from 37 patients for a total of 100 question-answer pairs. The LLM correctly answered 90% questions, with accuracies of 83% for temporality, 93% for severity assessment, 84% for intervention identification, and 100% for diagnosis retrieval. Errors mainly stemmed from the LLM's inherent limitations, such as misinterpreting numbers or hallucinations. CONCLUSION: The study demonstrates the feasibility and effectiveness of using a local, open-source LLM for querying and interpreting echocardiogram report data. This approach offers a significant improvement over traditional keyword-based searches, enabling more contextually relevant and semantically accurate responses; in turn showing promise in enhancing clinical decision-making and research by facilitating more efficient access to complex patient data.

Assuntos

Ecocardiografia , Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Humanos , Cardiopatias/diagnóstico por imagem , Confidencialidade , Armazenamento e Recuperação da Informação/métodos

Automated Diagnostic Reports from Images of Electrocardiograms at the Point-of-Care.

Khunte, Akshay; Sangha, Veer; Oikonomou, Evangelos K; Dhingra, Lovedeep S; Aminorroaya, Arya; Coppi, Andreas; Shankar, Sumukh Vasisht; Mortazavi, Bobak J; Bhatt, Deepak L; Krumholz, Harlan M; Nadkarni, Girish N; Vaid, Akhil; Khera, Rohan.

medRxiv ; 2024 Feb 18.

Artigo em Inglês | MEDLINE | ID: mdl-38405776

RESUMO

Timely and accurate assessment of electrocardiograms (ECGs) is crucial for diagnosing, triaging, and clinically managing patients. Current workflows rely on a computerized ECG interpretation using rule-based tools built into the ECG signal acquisition systems with limited accuracy and flexibility. In low-resource settings, specialists must review every single ECG for such decisions, as these computerized interpretations are not available. Additionally, high-quality interpretations are even more essential in such low-resource settings as there is a higher burden of accuracy for automated reads when access to experts is limited. Artificial Intelligence (AI)-based systems have the prospect of greater accuracy yet are frequently limited to a narrow range of conditions and do not replicate the full diagnostic range. Moreover, these models often require raw signal data, which are unavailable to physicians and necessitate costly technical integrations that are currently limited. To overcome these challenges, we developed and validated a format-independent vision encoder-decoder model - ECG-GPT - that can generate free-text, expert-level diagnosis statements directly from ECG images. The model shows robust performance, validated on 2.6 million ECGs across 6 geographically distinct health settings: (1) 2 large and diverse US health systems- Yale-New Haven and Mount Sinai Health Systems, (2) a consecutive ECG dataset from a central ECG repository from Minas Gerais, Brazil, (3) the prospective cohort study, UK Biobank, (4) a Germany-based, publicly available repository, PTB-XL, and (5) a community hospital in Missouri. The model demonstrated consistently high performance (AUROC≥0.81) across a wide range of rhythm and conduction disorders. This can be easily accessed via a web-based application capable of receiving ECG images and represents a scalable and accessible strategy for generating accurate, expert-level reports from images of ECGs, enabling accurate triage of patients globally, especially in low-resource settings.

Quantitative Prediction of Right Ventricular Size and Function From the ECG.

Duong, Son Q; Vaid, Akhil; My, Vy Thi Ha; Butler, Liam R; Lampert, Joshua; Pass, Robert H; Charney, Alexander W; Narula, Jagat; Khera, Rohan; Sakhuja, Ankit; Greenspan, Hayit; Gelb, Bruce D; Do, Ron; Nadkarni, Girish N.

J Am Heart Assoc ; 13(1): e031671, 2024 Jan 02.

Artigo em Inglês | MEDLINE | ID: mdl-38156471

RESUMO

BACKGROUND: Right ventricular ejection fraction (RVEF) and end-diastolic volume (RVEDV) are not readily assessed through traditional modalities. Deep learning-enabled ECG analysis for estimation of right ventricular (RV) size or function is unexplored. METHODS AND RESULTS: We trained a deep learning-ECG model to predict RV dilation (RVEDV >120 mL/m2), RV dysfunction (RVEF ≤40%), and numerical RVEDV and RVEF from a 12-lead ECG paired with reference-standard cardiac magnetic resonance imaging volumetric measurements in UK Biobank (UKBB; n=42 938). We fine-tuned in a multicenter health system (MSHoriginal [Mount Sinai Hospital]; n=3019) with prospective validation over 4 months (MSHvalidation; n=115). We evaluated performance with area under the receiver operating characteristic curve for categorical and mean absolute error for continuous measures overall and in key subgroups. We assessed the association of RVEF prediction with transplant-free survival with Cox proportional hazards models. The prevalence of RV dysfunction for UKBB/MSHoriginal/MSHvalidation cohorts was 1.0%/18.0%/15.7%, respectively. RV dysfunction model area under the receiver operating characteristic curve for UKBB/MSHoriginal/MSHvalidation cohorts was 0.86/0.81/0.77, respectively. The prevalence of RV dilation for UKBB/MSHoriginal/MSHvalidation cohorts was 1.6%/10.6%/4.3%. RV dilation model area under the receiver operating characteristic curve for UKBB/MSHoriginal/MSHvalidation cohorts was 0.91/0.81/0.92, respectively. MSHoriginal mean absolute error was RVEF=7.8% and RVEDV=17.6 mL/m2. The performance of the RVEF model was similar in key subgroups including with and without left ventricular dysfunction. Over a median follow-up of 2.3 years, predicted RVEF was associated with adjusted transplant-free survival (hazard ratio, 1.40 for each 10% decrease; P=0.031). CONCLUSIONS: Deep learning-ECG analysis can identify significant cardiac magnetic resonance imaging RV dysfunction and dilation with good performance. Predicted RVEF is associated with clinical outcome.

Assuntos

Disfunção Ventricular Direita , Função Ventricular Direita , Humanos , Volume Sistólico , Imageamento por Ressonância Magnética/métodos , Coração , Eletrocardiografia

Evaluating the accuracy of a state-of-the-art large language model for prediction of admissions from the emergency room.

Glicksberg, Benjamin S; Timsina, Prem; Patel, Dhaval; Sawant, Ashwin; Vaid, Akhil; Raut, Ganesh; Charney, Alexander W; Apakama, Donald; Carr, Brendan G; Freeman, Robert; Nadkarni, Girish N; Klang, Eyal.

J Am Med Inform Assoc ; 31(9): 1921-1928, 2024 Sep 01.

Artigo em Inglês | MEDLINE | ID: mdl-38771093

RESUMO

BACKGROUND: Artificial intelligence (AI) and large language models (LLMs) can play a critical role in emergency room operations by augmenting decision-making about patient admission. However, there are no studies for LLMs using real-world data and scenarios, in comparison to and being informed by traditional supervised machine learning (ML) models. We evaluated the performance of GPT-4 for predicting patient admissions from emergency department (ED) visits. We compared performance to traditional ML models both naively and when informed by few-shot examples and/or numerical probabilities. METHODS: We conducted a retrospective study using electronic health records across 7 NYC hospitals. We trained Bio-Clinical-BERT and XGBoost (XGB) models on unstructured and structured data, respectively, and created an ensemble model reflecting ML performance. We then assessed GPT-4 capabilities in many scenarios: through Zero-shot, Few-shot with and without retrieval-augmented generation (RAG), and with and without ML numerical probabilities. RESULTS: The Ensemble ML model achieved an area under the receiver operating characteristic curve (AUC) of 0.88, an area under the precision-recall curve (AUPRC) of 0.72 and an accuracy of 82.9%. The naïve GPT-4's performance (0.79 AUC, 0.48 AUPRC, and 77.5% accuracy) showed substantial improvement when given limited, relevant data to learn from (ie, RAG) and underlying ML probabilities (0.87 AUC, 0.71 AUPRC, and 83.1% accuracy). Interestingly, RAG alone boosted performance to near peak levels (0.82 AUC, 0.56 AUPRC, and 81.3% accuracy). CONCLUSIONS: The naïve LLM had limited performance but showed significant improvement in predicting ED admissions when supplemented with real-world examples to learn from, particularly through RAG, and/or numerical probabilities from traditional ML models. Its peak performance, although slightly lower than the pure ML model, is noteworthy given its potential for providing reasoning behind predictions. Further refinement of LLMs with real-world data is necessary for successful integration as decision-support tools in care settings.

Assuntos

Registros Eletrônicos de Saúde , Serviço Hospitalar de Emergência , Admissão do Paciente , Humanos , Estudos Retrospectivos , Inteligência Artificial , Processamento de Linguagem Natural , Aprendizado de Máquina , Aprendizado de Máquina Supervisionado

Artificial intelligence-guided detection of under-recognized cardiomyopathies on point-of-care cardiac ultrasound: a multi-center study.

Oikonomou, Evangelos K; Vaid, Akhil; Holste, Gregory; Coppi, Andreas; McNamara, Robert L; Baloescu, Cristiana; Krumholz, Harlan M; Wang, Zhangyang; Apakama, Donald J; Nadkarni, Girish N; Khera, Rohan.

medRxiv ; 2024 Jun 29.

Artigo em Inglês | MEDLINE | ID: mdl-38559021

RESUMO

Background: Point-of-care ultrasonography (POCUS) enables cardiac imaging at the bedside and in communities but is limited by abbreviated protocols and variation in quality. We developed and tested artificial intelligence (AI) models to automate the detection of underdiagnosed cardiomyopathies from cardiac POCUS. Methods: In a development set of 290,245 transthoracic echocardiographic videos across the Yale-New Haven Health System (YNHHS), we used augmentation approaches and a customized loss function weighted for view quality to derive a POCUS-adapted, multi-label, video-based convolutional neural network (CNN) that discriminates HCM (hypertrophic cardiomyopathy) and ATTR-CM (transthyretin amyloid cardiomyopathy) from controls without known disease. We evaluated the final model across independent, internal and external, retrospective cohorts of individuals who underwent cardiac POCUS across YNHHS and Mount Sinai Health System (MSHS) emergency departments (EDs) (2011-2024) to prioritize key views and validate the diagnostic and prognostic performance of single-view screening protocols. Findings: We identified 33,127 patients (median age 61 [IQR: 45-75] years, n=17,276 [52·2%] female) at YNHHS and 5,624 (57 [IQR: 39-71] years, n=1,953 [34·7%] female) at MSHS with 78,054 and 13,796 eligible cardiac POCUS videos, respectively. An AI-enabled single-view screening approach successfully discriminated HCM (AUROC of 0·90 [YNHHS] & 0·89 [MSHS]) and ATTR-CM (YNHHS: AUROC of 0·92 [YNHHS] & 0·99 [MSHS]). In YNHHS, 40 (58·0%) HCM and 23 (47·9%) ATTR-CM cases had a positive screen at median of 2·1 [IQR: 0·9-4·5] and 1·9 [IQR: 1·0-3·4] years before clinical diagnosis. Moreover, among 24,448 participants without known cardiomyopathy followed over 2·2 [IQR: 1·1-5·8] years, AI-POCUS probabilities in the highest (vs lowest) quintile for HCM and ATTR-CM conferred a 15% (adj.HR 1·15 [95%CI: 1·02-1·29]) and 39% (adj.HR 1·39 [95%CI: 1·22-1·59]) higher age- and sex-adjusted mortality risk, respectively. Interpretation: We developed and validated an AI framework that enables scalable, opportunistic screening of treatable cardiomyopathies wherever POCUS is used. Funding: National Heart, Lung and Blood Institute, Doris Duke Charitable Foundation, BridgeBio.

Derivation, External Validation and Clinical Implications of a deep learning approach for intracranial pressure estimation using non-cranial waveform measurements.

Gulamali, Faris; Jayaraman, Pushkala; Sawant, Ashwin S; Desman, Jacob; Fox, Benjamin; Chang, Annie; Soong, Brian Y; Arivazaghan, Naveen; Reynolds, Alexandra S; Duong, Son Q; Vaid, Akhil; Kovatch, Patricia; Freeman, Robert; Hofer, Ira S; Sakhuja, Ankit; Dangayach, Neha S; Reich, David S; Charney, Alexander W; Nadkarni, Girish N.

medRxiv ; 2024 Jan 30.

Artigo em Inglês | MEDLINE | ID: mdl-38352556

RESUMO

Importance: Increased intracranial pressure (ICP) is associated with adverse neurological outcomes, but needs invasive monitoring. Objective: Development and validation of an AI approach for detecting increased ICP (aICP) using only non-invasive extracranial physiological waveform data. Design: Retrospective diagnostic study of AI-assisted detection of increased ICP. We developed an AI model using exclusively extracranial waveforms, externally validated it and assessed associations with clinical outcomes. Setting: MIMIC-III Waveform Database (2000-2013), a database derived from patients admitted to an ICU in an academic Boston hospital, was used for development of the aICP model, and to report association with neurologic outcomes. Data from Mount Sinai Hospital (2020-2022) in New York City was used for external validation. Participants: Patients were included if they were older than 18 years, and were monitored with electrocardiograms, arterial blood pressure, respiratory impedance plethysmography and pulse oximetry. Patients who additionally had intracranial pressure monitoring were used for development (N=157) and external validation (N=56). Patients without intracranial monitors were used for association with outcomes (N=1694). Exposures: Extracranial waveforms including electrocardiogram, arterial blood pressure, plethysmography and SpO2. Main Outcomes and Measures: Intracranial pressure > 15 mmHg. Measures were Area under receiver operating characteristic curves (AUROCs), sensitivity, specificity, and accuracy at threshold of 0.5. We calculated odds ratios and p-values for phenotype association. Results: The AUROC was 0.91 (95% CI, 0.90-0.91) on testing and 0.80 (95% CI, 0.80-0.80) on external validation. aICP had accuracy, sensitivity, and specificity of 73.8% (95% CI, 72.0%-75.6%), 99.5% (95% CI 99.3%-99.6%), and 76.9% (95% CI, 74.0-79.8%) on external validation. A ten-percentile increment was associated with stroke (OR=2.12; 95% CI, 1.27-3.13), brain malignancy (OR=1.68; 95% CI, 1.09-2.60), subdural hemorrhage (OR=1.66; 95% CI, 1.07-2.57), intracerebral hemorrhage (OR=1.18; 95% CI, 1.07-1.32), and procedures like percutaneous brain biopsy (OR=1.58; 95% CI, 1.15-2.18) and craniotomy (OR = 1.43; 95% CI, 1.12-1.84; P < 0.05 for all). Conclusions and Relevance: aICP provides accurate, non-invasive estimation of increased ICP, and is associated with neurological outcomes and neurosurgical procedures in patients without intracranial monitoring.

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA