Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 336
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-38771093

RESUMO

BACKGROUND: Artificial intelligence (AI) and large language models (LLMs) can play a critical role in emergency room operations by augmenting decision-making about patient admission. However, there are no studies for LLMs using real-world data and scenarios, in comparison to and being informed by traditional supervised machine learning (ML) models. We evaluated the performance of GPT-4 for predicting patient admissions from emergency department (ED) visits. We compared performance to traditional ML models both naively and when informed by few-shot examples and/or numerical probabilities. METHODS: We conducted a retrospective study using electronic health records across 7 NYC hospitals. We trained Bio-Clinical-BERT and XGBoost (XGB) models on unstructured and structured data, respectively, and created an ensemble model reflecting ML performance. We then assessed GPT-4 capabilities in many scenarios: through Zero-shot, Few-shot with and without retrieval-augmented generation (RAG), and with and without ML numerical probabilities. RESULTS: The Ensemble ML model achieved an area under the receiver operating characteristic curve (AUC) of 0.88, an area under the precision-recall curve (AUPRC) of 0.72 and an accuracy of 82.9%. The naïve GPT-4's performance (0.79 AUC, 0.48 AUPRC, and 77.5% accuracy) showed substantial improvement when given limited, relevant data to learn from (ie, RAG) and underlying ML probabilities (0.87 AUC, 0.71 AUPRC, and 83.1% accuracy). Interestingly, RAG alone boosted performance to near peak levels (0.82 AUC, 0.56 AUPRC, and 81.3% accuracy). CONCLUSIONS: The naïve LLM had limited performance but showed significant improvement in predicting ED admissions when supplemented with real-world examples to learn from, particularly through RAG, and/or numerical probabilities from traditional ML models. Its peak performance, although slightly lower than the pure ML model, is noteworthy given its potential for providing reasoning behind predictions. Further refinement of LLMs with real-world data is necessary for successful integration as decision-support tools in care settings.

2.
Pediatr Cardiol ; 2024 May 10.
Artigo em Inglês | MEDLINE | ID: mdl-38730015

RESUMO

Assessment of pulmonary regurgitation (PR) guides treatment for patients with congenital heart disease. Quantitative assessment of PR fraction (PRF) by echocardiography is limited. Cardiac MRI (cMRI) is the reference-standard for PRF quantification. We created an algorithm to predict cMRI-quantified PRF from echocardiography using machine learning (ML). We retrospectively performed echocardiographic measurements paired to cMRI within 3 months in patients with ≥ mild PR from 2009 to 2022. Model inputs were vena contracta ratio, PR index, PR pressure half-time, main and branch pulmonary artery diastolic flow reversal (BPAFR), and transannular patch repair. A gradient boosted trees ML algorithm was trained using k-fold cross-validation to predict cMRI PRF by phase contrast imaging as a continuous number and at > mild (PRF ≥ 20%) and severe (PRF ≥ 40%) thresholds. Regression performance was evaluated with mean absolute error (MAE), and at clinical thresholds with area-under-the-receiver-operating-characteristic curve (AUROC). Prediction accuracy was compared to historical clinician accuracy. We externally validated prior reported studies for comparison. We included 243 subjects (median age 21 years, 58% repaired tetralogy of Fallot). The regression MAE = 7.0%. For prediction of > mild PR, AUROC = 0.96, but BPAFR alone outperformed the ML model (sensitivity 94%, specificity 97%). The ML model detection of severe PR had AUROC = 0.86, but in the subgroup with BPAFR, performance dropped (AUROC = 0.73). Accuracy between clinicians and the ML model was similar (70% vs. 69%). There was decrement in performance of prior reported algorithms on external validation in our dataset. A novel ML model for echocardiographic quantification of PRF outperforms prior studies and has comparable overall accuracy to clinicians. BPAFR is an excellent marker for > mild PRF, and has moderate capacity to detect severe PR, but more work is required to distinguish moderate from severe PR. Poor external validation of prior works highlights reproducibility challenges.

3.
Crit Care ; 28(1): 156, 2024 05 10.
Artigo em Inglês | MEDLINE | ID: mdl-38730421

RESUMO

BACKGROUND: Current classification for acute kidney injury (AKI) in critically ill patients with sepsis relies only on its severity-measured by maximum creatinine which overlooks inherent complexities and longitudinal evaluation of this heterogenous syndrome. The role of classification of AKI based on early creatinine trajectories is unclear. METHODS: This retrospective study identified patients with Sepsis-3 who developed AKI within 48-h of intensive care unit admission using Medical Information Mart for Intensive Care-IV database. We used latent class mixed modelling to identify early creatinine trajectory-based classes of AKI in critically ill patients with sepsis. Our primary outcome was development of acute kidney disease (AKD). Secondary outcomes were composite of AKD or all-cause in-hospital mortality by day 7, and AKD or all-cause in-hospital mortality by hospital discharge. We used multivariable regression to assess impact of creatinine trajectory-based classification on outcomes, and eICU database for external validation. RESULTS: Among 4197 patients with AKI in critically ill patients with sepsis, we identified eight creatinine trajectory-based classes with distinct characteristics. Compared to the class with transient AKI, the class that showed severe AKI with mild improvement but persistence had highest adjusted risks for developing AKD (OR 5.16; 95% CI 2.87-9.24) and composite 7-day outcome (HR 4.51; 95% CI 2.69-7.56). The class that demonstrated late mild AKI with persistence and worsening had highest risks for developing composite hospital discharge outcome (HR 2.04; 95% CI 1.41-2.94). These associations were similar on external validation. CONCLUSIONS: These 8 classes of AKI in critically ill patients with sepsis, stratified by early creatinine trajectories, were good predictors for key outcomes in patients with AKI in critically ill patients with sepsis independent of their AKI staging.


Assuntos
Injúria Renal Aguda , Creatinina , Estado Terminal , Aprendizado de Máquina , Sepse , Humanos , Injúria Renal Aguda/sangue , Injúria Renal Aguda/diagnóstico , Injúria Renal Aguda/etiologia , Injúria Renal Aguda/classificação , Masculino , Sepse/sangue , Sepse/complicações , Sepse/classificação , Feminino , Estudos Retrospectivos , Creatinina/sangue , Creatinina/análise , Pessoa de Meia-Idade , Idoso , Aprendizado de Máquina/tendências , Unidades de Terapia Intensiva/estatística & dados numéricos , Unidades de Terapia Intensiva/organização & administração , Biomarcadores/sangue , Biomarcadores/análise , Mortalidade Hospitalar
4.
medRxiv ; 2024 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-38765961

RESUMO

Adenosine-to-inosine (A-to-I) editing is a prevalent post-transcriptional RNA modification within the brain. Yet, most research has relied on postmortem samples, assuming it is an accurate representation of RNA biology in the living brain. We challenge this assumption by comparing A-to-I editing between postmortem and living prefrontal cortical tissues. Major differences were found, with over 70,000 A-to-I sites showing higher editing levels in postmortem tissues. Increased A-to-I editing in postmortem tissues is linked to higher ADAR1 and ADARB1 expression, is more pronounced in non-neuronal cells, and indicative of postmortem activation of inflammation and hypoxia. Higher A-to-I editing in living tissues marks sites that are evolutionarily preserved, synaptic, developmentally timed, and disrupted in neurological conditions. Common genetic variants were also found to differentially affect A-to-I editing levels in living versus postmortem tissues. Collectively, these discoveries illuminate the nuanced functions and intricate regulatory mechanisms of RNA editing within the human brain.

5.
medRxiv ; 2024 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-38699362

RESUMO

Importance: Infant alertness and neurologic changes are assessed by exam, which can be intermittent and subjective. Reliable, continuous methods are needed. Objective: We hypothesized that our computer vision method to track movement, pose AI, could predict neurologic changes. Design: Retrospective observational study from 2021-2022. Setting: A level four urban neonatal intensive care unit (NICU). Participants: Infants with corrected age ≤1 year, comprising 115 patients with 4,705 hours of video data linked to electroencephalograms (EEG), including 46% female and 25.2% white non-Hispanic. Exposures: Pose AI prediction of anatomic landmark position and an XGBoost classifier trained on one-minute variance in pose. Main outcomes and measures: Outcomes were cerebral dysfunction, diagnosed from EEG readings by an epileptologist, and sedation, defined by the administration of sedative medications. Measures of algorithm performance were receiver operating characteristic-area under the curves (ROC-AUCs) on cross-validation and on two test datasets comprised of held-out infants and held-out video frames from infants used in training. Results: Infant pose was accurately predicted in cross-validation, held-out frames, and held-out infants (respective ROC-AUCs 0.94, 0.83, 0.89). Median movement increased with age and, after accounting for age, was lower with sedative medications and in infants with cerebral dysfunction (all P<5×10-3, 10,000 permutations). Sedation prediction had high performance on cross-validation, held-out frames, and held-out infants (ROC-AUCs 0.90, 0.91, 0.87), as did prediction of cerebral dysfunction (ROC-AUCs 0.91, 0.90, 0.76). Conclusions and Relevance: We used pose AI to predict sedation and cerebral dysfunction in 4,705 hours of video from a large, diverse cohort of infants. Pose AI may offer a scalable, minimally invasive method for neuro-telemetry in the NICU.

6.
medRxiv ; 2024 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-38746297

RESUMO

Single-nucleus RNA sequencing (snRNA-seq) is often used to define gene expression patterns characteristic of brain cell types as well as to identify cell type specific gene expression signatures of neurological and mental illnesses in postmortem human brains. As methods to obtain brain tissue from living individuals emerge, it is essential to characterize gene expression differences associated with tissue originating from either living or postmortem subjects using snRNA-seq, and to assess whether and how such differences may impact snRNA-seq studies of brain tissue. To address this, human prefrontal cortex single nuclei gene expression was generated and compared between 31 samples from living individuals and 21 postmortem samples. The same cell types were consistently identified in living and postmortem nuclei, though for each cell type, a large proportion of genes were differentially expressed between samples from postmortem and living individuals. Notably, estimation of cell type proportions by cell type deconvolution of pseudo-bulk data was found to be more accurate in samples from living individuals. To allow for future integration of living and postmortem brain gene expression, a model was developed that quantifies from gene expression data the probability a human brain tissue sample was obtained postmortem. These probabilities are established as a means to statistically account for the gene expression differences between samples from living and postmortem individuals. Together, the results presented here provide a deep characterization of both differences between snRNA-seq derived from samples from living and postmortem individuals, as well as qualify and account for their effect on common analyses performed on this type of data.

7.
medRxiv ; 2024 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-38562892

RESUMO

COVID-19 has been a significant public health concern for the last four years; however, little is known about the mechanisms that lead to severe COVID-associated kidney injury. In this multicenter study, we combined quantitative deep urinary proteomics and machine learning to predict severe acute outcomes in hospitalized COVID-19 patients. Using a 10-fold cross-validated random forest algorithm, we identified a set of urinary proteins that demonstrated predictive power for both discovery and validation set with 87% and 79% accuracy, respectively. These predictive urinary biomarkers were recapitulated in non-COVID acute kidney injury revealing overlapping injury mechanisms. We further combined orthogonal multiomics datasets to understand the mechanisms that drive severe COVID-associated kidney injury. Functional overlap and network analysis of urinary proteomics, plasma proteomics and urine sediment single-cell RNA sequencing showed that extracellular matrix and autophagy-associated pathways were uniquely impacted in severe COVID-19. Differentially abundant proteins associated with these pathways exhibited high expression in cells in the juxtamedullary nephron, endothelial cells, and podocytes, indicating that these kidney cell types could be potential targets. Further, single-cell transcriptomic analysis of kidney organoids infected with SARS-CoV-2 revealed dysregulation of extracellular matrix organization in multiple nephron segments, recapitulating the clinically observed fibrotic response across multiomics datasets. Ligand-receptor interaction analysis of the podocyte and tubule organoid clusters showed significant reduction and loss of interaction between integrins and basement membrane receptors in the infected kidney organoids. Collectively, these data suggest that extracellular matrix degradation and adhesion-associated mechanisms could be a main driver of COVID-associated kidney injury and severe outcomes.

8.
JAMA Cardiol ; 2024 Apr 06.
Artigo em Inglês | MEDLINE | ID: mdl-38581644

RESUMO

Importance: Aortic stenosis (AS) is a major public health challenge with a growing therapeutic landscape, but current biomarkers do not inform personalized screening and follow-up. A video-based artificial intelligence (AI) biomarker (Digital AS Severity index [DASSi]) can detect severe AS using single-view long-axis echocardiography without Doppler characterization. Objective: To deploy DASSi to patients with no AS or with mild or moderate AS at baseline to identify AS development and progression. Design, Setting, and Participants: This is a cohort study that examined 2 cohorts of patients without severe AS undergoing echocardiography in the Yale New Haven Health System (YNHHS; 2015-2021) and Cedars-Sinai Medical Center (CSMC; 2018-2019). A novel computational pipeline for the cross-modal translation of DASSi into cardiac magnetic resonance (CMR) imaging was further developed in the UK Biobank. Analyses were performed between August 2023 and February 2024. Exposure: DASSi (range, 0-1) derived from AI applied to echocardiography and CMR videos. Main Outcomes and Measures: Annualized change in peak aortic valve velocity (AV-Vmax) and late (>6 months) aortic valve replacement (AVR). Results: A total of 12 599 participants were included in the echocardiographic study (YNHHS: n = 8798; median [IQR] age, 71 [60-80] years; 4250 [48.3%] women; median [IQR] follow-up, 4.1 [2.4-5.4] years; and CSMC: n = 3801; median [IQR] age, 67 [54-78] years; 1685 [44.3%] women; median [IQR] follow-up, 3.4 [2.8-3.9] years). Higher baseline DASSi was associated with faster progression in AV-Vmax (per 0.1 DASSi increment: YNHHS, 0.033 m/s per year [95% CI, 0.028-0.038] among 5483 participants; CSMC, 0.082 m/s per year [95% CI, 0.053-0.111] among 1292 participants), with values of 0.2 or greater associated with a 4- to 5-fold higher AVR risk than values less than 0.2 (YNHHS: 715 events; adjusted hazard ratio [HR], 4.97 [95% CI, 2.71-5.82]; CSMC: 56 events; adjusted HR, 4.04 [95% CI, 0.92-17.70]), independent of age, sex, race, ethnicity, ejection fraction, and AV-Vmax. This was reproduced across 45 474 participants (median [IQR] age, 65 [59-71] years; 23 559 [51.8%] women; median [IQR] follow-up, 2.5 [1.6-3.9] years) undergoing CMR imaging in the UK Biobank (for participants with DASSi ≥0.2 vs those with DASSi <.02, adjusted HR, 11.38 [95% CI, 2.56-50.57]). Saliency maps and phenome-wide association studies supported associations with cardiac structure and function and traditional cardiovascular risk factors. Conclusions and Relevance: In this cohort study of patients without severe AS undergoing echocardiography or CMR imaging, a new AI-based video biomarker was independently associated with AS development and progression, enabling opportunistic risk stratification across cardiovascular imaging modalities as well as potential application on handheld devices.

9.
medRxiv ; 2024 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-38585929

RESUMO

Randomized clinical trials (RCTs) are essential to guide medical practice; however, their generalizability to a given population is often uncertain. We developed a statistically informed Generative Adversarial Network (GAN) model, RCT-Twin-GAN, that leverages relationships between covariates and outcomes and generates a digital twin of an RCT (RCT-Twin) conditioned on covariate distributions from a second patient population. We used RCT-Twin-GAN to reproduce treatment effect outcomes of the Systolic Blood Pressure Intervention Trial (SPRINT) and the Action to Control Cardiovascular Risk in Diabetes (ACCORD) Blood Pressure Trial, which tested the same intervention but had different treatment effect results. To demonstrate treatment effect estimates of each RCT conditioned on the other RCT patient population, we evaluated the cardiovascular event-free survival of SPRINT digital twins conditioned on the ACCORD cohort and vice versa (SPRINT-conditioned ACCORD twins). The conditioned digital twins were balanced by the intervention arm (mean absolute standardized mean difference (MASMD) of covariates between treatment arms 0.019 (SD 0.018), and the conditioned covariates of the SPRINT-Twin on ACCORD were more similar to ACCORD than a sprint (MASMD 0.0082 SD 0.016 vs. 0.46 SD 0.20). Most importantly, across iterations, SPRINT conditioned ACCORD-Twin datasets reproduced the overall non-significant effect size seen in ACCORD (5-year cardiovascular outcome hazard ratio (95% confidence interval) of 0.88 (0.73-1.06) in ACCORD vs median 0.87 (0.68-1.13) in the SPRINT conditioned ACCORD-Twin), while the ACCORD conditioned SPRINT-Twins reproduced the significant effect size seen in SPRINT (0.75 (0.64-0.89) vs median 0.79 (0.72-0.86)) in ACCORD conditioned SPRINT-Twin). Finally, we describe the translation of this approach to real-world populations by conditioning the trials on an electronic health record population. Therefore, RCT-Twin-GAN simulates the direct translation of RCT-derived treatment effects across various patient populations with varying covariate distributions.

10.
J Am Coll Cardiol ; 2024 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-38593945

RESUMO

Recent Artificial Intelligence (AI) advancements in cardiovascular care offer potential enhancements in effective diagnosis, treatment, and outcomes. Over 600 Food and Drug Administration (FDA)-approved clinical AI algorithms now exist, with 10% focusing on cardiovascular applications, highlighting the growing opportunities for AI to augment care. This review discusses the latest advancements in the field of AI, with a particular focus on the utilization of multimodal inputs and the field of generative AI. Further discussions in this review involve an approach to understanding the larger context in which AI-augmented care may exist, and include a discussion of the need for rigorous evaluation, appropriate infrastructure for deployment, ethics and equity assessments, regulatory oversight, and viable business cases for deployment. Embracing this rapidly evolving technology while setting an appropriately high evaluation benchmark with careful and patient-centered implementation will be crucial for cardiology to leverage AI to enhance patient care and the provider experience.

11.
J Am Coll Cardiol ; 2024 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-38593946

RESUMO

Recent AI advancements in cardiovascular care offer potential enhancements in diagnosis, treatment, and outcomes. Innovations to date focus on automating measurements, enhancing image quality, and detecting diseases using novel methods. Applications span wearables, electrocardiograms, echocardiography, angiography, genetics, and more. AI models detect diseases from electrocardiograms at accuracy not previously achieved by technology or human experts, including reduced ejection fraction, valvular heart disease, and other cardiomyopathies. However, AI's unique characteristics necessitates rigorous validation by addressing training methods, real-world efficacy, equity concerns, and long-term reliability. Despite an exponentially growing number of studies in cardiovascular AI, trials showing improvement in outcomes remain lacking. A number are currently underway. Embracing this rapidly evolving technology while setting a high evaluation benchmark will be crucial for cardiology to leverage AI to enhance patient care and the provider experience.

12.
medRxiv ; 2024 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-38559021

RESUMO

Background: Point-of-care ultrasonography (POCUS) enables access to cardiac imaging directly at the bedside but is limited by brief acquisition, variation in acquisition quality, and lack of advanced protocols. Objective: To develop and validate deep learning models for detecting underdiagnosed cardiomyopathies on cardiac POCUS, leveraging a novel acquisition quality-adapted modeling strategy. Methods: To develop the models, we identified transthoracic echocardiograms (TTEs) of patients across five hospitals in a large U.S. health system with transthyretin amyloid cardiomyopathy (ATTR-CM, confirmed by Tc99m-pyrophosphate imaging), hypertrophic cardiomyopathy (HCM, confirmed by cardiac magnetic resonance), and controls enriched for the presence of severe AS. In a sample of 290,245 TTE videos, we used novel augmentation approaches and a customized loss function to weigh image and view quality to train a multi-label, view agnostic video-based convolutional neural network (CNN) to discriminate the presence of ATTR-CM, HCM, and/or AS. Models were tested across 3,758 real-world POCUS videos from 1,879 studies in 1,330 independent emergency department (ED) patients from 2011 through 2023. Results: Our multi-label, view-agnostic classifier demonstrated state-of-the-art performance in discriminating ATTR-CM (AUROC 0.98 [95%CI: 0.96-0.99]) and HCM (AUROC 0.95 [95% CI: 0.94-0.96]) on standard TTE studies. Automated metrics of anatomical view correctness confirmed significantly lower quality in POCUS vs TTE videos (median view classifier confidence of 0.63 [IQR: 0.44-0.88] vs 0.93 [IQR: 0.69-1.00], p<0.001). When deployed to POCUS videos, our algorithm effectively discriminated ATTR-CM and HCM with AUROC of up to 0.94 (parasternal long-axis (PLAX)), and 0.85 (apical 4 chamber), corresponding to positive diagnostic odds ratios of 46.7 and 25.5, respectively. In total, 18/35 (51.4%) of ATTR-CM and 32/57 (41.1%) of HCM patients in the POCUS cohort had an AI-positive screen in the year before their eventual confirmatory imaging. Conclusions: We define and validate an AI framework that enables scalable, opportunistic screening of under-diagnosed cardiomyopathies using POCUS.

13.
Cell Rep Med ; : 101518, 2024 Apr 14.
Artigo em Inglês | MEDLINE | ID: mdl-38642551

RESUMO

Population-based genomic screening may help diagnose individuals with disease-risk variants. Here, we perform a genome-first evaluation for nine disorders in 29,039 participants with linked exome sequences and electronic health records (EHRs). We identify 614 individuals with 303 pathogenic/likely pathogenic or predicted loss-of-function (P/LP/LoF) variants, yielding 644 observations; 487 observations (76%) lack a corresponding clinical diagnosis in the EHR. Upon further investigation, 75 clinically undiagnosed observations (15%) have evidence of symptomatic untreated disease, including familial hypercholesterolemia (3 of 6 [50%] undiagnosed observations with disease evidence) and breast cancer (23 of 106 [22%]). These genetic findings enable targeted phenotyping that reveals new diagnoses in previously undiagnosed individuals. Disease yield is greater with variants in penetrant genes for which disease is observed in carriers in an independent cohort. The prevalence of P/LP/LoF variants exceeds that of clinical diagnoses, and some clinically undiagnosed carriers are discovered to have disease. These results highlight the potential of population-based genomic screening.

14.
J Neurol ; 2024 Apr 24.
Artigo em Inglês | MEDLINE | ID: mdl-38656620

RESUMO

OBJECTIVE: To describe the frequency of neuropsychiatric complications among hospitalized patients with coronavirus disease 2019 (COVID-19) and their association with pre-existing comorbidities and clinical outcomes. METHODS: We retrospectively identified all patients hospitalized with COVID-19 within a large multicenter New York City health system between March 15, 2020 and May 17, 2021 and randomly selected a representative cohort for detailed chart review. Clinical data, including the occurrence of neuropsychiatric complications (categorized as either altered mental status [AMS] or other neuropsychiatric complications) and in-hospital mortality, were extracted using an electronic medical record database and individual chart review. Associations between neuropsychiatric complications, comorbidities, laboratory findings, and in-hospital mortality were assessed using multivariate logistic regression. RESULTS: Our study cohort consisted of 974 patients, the majority were admitted during the first wave of the pandemic. Patients were treated with anticoagulation (88.4%), glucocorticoids (24.8%), and remdesivir (10.5%); 18.6% experienced severe COVID-19 pneumonia (evidenced by ventilator requirement). Neuropsychiatric complications occurred in 58.8% of patients; 39.8% experienced AMS; and 19.0% experienced at least one other complication (seizures in 1.4%, ischemic stroke in 1.6%, hemorrhagic stroke in 1.0%) or symptom (headache in 11.4%, anxiety in 6.8%, ataxia in 6.3%). Higher odds of mortality, which occurred in 22.0%, were associated with AMS, ventilator support, increasing age, and higher serum inflammatory marker levels. Anticoagulant therapy was associated with lower odds of mortality and AMS. CONCLUSION: Neuropsychiatric complications of COVID-19, especially AMS, were common, varied, and associated with in-hospital mortality in a diverse multicenter cohort at an epicenter of the COVID-19 pandemic.

15.
Artigo em Inglês | MEDLINE | ID: mdl-38687616

RESUMO

OBJECTIVES: The study developed framework that leverages an open-source Large Language Model (LLM) to enable clinicians to ask plain-language questions about a patient's entire echocardiogram report history. This approach is intended to streamline the extraction of clinical insights from multiple echocardiogram reports, particularly in patients with complex cardiac diseases, thereby enhancing both patient care and research efficiency. MATERIALS AND METHODS: Data from over 10 years were collected, comprising echocardiogram reports from patients with more than 10 echocardiograms on file at the Mount Sinai Health System. These reports were converted into a single document per patient for analysis, broken down into snippets and relevant snippets were retrieved using text similarity measures. The LLaMA-2 70B model was employed for analyzing the text using a specially crafted prompt. The model's performance was evaluated against ground-truth answers created by faculty cardiologists. RESULTS: The study analyzed 432 reports from 37 patients for a total of 100 question-answer pairs. The LLM correctly answered 90% questions, with accuracies of 83% for temporality, 93% for severity assessment, 84% for intervention identification, and 100% for diagnosis retrieval. Errors mainly stemmed from the LLM's inherent limitations, such as misinterpreting numbers or hallucinations. CONCLUSION: The study demonstrates the feasibility and effectiveness of using a local, open-source LLM for querying and interpreting echocardiogram report data. This approach offers a significant improvement over traditional keyword-based searches, enabling more contextually relevant and semantically accurate responses; in turn showing promise in enhancing clinical decision-making and research by facilitating more efficient access to complex patient data.

16.
JMIR Ment Health ; 11: e55552, 2024 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-38663011

RESUMO

BACKGROUND: Heart rate variability (HRV) biofeedback is often performed with structured education, laboratory-based assessments, and practice sessions. It has been shown to improve psychological and physiological function across populations. However, a means to remotely use and monitor this approach would allow for wider use of this technique. Advancements in wearable and digital technology present an opportunity for the widespread application of this approach. OBJECTIVE: The primary aim of the study was to determine the feasibility of fully remote, self-administered short sessions of HRV-directed biofeedback in a diverse population of health care workers (HCWs). The secondary aim was to determine whether a fully remote, HRV-directed biofeedback intervention significantly alters longitudinal HRV over the intervention period, as monitored by wearable devices. The tertiary aim was to estimate the impact of this intervention on metrics of psychological well-being. METHODS: To determine whether remotely implemented short sessions of HRV biofeedback can improve autonomic metrics and psychological well-being, we enrolled HCWs across 7 hospitals in New York City in the United States. They downloaded our study app, watched brief educational videos about HRV biofeedback, and used a well-studied HRV biofeedback program remotely through their smartphone. HRV biofeedback sessions were used for 5 minutes per day for 5 weeks. HCWs were then followed for 12 weeks after the intervention period. Psychological measures were obtained over the study period, and they wore an Apple Watch for at least 7 weeks to monitor the circadian features of HRV. RESULTS: In total, 127 HCWs were enrolled in the study. Overall, only 21 (16.5%) were at least 50% compliant with the HRV biofeedback intervention, representing a small portion of the total sample. This demonstrates that this study design does not feasibly result in adequate rates of compliance with the intervention. Numerical improvement in psychological metrics was observed over the 17-week study period, although it did not reach statistical significance (all P>.05). Using a mixed effect cosinor model, the mean midline-estimating statistic of rhythm (MESOR) of the circadian pattern of the SD of the interbeat interval of normal sinus beats (SDNN), an HRV metric, was observed to increase over the first 4 weeks of the biofeedback intervention in HCWs who were at least 50% compliant. CONCLUSIONS: In conclusion, we found that using brief remote HRV biofeedback sessions and monitoring its physiological effect using wearable devices, in the manner that the study was conducted, was not feasible. This is considering the low compliance rates with the study intervention. We found that remote short sessions of HRV biofeedback demonstrate potential promise in improving autonomic nervous function and warrant further study. Wearable devices can monitor the physiological effects of psychological interventions.


Assuntos
Biorretroalimentação Psicológica , Frequência Cardíaca , Dispositivos Eletrônicos Vestíveis , Adulto , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Biorretroalimentação Psicológica/métodos , Biorretroalimentação Psicológica/instrumentação , Pessoal de Saúde , Frequência Cardíaca/fisiologia , Cidade de Nova Iorque , Estudos Prospectivos , Telemedicina/métodos , Telemedicina/instrumentação
17.
J Cancer Res Clin Oncol ; 150(3): 140, 2024 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-38504034

RESUMO

PURPOSE: Despite advanced technologies in breast cancer management, challenges remain in efficiently interpreting vast clinical data for patient-specific insights. We reviewed the literature on how large language models (LLMs) such as ChatGPT might offer solutions in this field. METHODS: We searched MEDLINE for relevant studies published before December 22, 2023. Keywords included: "large language models", "LLM", "GPT", "ChatGPT", "OpenAI", and "breast". The risk bias was evaluated using the QUADAS-2 tool. RESULTS: Six studies evaluating either ChatGPT-3.5 or GPT-4, met our inclusion criteria. They explored clinical notes analysis, guideline-based question-answering, and patient management recommendations. Accuracy varied between studies, ranging from 50 to 98%. Higher accuracy was seen in structured tasks like information retrieval. Half of the studies used real patient data, adding practical clinical value. Challenges included inconsistent accuracy, dependency on the way questions are posed (prompt-dependency), and in some cases, missing critical clinical information. CONCLUSION: LLMs hold potential in breast cancer care, especially in textual information extraction and guideline-driven clinical question-answering. Yet, their inconsistent accuracy underscores the need for careful validation of these models, and the importance of ongoing supervision.


Assuntos
Neoplasias da Mama , Humanos , Feminino , Neoplasias da Mama/terapia , Mama , Armazenamento e Recuperação da Informação , Idioma
18.
BMC Med Educ ; 24(1): 354, 2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38553693

RESUMO

BACKGROUND: Writing multiple choice questions (MCQs) for the purpose of medical exams is challenging. It requires extensive medical knowledge, time and effort from medical educators. This systematic review focuses on the application of large language models (LLMs) in generating medical MCQs. METHODS: The authors searched for studies published up to November 2023. Search terms focused on LLMs generated MCQs for medical examinations. Non-English, out of year range and studies not focusing on AI generated multiple-choice questions were excluded. MEDLINE was used as a search database. Risk of bias was evaluated using a tailored QUADAS-2 tool. RESULTS: Overall, eight studies published between April 2023 and October 2023 were included. Six studies used Chat-GPT 3.5, while two employed GPT 4. Five studies showed that LLMs can produce competent questions valid for medical exams. Three studies used LLMs to write medical questions but did not evaluate the validity of the questions. One study conducted a comparative analysis of different models. One other study compared LLM-generated questions with those written by humans. All studies presented faulty questions that were deemed inappropriate for medical exams. Some questions required additional modifications in order to qualify. CONCLUSIONS: LLMs can be used to write MCQs for medical examinations. However, their limitations cannot be ignored. Further study in this field is essential and more conclusive evidence is needed. Until then, LLMs may serve as a supplementary tool for writing medical examinations. 2 studies were at high risk of bias. The study followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.


Assuntos
Conhecimento , Idioma , Humanos , Bases de Dados Factuais , Redação
19.
Therap Adv Gastroenterol ; 17: 17562848241227031, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38390029

RESUMO

Over the past year, the emergence of state-of-the-art large language models (LLMs) in tools like ChatGPT has ushered in a rapid acceleration in artificial intelligence (AI) innovation. These powerful AI models can generate tailored and high-quality text responses to instructions and questions without the need for labor-intensive task-specific training data or complex software engineering. As the technology continues to mature, LLMs hold immense potential for transforming clinical workflows, enhancing patient outcomes, improving medical education, and optimizing medical research. In this review, we provide a practical discussion of LLMs, tailored to gastroenterologists. We highlight the technical foundations of LLMs, emphasizing their key strengths and limitations as well as how to interact with them safely and effectively. We discuss some potential LLM use cases for clinical gastroenterology practice, education, and research. Finally, we review critical barriers to implementation and ongoing work to address these issues. This review aims to equip gastroenterologists with a foundational understanding of LLMs to facilitate a more active clinician role in the development and implementation of this rapidly emerging technology.


Large language models in gastroenterology: a simplified overview for clinicians This text discusses the recent advancements in large language models (LLMs), like ChatGPT, which have significantly advanced artificial intelligence. These models can create specific, high-quality text responses without needing extensive training data or complex programming. They show great promise in transforming various aspects of clinical healthcare, particularly in improving patient care, medical education, and research. This article focuses on how LLMs can be applied in the field of gastroenterology. It explains the technical aspects of LLMs, their strengths and weaknesses, and how to use them effectively and safely. The text also explores how LLMs could be used in clinical practice, education, and research in gastroenterology. Finally, it discusses the challenges in implementing these models and the ongoing efforts to overcome them, aiming to provide gastroenterologists with the basic knowledge needed to engage more actively in the development and use of this emerging technology.

20.
Artif Intell Med ; 148: 102750, 2024 02.
Artigo em Inglês | MEDLINE | ID: mdl-38325922

RESUMO

Computational subphenotyping, a data-driven approach to understanding disease subtypes, is a prominent topic in medical research. Numerous ongoing studies are dedicated to developing advanced computational subphenotyping methods for cross-sectional data. However, the potential of time-series data has been underexplored until now. Here, we propose a Multivariate Levenshtein Distance (MLD) that can account for address correlation in multiple discrete features over time-series data. Our algorithm has two distinct components: it integrates an optimal threshold score to enhance the sensitivity in discriminating between pairs of instances, and the MLD itself. We have applied the proposed distance metrics on the k-means clustering algorithm to derive temporal subphenotypes from time-series data of biomarkers and treatment administrations from 1039 critically ill patients with COVID-19 and compare its effectiveness to standard methods. In conclusion, the Multivariate Levenshtein Distance metric is a novel method to quantify the distance from multiple discrete features over time-series data and demonstrates superior clustering performance among competing time-series distance metrics.


Assuntos
COVID-19 , Estado Terminal , Humanos , Fatores de Tempo , Estudos Transversais , Algoritmos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...