Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 84
Filter
2.
Sci Rep ; 14(1): 4516, 2024 02 24.
Article in English | MEDLINE | ID: mdl-38402362

ABSTRACT

While novel oral anticoagulants are increasingly used to reduce risk of stroke in patients with atrial fibrillation, vitamin K antagonists such as warfarin continue to be used extensively for stroke prevention across the world. While effective in reducing the risk of strokes, the complex pharmacodynamics of warfarin make it difficult to use clinically, with many patients experiencing under- and/or over- anticoagulation. In this study we employed a novel implementation of deep reinforcement learning to provide clinical decision support to optimize time in therapeutic International Normalized Ratio (INR) range. We used a novel semi-Markov decision process formulation of the Batch-Constrained deep Q-learning algorithm to develop a reinforcement learning model to dynamically recommend optimal warfarin dosing to achieve INR of 2.0-3.0 for patients with atrial fibrillation. The model was developed using data from 22,502 patients in the warfarin treated groups of the pivotal randomized clinical trials of edoxaban (ENGAGE AF-TIMI 48), apixaban (ARISTOTLE) and rivaroxaban (ROCKET AF). The model was externally validated on data from 5730 warfarin-treated patients in a fourth trial of dabigatran (RE-LY) using multilevel regression models to estimate the relationship between center-level algorithm consistent dosing, time in therapeutic INR range (TTR), and a composite clinical outcome of stroke, systemic embolism or major hemorrhage. External validation showed a positive association between center-level algorithm-consistent dosing and TTR (R2 = 0.56). Each 10% increase in algorithm-consistent dosing at the center level independently predicted a 6.78% improvement in TTR (95% CI 6.29, 7.28; p < 0.001) and a 11% decrease in the composite clinical outcome (HR 0.89; 95% CI 0.81, 1.00; p = 0.015). These results were comparable to those of a rules-based clinical algorithm used for benchmarking, for which each 10% increase in algorithm-consistent dosing independently predicted a 6.10% increase in TTR (95% CI 5.67, 6.54, p < 0.001) and a 10% decrease in the composite outcome (HR 0.90; 95% CI 0.83, 0.98, p = 0.018). Our findings suggest that a deep reinforcement learning algorithm can optimize time in therapeutic range for patients taking warfarin. A digital clinical decision support system to promote algorithm-consistent warfarin dosing could optimize time in therapeutic range and improve clinical outcomes in atrial fibrillation globally.


Subject(s)
Atrial Fibrillation , Stroke , Humans , Administration, Oral , Anticoagulants , Atrial Fibrillation/complications , Atrial Fibrillation/drug therapy , Atrial Fibrillation/chemically induced , Machine Learning , Rivaroxaban/therapeutic use , Stroke/prevention & control , Stroke/chemically induced , Treatment Outcome , Warfarin , Randomized Controlled Trials as Topic
3.
J Clin Oncol ; 42(14): 1625-1634, 2024 May 10.
Article in English | MEDLINE | ID: mdl-38359380

ABSTRACT

PURPOSE: For patients with advanced cancer, early consultations with palliative care (PC) specialists reduce costs, improve quality of life, and prolong survival. However, capacity limitations prevent all patients from receiving PC shortly after diagnosis. We evaluated whether a prognostic machine learning system could promote early PC, given existing capacity. METHODS: Using population-level administrative data in Ontario, Canada, we assembled a cohort of patients with incurable cancer who received palliative-intent systemic therapy between July 1, 2014, and December 30, 2019. We developed a machine learning system that predicted death within 1 year of each treatment using demographics, cancer characteristics, treatments, symptoms, laboratory values, and history of acute care admissions. We trained the system in patients who started treatment before July 1, 2017, and evaluated the potential impact of the system on PC in subsequent patients. RESULTS: Among 560,210 treatments received by 54,628 patients, death occurred within 1 year of 45.2% of treatments. The machine learning system recommended the same number of PC consultations observed with usual care at the 60.0% 1-year risk of death, with a first-alarm positive predictive value of 69.7% and an outcome-level sensitivity of 74.9%. Compared with usual care, system-guided care could increase early PC by 8.5% overall (95% CI, 7.5 to 9.5; P < .001) and by 15.3% (95% CI, 13.9 to 16.6; P < .001) among patients who live 6 months beyond their first treatment, without requiring more PC consultations in total or substantially increasing PC among patients with a prognosis exceeding 2 years. CONCLUSION: Prognostic machine learning systems could increase early PC despite existing resource constraints. These results demonstrate an urgent need to deploy and evaluate prognostic systems in real-time clinical practice to increase access to early PC.


Subject(s)
Machine Learning , Neoplasms , Palliative Care , Referral and Consultation , Humans , Palliative Care/methods , Neoplasms/therapy , Male , Female , Referral and Consultation/statistics & numerical data , Aged , Middle Aged , Ontario , Aged, 80 and over , Prognosis
4.
PLOS Digit Health ; 3(1): e0000417, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38236824

ABSTRACT

The study provides a comprehensive review of OpenAI's Generative Pre-trained Transformer 4 (GPT-4) technical report, with an emphasis on applications in high-risk settings like healthcare. A diverse team, including experts in artificial intelligence (AI), natural language processing, public health, law, policy, social science, healthcare research, and bioethics, analyzed the report against established peer review guidelines. The GPT-4 report shows a significant commitment to transparent AI research, particularly in creating a systems card for risk assessment and mitigation. However, it reveals limitations such as restricted access to training data, inadequate confidence and uncertainty estimations, and concerns over privacy and intellectual property rights. Key strengths identified include the considerable time and economic investment in transparent AI research and the creation of a comprehensive systems card. On the other hand, the lack of clarity in training processes and data raises concerns about encoded biases and interests in GPT-4. The report also lacks confidence and uncertainty estimations, crucial in high-risk areas like healthcare, and fails to address potential privacy and intellectual property issues. Furthermore, this study emphasizes the need for diverse, global involvement in developing and evaluating large language models (LLMs) to ensure broad societal benefits and mitigate risks. The paper presents recommendations such as improving data transparency, developing accountability frameworks, establishing confidence standards for LLM outputs in high-risk settings, and enhancing industry research review processes. It concludes that while GPT-4's report is a step towards open discussions on LLMs, more extensive interdisciplinary reviews are essential for addressing bias, harm, and risk concerns, especially in high-risk domains. The review aims to expand the understanding of LLMs in general and highlights the need for new reflection forms on how LLMs are reviewed, the data required for effective evaluation, and addressing critical issues like bias and risk.

5.
Sci Transl Med ; 16(731): eadg4517, 2024 Jan 24.
Article in English | MEDLINE | ID: mdl-38266105

ABSTRACT

The human retina is a multilayered tissue that offers a unique window into systemic health. Optical coherence tomography (OCT) is widely used in eye care and allows the noninvasive, rapid capture of retinal anatomy in exquisite detail. We conducted genotypic and phenotypic analyses of retinal layer thicknesses using macular OCT images from 44,823 UK Biobank participants. We performed OCT layer cross-phenotype association analyses (OCT-XWAS), associating retinal thicknesses with 1866 incident conditions (median 10-year follow-up) and 88 quantitative traits and blood biomarkers. We performed genome-wide association studies (GWASs), identifying inherited genetic markers that influence retinal layer thicknesses and replicated our associations among the LIFE-Adult Study (N = 6313). Last, we performed a comparative analysis of phenome- and genome-wide associations to identify putative causal links between retinal layer thicknesses and both ocular and systemic conditions. Independent associations with incident mortality were detected for thinner photoreceptor segments (PSs) and, separately, ganglion cell complex layers. Phenotypic associations were detected between thinner retinal layers and ocular, neuropsychiatric, cardiometabolic, and pulmonary conditions. A GWAS of retinal layer thicknesses yielded 259 unique loci. Consistency between epidemiologic and genetic associations suggested links between a thinner retinal nerve fiber layer with glaucoma, thinner PS with age-related macular degeneration, and poor cardiometabolic and pulmonary function with a thinner PS. In conclusion, we identified multiple inherited genetic loci and acquired systemic cardio-metabolic-pulmonary conditions associated with thinner retinal layers and identify retinal layers wherein thinning is predictive of future ocular and systemic conditions.


Subject(s)
Cardiovascular Diseases , Genome-Wide Association Study , Adult , Humans , Tomography, Optical Coherence , Face , Retina/diagnostic imaging
6.
N Engl J Med ; 389(22): 2114-5, 2023 11 30.
Article in English | MEDLINE | ID: mdl-38048207
7.
NPJ Digit Med ; 6(1): 237, 2023 Dec 20.
Article in English | MEDLINE | ID: mdl-38123810

ABSTRACT

Stress is associated with numerous chronic health conditions, both mental and physical. However, the heterogeneity of these associations at the individual level is poorly understood. While data generated from individuals in their day-to-day lives "in the wild" may best represent the heterogeneity of stress, gathering these data and separating signals from noise is challenging. In this work, we report findings from a major data collection effort using Digital Health Technologies (DHTs) and frontline healthcare workers. We provide insights into stress "in the wild", by using robust methods for its identification from multimodal data and quantifying its heterogeneity. Here we analyze data from the Stress and Recovery in Frontline COVID-19 Workers study following 365 frontline healthcare workers for 4-6 months using wearable devices and smartphone app-based measures. Causal discovery is used to learn how the causal structure governing an individual's self-reported symptoms and physiological features from DHTs differs between non-stress and potential stress states. Our methods uncover robust representations of potential stress states across a population of frontline healthcare workers. These representations reveal high levels of inter- and intra-individual heterogeneity in stress. We leverage multiple stress definitions that span different modalities (from subjective to physiological) to obtain a comprehensive view of stress, as these differing definitions rarely align in time. We show that these different stress definitions can be robustly represented as changes in the underlying causal structure on and off stress for individuals. This study is an important step toward better understanding potential underlying processes generating stress in individuals.

9.
Nat Hum Behav ; 7(11): 1833-1835, 2023 Nov.
Article in English | MEDLINE | ID: mdl-37985904
10.
J Natl Compr Canc Netw ; 21(10): 1029-1037.e21, 2023 10.
Article in English | MEDLINE | ID: mdl-37856226

ABSTRACT

BACKGROUND: Emergency department visits and hospitalizations frequently occur during systemic therapy for cancer. We developed and evaluated a longitudinal warning system for acute care use. METHODS: Using a retrospective population-based cohort of patients who started intravenous systemic therapy for nonhematologic cancers between July 1, 2014, and June 30, 2020, we randomly separated patients into cohorts for model training, hyperparameter tuning and model selection, and system testing. Predictive features included static features, such as demographics, cancer type, and treatment regimens, and dynamic features, such as patient-reported symptoms and laboratory values. The longitudinal warning system predicted the probability of acute care utilization within 30 days after each treatment session. Machine learning systems were developed in the training and tuning cohorts and evaluated in the testing cohort. Sensitivity analyses considered feature importance, other acute care endpoints, and performance within subgroups. RESULTS: The cohort included 105,129 patients who received 1,216,385 treatment sessions. Acute care followed 182,444 (15.0%) treatments within 30 days. The ensemble model achieved an area under the receiver operating characteristic curve of 0.742 (95% CI, 0.739-0.745) and was well calibrated in the test cohort. Important predictive features included prior acute care use, treatment regimen, and laboratory tests. If the system was set to alarm approximately once every 15 treatments, 25.5% of acute care events would be preceded by an alarm, and 47.4% of patients would experience acute care after an alarm. The system underestimated risk for some treatment regimens and potentially underserved populations such as females and non-English speakers. CONCLUSIONS: Machine learning warning systems can detect patients at risk for acute care utilization, which can aid in preventive intervention and facilitate tailored treatment. Future research should address potential biases and prospectively evaluate impact after system deployment.


Subject(s)
Neoplasms , Female , Humans , Retrospective Studies , Neoplasms/diagnosis , Neoplasms/drug therapy , Machine Learning , Hospitalization , Emergency Service, Hospital
11.
Nat Med ; 29(11): 2929-2938, 2023 Nov.
Article in English | MEDLINE | ID: mdl-37884627

ABSTRACT

Artificial intelligence as a medical device is increasingly being applied to healthcare for diagnosis, risk stratification and resource allocation. However, a growing body of evidence has highlighted the risk of algorithmic bias, which may perpetuate existing health inequity. This problem arises in part because of systemic inequalities in dataset curation, unequal opportunity to participate in research and inequalities of access. This study aims to explore existing standards, frameworks and best practices for ensuring adequate data diversity in health datasets. Exploring the body of existing literature and expert views is an important step towards the development of consensus-based guidelines. The study comprises two parts: a systematic review of existing standards, frameworks and best practices for healthcare datasets; and a survey and thematic analysis of stakeholder views of bias, health equity and best practices for artificial intelligence as a medical device. We found that the need for dataset diversity was well described in literature, and experts generally favored the development of a robust set of guidelines, but there were mixed views about how these could be implemented practically. The outputs of this study will be used to inform the development of standards for transparency of data diversity in health datasets (the STANDING Together initiative).


Subject(s)
Artificial Intelligence , Delivery of Health Care , Humans , Consensus , Systematic Reviews as Topic
13.
medRxiv ; 2023 May 17.
Article in English | MEDLINE | ID: mdl-37292770

ABSTRACT

The human retina is a complex multi-layered tissue which offers a unique window into systemic health and disease. Optical coherence tomography (OCT) is widely used in eye care and allows the non-invasive, rapid capture of retinal measurements in exquisite detail. We conducted genome- and phenome-wide analyses of retinal layer thicknesses using macular OCT images from 44,823 UK Biobank participants. We performed phenome-wide association analyses, associating retinal thicknesses with 1,866 incident ICD-based conditions (median 10-year follow-up) and 88 quantitative traits and blood biomarkers. We performed genome-wide association analyses, identifying inherited genetic markers which influence the retina, and replicated our associations among 6,313 individuals from the LIFE-Adult Study. And lastly, we performed comparative association of phenome- and genome- wide associations to identify putative causal links between systemic conditions, retinal layer thicknesses, and ocular disease. Independent associations with incident mortality were detected for photoreceptor thinning and ganglion cell complex thinning. Significant phenotypic associations were detected between retinal layer thinning and ocular, neuropsychiatric, cardiometabolic and pulmonary conditions. Genome-wide association of retinal layer thicknesses yielded 259 loci. Consistency between epidemiologic and genetic associations suggested putative causal links between thinning of the retinal nerve fiber layer with glaucoma, photoreceptor segment with AMD, as well as poor cardiometabolic and pulmonary function with PS thinning, among other findings. In conclusion, retinal layer thinning predicts risk of future ocular and systemic disease. Furthermore, systemic cardio-metabolic-pulmonary conditions promote retinal thinning. Retinal imaging biomarkers, integrated into electronic health records, may inform risk prediction and potential therapeutic strategies.

14.
Transl Psychiatry ; 13(1): 210, 2023 Jun 16.
Article in English | MEDLINE | ID: mdl-37328465

ABSTRACT

Advancements in artificial intelligence (AI) are enabling the development of clinical support tools (CSTs) in psychiatry to facilitate the review of patient data and inform clinical care. To promote their successful integration and prevent over-reliance, it is important to understand how psychiatrists will respond to information provided by AI-based CSTs, particularly if it is incorrect. We conducted an experiment to examine psychiatrists' perceptions of AI-based CSTs for treating major depressive disorder (MDD) and to determine whether perceptions interacted with the quality of CST information. Eighty-three psychiatrists read clinical notes about a hypothetical patient with MDD and reviewed two CSTs embedded within a single dashboard: the note's summary and a treatment recommendation. Psychiatrists were randomised to believe the source of CSTs was either AI or another psychiatrist, and across four notes, CSTs provided either correct or incorrect information. Psychiatrists rated the CSTs on various attributes. Ratings for note summaries were less favourable when psychiatrists believed the notes were generated with AI as compared to another psychiatrist, regardless of whether the notes provided correct or incorrect information. A smaller preference for psychiatrist-generated information emerged in ratings of attributes that reflected the summary's accuracy or its inclusion of important information from the full clinical note. Ratings for treatment recommendations were also less favourable when their perceived source was AI, but only when recommendations were correct. There was little evidence that clinical expertise or familiarity with AI impacted results. These findings suggest that psychiatrists prefer human-derived CSTs. This preference was less pronounced for ratings that may have prompted a deeper review of CST information (i.e. a comparison with the full clinical note to evaluate the summary's accuracy or completeness, assessing an incorrect treatment recommendation), suggesting a role of heuristics. Future work should explore other contributing factors and downstream implications for integrating AI into psychiatric care.


Subject(s)
Decision Support Systems, Clinical , Depressive Disorder, Major , Psychiatry , Humans , Artificial Intelligence , Depression , Depressive Disorder, Major/drug therapy
15.
Crit Care Explor ; 5(5): e0897, 2023 May.
Article in English | MEDLINE | ID: mdl-37151895

ABSTRACT

Hospital early warning systems that use machine learning (ML) to predict clinical deterioration are increasingly being used to aid clinical decision-making. However, it is not known how ML predictions complement physician and nurse judgment. Our objective was to train and validate a ML model to predict patient deterioration and compare model predictions with real-world physician and nurse predictions. DESIGN: Retrospective and prospective cohort study. SETTING: Academic tertiary care hospital. PATIENTS: Adult general internal medicine hospitalizations. MEASUREMENTS AND MAIN RESULTS: We developed and validated a neural network model to predict in-hospital death and ICU admission in 23,528 hospitalizations between April 2011 and April 2019. We then compared model predictions with 3,374 prospectively collected predictions from nurses, residents, and attending physicians about their own patients in 960 hospitalizations between April 30, and August 28, 2019. ML model predictions achieved clinician-level accuracy for predicting ICU admission or death (ML median F1 score 0.32 [interquartile range (IQR) 0.30-0.34], AUC 0.77 [IQ 0.76-0.78]; clinicians median F1-score 0.33 [IQR 0.30-0.35], AUC 0.64 [IQR 0.63-0.66]). ML predictions were more accurate than clinicians for ICU admission. Of all ICU admissions and deaths, 36% occurred in hospitalizations where the model and clinicians disagreed. Combining human and model predictions detected 49% of clinical deterioration events, improving sensitivity by 16% compared with clinicians alone and 24% compared with the model alone while maintaining a positive predictive value of 33%, thus keeping false alarms at a clinically acceptable level. CONCLUSIONS: ML models can complement clinician judgment to predict clinical deterioration in hospital. These findings demonstrate important opportunities for human-computer collaboration to improve prognostication and personalized medicine in hospital.

16.
Sci Adv ; 9(19): eabq0701, 2023 05 10.
Article in English | MEDLINE | ID: mdl-37163590

ABSTRACT

As governments and industry turn to increased use of automated decision systems, it becomes essential to consider how closely such systems can reproduce human judgment. We identify a core potential failure, finding that annotators label objects differently depending on whether they are being asked a factual question or a normative question. This challenges a natural assumption maintained in many standard machine-learning (ML) data acquisition procedures: that there is no difference between predicting the factual classification of an object and an exercise of judgment about whether an object violates a rule premised on those facts. We find that using factual labels to train models intended for normative judgments introduces a notable measurement error. We show that models trained using factual labels yield significantly different judgments than those trained using normative labels and that the impact of this effect on model performance can exceed that of other factors (e.g., dataset size) that routinely attract attention from ML researchers and practitioners.


Subject(s)
Judgment , Machine Learning , Humans , Government
18.
Nat Cardiovasc Res ; 2: 144-158, 2023 Jan 16.
Article in English | MEDLINE | ID: mdl-36949957

ABSTRACT

Somatic mutations in blood indicative of clonal hematopoiesis of indeterminate potential (CHIP) are associated with an increased risk of hematologic malignancy, coronary artery disease, and all-cause mortality. Here we analyze the relation between CHIP status and incident peripheral artery disease (PAD) and atherosclerosis, using whole-exome sequencing and clinical data from the UK Biobank and Mass General Brigham Biobank. CHIP associated with incident PAD and atherosclerotic disease across multiple beds, with increased risk among individuals with CHIP driven by mutation in DNA Damage Repair (DDR) genes such as TP53 and PPM1D. To model the effects of DDR-induced CHIP on atherosclerosis, we used a competitive bone marrow transplantation strategy, and generated atherosclerosis-prone Ldlr-/- chimeric mice carrying 20% p53-deficient hematopoietic cells. The chimeric mice were analyzed 13-weeks post-grafting and showed increased aortic plaque size and accumulation of macrophages within the plaque, driven by increased proliferation of p53-deficient plaque macrophages. In summary, our findings highlight the role of CHIP as a broad driver of atherosclerosis across the entire arterial system beyond the coronary arteries, and provide genetic and experimental support for a direct causal contribution of TP53-mutant CHIP to atherosclerosis.

19.
J Diabetes ; 15(2): 145-151, 2023 Feb.
Article in English | MEDLINE | ID: mdl-36641812

ABSTRACT

OBJECTIVE: To determine whether nailfold capillary images, acquired using video capillaroscopy, can provide diagnostic information about diabetes and its complications. RESEARCH DESIGN AND METHODS: Nailfold video capillaroscopy was performed in 120 adult patients with and without type 1 or type 2 diabetes, and with and without cardiovascular disease. Nailfold images were analyzed using convolutional neural networks, a deep learning technique. Cross-validation was used to develop and test the ability of models to predict five5 prespecified states (diabetes, high glycosylated hemoglobin, cardiovascular event, retinopathy, albuminuria, and hypertension). The performance of each model for a particular state was assessed by estimating areas under the receiver operating characteristics curves (AUROC) and precision recall curves (AUPR). RESULTS: A total of 5236 nailfold images were acquired from 120 participants (mean 44 images per participant) and were all available for analysis. Models were able to accurately identify the presence of diabetes, with AUROC 0.84 (95% confidence interval [CI] 0.76, 0.91) and AUPR 0.84 (95% CI 0.78, 0.93), respectively. Models were also able to predict a history of cardiovascular events in patients with diabetes, with AUROC 0.65 (95% CI 0.51, 0.78) and AUPR 0.72 (95% CI 0.62, 0.88) respectively. CONCLUSIONS: This proof-of-concept study demonstrates the potential of machine learning for identifying people with microvascular capillary changes from diabetes based on nailfold images, and for possibly identifying those most likely to have diabetes-related complications.


Subject(s)
Deep Learning , Diabetes Mellitus, Type 2 , Adult , Humans , Diabetes Mellitus, Type 2/complications , Diabetes Mellitus, Type 2/diagnosis , Microscopic Angioscopy/methods , Nails/diagnostic imaging , Nails/blood supply , ROC Curve , Capillaries/diagnostic imaging
20.
Sci Rep ; 13(1): 1383, 2023 01 25.
Article in English | MEDLINE | ID: mdl-36697450

ABSTRACT

Artificial intelligence (AI)-generated clinical advice is becoming more prevalent in healthcare. However, the impact of AI-generated advice on physicians' decision-making is underexplored. In this study, physicians received X-rays with correct diagnostic advice and were asked to make a diagnosis, rate the advice's quality, and judge their own confidence. We manipulated whether the advice came with or without a visual annotation on the X-rays, and whether it was labeled as coming from an AI or a human radiologist. Overall, receiving annotated advice from an AI resulted in the highest diagnostic accuracy. Physicians rated the quality of AI advice higher than human advice. We did not find a strong effect of either manipulation on participants' confidence. The magnitude of the effects varied between task experts and non-task experts, with the latter benefiting considerably from correct explainable AI advice. These findings raise important considerations for the deployment of diagnostic advice in healthcare.


Subject(s)
Artificial Intelligence , Physicians , Humans , X-Rays , Radiography , Radiologists
SELECTION OF CITATIONS
SEARCH DETAIL
...