Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 149
Filter
1.
Article in English | MEDLINE | ID: mdl-39284102

ABSTRACT

In the high-stakes arena of drug discovery, the journey from bench to bedside is hindered by a daunting 92% failure rate, primarily due to unpredicted toxicities and inadequate therapeutic efficacy in clinical trials. The FDA Modernization Act 2.0 heralds a transformative approach, advocating for the integration of alternative methods to conventional animal testing, including cell-based assays that employ human induced pluripotent stem cell (iPSC)-derived organoids, and organ-on-a-chip technologies, in conjunction with sophisticated artificial intelligence (AI) methodologies. Our review explores the innovative capacity of iPSC-derived clinical trial in a dish models designed for cardiovascular disease research. We also highlight how integrating iPSC technology with AI can accelerate the identification of viable therapeutic candidates, streamline drug screening, and pave the way toward more personalized medicine. Through this, we provide a comprehensive overview of the current landscape and future implications of iPSC and AI applications being navigated by the research community and pharmaceutical industry.

2.
Nat Cancer ; 2024 Sep 20.
Article in English | MEDLINE | ID: mdl-39304772

ABSTRACT

Hepatocellular carcinoma (HCC) frequently recurs from minimal residual disease (MRD), which persists after therapy. Here, we identified mechanisms of persistence of residual tumor cells using post-chemoembolization human HCC (n = 108 patients, 1.07 million cells) and a transgenic mouse model of MRD. Through single-cell high-plex cytometric imaging, we identified a spatial neighborhood within which PD-L1 + M2-like macrophages interact with stem-like tumor cells, correlating with CD8+ T cell exhaustion and poor survival. Further, through spatial transcriptomics of residual HCC, we showed that macrophage-derived TGFß1 mediates the persistence of stem-like tumor cells. Last, we demonstrate that combined blockade of Pdl1 and Tgfß excluded immunosuppressive macrophages, recruited activated CD8+ T cells and eliminated residual stem-like tumor cells in two mouse models: a transgenic model of MRD and a syngeneic orthotopic model of doxorubicin-resistant HCC. Thus, our spatial analyses reveal that PD-L1+ macrophages sustain MRD by activating the TGFß pathway in stem-like cancer cells and targeting this interaction may prevent HCC recurrence from MRD.

3.
Cell Rep Methods ; 4(8): 100838, 2024 Aug 19.
Article in English | MEDLINE | ID: mdl-39127044

ABSTRACT

Tissues are organized into anatomical and functional units at different scales. New technologies for high-dimensional molecular profiling in situ have enabled the characterization of structure-function relationships in increasing molecular detail. However, it remains a challenge to consistently identify key functional units across experiments, tissues, and disease contexts, a task that demands extensive manual annotation. Here, we present spatial cellular graph partitioning (SCGP), a flexible method for the unsupervised annotation of tissue structures. We further present a reference-query extension pipeline, SCGP-Extension, that generalizes reference tissue structure labels to previously unseen samples, performing data integration and tissue structure discovery. Our experiments demonstrate reliable, robust partitioning of spatial data in a wide variety of contexts and best-in-class accuracy in identifying expertly annotated structures. Downstream analysis on SCGP-identified tissue structures reveals disease-relevant insights regarding diabetic kidney disease, skin disorder, and neoplastic diseases, underscoring its potential to drive biological insight and discovery from spatial datasets.


Subject(s)
Computational Biology , Humans , Animals , Computational Biology/methods , Diabetic Nephropathies/metabolism , Diabetic Nephropathies/pathology , Mice , Skin Diseases/genetics , Skin Diseases/pathology
4.
Nat Med ; 2024 Aug 07.
Article in English | MEDLINE | ID: mdl-39112796

ABSTRACT

Traditional biomedical artificial intelligence (AI) models, designed for specific tasks or modalities, often exhibit limited flexibility in real-world deployment and struggle to utilize holistic information. Generalist AI holds the potential to address these limitations due to its versatility in interpreting different data types and generating tailored outputs for diverse needs. However, existing biomedical generalist AI solutions are typically heavyweight and closed source to researchers, practitioners and patients. Here, we describe BiomedGPT, the first open-source and lightweight vision-language foundation model, designed as a generalist capable of performing various biomedical tasks. BiomedGPT achieved state-of-the-art results in 16 out of 25 experiments while maintaining a computing-friendly model scale. We also conducted human evaluations to assess the capabilities of BiomedGPT in radiology visual question answering, report generation and summarization. BiomedGPT exhibits robust prediction ability with a low error rate of 3.8% in question answering, satisfactory performance with an error rate of 8.3% in writing complex radiology reports, and competitive summarization ability with a nearly equivalent preference score to human experts. Our method demonstrates that effective training with diverse data can lead to more practical biomedical AI for improving diagnosis and workflow efficiency.

5.
Nat Methods ; 21(8): 1422-1429, 2024 Aug.
Article in English | MEDLINE | ID: mdl-39122951

ABSTRACT

Language models are playing an increasingly important role in many areas of artificial intelligence (AI) and computational biology. In this primer, we discuss the ways in which language models, both those based on natural language and those based on biological sequences, can be applied to biological research. This primer is primarily intended for biologists interested in using these cutting-edge AI technologies in their applications. We provide guidance on best practices and key resources for adapting language models for biology.


Subject(s)
Artificial Intelligence , Computational Biology , Computational Biology/methods , Humans , Natural Language Processing , Programming Languages
6.
bioRxiv ; 2024 Jul 19.
Article in English | MEDLINE | ID: mdl-39071282

ABSTRACT

Old age is associated with a decline in cognitive function and an increase in neurodegenerative disease risk1. Brain aging is complex and accompanied by many cellular changes2-20. However, the influence that aged cells have on neighboring cells and how this contributes to tissue decline is unknown. More generally, the tools to systematically address this question in aging tissues have not yet been developed. Here, we generate spatiotemporal data at single-cell resolution for the mouse brain across lifespan, and we develop the first machine learning models based on spatial transcriptomics ('spatial aging clocks') to reveal cell proximity effects during brain aging and rejuvenation. We collect a single-cell spatial transcriptomics brain atlas of 4.2 million cells from 20 distinct ages and across two rejuvenating interventions-exercise and partial reprogramming. We identify spatial and cell type-specific transcriptomic fingerprints of aging, rejuvenation, and disease, including for rare cell types. Using spatial aging clocks and deep learning models, we find that T cells, which infiltrate the brain with age, have a striking pro-aging proximity effect on neighboring cells. Surprisingly, neural stem cells have a strong pro-rejuvenating effect on neighboring cells. By developing computational tools to identify mediators of these proximity effects, we find that pro-aging T cells trigger a local inflammatory response likely via interferon-γ whereas pro-rejuvenating neural stem cells impact the metabolism of neighboring cells possibly via growth factors (e.g. vascular endothelial growth factor) and extracellular vesicles, and we experimentally validate some of these predictions. These results suggest that rare cells can have a drastic influence on their neighbors and could be targeted to counter tissue aging. We anticipate that these spatial aging clocks will not only allow scalable assessment of the efficacy of interventions for aging and disease but also represent a new tool for studying cell-cell interactions in many spatial contexts.

7.
ArXiv ; 2024 Jun 20.
Article in English | MEDLINE | ID: mdl-38947933

ABSTRACT

Feature attribution, the ability to localize regions of the input data that are relevant for classification, is an important capability for ML models in scientific and biomedical domains. Current methods for feature attribution, which rely on "explaining" the predictions of end-to-end classifiers, suffer from imprecise feature localization and are inadequate for use with small sample sizes and high-dimensional datasets due to computational challenges. We introduce prospector heads, an efficient and interpretable alternative to explanation-based attribution methods that can be applied to any encoder and any data modality. Prospector heads generalize across modalities through experiments on sequences (text), images (pathology), and graphs (protein structures), outperforming baseline attribution methods by up to 26.3 points in mean localization AUPRC. We also demonstrate how prospector heads enable improved interpretation and discovery of class-specific patterns in input data. Through their high performance, flexibility, and generalizability, prospectors provide a framework for improving trust and transparency for ML models in complex domains.

8.
Eur Heart J Digit Health ; 5(4): 427-434, 2024 Jul.
Article in English | MEDLINE | ID: mdl-39081946

ABSTRACT

Aims: Deep learning methods have recently gained success in detecting left ventricular systolic dysfunction (LVSD) from electrocardiogram (ECG) waveforms. Despite their high level of accuracy, they are difficult to interpret and deploy broadly in the clinical setting. In this study, we set out to determine whether simpler models based on standard ECG measurements could detect LVSD with similar accuracy to that of deep learning models. Methods and results: Using an observational data set of 40 994 matched 12-lead ECGs and transthoracic echocardiograms, we trained a range of models with increasing complexity to detect LVSD based on ECG waveforms and derived measurements. The training data were acquired from the Stanford University Medical Center. External validation data were acquired from the Columbia Medical Center and the UK Biobank. The Stanford data set consisted of 40 994 matched ECGs and echocardiograms, of which 9.72% had LVSD. A random forest model using 555 discrete, automated measurements achieved an area under the receiver operator characteristic curve (AUC) of 0.92 (0.91-0.93), similar to a deep learning waveform model with an AUC of 0.94 (0.93-0.94). A logistic regression model based on five measurements achieved high performance [AUC of 0.86 (0.85-0.87)], close to a deep learning model and better than N-terminal prohormone brain natriuretic peptide (NT-proBNP). Finally, we found that simpler models were more portable across sites, with experiments at two independent, external sites. Conclusion: Our study demonstrates the value of simple electrocardiographic models that perform nearly as well as deep learning models, while being much easier to implement and interpret.

9.
ArXiv ; 2024 Jul 01.
Article in English | MEDLINE | ID: mdl-39010876

ABSTRACT

Radiotherapy treatment planning is a time-consuming and potentially subjective process that requires the iterative adjustment of model parameters to balance multiple conflicting objectives. Recent advancements in large foundation models offer promising avenues for addressing the challenges in planning and clinical decision-making. This study introduces GPT-RadPlan, a fully automated treatment planning framework that harnesses prior radiation oncology knowledge encoded in multi-modal large language models, such as GPT-4Vision (GPT-4V) from OpenAI. GPT-RadPlan is made aware of planning protocols as context and acts as an expert human planner, capable of guiding a treatment planning process. Via in-context learning, we incorporate clinical protocols for various disease sites as prompts to enable GPT-4V to acquire treatment planning domain knowledge. The resulting GPT-RadPlan agent is integrated into our in-house inverse treatment planning system through an API. The efficacy of the automated planning system is showcased using multiple prostate and head & neck cancer cases, where we compared GPT-RadPlan results to clinical plans. In all cases, GPT-RadPlan either outperformed or matched the clinical plans, demonstrating superior target coverage and organ-at-risk sparing. Consistently satisfying the dosimetric objectives in the clinical protocol, GPT-RadPlan represents the first multimodal large language model agent that mimics the behaviors of human planners in radiation oncology clinics, achieving remarkable results in automating the treatment planning process without the need for additional training.

10.
Diagnostics (Basel) ; 14(14)2024 Jul 17.
Article in English | MEDLINE | ID: mdl-39061675

ABSTRACT

Background: Segmenting computed tomography (CT) is crucial in various clinical applications, such as tailoring personalized cardiac ablation for managing cardiac arrhythmias. Automating segmentation through machine learning (ML) is hindered by the necessity for large, labeled training data, which can be challenging to obtain. This article proposes a novel approach for automated, robust labeling using domain knowledge to achieve high-performance segmentation by ML from a small training set. The approach, the domain knowledge-encoding (DOKEN) algorithm, reduces the reliance on large training datasets by encoding cardiac geometry while automatically labeling the training set. The method was validated in a hold-out dataset of CT results from an atrial fibrillation (AF) ablation study. Methods: The DOKEN algorithm parses left atrial (LA) structures, extracts "anatomical knowledge" by leveraging digital LA models (available publicly), and then applies this knowledge to achieve high ML segmentation performance with a small number of training samples. The DOKEN-labeled training set was used to train a nnU-Net deep neural network (DNN) model for segmenting cardiac CT in N = 20 patients. Subsequently, the method was tested in a hold-out set with N = 100 patients (five times larger than training set) who underwent AF ablation. Results: The DOKEN algorithm integrated with the nn-Unet model achieved high segmentation performance with few training samples, with a training to test ratio of 1:5. The Dice score of the DOKEN-enhanced model was 96.7% (IQR: 95.3% to 97.7%), with a median error in surface distance of boundaries of 1.51 mm (IQR: 0.72 to 3.12) and a mean centroid-boundary distance of 1.16 mm (95% CI: -4.57 to 6.89), similar to expert results (r = 0.99; p < 0.001). In digital hearts, the novel DOKEN approach segmented the LA structures with a mean difference for the centroid-boundary distances of -0.27 mm (95% CI: -3.87 to 3.33; r = 0.99; p < 0.0001). Conclusions: The proposed novel domain knowledge-encoding algorithm was able to perform the segmentation of six substructures of the LA, reducing the need for large training data sets. The combination of domain knowledge encoding and a machine learning approach could reduce the dependence of ML on large training datasets and could potentially be applied to AF ablation procedures and extended in the future to other imaging, 3D printing, and data science applications.

11.
Nat Biomed Eng ; 2024 Jun 19.
Article in English | MEDLINE | ID: mdl-38898173

ABSTRACT

In pathology, the deployment of artificial intelligence (AI) in clinical settings is constrained by limitations in data collection and in model transparency and interpretability. Here we describe a digital pathology framework, nuclei.io, that incorporates active learning and human-in-the-loop real-time feedback for the rapid creation of diverse datasets and models. We validate the effectiveness of the framework via two crossover user studies that leveraged collaboration between the AI and the pathologist, including the identification of plasma cells in endometrial biopsies and the detection of colorectal cancer metastasis in lymph nodes. In both studies, nuclei.io yielded considerable diagnostic performance improvements. Collaboration between clinicians and AI will aid digital pathology by enhancing accuracies and efficiencies.

12.
Bioinformatics ; 40(7)2024 07 01.
Article in English | MEDLINE | ID: mdl-38913862

ABSTRACT

MOTIVATION: The emergence of large chemical repositories and combinatorial chemical spaces, coupled with high-throughput docking and generative AI, have greatly expanded the chemical diversity of small molecules for drug discovery. Selecting compounds for experimental validation requires filtering these molecules based on favourable druglike properties, such as Absorption, Distribution, Metabolism, Excretion, and Toxicity (ADMET). RESULTS: We developed ADMET-AI, a machine learning platform that provides fast and accurate ADMET predictions both as a website and as a Python package. ADMET-AI has the highest average rank on the TDC ADMET Leaderboard, and it is currently the fastest web-based ADMET predictor, with a 45% reduction in time compared to the next fastest public ADMET web server. ADMET-AI can also be run locally with predictions for one million molecules taking just 3.1 h. AVAILABILITY AND IMPLEMENTATION: The ADMET-AI platform is freely available both as a web server at admet.ai.greenstonebio.com and as an open-source Python package for local batch prediction at github.com/swansonk14/admet_ai (also archived on Zenodo at doi.org/10.5281/zenodo.10372930). All data and models are archived on Zenodo at doi.org/10.5281/zenodo.10372418.


Subject(s)
Drug Discovery , Machine Learning , Software , Drug Discovery/methods , Small Molecule Libraries/chemistry
13.
Bioinformatics ; 40(Suppl 1): i521-i528, 2024 06 28.
Article in English | MEDLINE | ID: mdl-38940132

ABSTRACT

MOTIVATION: Spatially resolved single-cell transcriptomics have provided unprecedented insights into gene expression in situ, particularly in the context of cell interactions or organization of tissues. However, current technologies for profiling spatial gene expression at single-cell resolution are generally limited to the measurement of a small number of genes. To address this limitation, several algorithms have been developed to impute or predict the expression of additional genes that were not present in the measured gene panel. Current algorithms do not leverage the rich spatial and gene relational information in spatial transcriptomics. To improve spatial gene expression predictions, we introduce Spatial Propagation and Reinforcement of Imputed Transcript Expression (SPRITE) as a meta-algorithm that processes predictions obtained from existing methods by propagating information across gene correlation networks and spatial neighborhood graphs. RESULTS: SPRITE improves spatial gene expression predictions across multiple spatial transcriptomics datasets. Furthermore, SPRITE predicted spatial gene expression leads to improved clustering, visualization, and classification of cells. SPRITE can be used in spatial transcriptomics data analysis to improve inferences based on predicted gene expression. AVAILABILITY AND IMPLEMENTATION: The SPRITE software package is available at https://github.com/sunericd/SPRITE. Code for generating experiments and analyses in the manuscript is available at https://github.com/sunericd/sprite-figures-and-analyses.


Subject(s)
Algorithms , Gene Expression Profiling , Gene Regulatory Networks , Software , Gene Expression Profiling/methods , Single-Cell Analysis/methods , Humans , Transcriptome
14.
Cancer Cell ; 42(6): 915-918, 2024 Jun 10.
Article in English | MEDLINE | ID: mdl-38861926

ABSTRACT

Experts discuss the challenges and opportunities of using artificial intelligence (AI) to study the evolution of cancer cells and their microenvironment, improve diagnosis, predict treatment response, and ensure responsible implementation in the clinic.


Subject(s)
Artificial Intelligence , Neoplasms , Tumor Microenvironment , Humans , Neoplasms/therapy , Neoplasms/genetics , Neoplasms/pathology
15.
BJA Open ; 10: 100280, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38764485

ABSTRACT

Background: Patients are increasingly using artificial intelligence (AI) chatbots to seek answers to medical queries. Methods: Ten frequently asked questions in anaesthesia were posed to three AI chatbots: ChatGPT4 (OpenAI), Bard (Google), and Bing Chat (Microsoft). Each chatbot's answers were evaluated in a randomised, blinded order by five residency programme directors from 15 medical institutions in the USA. Three medical content quality categories (accuracy, comprehensiveness, safety) and three communication quality categories (understandability, empathy/respect, and ethics) were scored between 1 and 5 (1 representing worst, 5 representing best). Results: ChatGPT4 and Bard outperformed Bing Chat (median [inter-quartile range] scores: 4 [3-4], 4 [3-4], and 3 [2-4], respectively; P<0.001 with all metrics combined). All AI chatbots performed poorly in accuracy (score of ≥4 by 58%, 48%, and 36% of experts for ChatGPT4, Bard, and Bing Chat, respectively), comprehensiveness (score ≥4 by 42%, 30%, and 12% of experts for ChatGPT4, Bard, and Bing Chat, respectively), and safety (score ≥4 by 50%, 40%, and 28% of experts for ChatGPT4, Bard, and Bing Chat, respectively). Notably, answers from ChatGPT4, Bard, and Bing Chat differed statistically in comprehensiveness (ChatGPT4, 3 [2-4] vs Bing Chat, 2 [2-3], P<0.001; and Bard 3 [2-4] vs Bing Chat, 2 [2-3], P=0.002). All large language model chatbots performed well with no statistical difference for understandability (P=0.24), empathy (P=0.032), and ethics (P=0.465). Conclusions: In answering anaesthesia patient frequently asked questions, the chatbots perform well on communication metrics but are suboptimal for medical content metrics. Overall, ChatGPT4 and Bard were comparable to each other, both outperforming Bing Chat.

16.
bioRxiv ; 2024 Mar 21.
Article in English | MEDLINE | ID: mdl-38562882

ABSTRACT

Single-cell RNA sequencing (scRNA-seq) has transformed our understanding of cell fate in developmental systems. However, identifying the molecular hallmarks of potency - the capacity of a cell to differentiate into other cell types - has remained challenging. Here, we introduce CytoTRACE 2, an interpretable deep learning framework for characterizing potency and differentiation states on an absolute scale from scRNA-seq data. Across 31 human and mouse scRNA-seq datasets encompassing 28 tissue types, CytoTRACE 2 outperformed existing methods for recovering experimentally determined potency levels and differentiation states covering the entire range of cellular ontogeny. Moreover, it reconstructed the temporal hierarchy of mouse embryogenesis across 62 timepoints; identified pan-tissue expression programs that discriminate major potency levels; and facilitated discovery of cellular phenotypes in cancer linked to survival and immunotherapy resistance. Our results illuminate a fundamental feature of cell biology and provide a broadly applicable platform for delineating single-cell differentiation landscapes in health and disease.

18.
Cell Rep Med ; 5(3): 101444, 2024 Mar 19.
Article in English | MEDLINE | ID: mdl-38428426

ABSTRACT

Patients with cancer may be given treatments that are not officially approved (off-label) or recommended by guidelines (off-guideline). Here we present a data science framework to systematically characterize off-label and off-guideline usages using real-world data from de-identified electronic health records (EHR). We analyze treatment patterns in 165,912 US patients with 14 common cancer types. We find that 18.6% and 4.4% of patients have received at least one line of off-label and off-guideline cancer drugs, respectively. Patients with worse performance status, in later lines, or treated at academic hospitals are significantly more likely to receive off-label and off-guideline drugs. To quantify how predictable off-guideline usage is, we developed machine learning models to predict which drug a patient is likely to receive based on their clinical characteristics and previous treatments. Finally, we demonstrate that our systematic analyses generate hypotheses about patients' response to treatments.


Subject(s)
Antineoplastic Agents , Neoplasms , Humans , Off-Label Use , Neoplasms/drug therapy , Neoplasms/epidemiology , Antineoplastic Agents/therapeutic use
19.
NPJ Digit Med ; 7(1): 63, 2024 Mar 08.
Article in English | MEDLINE | ID: mdl-38459205

ABSTRACT

Despite the importance of informed consent in healthcare, the readability and specificity of consent forms often impede patients' comprehension. This study investigates the use of GPT-4 to simplify surgical consent forms and introduces an AI-human expert collaborative approach to validate content appropriateness. Consent forms from multiple institutions were assessed for readability and simplified using GPT-4, with pre- and post-simplification readability metrics compared using nonparametric tests. Independent reviews by medical authors and a malpractice defense attorney were conducted. Finally, GPT-4's potential for generating de novo procedure-specific consent forms was assessed, with forms evaluated using a validated 8-item rubric and expert subspecialty surgeon review. Analysis of 15 academic medical centers' consent forms revealed significant reductions in average reading time, word rarity, and passive sentence frequency (all P < 0.05) following GPT-4-faciliated simplification. Readability improved from an average college freshman to an 8th-grade level (P = 0.004), matching the average American's reading level. Medical and legal sufficiency consistency was confirmed. GPT-4 generated procedure-specific consent forms for five varied surgical procedures at an average 6th-grade reading level. These forms received perfect scores on a standardized consent form rubric and withstood scrutiny upon expert subspeciality surgeon review. This study demonstrates the first AI-human expert collaboration to enhance surgical consent forms, significantly improving readability without sacrificing clinical detail. Our framework could be extended to other patient communication materials, emphasizing clear communication and mitigating disparities related to health literacy barriers.

20.
Nat Commun ; 15(1): 1059, 2024 Feb 05.
Article in English | MEDLINE | ID: mdl-38316764

ABSTRACT

The ability to computationally generate novel yet physically foldable protein structures could lead to new biological discoveries and new treatments targeting yet incurable diseases. Despite recent advances in protein structure prediction, directly generating diverse, novel protein structures from neural networks remains difficult. In this work, we present a diffusion-based generative model that generates protein backbone structures via a procedure inspired by the natural folding process. We describe a protein backbone structure as a sequence of angles capturing the relative orientation of the constituent backbone atoms, and generate structures by denoising from a random, unfolded state towards a stable folded structure. Not only does this mirror how proteins natively twist into energetically favorable conformations, the inherent shift and rotational invariance of this representation crucially alleviates the need for more complex equivariant networks. We train a denoising diffusion probabilistic model with a simple transformer backbone and demonstrate that our resulting model unconditionally generates highly realistic protein structures with complexity and structural patterns akin to those of naturally-occurring proteins. As a useful resource, we release an open-source codebase and trained models for protein structure diffusion.


Subject(s)
Protein Folding , Proteins , Proteins/metabolism , Neural Networks, Computer , Protein Conformation
SELECTION OF CITATIONS
SEARCH DETAIL