Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 141
Filtrar
1.
ArXiv ; 2024 Jun 20.
Artículo en Inglés | MEDLINE | ID: mdl-38947933

RESUMEN

Feature attribution, the ability to localize regions of the input data that are relevant for classification, is an important capability for ML models in scientific and biomedical domains. Current methods for feature attribution, which rely on "explaining" the predictions of end-to-end classifiers, suffer from imprecise feature localization and are inadequate for use with small sample sizes and high-dimensional datasets due to computational challenges. We introduce prospector heads, an efficient and interpretable alternative to explanation-based attribution methods that can be applied to any encoder and any data modality. Prospector heads generalize across modalities through experiments on sequences (text), images (pathology), and graphs (protein structures), outperforming baseline attribution methods by up to 26.3 points in mean localization AUPRC. We also demonstrate how prospector heads enable improved interpretation and discovery of class-specific patterns in input data. Through their high performance, flexibility, and generalizability, prospectors provide a framework for improving trust and transparency for ML models in complex domains.

2.
ArXiv ; 2024 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-39010876

RESUMEN

Radiotherapy treatment planning is a time-consuming and potentially subjective process that requires the iterative adjustment of model parameters to balance multiple conflicting objectives. Recent advancements in large foundation models offer promising avenues for addressing the challenges in planning and clinical decision-making. This study introduces GPT-RadPlan, a fully automated treatment planning framework that harnesses prior radiation oncology knowledge encoded in multi-modal large language models, such as GPT-4Vision (GPT-4V) from OpenAI. GPT-RadPlan is made aware of planning protocols as context and acts as an expert human planner, capable of guiding a treatment planning process. Via in-context learning, we incorporate clinical protocols for various disease sites as prompts to enable GPT-4V to acquire treatment planning domain knowledge. The resulting GPT-RadPlan agent is integrated into our in-house inverse treatment planning system through an API. The efficacy of the automated planning system is showcased using multiple prostate and head & neck cancer cases, where we compared GPT-RadPlan results to clinical plans. In all cases, GPT-RadPlan either outperformed or matched the clinical plans, demonstrating superior target coverage and organ-at-risk sparing. Consistently satisfying the dosimetric objectives in the clinical protocol, GPT-RadPlan represents the first multimodal large language model agent that mimics the behaviors of human planners in radiation oncology clinics, achieving remarkable results in automating the treatment planning process without the need for additional training.

3.
Nat Biomed Eng ; 2024 Jun 19.
Artículo en Inglés | MEDLINE | ID: mdl-38898173

RESUMEN

In pathology, the deployment of artificial intelligence (AI) in clinical settings is constrained by limitations in data collection and in model transparency and interpretability. Here we describe a digital pathology framework, nuclei.io, that incorporates active learning and human-in-the-loop real-time feedback for the rapid creation of diverse datasets and models. We validate the effectiveness of the framework via two crossover user studies that leveraged collaboration between the AI and the pathologist, including the identification of plasma cells in endometrial biopsies and the detection of colorectal cancer metastasis in lymph nodes. In both studies, nuclei.io yielded considerable diagnostic performance improvements. Collaboration between clinicians and AI will aid digital pathology by enhancing accuracies and efficiencies.

4.
Bioinformatics ; 40(Supplement_1): i521-i528, 2024 Jun 28.
Artículo en Inglés | MEDLINE | ID: mdl-38940132

RESUMEN

MOTIVATION: Spatially resolved single-cell transcriptomics have provided unprecedented insights into gene expression in situ, particularly in the context of cell interactions or organization of tissues. However, current technologies for profiling spatial gene expression at single-cell resolution are generally limited to the measurement of a small number of genes. To address this limitation, several algorithms have been developed to impute or predict the expression of additional genes that were not present in the measured gene panel. Current algorithms do not leverage the rich spatial and gene relational information in spatial transcriptomics. To improve spatial gene expression predictions, we introduce Spatial Propagation and Reinforcement of Imputed Transcript Expression (SPRITE) as a meta-algorithm that processes predictions obtained from existing methods by propagating information across gene correlation networks and spatial neighborhood graphs. RESULTS: SPRITE improves spatial gene expression predictions across multiple spatial transcriptomics datasets. Furthermore, SPRITE predicted spatial gene expression leads to improved clustering, visualization, and classification of cells. SPRITE can be used in spatial transcriptomics data analysis to improve inferences based on predicted gene expression. AVAILABILITY AND IMPLEMENTATION: The SPRITE software package is available at https://github.com/sunericd/SPRITE. Code for generating experiments and analyses in the manuscript is available at https://github.com/sunericd/sprite-figures-and-analyses.


Asunto(s)
Algoritmos , Perfilación de la Expresión Génica , Redes Reguladoras de Genes , Programas Informáticos , Perfilación de la Expresión Génica/métodos , Análisis de la Célula Individual/métodos , Humanos , Transcriptoma
5.
Bioinformatics ; 40(7)2024 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-38913862

RESUMEN

MOTIVATION: The emergence of large chemical repositories and combinatorial chemical spaces, coupled with high-throughput docking and generative AI, have greatly expanded the chemical diversity of small molecules for drug discovery. Selecting compounds for experimental validation requires filtering these molecules based on favourable druglike properties, such as Absorption, Distribution, Metabolism, Excretion, and Toxicity (ADMET). RESULTS: We developed ADMET-AI, a machine learning platform that provides fast and accurate ADMET predictions both as a website and as a Python package. ADMET-AI has the highest average rank on the TDC ADMET Leaderboard, and it is currently the fastest web-based ADMET predictor, with a 45% reduction in time compared to the next fastest public ADMET web server. ADMET-AI can also be run locally with predictions for one million molecules taking just 3.1 h. AVAILABILITY AND IMPLEMENTATION: The ADMET-AI platform is freely available both as a web server at admet.ai.greenstonebio.com and as an open-source Python package for local batch prediction at github.com/swansonk14/admet_ai (also archived on Zenodo at doi.org/10.5281/zenodo.10372930). All data and models are archived on Zenodo at doi.org/10.5281/zenodo.10372418.


Asunto(s)
Descubrimiento de Drogas , Aprendizaje Automático , Programas Informáticos , Descubrimiento de Drogas/métodos , Bibliotecas de Moléculas Pequeñas/química
6.
Cancer Cell ; 42(6): 915-918, 2024 Jun 10.
Artículo en Inglés | MEDLINE | ID: mdl-38861926

RESUMEN

Experts discuss the challenges and opportunities of using artificial intelligence (AI) to study the evolution of cancer cells and their microenvironment, improve diagnosis, predict treatment response, and ensure responsible implementation in the clinic.


Asunto(s)
Inteligencia Artificial , Neoplasias , Microambiente Tumoral , Humanos , Neoplasias/terapia , Neoplasias/genética , Neoplasias/patología
7.
BJA Open ; 10: 100280, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38764485

RESUMEN

Background: Patients are increasingly using artificial intelligence (AI) chatbots to seek answers to medical queries. Methods: Ten frequently asked questions in anaesthesia were posed to three AI chatbots: ChatGPT4 (OpenAI), Bard (Google), and Bing Chat (Microsoft). Each chatbot's answers were evaluated in a randomised, blinded order by five residency programme directors from 15 medical institutions in the USA. Three medical content quality categories (accuracy, comprehensiveness, safety) and three communication quality categories (understandability, empathy/respect, and ethics) were scored between 1 and 5 (1 representing worst, 5 representing best). Results: ChatGPT4 and Bard outperformed Bing Chat (median [inter-quartile range] scores: 4 [3-4], 4 [3-4], and 3 [2-4], respectively; P<0.001 with all metrics combined). All AI chatbots performed poorly in accuracy (score of ≥4 by 58%, 48%, and 36% of experts for ChatGPT4, Bard, and Bing Chat, respectively), comprehensiveness (score ≥4 by 42%, 30%, and 12% of experts for ChatGPT4, Bard, and Bing Chat, respectively), and safety (score ≥4 by 50%, 40%, and 28% of experts for ChatGPT4, Bard, and Bing Chat, respectively). Notably, answers from ChatGPT4, Bard, and Bing Chat differed statistically in comprehensiveness (ChatGPT4, 3 [2-4] vs Bing Chat, 2 [2-3], P<0.001; and Bard 3 [2-4] vs Bing Chat, 2 [2-3], P=0.002). All large language model chatbots performed well with no statistical difference for understandability (P=0.24), empathy (P=0.032), and ethics (P=0.465). Conclusions: In answering anaesthesia patient frequently asked questions, the chatbots perform well on communication metrics but are suboptimal for medical content metrics. Overall, ChatGPT4 and Bard were comparable to each other, both outperforming Bing Chat.

8.
bioRxiv ; 2024 Mar 21.
Artículo en Inglés | MEDLINE | ID: mdl-38562882

RESUMEN

Single-cell RNA sequencing (scRNA-seq) has transformed our understanding of cell fate in developmental systems. However, identifying the molecular hallmarks of potency - the capacity of a cell to differentiate into other cell types - has remained challenging. Here, we introduce CytoTRACE 2, an interpretable deep learning framework for characterizing potency and differentiation states on an absolute scale from scRNA-seq data. Across 31 human and mouse scRNA-seq datasets encompassing 28 tissue types, CytoTRACE 2 outperformed existing methods for recovering experimentally determined potency levels and differentiation states covering the entire range of cellular ontogeny. Moreover, it reconstructed the temporal hierarchy of mouse embryogenesis across 62 timepoints; identified pan-tissue expression programs that discriminate major potency levels; and facilitated discovery of cellular phenotypes in cancer linked to survival and immunotherapy resistance. Our results illuminate a fundamental feature of cell biology and provide a broadly applicable platform for delineating single-cell differentiation landscapes in health and disease.

10.
Cell Rep Med ; 5(3): 101444, 2024 Mar 19.
Artículo en Inglés | MEDLINE | ID: mdl-38428426

RESUMEN

Patients with cancer may be given treatments that are not officially approved (off-label) or recommended by guidelines (off-guideline). Here we present a data science framework to systematically characterize off-label and off-guideline usages using real-world data from de-identified electronic health records (EHR). We analyze treatment patterns in 165,912 US patients with 14 common cancer types. We find that 18.6% and 4.4% of patients have received at least one line of off-label and off-guideline cancer drugs, respectively. Patients with worse performance status, in later lines, or treated at academic hospitals are significantly more likely to receive off-label and off-guideline drugs. To quantify how predictable off-guideline usage is, we developed machine learning models to predict which drug a patient is likely to receive based on their clinical characteristics and previous treatments. Finally, we demonstrate that our systematic analyses generate hypotheses about patients' response to treatments.


Asunto(s)
Antineoplásicos , Neoplasias , Humanos , Uso Fuera de lo Indicado , Neoplasias/tratamiento farmacológico , Neoplasias/epidemiología , Antineoplásicos/uso terapéutico
11.
NPJ Digit Med ; 7(1): 63, 2024 Mar 08.
Artículo en Inglés | MEDLINE | ID: mdl-38459205

RESUMEN

Despite the importance of informed consent in healthcare, the readability and specificity of consent forms often impede patients' comprehension. This study investigates the use of GPT-4 to simplify surgical consent forms and introduces an AI-human expert collaborative approach to validate content appropriateness. Consent forms from multiple institutions were assessed for readability and simplified using GPT-4, with pre- and post-simplification readability metrics compared using nonparametric tests. Independent reviews by medical authors and a malpractice defense attorney were conducted. Finally, GPT-4's potential for generating de novo procedure-specific consent forms was assessed, with forms evaluated using a validated 8-item rubric and expert subspecialty surgeon review. Analysis of 15 academic medical centers' consent forms revealed significant reductions in average reading time, word rarity, and passive sentence frequency (all P < 0.05) following GPT-4-faciliated simplification. Readability improved from an average college freshman to an 8th-grade level (P = 0.004), matching the average American's reading level. Medical and legal sufficiency consistency was confirmed. GPT-4 generated procedure-specific consent forms for five varied surgical procedures at an average 6th-grade reading level. These forms received perfect scores on a standardized consent form rubric and withstood scrutiny upon expert subspeciality surgeon review. This study demonstrates the first AI-human expert collaboration to enhance surgical consent forms, significantly improving readability without sacrificing clinical detail. Our framework could be extended to other patient communication materials, emphasizing clear communication and mitigating disparities related to health literacy barriers.

12.
Nat Commun ; 15(1): 1059, 2024 Feb 05.
Artículo en Inglés | MEDLINE | ID: mdl-38316764

RESUMEN

The ability to computationally generate novel yet physically foldable protein structures could lead to new biological discoveries and new treatments targeting yet incurable diseases. Despite recent advances in protein structure prediction, directly generating diverse, novel protein structures from neural networks remains difficult. In this work, we present a diffusion-based generative model that generates protein backbone structures via a procedure inspired by the natural folding process. We describe a protein backbone structure as a sequence of angles capturing the relative orientation of the constituent backbone atoms, and generate structures by denoising from a random, unfolded state towards a stable folded structure. Not only does this mirror how proteins natively twist into energetically favorable conformations, the inherent shift and rotational invariance of this representation crucially alleviates the need for more complex equivariant networks. We train a denoising diffusion probabilistic model with a simple transformer backbone and demonstrate that our resulting model unconditionally generates highly realistic protein structures with complexity and structural patterns akin to those of naturally-occurring proteins. As a useful resource, we release an open-source codebase and trained models for protein structure diffusion.


Asunto(s)
Pliegue de Proteína , Proteínas , Proteínas/metabolismo , Redes Neurales de la Computación , Conformación Proteica
13.
Proc Natl Acad Sci U S A ; 121(10): e2313719121, 2024 Mar 05.
Artículo en Inglés | MEDLINE | ID: mdl-38416677

RESUMEN

Single-cell data integration can provide a comprehensive molecular view of cells, and many algorithms have been developed to remove unwanted technical or biological variations and integrate heterogeneous single-cell datasets. Despite their wide usage, existing methods suffer from several fundamental limitations. In particular, we lack a rigorous statistical test for whether two high-dimensional single-cell datasets are alignable (and therefore should even be aligned). Moreover, popular methods can substantially distort the data during alignment, making the aligned data and downstream analysis difficult to interpret. To overcome these limitations, we present a spectral manifold alignment and inference (SMAI) framework, which enables principled and interpretable alignability testing and structure-preserving integration of single-cell data with the same type of features. SMAI provides a statistical test to robustly assess the alignability between datasets to avoid misleading inference and is justified by high-dimensional statistical theory. On a diverse range of real and simulated benchmark datasets, it outperforms commonly used alignment methods. Moreover, we show that SMAI improves various downstream analyses such as identification of differentially expressed genes and imputation of single-cell spatial transcriptomics, providing further biological insights. SMAI's interpretability also enables quantification and a deeper understanding of the sources of technical confounders in single-cell data.


Asunto(s)
Algoritmos , Perfilación de la Expresión Génica , Expresión Génica , Análisis de la Célula Individual
14.
J Emerg Med ; 66(2): 184-191, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38369413

RESUMEN

BACKGROUND: The adoption of point-of-care ultrasound (POCUS) has greatly improved the ability to rapidly evaluate unstable emergency department (ED) patients at the bedside. One major use of POCUS is to obtain echocardiograms to assess cardiac function. OBJECTIVES: We developed EchoNet-POCUS, a novel deep learning system, to aid emergency physicians (EPs) in interpreting POCUS echocardiograms and to reduce operator-to-operator variability. METHODS: We collected a new dataset of POCUS echocardiogram videos obtained in the ED by EPs and annotated the cardiac function and quality of each video. Using this dataset, we train EchoNet-POCUS to evaluate both cardiac function and video quality in POCUS echocardiograms. RESULTS: EchoNet-POCUS achieves an area under the receiver operating characteristic curve (AUROC) of 0.92 (0.89-0.94) for predicting whether cardiac function is abnormal and an AUROC of 0.81 (0.78-0.85) for predicting video quality. CONCLUSIONS: EchoNet-POCUS can be applied to bedside echocardiogram videos in real time using commodity hardware, as we demonstrate in a prospective pilot study.


Asunto(s)
Ecocardiografía , Sistemas de Atención de Punto , Humanos , Estudios Prospectivos , Proyectos Piloto , Ultrasonografía , Servicio de Urgencia en Hospital
15.
Nat Methods ; 21(3): 444-454, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38347138

RESUMEN

Whole-transcriptome spatial profiling of genes at single-cell resolution remains a challenge. To address this limitation, spatial gene expression prediction methods have been developed to infer the spatial expression of unmeasured transcripts, but the quality of these predictions can vary greatly. Here we present Transcript Imputation with Spatial Single-cell Uncertainty Estimation (TISSUE) as a general framework for estimating uncertainty for spatial gene expression predictions and providing uncertainty-aware methods for downstream inference. Leveraging conformal inference, TISSUE provides well-calibrated prediction intervals for predicted expression values across 11 benchmark datasets. Moreover, it consistently reduces the false discovery rate for differential gene expression analysis, improves clustering and visualization of predicted spatial transcriptomics and improves the performance of supervised learning models trained on predicted gene expression profiles. Applying TISSUE to a MERFISH spatial transcriptomics dataset of the adult mouse subventricular zone, we identified subtypes within the neural stem cell lineage and developed subtype-specific regional classifiers.


Asunto(s)
Perfilación de la Expresión Génica , Células-Madre Neurales , Animales , Ratones , Incertidumbre , Benchmarking , Análisis por Conglomerados , Transcriptoma , Análisis de la Célula Individual
16.
Ann Intern Med ; 177(2): 210-220, 2024 02.
Artículo en Inglés | MEDLINE | ID: mdl-38285984

RESUMEN

Large language models (LLMs) are artificial intelligence models trained on vast text data to generate humanlike outputs. They have been applied to various tasks in health care, ranging from answering medical examination questions to generating clinical reports. With increasing institutional partnerships between companies producing LLMs and health systems, the real-world clinical application of these models is nearing realization. As these models gain traction, health care practitioners must understand what LLMs are, their development, their current and potential applications, and the associated pitfalls in a medical setting. This review, coupled with a tutorial, provides a comprehensive yet accessible overview of these areas with the aim of familiarizing health care professionals with the rapidly changing landscape of LLMs in medicine. Furthermore, the authors highlight active research areas in the field that promise to improve LLMs' usability in health care contexts.


Asunto(s)
Inteligencia Artificial , Medicina , Humanos , Personal de Salud , Lenguaje
17.
Sci Rep ; 14(1): 11, 2024 01 02.
Artículo en Inglés | MEDLINE | ID: mdl-38167849

RESUMEN

Transesophageal echocardiography (TEE) imaging is a vital tool used in the evaluation of complex cardiac pathology and the management of cardiac surgery patients. A key limitation to the application of deep learning strategies to intraoperative and intraprocedural TEE data is the complexity and unstructured nature of these images. In the present study, we developed a deep learning-based, multi-category TEE view classification model that can be used to add structure to intraoperative and intraprocedural TEE imaging data. More specifically, we trained a convolutional neural network (CNN) to predict standardized TEE views using labeled intraoperative and intraprocedural TEE videos from Cedars-Sinai Medical Center (CSMC). We externally validated our model on intraoperative TEE videos from Stanford University Medical Center (SUMC). Accuracy of our model was high across all labeled views. The highest performance was achieved for the Trans-Gastric Left Ventricular Short Axis View (area under the receiver operating curve [AUC] = 0.971 at CSMC, 0.957 at SUMC), the Mid-Esophageal Long Axis View (AUC = 0.954 at CSMC, 0.905 at SUMC), the Mid-Esophageal Aortic Valve Short Axis View (AUC = 0.946 at CSMC, 0.898 at SUMC), and the Mid-Esophageal 4-Chamber View (AUC = 0.939 at CSMC, 0.902 at SUMC). Ultimately, we demonstrate that our deep learning model can accurately classify standardized TEE views, which will facilitate further downstream deep learning analyses for intraoperative and intraprocedural TEE imaging.


Asunto(s)
Procedimientos Quirúrgicos Cardíacos , Aprendizaje Profundo , Humanos , Ecocardiografía Transesofágica/métodos , Ecocardiografía/métodos , Válvula Aórtica
18.
Lancet Digit Health ; 6(1): e70-e78, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38065778

RESUMEN

BACKGROUND: Preoperative risk assessments used in clinical practice are insufficient in their ability to identify risk for postoperative mortality. Deep-learning analysis of electrocardiography can identify hidden risk markers that can help to prognosticate postoperative mortality. We aimed to develop a prognostic model that accurately predicts postoperative mortality in patients undergoing medical procedures and who had received preoperative electrocardiographic diagnostic testing. METHODS: In a derivation cohort of preoperative patients with available electrocardiograms (ECGs) from Cedars-Sinai Medical Center (Los Angeles, CA, USA) between Jan 1, 2015 and Dec 31, 2019, a deep-learning algorithm was developed to leverage waveform signals to discriminate postoperative mortality. We randomly split patients (8:1:1) into subsets for training, internal validation, and final algorithm test analyses. Model performance was assessed using area under the receiver operating characteristic curve (AUC) values in the hold-out test dataset and in two external hospital cohorts and compared with the established Revised Cardiac Risk Index (RCRI) score. The primary outcome was post-procedural mortality across three health-care systems. FINDINGS: 45 969 patients had a complete ECG waveform image available for at least one 12-lead ECG performed within the 30 days before the procedure date (59 975 inpatient procedures and 112 794 ECGs): 36 839 patients in the training dataset, 4549 in the internal validation dataset, and 4581 in the internal test dataset. In the held-out internal test cohort, the algorithm discriminates mortality with an AUC value of 0·83 (95% CI 0·79-0·87), surpassing the discrimination of the RCRI score with an AUC of 0·67 (0·61-0·72). The algorithm similarly discriminated risk for mortality in two independent US health-care systems, with AUCs of 0·79 (0·75-0·83) and 0·75 (0·74-0·76), respectively. Patients determined to be high risk by the deep-learning model had an unadjusted odds ratio (OR) of 8·83 (5·57-13·20) for postoperative mortality compared with an unadjusted OR of 2·08 (0·77-3·50) for postoperative mortality for RCRI scores of more than 2. The deep-learning algorithm performed similarly for patients undergoing cardiac surgery (AUC 0·85 [0·77-0·92]), non-cardiac surgery (AUC 0·83 [0·79-0·88]), and catheterisation or endoscopy suite procedures (AUC 0·76 [0·72-0·81]). INTERPRETATION: A deep-learning algorithm interpreting preoperative ECGs can improve discrimination of postoperative mortality. The deep-learning algorithm worked equally well for risk stratification of cardiac surgeries, non-cardiac surgeries, and catheterisation laboratory procedures, and was validated in three independent health-care systems. This algorithm can provide additional information to clinicians making the decision to perform medical procedures and stratify the risk of future complications. FUNDING: National Heart, Lung, and Blood Institute.


Asunto(s)
Aprendizaje Profundo , Humanos , Medición de Riesgo/métodos , Algoritmos , Pronóstico , Electrocardiografía
19.
JAMA Surg ; 159(1): 87-95, 2024 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-37966807

RESUMEN

Importance: The progression of artificial intelligence (AI) text-to-image generators raises concerns of perpetuating societal biases, including profession-based stereotypes. Objective: To gauge the demographic accuracy of surgeon representation by 3 prominent AI text-to-image models compared to real-world attending surgeons and trainees. Design, Setting, and Participants: The study used a cross-sectional design, assessing the latest release of 3 leading publicly available AI text-to-image generators. Seven independent reviewers categorized AI-produced images. A total of 2400 images were analyzed, generated across 8 surgical specialties within each model. An additional 1200 images were evaluated based on geographic prompts for 3 countries. The study was conducted in May 2023. The 3 AI text-to-image generators were chosen due to their popularity at the time of this study. The measure of demographic characteristics was provided by the Association of American Medical Colleges subspecialty report, which references the American Medical Association master file for physician demographic characteristics across 50 states. Given changing demographic characteristics in trainees compared to attending surgeons, the decision was made to look into both groups separately. Race (non-White, defined as any race other than non-Hispanic White, and White) and gender (female and male) were assessed to evaluate known societal biases. Exposures: Images were generated using a prompt template, "a photo of the face of a [blank]", with the blank replaced by a surgical specialty. Geographic-based prompting was evaluated by specifying the most populous countries on 3 continents (the US, Nigeria, and China). Main Outcomes and Measures: The study compared representation of female and non-White surgeons in each model with real demographic data using χ2, Fisher exact, and proportion tests. Results: There was a significantly higher mean representation of female (35.8% vs 14.7%; P < .001) and non-White (37.4% vs 22.8%; P < .001) surgeons among trainees than attending surgeons. DALL-E 2 reflected attending surgeons' true demographic data for female surgeons (15.9% vs 14.7%; P = .39) and non-White surgeons (22.6% vs 22.8%; P = .92) but underestimated trainees' representation for both female (15.9% vs 35.8%; P < .001) and non-White (22.6% vs 37.4%; P < .001) surgeons. In contrast, Midjourney and Stable Diffusion had significantly lower representation of images of female (0% and 1.8%, respectively; P < .001) and non-White (0.5% and 0.6%, respectively; P < .001) surgeons than DALL-E 2 or true demographic data. Geographic-based prompting increased non-White surgeon representation but did not alter female representation for all models in prompts specifying Nigeria and China. Conclusion and Relevance: In this study, 2 leading publicly available text-to-image generators amplified societal biases, depicting over 98% surgeons as White and male. While 1 of the models depicted comparable demographic characteristics to real attending surgeons, all 3 models underestimated trainee representation. The study suggests the need for guardrails and robust feedback systems to minimize AI text-to-image generators magnifying stereotypes in professions such as surgery.


Asunto(s)
Especialidades Quirúrgicas , Cirujanos , Estados Unidos , Humanos , Masculino , Femenino , Estudios Transversales , Inteligencia Artificial , Demografía
20.
bioRxiv ; 2024 Mar 05.
Artículo en Inglés | MEDLINE | ID: mdl-37905130

RESUMEN

There has been significant recent progress in leveraging large-scale gene expression data to develop foundation models for single-cell biology. Models such as Geneformer and scGPT implicitly learn gene and cellular functions from the gene expression profiles of millions of cells, which requires extensive data curation and resource-intensive training. Here we explore a much simpler alternative by leveraging ChatGPT embeddings of genes based on literature. Our proposal, GenePT, uses NCBI text descriptions of individual genes with GPT-3.5 to generate gene embeddings. From there, GenePT generates single-cell embeddings in two ways: (i) by averaging the gene embeddings, weighted by each gene's expression level; or (ii) by creating a sentence embedding for each cell, using gene names ordered by the expression level. Without the need for dataset curation and additional pretraining, GenePT is efficient and easy to use. On many downstream tasks used to evaluate recent single-cell foundation models - e.g., classifying gene properties and cell types - GenePT achieves comparable, and often better, performance than Geneformer and other models. GenePT demonstrates that large language model embedding of literature is a simple and effective path for biological foundation models.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...