Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 195
Filtrar
1.
J Med Internet Res ; 26: e52508, 2024 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-38696776

RESUMO

The number of papers presenting machine learning (ML) models that are being submitted to and published in the Journal of Medical Internet Research and other JMIR Publications journals has steadily increased. Editors and peer reviewers involved in the review process for such manuscripts often go through multiple review cycles to enhance the quality and completeness of reporting. The use of reporting guidelines or checklists can help ensure consistency in the quality of submitted (and published) scientific manuscripts and, for example, avoid instances of missing information. In this Editorial, the editors of JMIR Publications journals discuss the general JMIR Publications policy regarding authors' application of reporting guidelines and specifically focus on the reporting of ML studies in JMIR Publications journals, using the Consolidated Reporting of Machine Learning Studies (CREMLS) guidelines, with an example of how authors and other journals could use the CREMLS checklist to ensure transparency and rigor in reporting.


Assuntos
Aprendizado de Máquina , Humanos , Guias como Assunto , Prognóstico , Lista de Checagem
2.
J Am Med Inform Assoc ; 31(6): 1303-1312, 2024 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-38713006

RESUMO

OBJECTIVES: Racial disparities in kidney transplant access and posttransplant outcomes exist between non-Hispanic Black (NHB) and non-Hispanic White (NHW) patients in the United States, with the site of care being a key contributor. Using multi-site data to examine the effect of site of care on racial disparities, the key challenge is the dilemma in sharing patient-level data due to regulations for protecting patients' privacy. MATERIALS AND METHODS: We developed a federated learning framework, named dGEM-disparity (decentralized algorithm for Generalized linear mixed Effect Model for disparity quantification). Consisting of 2 modules, dGEM-disparity first provides accurately estimated common effects and calibrated hospital-specific effects by requiring only aggregated data from each center and then adopts a counterfactual modeling approach to assess whether the graft failure rates differ if NHB patients had been admitted at transplant centers in the same distribution as NHW patients were admitted. RESULTS: Utilizing United States Renal Data System data from 39 043 adult patients across 73 transplant centers over 10 years, we found that if NHB patients had followed the distribution of NHW patients in admissions, there would be 38 fewer deaths or graft failures per 10 000 NHB patients (95% CI, 35-40) within 1 year of receiving a kidney transplant on average. DISCUSSION: The proposed framework facilitates efficient collaborations in clinical research networks. Additionally, the framework, by using counterfactual modeling to calculate the event rate, allows us to investigate contributions to racial disparities that may occur at the level of site of care. CONCLUSIONS: Our framework is broadly applicable to other decentralized datasets and disparities research related to differential access to care. Ultimately, our proposed framework will advance equity in human health by identifying and addressing hospital-level racial disparities.


Assuntos
Algoritmos , Negro ou Afro-Americano , Disparidades em Assistência à Saúde , Transplante de Rim , População Branca , Humanos , Estados Unidos , Disparidades em Assistência à Saúde/etnologia , Adulto , Masculino , Feminino , Rejeição de Enxerto/etnologia , Pessoa de Meia-Idade
3.
medRxiv ; 2024 Apr 27.
Artigo em Inglês | MEDLINE | ID: mdl-38712148

RESUMO

Background: The launch of the Chat Generative Pre-trained Transformer (ChatGPT) in November 2022 has attracted public attention and academic interest to large language models (LLMs), facilitating the emergence of many other innovative LLMs. These LLMs have been applied in various fields, including healthcare. Numerous studies have since been conducted regarding how to employ state-of-the-art LLMs in health-related scenarios to assist patients, doctors, and public health administrators. Objective: This review aims to summarize the applications and concerns of applying conversational LLMs in healthcare and provide an agenda for future research on LLMs in healthcare. Methods: We utilized PubMed, ACM, and IEEE digital libraries as primary sources for this review. We followed the guidance of Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRIMSA) to screen and select peer-reviewed research articles that (1) were related to both healthcare applications and conversational LLMs and (2) were published before September 1st, 2023, the date when we started paper collection and screening. We investigated these papers and classified them according to their applications and concerns. Results: Our search initially identified 820 papers according to targeted keywords, out of which 65 papers met our criteria and were included in the review. The most popular conversational LLM was ChatGPT from OpenAI (60), followed by Bard from Google (1), Large Language Model Meta AI (LLaMA) from Meta (1), and other LLMs (5). These papers were classified into four categories in terms of their applications: 1) summarization, 2) medical knowledge inquiry, 3) prediction, and 4) administration, and four categories of concerns: 1) reliability, 2) bias, 3) privacy, and 4) public acceptability. There are 49 (75%) research papers using LLMs for summarization and/or medical knowledge inquiry, and 58 (89%) research papers expressing concerns about reliability and/or bias. We found that conversational LLMs exhibit promising results in summarization and providing medical knowledge to patients with a relatively high accuracy. However, conversational LLMs like ChatGPT are not able to provide reliable answers to complex health-related tasks that require specialized domain expertise. Additionally, no experiments in our reviewed papers have been conducted to thoughtfully examine how conversational LLMs lead to bias or privacy issues in healthcare research. Conclusions: Future studies should focus on improving the reliability of LLM applications in complex health-related tasks, as well as investigating the mechanisms of how LLM applications brought bias and privacy issues. Considering the vast accessibility of LLMs, legal, social, and technical efforts are all needed to address concerns about LLMs to promote, improve, and regularize the application of LLMs in healthcare.

4.
Artigo em Inglês | MEDLINE | ID: mdl-38613820

RESUMO

OBJECTIVES: Phenotyping is a core task in observational health research utilizing electronic health records (EHRs). Developing an accurate algorithm demands substantial input from domain experts, involving extensive literature review and evidence synthesis. This burdensome process limits scalability and delays knowledge discovery. We investigate the potential for leveraging large language models (LLMs) to enhance the efficiency of EHR phenotyping by generating high-quality algorithm drafts. MATERIALS AND METHODS: We prompted four LLMs-GPT-4 and GPT-3.5 of ChatGPT, Claude 2, and Bard-in October 2023, asking them to generate executable phenotyping algorithms in the form of SQL queries adhering to a common data model (CDM) for three phenotypes (ie, type 2 diabetes mellitus, dementia, and hypothyroidism). Three phenotyping experts evaluated the returned algorithms across several critical metrics. We further implemented the top-rated algorithms and compared them against clinician-validated phenotyping algorithms from the Electronic Medical Records and Genomics (eMERGE) network. RESULTS: GPT-4 and GPT-3.5 exhibited significantly higher overall expert evaluation scores in instruction following, algorithmic logic, and SQL executability, when compared to Claude 2 and Bard. Although GPT-4 and GPT-3.5 effectively identified relevant clinical concepts, they exhibited immature capability in organizing phenotyping criteria with the proper logic, leading to phenotyping algorithms that were either excessively restrictive (with low recall) or overly broad (with low positive predictive values). CONCLUSION: GPT versions 3.5 and 4 are capable of drafting phenotyping algorithms by identifying relevant clinical criteria aligned with a CDM. However, expertise in informatics and clinical experience is still required to assess and further refine generated algorithms.

5.
J Med Internet Res ; 26: e49445, 2024 04 24.
Artigo em Inglês | MEDLINE | ID: mdl-38657232

RESUMO

BACKGROUND: Sharing data from clinical studies can accelerate scientific progress, improve transparency, and increase the potential for innovation and collaboration. However, privacy concerns remain a barrier to data sharing. Certain concerns, such as reidentification risk, can be addressed through the application of anonymization algorithms, whereby data are altered so that it is no longer reasonably related to a person. Yet, such alterations have the potential to influence the data set's statistical properties, such that the privacy-utility trade-off must be considered. This has been studied in theory, but evidence based on real-world individual-level clinical data is rare, and anonymization has not broadly been adopted in clinical practice. OBJECTIVE: The goal of this study is to contribute to a better understanding of anonymization in the real world by comprehensively evaluating the privacy-utility trade-off of differently anonymized data using data and scientific results from the German Chronic Kidney Disease (GCKD) study. METHODS: The GCKD data set extracted for this study consists of 5217 records and 70 variables. A 2-step procedure was followed to determine which variables constituted reidentification risks. To capture a large portion of the risk-utility space, we decided on risk thresholds ranging from 0.02 to 1. The data were then transformed via generalization and suppression, and the anonymization process was varied using a generic and a use case-specific configuration. To assess the utility of the anonymized GCKD data, general-purpose metrics (ie, data granularity and entropy), as well as use case-specific metrics (ie, reproducibility), were applied. Reproducibility was assessed by measuring the overlap of the 95% CI lengths between anonymized and original results. RESULTS: Reproducibility measured by 95% CI overlap was higher than utility obtained from general-purpose metrics. For example, granularity varied between 68.2% and 87.6%, and entropy varied between 25.5% and 46.2%, whereas the average 95% CI overlap was above 90% for all risk thresholds applied. A nonoverlapping 95% CI was detected in 6 estimates across all analyses, but the overwhelming majority of estimates exhibited an overlap over 50%. The use case-specific configuration outperformed the generic one in terms of actual utility (ie, reproducibility) at the same level of privacy. CONCLUSIONS: Our results illustrate the challenges that anonymization faces when aiming to support multiple likely and possibly competing uses, while use case-specific anonymization can provide greater utility. This aspect should be taken into account when evaluating the associated costs of anonymized data and attempting to maintain sufficiently high levels of privacy for anonymized data. TRIAL REGISTRATION: German Clinical Trials Register DRKS00003971; https://drks.de/search/en/trial/DRKS00003971. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR2-10.1093/ndt/gfr456.


Assuntos
Anonimização de Dados , Humanos , Insuficiência Renal Crônica/terapia , Disseminação de Informação/métodos , Algoritmos , Alemanha , Confidencialidade , Privacidade
6.
J Biomed Inform ; 153: 104640, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38608915

RESUMO

Evidence-based medicine promises to improve the quality of healthcare by empowering medical decisions and practices with the best available evidence. The rapid growth of medical evidence, which can be obtained from various sources, poses a challenge in collecting, appraising, and synthesizing the evidential information. Recent advancements in generative AI, exemplified by large language models, hold promise in facilitating the arduous task. However, developing accountable, fair, and inclusive models remains a complicated undertaking. In this perspective, we discuss the trustworthiness of generative AI in the context of automated summarization of medical evidence.


Assuntos
Inteligência Artificial , Medicina Baseada em Evidências , Humanos , Confiança , Processamento de Linguagem Natural
7.
NPJ Digit Med ; 7(1): 46, 2024 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-38409350

RESUMO

Drug repurposing represents an attractive alternative to the costly and time-consuming process of new drug development, particularly for serious, widespread conditions with limited effective treatments, such as Alzheimer's disease (AD). Emerging generative artificial intelligence (GAI) technologies like ChatGPT offer the promise of expediting the review and summary of scientific knowledge. To examine the feasibility of using GAI for identifying drug repurposing candidates, we iteratively tasked ChatGPT with proposing the twenty most promising drugs for repurposing in AD, and tested the top ten for risk of incident AD in exposed and unexposed individuals over age 65 in two large clinical datasets: (1) Vanderbilt University Medical Center and (2) the All of Us Research Program. Among the candidates suggested by ChatGPT, metformin, simvastatin, and losartan were associated with lower AD risk in meta-analysis. These findings suggest GAI technologies can assimilate scientific insights from an extensive Internet-based search space, helping to prioritize drug repurposing candidates and facilitate the treatment of diseases.

8.
medRxiv ; 2024 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-38196578

RESUMO

Objectives: Phenotyping is a core task in observational health research utilizing electronic health records (EHRs). Developing an accurate algorithm demands substantial input from domain experts, involving extensive literature review and evidence synthesis. This burdensome process limits scalability and delays knowledge discovery. We investigate the potential for leveraging large language models (LLMs) to enhance the efficiency of EHR phenotyping by generating high-quality algorithm drafts. Materials and Methods: We prompted four LLMs-GPT-4 and GPT-3.5 of ChatGPT, Claude 2, and Bard-in October 2023, asking them to generate executable phenotyping algorithms in the form of SQL queries adhering to a common data model (CDM) for three phenotypes (i.e., type 2 diabetes mellitus, dementia, and hypothyroidism). Three phenotyping experts evaluated the returned algorithms across several critical metrics. We further implemented the top-rated algorithms and compared them against clinician-validated phenotyping algorithms from the Electronic Medical Records and Genomics (eMERGE) network. Results: GPT-4 and GPT-3.5 exhibited significantly higher overall expert evaluation scores in instruction following, algorithmic logic, and SQL executability, when compared to Claude 2 and Bard. Although GPT-4 and GPT-3.5 effectively identified relevant clinical concepts, they exhibited immature capability in organizing phenotyping criteria with the proper logic, leading to phenotyping algorithms that were either excessively restrictive (with low recall) or overly broad (with low positive predictive values). Conclusion: GPT versions 3.5 and 4 are capable of drafting phenotyping algorithms by identifying relevant clinical criteria aligned with a CDM. However, expertise in informatics and clinical experience is still required to assess and further refine generated algorithms.

9.
J Med Internet Res ; 25: e48193, 2023 11 17.
Artigo em Inglês | MEDLINE | ID: mdl-37976095

RESUMO

BACKGROUND: Alzheimer disease or related dementias (ADRD) are severe neurological disorders that impair the thinking and memory skills of older adults. Most persons living with dementia receive care at home from their family members or other unpaid informal caregivers; this results in significant mental, physical, and financial challenges for these caregivers. To combat these challenges, many informal ADRD caregivers seek social support in online environments. Although research examining online caregiving discussions is growing, few investigations have distinguished caregivers according to their kin relationships with persons living with dementias. Various studies have suggested that caregivers in different relationships experience distinct caregiving challenges and support needs. OBJECTIVE: This study aims to examine and compare the online behaviors of adult-child and spousal caregivers, the 2 largest groups of informal ADRD caregivers, in an open online community. METHODS: We collected posts from ALZConnected, an online community managed by the Alzheimer's Association. To gain insights into online behaviors, we first applied structural topic modeling to identify topics and topic prevalence between adult-child and spousal caregivers. Next, we applied VADER (Valence Aware Dictionary for Sentiment Reasoning) and LIWC (Linguistic Inquiry and Word Count) to evaluate sentiment changes in the online posts over time for both types of caregivers. We further built machine learning models to distinguish the posts of each caregiver type and evaluated them in terms of precision, recall, F1-score, and area under the precision-recall curve. Finally, we applied the best prediction model to compare the temporal trend of relationship-predicting capacities in posts between the 2 types of caregivers. RESULTS: Our analysis showed that the number of posts from both types of caregivers followed a long-tailed distribution, indicating that most caregivers in this online community were infrequent users. In comparison with adult-child caregivers, spousal caregivers tended to be more active in the community, publishing more posts and engaging in discussions on a wider range of caregiving topics. Spousal caregivers also exhibited slower growth in positive emotional communication over time. The best machine learning model for predicting adult-child, spousal, or other caregivers achieved an area under the precision-recall curve of 81.3%. The subsequent trend analysis showed that it became more difficult to predict adult-child caregiver posts than spousal caregiver posts over time. This suggests that adult-child and spousal caregivers might gradually shift their discussions from questions that are more directly related to their own experiences and needs to questions that are more general and applicable to other types of caregivers. CONCLUSIONS: Our findings suggest that it is important for researchers and community organizers to consider the heterogeneity of caregiving experiences and subsequent online behaviors among different types of caregivers when tailoring online peer support to meet the specific needs of each caregiver group.


Assuntos
Filhos Adultos , Doença de Alzheimer , Cuidadores , Idoso , Humanos , Cuidadores/psicologia , Comunicação , Família , Apoio Social , Filhos Adultos/psicologia
10.
Front Pharmacol ; 14: 1211491, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37860114

RESUMO

Introduction: The landscape of drug-drug interactions (DDIs) has evolved significantly over the past 60 years, necessitating a retrospective analysis to identify research trends and under-explored areas. While methodologies like bibliometric analysis provide valuable quantitative perspectives on DDI research, they have not successfully delineated the complex interrelations between drugs. Understanding these intricate relationships is essential for deciphering the evolving architecture and progressive transformation of DDI research structures over time. We utilize network analysis to unearth the multifaceted relationships between drugs, offering a richer, more nuanced comprehension of shifts in research focus within the DDI landscape. Methods: This groundbreaking investigation employs natural language processing, techniques, specifically Named Entity Recognition (NER) via ScispaCy, and the information extraction model, SciFive, to extract pharmacokinetic (PK) and pharmacodynamic (PD) DDI evidence from PubMed articles spanning January 1962 to July 2023. It reveals key trends and patterns through an innovative network analysis approach. Static network analysis is deployed to discern structural patterns in DDI research, while evolving network analysis is employed to monitor changes in the DDI research trend structures over time. Results: Our compelling results shed light on the scale-free characteristics of pharmacokinetic, pharmacodynamic, and their combined networks, exhibiting power law exponent values of 2.5, 2.82, and 2.46, respectively. In these networks, a select few drugs serve as central hubs, engaging in extensive interactions with a multitude of other drugs. Interestingly, the networks conform to a densification power law, illustrating that the number of DDIs grows exponentially as new drugs are added to the DDI network. Notably, we discovered that drugs connected in PK and PD networks predominantly belong to the same categories defined by the Anatomical Therapeutic Chemical (ATC) classification system, with fewer interactions observed between drugs from different categories. Discussion: The finding suggests that PK and PD DDIs between drugs from different ATC categories have not been studied as extensively as those between drugs within the same categories. By unearthing these hidden patterns, our study paves the way for a deeper understanding of the DDI landscape, providing valuable information for future DDI research, clinical practice, and drug development focus areas.

11.
JAMA Netw Open ; 6(10): e2336383, 2023 10 02.
Artigo em Inglês | MEDLINE | ID: mdl-37812421

RESUMO

Importance: US health professionals devote a large amount of effort to engaging with patients' electronic health records (EHRs) to deliver care. It is unknown whether patients with different racial and ethnic backgrounds receive equal EHR engagement. Objective: To investigate whether there are differences in the level of health professionals' EHR engagement for hospitalized patients according to race or ethnicity during inpatient care. Design, Setting, and Participants: This cross-sectional study analyzed EHR access log data from 2 major medical institutions, Vanderbilt University Medical Center (VUMC) and Northwestern Medicine (NW Medicine), over a 3-year period from January 1, 2018, to December 31, 2020. The study included all adult patients (aged ≥18 years) who were discharged alive after hospitalization for at least 24 hours. The data were analyzed between August 15, 2022, and March 15, 2023. Exposures: The actions of health professionals in each patient's EHR were based on EHR access log data. Covariates included patients' demographic information, socioeconomic characteristics, and comorbidities. Main Outcomes and Measures: The primary outcome was the quantity of EHR engagement, as defined by the average number of EHR actions performed by health professionals within a patient's EHR per hour during the patient's hospital stay. Proportional odds logistic regression was applied based on outcome quartiles. Results: A total of 243 416 adult patients were included from VUMC (mean [SD] age, 51.7 [19.2] years; 54.9% female and 45.1% male; 14.8% Black, 4.9% Hispanic, 77.7% White, and 2.6% other races and ethnicities) and NW Medicine (mean [SD] age, 52.8 [20.6] years; 65.2% female and 34.8% male; 11.7% Black, 12.1% Hispanic, 69.2% White, and 7.0% other races and ethnicities). When combining Black, Hispanic, or other race and ethnicity patients into 1 group, these patients were significantly less likely to receive a higher amount of EHR engagement compared with White patients (adjusted odds ratios, 0.86 [95% CI, 0.83-0.88; P < .001] for VUMC and 0.90 [95% CI, 0.88-0.92; P < .001] for NW Medicine). However, a reduction in this difference was observed from 2018 to 2020. Conclusions and Relevance: In this cross-sectional study of inpatient EHR engagement, the findings highlight differences in how health professionals distribute their efforts to patients' EHRs, as well as a method to measure these differences. Further investigations are needed to determine whether and how EHR engagement differences are correlated with health care outcomes.


Assuntos
Registros Eletrônicos de Saúde , Etnicidade , Disparidades em Assistência à Saúde , Adulto , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Negro ou Afro-Americano , Estudos Transversais , Registros Eletrônicos de Saúde/estatística & dados numéricos , Brancos , Hospitalização/estatística & dados numéricos , Atitude do Pessoal de Saúde , Idoso , Disparidades em Assistência à Saúde/etnologia , Disparidades em Assistência à Saúde/estatística & dados numéricos , Fatores de Tempo
12.
medRxiv ; 2023 Sep 07.
Artigo em Inglês | MEDLINE | ID: mdl-37745352

RESUMO

Background: There are many myths regarding Alzheimer's disease (AD) that have been circulated on the Internet, each exhibiting varying degrees of accuracy, inaccuracy, and misinformation. Large language models such as ChatGPT, may be a useful tool to help assess these myths for veracity and inaccuracy. However, they can induce misinformation as well. The objective of this study is to assess ChatGPT's ability to identify and address AD myths with reliable information. Methods: We conducted a cross-sectional study of clinicians' evaluation of ChatGPT (GPT 4.0)'s responses to 20 selected AD myths. We prompted ChatGPT to express its opinion on each myth and then requested it to rephrase its explanation using a simplified language that could be more readily understood by individuals with a middle school education. We implemented a survey using Redcap to determine the degree to which clinicians agreed with the accuracy of each ChatGPT's explanation and the degree to which the simplified rewriting was readable and retained the message of the original. We also collected their explanation on any disagreement with ChatGPT's responses. We used five Likert-type scale with a score ranging from -2 to 2 to quantify clinicians' agreement in each aspect of the evaluation. Results: The clinicians (n=11) were generally satisfied with ChatGPT's explanations, with a mean (SD) score of 1.0(±0.3) across the 20 myths. While ChatGPT correctly identified that all the 20 myths were inaccurate, some clinicians disagreed with its explanations on 7 of the myths.Overall, 9 of the 11 professionals either agreed or strongly agreed that ChatGPT has the potential to provide meaningful explanations of certain myths. Conclusions: The majority of surveyed healthcare professionals acknowledged the potential value of ChatGPT in mitigating AD misinformation. However, the need for more refined and detailed explanations of the disease's mechanisms and treatments was highlighted.

13.
bioRxiv ; 2023 Aug 26.
Artigo em Inglês | MEDLINE | ID: mdl-37609241

RESUMO

Predictive models in biomedicine need to ensure equitable and reliable outcomes for the populations they are applied to. Unfortunately, biases in medical predictions can lead to unfair treatment and widening disparities, underscoring the need for effective techniques to address these issues. To enhance fairness, we introduce a framework based on a Multiple Domain Adversarial Neural Network (MDANN), which incorporates multiple adversarial components. In an MDANN, an adversarial module is applied to learn a fair pattern by negative gradients back-propagating across multiple sensitive features (i.e., characteristics of individuals that should not be used to discriminate unfairly between individuals when making predictions or decisions.) We leverage loss functions based on the Area Under the Receiver Operating Characteristic Curve (AUC) to address the class imbalance, promoting equitable classification performance for minority groups (e.g., a subset of the population that is underrepresented or disadvantaged.) Moreover, we utilize pre-trained convolutional autoencoders (CAEs) to extract deep representations of data, aiming to enhance prediction accuracy and fairness. Combining these mechanisms, we alleviate biases and disparities to provide reliable and equitable disease prediction. We empirically demonstrate that the MDANN approach leads to better accuracy and fairness in predicting disease progression using brain imaging data for Alzheimer's Disease and Autism populations than state-of-the-art techniques.

14.
Annu Rev Biomed Data Sci ; 6: 443-464, 2023 08 10.
Artigo em Inglês | MEDLINE | ID: mdl-37561600

RESUMO

The All of Us Research Program's Data and Research Center (DRC) was established to help acquire, curate, and provide access to one of the world's largest and most diverse datasets for precision medicine research. Already, over 500,000 participants are enrolled in All of Us, 80% of whom are underrepresented in biomedical research, and data are being analyzed by a community of over 2,300 researchers. The DRC created this thriving data ecosystem by collaborating with engaged participants, innovative program partners, and empowered researchers. In this review, we first describe how the DRC is organized to meet the needs of this broad group of stakeholders. We then outline guiding principles, common challenges, and innovative approaches used to build the All of Us data ecosystem. Finally, we share lessons learned to help others navigate important decisions and trade-offs in building a modern biomedical data platform.


Assuntos
Pesquisa Biomédica , Saúde da População , Humanos , Ecossistema , Medicina de Precisão
15.
Res Sq ; 2023 Jul 14.
Artigo em Inglês | MEDLINE | ID: mdl-37503019

RESUMO

Drug repurposing represents an attractive alternative to the costly and time-consuming process of new drug development, particularly for serious, widespread conditions with limited effective treatments, such as Alzheimer's disease (AD). Emerging generative artificial intelligence (GAI) technologies like ChatGPT offer the promise of expediting the review and summary of scientific knowledge. To examine the feasibility of using GAI for identifying drug repurposing candidates, we iteratively tasked ChatGPT with proposing the twenty most promising drugs for repurposing in AD, and tested the top ten for risk of incident AD in exposed and unexposed individuals over age 65 in two large clinical datasets: 1) Vanderbilt University Medical Center and 2) the All of Us Research Program. Among the candidates suggested by ChatGPT, metformin, simvastatin, and losartan were associated with lower AD risk in meta-analysis. These findings suggest GAI technologies can assimilate scientific insights from an extensive Internet-based search space, helping to prioritize drug repurposing candidates and facilitate the treatment of diseases.

16.
medRxiv ; 2023 Jul 08.
Artigo em Inglês | MEDLINE | ID: mdl-37461512

RESUMO

Drug repurposing represents an attractive alternative to the costly and time-consuming process of new drug development, particularly for serious, widespread conditions with limited effective treatments, such as Alzheimer's disease (AD). Emerging generative artificial intelligence (GAI) technologies like ChatGPT offer the promise of expediting the review and summary of scientific knowledge. To examine the feasibility of using GAI for identifying drug repurposing candidates, we iteratively tasked ChatGPT with proposing the twenty most promising drugs for repurposing in AD, and tested the top ten for risk of incident AD in exposed and unexposed individuals over age 65 in two large clinical datasets: 1) Vanderbilt University Medical Center and 2) the All of Us Research Program. Among the candidates suggested by ChatGPT, metformin, simvastatin, and losartan were associated with lower AD risk in meta-analysis. These findings suggest GAI technologies can assimilate scientific insights from an extensive Internet-based search space, helping to prioritize drug repurposing candidates and facilitate the treatment of diseases.

17.
IEEE Trans Nanobioscience ; 22(4): 808-817, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37289605

RESUMO

Sharing individual-level pandemic data is essential for accelerating the understanding of a disease. For example, COVID-19 data have been widely collected to support public health surveillance and research. In the United States, these data are typically de-identified before publication to protect the privacy of the corresponding individuals. However, current data publishing approaches for this type of data, such as those adopted by the U.S. Centers for Disease Control and Prevention (CDC), have not flexed over time to account for the dynamic nature of infection rates. Thus, the policies generated by these strategies have the potential to both raise privacy risks or overprotect the data and impair the data utility (or usability). To optimize the tradeoff between privacy risk and data utility, we introduce a game theoretic model that adaptively generates policies for the publication of individual-level COVID-19 data according to infection dynamics. We model the data publishing process as a two-player Stackelberg game between a data publisher and a data recipient and then search for the best strategy for the publisher. In this game, we consider 1) average performance of predicting future case counts; and 2) mutual information between the original data and the released data. We use COVID-19 case data from Vanderbilt University Medical Center from March 2020 to December 2021 to demonstrate the effectiveness of the new model. The results indicate that the game theoretic model outperforms all state-of-the-art baseline approaches, including those adopted by CDC, while maintaining low privacy risk. We further perform an extensive sensitivity analyses to show that our findings are robust to order-of-magnitude parameter fluctuations.


Assuntos
COVID-19 , Privacidade , Humanos , Estados Unidos/epidemiologia , Pandemias , COVID-19/epidemiologia , Editoração
18.
Genome Res ; 33(7): 1113-1123, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37217251

RESUMO

The collection and sharing of genomic data are becoming increasingly commonplace in research, clinical, and direct-to-consumer settings. The computational protocols typically adopted to protect individual privacy include sharing summary statistics, such as allele frequencies, or limiting query responses to the presence/absence of alleles of interest using web services called Beacons. However, even such limited releases are susceptible to likelihood ratio-based membership-inference attacks. Several approaches have been proposed to preserve privacy, which either suppress a subset of genomic variants or modify query responses for specific variants (e.g., adding noise, as in differential privacy). However, many of these approaches result in a significant utility loss, either suppressing many variants or adding a substantial amount of noise. In this paper, we introduce optimization-based approaches to explicitly trade off the utility of summary data or Beacon responses and privacy with respect to membership-inference attacks based on likelihood ratios, combining variant suppression and modification. We consider two attack models. In the first, an attacker applies a likelihood ratio test to make membership-inference claims. In the second model, an attacker uses a threshold that accounts for the effect of the data release on the separation in scores between individuals in the data set and those who are not. We further introduce highly scalable approaches for approximately solving the privacy-utility tradeoff problem when information is in the form of either summary statistics or presence/absence queries. Finally, we show that the proposed approaches outperform the state of the art in both utility and privacy through an extensive evaluation with public data sets.


Assuntos
Disseminação de Informação , Privacidade , Humanos , Disseminação de Informação/métodos , Genômica , Frequência do Gene , Alelos
19.
Sci Rep ; 13(1): 6932, 2023 04 28.
Artigo em Inglês | MEDLINE | ID: mdl-37117219

RESUMO

As recreational genomics continues to grow in its popularity, many people are afforded the opportunity to share their genomes in exchange for various services, including third-party interpretation (TPI) tools, to understand their predisposition to health problems and, based on genome similarity, to find extended family members. At the same time, these services have increasingly been reused by law enforcement to track down potential criminals through family members who disclose their genomic information. While it has been observed that many potential users shy away from such data sharing when they learn that their privacy cannot be assured, it remains unclear how potential users' valuations of the service will affect a population's behavior. In this paper, we present a game theoretic framework to model interdependent privacy challenges in genomic data sharing online. Through simulations, we find that in addition to the boundary cases when (1) no player and (2) every player joins, there exist pure-strategy Nash equilibria when a relatively small portion of players choose to join the genomic database. The result is consistent under different parametric settings. We further examine the stability of Nash equilibria and illustrate that the only equilibrium that is resistant to a random dropping of players is when all players join the genomic database. Finally, we show that when players consider the impact that their data sharing may have on their relatives, the only pure strategy Nash equilibria are when either no player or every player shares their genomic data.


Assuntos
Hepatopatia Gordurosa não Alcoólica , Privacidade , Humanos , Disseminação de Informação , Família , Genômica
20.
J Med Internet Res ; 25: e43251, 2023 03 24.
Artigo em Inglês | MEDLINE | ID: mdl-36961506

RESUMO

The potential of artificial intelligence (AI) to reduce health care disparities and inequities is recognized, but it can also exacerbate these issues if not implemented in an equitable manner. This perspective identifies potential biases in each stage of the AI life cycle, including data collection, annotation, machine learning model development, evaluation, deployment, operationalization, monitoring, and feedback integration. To mitigate these biases, we suggest involving a diverse group of stakeholders, using human-centered AI principles. Human-centered AI can help ensure that AI systems are designed and used in a way that benefits patients and society, which can reduce health disparities and inequities. By recognizing and addressing biases at each stage of the AI life cycle, AI can achieve its potential in health care.


Assuntos
Inteligência Artificial , Aprendizado de Máquina , Humanos , Disparidades em Assistência à Saúde , Viés
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...