Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 117
Filtrar
1.
medRxiv ; 2024 Apr 10.
Artigo em Inglês | MEDLINE | ID: mdl-38645167

RESUMO

Apart from ancestry, personal or environmental covariates may contribute to differences in polygenic score (PGS) performance. We analyzed effects of covariate stratification and interaction on body mass index (BMI) PGS (PGSBMI) across four cohorts of European (N=491,111) and African (N=21,612) ancestry. Stratifying on binary covariates and quintiles for continuous covariates, 18/62 covariates had significant and replicable R2 differences among strata. Covariates with the largest differences included age, sex, blood lipids, physical activity, and alcohol consumption, with R2 being nearly double between best and worst performing quintiles for certain covariates. 28 covariates had significant PGSBMI-covariate interaction effects, modifying PGSBMI effects by nearly 20% per standard deviation change. We observed overlap between covariates that had significant R2 differences among strata and interaction effects - across all covariates, their main effects on BMI were correlated with their maximum R2 differences and interaction effects (0.56 and 0.58, respectively), suggesting high-PGSBMI individuals have highest R2 and increase in PGS effect. Using quantile regression, we show the effect of PGSBMI increases as BMI itself increases, and that these differences in effects are directly related to differences in R2 when stratifying by different covariates. Given significant and replicable evidence for context-specific PGSBMI performance and effects, we investigated ways to increase model performance taking into account non-linear effects. Machine learning models (neural networks) increased relative model R2 (mean 23%) across datasets. Finally, creating PGSBMI directly from GxAge GWAS effects increased relative R2 by 7.8%. These results demonstrate that certain covariates, especially those most associated with BMI, significantly affect both PGSBMI performance and effects across diverse cohorts and ancestries, and we provide avenues to improve model performance that consider these effects.

2.
Artigo em Inglês | MEDLINE | ID: mdl-38613820

RESUMO

OBJECTIVES: Phenotyping is a core task in observational health research utilizing electronic health records (EHRs). Developing an accurate algorithm demands substantial input from domain experts, involving extensive literature review and evidence synthesis. This burdensome process limits scalability and delays knowledge discovery. We investigate the potential for leveraging large language models (LLMs) to enhance the efficiency of EHR phenotyping by generating high-quality algorithm drafts. MATERIALS AND METHODS: We prompted four LLMs-GPT-4 and GPT-3.5 of ChatGPT, Claude 2, and Bard-in October 2023, asking them to generate executable phenotyping algorithms in the form of SQL queries adhering to a common data model (CDM) for three phenotypes (ie, type 2 diabetes mellitus, dementia, and hypothyroidism). Three phenotyping experts evaluated the returned algorithms across several critical metrics. We further implemented the top-rated algorithms and compared them against clinician-validated phenotyping algorithms from the Electronic Medical Records and Genomics (eMERGE) network. RESULTS: GPT-4 and GPT-3.5 exhibited significantly higher overall expert evaluation scores in instruction following, algorithmic logic, and SQL executability, when compared to Claude 2 and Bard. Although GPT-4 and GPT-3.5 effectively identified relevant clinical concepts, they exhibited immature capability in organizing phenotyping criteria with the proper logic, leading to phenotyping algorithms that were either excessively restrictive (with low recall) or overly broad (with low positive predictive values). CONCLUSION: GPT versions 3.5 and 4 are capable of drafting phenotyping algorithms by identifying relevant clinical criteria aligned with a CDM. However, expertise in informatics and clinical experience is still required to assess and further refine generated algorithms.

3.
Artigo em Inglês | MEDLINE | ID: mdl-38497958

RESUMO

OBJECTIVE: This study aimed to develop and assess the performance of fine-tuned large language models for generating responses to patient messages sent via an electronic health record patient portal. MATERIALS AND METHODS: Utilizing a dataset of messages and responses extracted from the patient portal at a large academic medical center, we developed a model (CLAIR-Short) based on a pre-trained large language model (LLaMA-65B). In addition, we used the OpenAI API to update physician responses from an open-source dataset into a format with informative paragraphs that offered patient education while emphasizing empathy and professionalism. By combining with this dataset, we further fine-tuned our model (CLAIR-Long). To evaluate fine-tuned models, we used 10 representative patient portal questions in primary care to generate responses. We asked primary care physicians to review generated responses from our models and ChatGPT and rated them for empathy, responsiveness, accuracy, and usefulness. RESULTS: The dataset consisted of 499 794 pairs of patient messages and corresponding responses from the patient portal, with 5000 patient messages and ChatGPT-updated responses from an online platform. Four primary care physicians participated in the survey. CLAIR-Short exhibited the ability to generate concise responses similar to provider's responses. CLAIR-Long responses provided increased patient educational content compared to CLAIR-Short and were rated similarly to ChatGPT's responses, receiving positive evaluations for responsiveness, empathy, and accuracy, while receiving a neutral rating for usefulness. CONCLUSION: This subjective analysis suggests that leveraging large language models to generate responses to patient messages demonstrates significant potential in facilitating communication between patients and healthcare providers.

4.
J Am Med Inform Assoc ; 31(4): 968-974, 2024 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-38383050

RESUMO

OBJECTIVE: To develop and evaluate a data-driven process to generate suggestions for improving alert criteria using explainable artificial intelligence (XAI) approaches. METHODS: We extracted data on alerts generated from January 1, 2019 to December 31, 2020, at Vanderbilt University Medical Center. We developed machine learning models to predict user responses to alerts. We applied XAI techniques to generate global explanations and local explanations. We evaluated the generated suggestions by comparing with alert's historical change logs and stakeholder interviews. Suggestions that either matched (or partially matched) changes already made to the alert or were considered clinically correct were classified as helpful. RESULTS: The final dataset included 2 991 823 firings with 2689 features. Among the 5 machine learning models, the LightGBM model achieved the highest Area under the ROC Curve: 0.919 [0.918, 0.920]. We identified 96 helpful suggestions. A total of 278 807 firings (9.3%) could have been eliminated. Some of the suggestions also revealed workflow and education issues. CONCLUSION: We developed a data-driven process to generate suggestions for improving alert criteria using XAI techniques. Our approach could identify improvements regarding clinical decision support (CDS) that might be overlooked or delayed in manual reviews. It also unveils a secondary purpose for the XAI: to improve quality by discovering scenarios where CDS alerts are not accepted due to workflow, education, or staffing issues.


Assuntos
Inteligência Artificial , Sistemas de Apoio a Decisões Clínicas , Humanos , Aprendizado de Máquina , Centros Médicos Acadêmicos , Escolaridade
5.
NPJ Digit Med ; 7(1): 46, 2024 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-38409350

RESUMO

Drug repurposing represents an attractive alternative to the costly and time-consuming process of new drug development, particularly for serious, widespread conditions with limited effective treatments, such as Alzheimer's disease (AD). Emerging generative artificial intelligence (GAI) technologies like ChatGPT offer the promise of expediting the review and summary of scientific knowledge. To examine the feasibility of using GAI for identifying drug repurposing candidates, we iteratively tasked ChatGPT with proposing the twenty most promising drugs for repurposing in AD, and tested the top ten for risk of incident AD in exposed and unexposed individuals over age 65 in two large clinical datasets: (1) Vanderbilt University Medical Center and (2) the All of Us Research Program. Among the candidates suggested by ChatGPT, metformin, simvastatin, and losartan were associated with lower AD risk in meta-analysis. These findings suggest GAI technologies can assimilate scientific insights from an extensive Internet-based search space, helping to prioritize drug repurposing candidates and facilitate the treatment of diseases.

6.
medRxiv ; 2024 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-38196578

RESUMO

Objectives: Phenotyping is a core task in observational health research utilizing electronic health records (EHRs). Developing an accurate algorithm demands substantial input from domain experts, involving extensive literature review and evidence synthesis. This burdensome process limits scalability and delays knowledge discovery. We investigate the potential for leveraging large language models (LLMs) to enhance the efficiency of EHR phenotyping by generating high-quality algorithm drafts. Materials and Methods: We prompted four LLMs-GPT-4 and GPT-3.5 of ChatGPT, Claude 2, and Bard-in October 2023, asking them to generate executable phenotyping algorithms in the form of SQL queries adhering to a common data model (CDM) for three phenotypes (i.e., type 2 diabetes mellitus, dementia, and hypothyroidism). Three phenotyping experts evaluated the returned algorithms across several critical metrics. We further implemented the top-rated algorithms and compared them against clinician-validated phenotyping algorithms from the Electronic Medical Records and Genomics (eMERGE) network. Results: GPT-4 and GPT-3.5 exhibited significantly higher overall expert evaluation scores in instruction following, algorithmic logic, and SQL executability, when compared to Claude 2 and Bard. Although GPT-4 and GPT-3.5 effectively identified relevant clinical concepts, they exhibited immature capability in organizing phenotyping criteria with the proper logic, leading to phenotyping algorithms that were either excessively restrictive (with low recall) or overly broad (with low positive predictive values). Conclusion: GPT versions 3.5 and 4 are capable of drafting phenotyping algorithms by identifying relevant clinical criteria aligned with a CDM. However, expertise in informatics and clinical experience is still required to assess and further refine generated algorithms.

7.
Genet Med ; 26(4): 101074, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38243783

RESUMO

PURPOSE: Diagnostic delay in monogenic disease is reportedly common. We conducted a scoping review investigating variability in study design, results, and conclusions. METHODS: We searched the academic literature on January 17, 2023, for original peer reviewed journals and conference articles that quantified diagnostic delay in monogenic disease. We abstracted the reported diagnostic delay, relevant study design features, and definitions. RESULTS: Our search identified 259 articles quantifying diagnostic delay in 111 distinct monogenetic diseases. Median reported diagnostic delay for all studies collectively in monogenetic diseases was 5.0 years (IQR 2-10). There was major variation in the reported delay within individual monogenetic diseases. Shorter delay was associated with disorders of childhood metabolism, immunity, and development. The majority (67.6%) of articles that studied delay reported an improvement with calendar time. Study design and definitions of delay were highly heterogenous. Three gaps were identified: (1) no studies were conducted in the least developed countries, (2) delay has not been studied for the majority of known, or (3) most prevalent genetic diseases. CONCLUSION: Heterogenous study design and definitions of diagnostic delay inhibit comparison across studies. Future efforts should focus on standardizing delay measurements, while expanding the research to low-income countries.


Assuntos
Diagnóstico Tardio , Projetos de Pesquisa , Humanos , Países em Desenvolvimento
10.
Am J Hum Genet ; 110(11): 1950-1958, 2023 11 02.
Artigo em Inglês | MEDLINE | ID: mdl-37883979

RESUMO

As large-scale genomic screening becomes increasingly prevalent, understanding the influence of actionable results on healthcare utilization is key to estimating the potential long-term clinical impact. The eMERGE network sequenced individuals for actionable genes in multiple genetic conditions and returned results to individuals, providers, and the electronic health record. Differences in recommended health services (laboratory, imaging, and procedural testing) delivered within 12 months of return were compared among individuals with pathogenic or likely pathogenic (P/LP) findings to matched individuals with negative findings before and after return of results. Of 16,218 adults, 477 unselected individuals were found to have a monogenic risk for arrhythmia (n = 95), breast cancer (n = 96), cardiomyopathy (n = 95), colorectal cancer (n = 105), or familial hypercholesterolemia (n = 86). Individuals with P/LP results more frequently received services after return (43.8%) compared to before return (25.6%) of results and compared to individuals with negative findings (24.9%; p < 0.0001). The annual cost of qualifying healthcare services increased from an average of $162 before return to $343 after return of results among the P/LP group (p < 0.0001); differences in the negative group were non-significant. The mean difference-in-differences was $149 (p < 0.0001), which describes the increased cost within the P/LP group corrected for cost changes in the negative group. When stratified by individual conditions, significant cost differences were observed for arrhythmia, breast cancer, and cardiomyopathy. In conclusion, less than half of individuals received billed health services after monogenic return, which modestly increased healthcare costs for payors in the year following return.


Assuntos
Neoplasias da Mama , Cardiomiopatias , Adulto , Humanos , Feminino , Estudos Prospectivos , Aceitação pelo Paciente de Cuidados de Saúde , Arritmias Cardíacas , Neoplasias da Mama/genética , Cardiomiopatias/genética
11.
Am J Hum Genet ; 110(9): 1522-1533, 2023 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-37607538

RESUMO

Population-scale biobanks linked to electronic health record data provide vast opportunities to extend our knowledge of human genetics and discover new phenotype-genotype associations. Given their dense phenotype data, biobanks can also facilitate replication studies on a phenome-wide scale. Here, we introduce the phenotype-genotype reference map (PGRM), a set of 5,879 genetic associations from 523 GWAS publications that can be used for high-throughput replication experiments. PGRM phenotypes are standardized as phecodes, ensuring interoperability between biobanks. We applied the PGRM to five ancestry-specific cohorts from four independent biobanks and found evidence of robust replications across a wide array of phenotypes. We show how the PGRM can be used to detect data corruption and to empirically assess parameters for phenome-wide studies. Finally, we use the PGRM to explore factors associated with replicability of GWAS results.


Assuntos
Bancos de Espécimes Biológicos , Ciência de Dados , Humanos , Fenômica , Fenótipo , Genótipo
12.
medRxiv ; 2023 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-37503263

RESUMO

Objective: This study aimed to develop and assess the performance of fine-tuned large language models for generating responses to patient messages sent via an electronic health record patient portal. Methods: Utilizing a dataset of messages and responses extracted from the patient portal at a large academic medical center, we developed a model (CLAIR-Short) based on a pre-trained large language model (LLaMA-65B). In addition, we used the OpenAI API to update physician responses from an open-source dataset into a format with informative paragraphs that offered patient education while emphasizing empathy and professionalism. By combining with this dataset, we further fine-tuned our model (CLAIR-Long). To evaluate the fine-tuned models, we used ten representative patient portal questions in primary care to generate responses. We asked primary care physicians to review generated responses from our models and ChatGPT and rated them for empathy, responsiveness, accuracy, and usefulness. Results: The dataset consisted of a total of 499,794 pairs of patient messages and corresponding responses from the patient portal, with 5,000 patient messages and ChatGPT-updated responses from an online platform. Four primary care physicians participated in the survey. CLAIR-Short exhibited the ability to generate concise responses similar to provider's responses. CLAIR-Long responses provided increased patient educational content compared to CLAIR-Short and were rated similarly to ChatGPT's responses, receiving positive evaluations for responsiveness, empathy, and accuracy, while receiving a neutral rating for usefulness. Conclusion: Leveraging large language models to generate responses to patient messages demonstrates significant potential in facilitating communication between patients and primary care providers.

14.
medRxiv ; 2023 Jul 08.
Artigo em Inglês | MEDLINE | ID: mdl-37461512

RESUMO

Drug repurposing represents an attractive alternative to the costly and time-consuming process of new drug development, particularly for serious, widespread conditions with limited effective treatments, such as Alzheimer's disease (AD). Emerging generative artificial intelligence (GAI) technologies like ChatGPT offer the promise of expediting the review and summary of scientific knowledge. To examine the feasibility of using GAI for identifying drug repurposing candidates, we iteratively tasked ChatGPT with proposing the twenty most promising drugs for repurposing in AD, and tested the top ten for risk of incident AD in exposed and unexposed individuals over age 65 in two large clinical datasets: 1) Vanderbilt University Medical Center and 2) the All of Us Research Program. Among the candidates suggested by ChatGPT, metformin, simvastatin, and losartan were associated with lower AD risk in meta-analysis. These findings suggest GAI technologies can assimilate scientific insights from an extensive Internet-based search space, helping to prioritize drug repurposing candidates and facilitate the treatment of diseases.

15.
Am J Hum Genet ; 110(7): 1021-1033, 2023 07 06.
Artigo em Inglês | MEDLINE | ID: mdl-37343562

RESUMO

Two major goals of the Electronic Medical Record and Genomics (eMERGE) Network are to learn how best to return research results to patient/participants and the clinicians who care for them and also to assess the impact of placing these results in clinical care. Yet since its inception, the Network has confronted a host of challenges in achieving these goals, many of which had ethical, legal, or social implications (ELSIs) that required consideration. Here, we share impediments we encountered in recruiting participants, returning results, and assessing their impact, all of which affected our ability to achieve the goals of eMERGE, as well as the steps we took to attempt to address these obstacles. We divide the domains in which we experienced challenges into four broad categories: (1) study design, including recruitment of more diverse groups; (2) consent; (3) returning results to participants and their health care providers (HCPs); and (4) assessment of follow-up care of participants and measuring the impact of research on participants and their families. Since most phases of eMERGE have included children as well as adults, we also address the particular ELSI posed by including pediatric populations in this research. We make specific suggestions for improving translational genomic research to ensure that future projects can effectively return results and assess their impact on patient/participants and providers if the goals of genomic-informed medicine are to be achieved.


Assuntos
Registros Eletrônicos de Saúde , Genômica , Criança , Adulto , Humanos , Genoma , Pesquisa Translacional Biomédica , Grupos Populacionais
16.
Ann Intern Med ; 176(5): 585-595, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37155986

RESUMO

BACKGROUND: The cost-effectiveness of screening the U.S. population for Centers for Disease Control and Prevention (CDC) Tier 1 genomic conditions is unknown. OBJECTIVE: To estimate the cost-effectiveness of simultaneous genomic screening for Lynch syndrome (LS), hereditary breast and ovarian cancer syndrome (HBOC), and familial hypercholesterolemia (FH). DESIGN: Decision analytic Markov model. DATA SOURCES: Published literature. TARGET POPULATION: Separate age-based cohorts (ages 20 to 60 years at time of screening) of racially and ethnically representative U.S. adults. TIME HORIZON: Lifetime. PERSPECTIVE: U.S. health care payer. INTERVENTION: Population genomic screening using clinical sequencing with a restricted panel of high-evidence genes, cascade testing of first-degree relatives, and recommended preventive interventions for identified probands. OUTCOME MEASURES: Incident breast, ovarian, and colorectal cancer cases; incident cardiovascular events; quality-adjusted survival; and costs. RESULTS OF BASE-CASE ANALYSIS: Screening 100 000 unselected 30-year-olds resulted in 101 (95% uncertainty interval [UI], 77 to 127) fewer overall cancer cases and 15 (95% UI, 4 to 28) fewer cardiovascular events and an increase of 495 quality-adjusted life-years (QALYs) (95% UI, 401 to 757) at an incremental cost of $33.9 million (95% UI, $27.0 million to $41.1 million). The incremental cost-effectiveness ratio was $68 600 per QALY gained (95% UI, $41 800 to $88 900). RESULTS OF SENSITIVITY ANALYSIS: Screening 30-, 40-, and 50-year-old cohorts was cost-effective in 99%, 88%, and 19% of probabilistic simulations, respectively, at a $100 000-per-QALY threshold. The test costs at which screening 30-, 40-, and 50-year-olds reached the $100 000-per-QALY threshold were $413, $290, and $166, respectively. Variant prevalence and adherence to preventive interventions were also highly influential parameters. LIMITATIONS: Population averages for model inputs, which were derived predominantly from European populations, vary across ancestries and health care environments. CONCLUSION: Population genomic screening with a restricted panel of high-evidence genes associated with 3 CDC Tier 1 conditions is likely to be cost-effective in U.S. adults younger than 40 years if the testing cost is relatively low and probands have access to preventive interventions. PRIMARY FUNDING SOURCE: National Human Genome Research Institute.


Assuntos
Doenças Cardiovasculares , Hiperlipoproteinemia Tipo II , Adulto , Humanos , Adulto Jovem , Pessoa de Meia-Idade , Análise de Custo-Efetividade , Análise Custo-Benefício , Metagenômica , Anos de Vida Ajustados por Qualidade de Vida , Programas de Rastreamento
17.
Circ Genom Precis Med ; 16(2): e003816, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-37071725

RESUMO

BACKGROUND: The implications of secondary findings detected in large-scale sequencing projects remain uncertain. We assessed prevalence and penetrance of pathogenic familial hypercholesterolemia (FH) variants, their association with coronary heart disease (CHD), and 1-year outcomes following return of results in phase III of the electronic medical records and genomics network. METHODS: Adult participants (n=18 544) at 7 sites were enrolled in a prospective cohort study to assess the clinical impact of returning results from targeted sequencing of 68 actionable genes, including LDLR, APOB, and PCSK9. FH variant prevalence and penetrance (defined as low-density lipoprotein cholesterol >155 mg/dL) were estimated after excluding participants enrolled on the basis of hypercholesterolemia. Multivariable logistic regression was used to estimate the odds of CHD compared to age- and sex-matched controls without FH-associated variants. Process (eg, referral to a specialist or ordering new tests), intermediate (eg, new diagnosis of FH), and clinical (eg, treatment modification) outcomes within 1 year after return of results were ascertained by electronic health record review. RESULTS: The prevalence of FH-associated pathogenic variants was 1 in 188 (69 of 13,019 unselected participants). Penetrance was 87.5%. The presence of an FH variant was associated with CHD (odds ratio, 3.02 [2.00-4.53]) and premature CHD (odds ratio, 3.68 [2.34-5.78]). At least 1 outcome occurred in 92% of participants; 44% received a new diagnosis of FH and 26% had treatment modified following return of results. CONCLUSIONS: In a multisite cohort of electronic health record-linked biobanks, monogenic FH was prevalent, penetrant, and associated with presence of CHD. Nearly half of participants with an FH-associated variant received a new diagnosis of FH and a quarter had treatment modified after return of results. These results highlight the potential utility of sequencing electronic health record-linked biobanks to detect FH.


Assuntos
Doenças Cardiovasculares , Doença da Artéria Coronariana , Hiperlipoproteinemia Tipo II , Adulto , Humanos , Pró-Proteína Convertase 9/genética , Registros Eletrônicos de Saúde , Penetrância , Prevalência , Estudos Prospectivos , Fatores de Risco , Hiperlipoproteinemia Tipo II/diagnóstico , Hiperlipoproteinemia Tipo II/epidemiologia , Hiperlipoproteinemia Tipo II/genética , Doença da Artéria Coronariana/genética , Fatores de Risco de Doenças Cardíacas , Genômica
18.
Sci Rep ; 13(1): 1971, 2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36737471

RESUMO

The electronic Medical Records and Genomics (eMERGE) Network assessed the feasibility of deploying portable phenotype rule-based algorithms with natural language processing (NLP) components added to improve performance of existing algorithms using electronic health records (EHRs). Based on scientific merit and predicted difficulty, eMERGE selected six existing phenotypes to enhance with NLP. We assessed performance, portability, and ease of use. We summarized lessons learned by: (1) challenges; (2) best practices to address challenges based on existing evidence and/or eMERGE experience; and (3) opportunities for future research. Adding NLP resulted in improved, or the same, precision and/or recall for all but one algorithm. Portability, phenotyping workflow/process, and technology were major themes. With NLP, development and validation took longer. Besides portability of NLP technology and algorithm replicability, factors to ensure success include privacy protection, technical infrastructure setup, intellectual property agreement, and efficient communication. Workflow improvements can improve communication and reduce implementation time. NLP performance varied mainly due to clinical document heterogeneity; therefore, we suggest using semi-structured notes, comprehensive documentation, and customization options. NLP portability is possible with improved phenotype algorithm performance, but careful planning and architecture of the algorithms is essential to support local customizations.


Assuntos
Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Genômica , Algoritmos , Fenótipo
19.
J Biomed Inform ; 138: 104294, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36706849

RESUMO

OBJECTIVE: The study aims to investigate whether machine learning-based predictive models for cardiovascular disease (CVD) risk assessment show equivalent performance across demographic groups (such as race and gender) and if bias mitigation methods can reduce any bias present in the models. This is important as systematic bias may be introduced when collecting and preprocessing health data, which could affect the performance of the models on certain demographic sub-cohorts. The study is to investigate this using electronic health records data and various machine learning models. METHODS: The study used large de-identified Electronic Health Records data from Vanderbilt University Medical Center. Machine learning (ML) algorithms including logistic regression, random forest, gradient-boosting trees, and long short-term memory were applied to build multiple predictive models. Model bias and fairness were evaluated using equal opportunity difference (EOD, 0 indicates fairness) and disparate impact (DI, 1 indicates fairness). In our study, we also evaluated the fairness of a non-ML baseline model, the American Heart Association (AHA) Pooled Cohort Risk Equations (PCEs). Moreover, we compared the performance of three different de-biasing methods: removing protected attributes (e.g., race and gender), resampling the imbalanced training dataset by sample size, and resampling by the proportion of people with CVD outcomes. RESULTS: The study cohort included 109,490 individuals (mean [SD] age 47.4 [14.7] years; 64.5% female; 86.3% White; 13.7% Black). The experimental results suggested that most ML models had smaller EOD and DI than PCEs. For ML models, the mean EOD ranged from -0.001 to 0.018 and the mean DI ranged from 1.037 to 1.094 across race groups. There was a larger EOD and DI across gender groups, with EOD ranging from 0.131 to 0.136 and DI ranging from 1.535 to 1.587. For debiasing methods, removing protected attributes didn't significantly reduced the bias for most ML models. Resampling by sample size also didn't consistently decrease bias. Resampling by case proportion reduced the EOD and DI for gender groups but slightly reduced accuracy in many cases. CONCLUSIONS: Among the VUMC cohort, both PCEs and ML models were biased against women, suggesting the need to investigate and correct gender disparities in CVD risk prediction. Resampling by proportion reduced the bias for gender groups but not for race groups.


Assuntos
Doenças Cardiovasculares , Humanos , Feminino , Pessoa de Meia-Idade , Masculino , Aprendizado de Máquina , Algoritmos , Algoritmo Florestas Aleatórias , Modelos Logísticos
20.
J Am Med Inform Assoc ; 30(2): 233-244, 2023 01 18.
Artigo em Inglês | MEDLINE | ID: mdl-36005898

RESUMO

OBJECTIVE: COVID-19 survivors are at risk for long-term health effects, but assessing the sequelae of COVID-19 at large scales is challenging. High-throughput methods to efficiently identify new medical problems arising after acute medical events using the electronic health record (EHR) could improve surveillance for long-term consequences of acute medical problems like COVID-19. MATERIALS AND METHODS: We augmented an existing high-throughput phenotyping method (PheWAS) to identify new diagnoses occurring after an acute temporal event in the EHR. We then used the temporal-informed phenotypes to assess development of new medical problems among COVID-19 survivors enrolled in an EHR cohort of adults tested for COVID-19 at Vanderbilt University Medical Center. RESULTS: The study cohort included 186 105 adults tested for COVID-19 from March 5, 2020 to November 1, 2021; of which 30 088 (16.2%) tested positive. Median follow-up after testing was 412 days (IQR 274-528). Our temporal-informed phenotyping was able to distinguish phenotype chapters based on chronicity of their constituent diagnoses. PheWAS with temporal-informed phenotypes identified increased risk for 43 diagnoses among COVID-19 survivors during outpatient follow-up, including multiple new respiratory, cardiovascular, neurological, and pregnancy-related conditions. Findings were robust to sensitivity analyses, and several phenotypic associations were supported by changes in outpatient vital signs or laboratory tests from the pretesting to postrecovery period. CONCLUSION: Temporal-informed PheWAS identified new diagnoses affecting multiple organ systems among COVID-19 survivors. These findings can inform future efforts to enable longitudinal health surveillance for survivors of COVID-19 and other acute medical conditions using the EHR.


Assuntos
COVID-19 , Humanos , Fenótipo , Registros Eletrônicos de Saúde
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...