Pesquisa | BVS - MINISTÉRIO DA SAÚDE

1.

An electronic health record cohort of Veterans with amyotrophic lateral sclerosis.

Reimer, Richard J; Goncalves, Andre; Soper, Braden; Cadena, Jose; Wilson, Jennifer L; Gryshuk, Amy L; Suarez, Paola; Osborne, Thomas F; Grimes, Kevin V; Ray, Priyadip.

Amyotroph Lateral Scler Frontotemporal Degener ; : 1-7, 2023 Aug 09.

Artigo em Inglês | MEDLINE | ID: mdl-37555559

RESUMO

Objective: To assemble and characterize an electronic health record (EHR) dataset for a large cohort of US military Veterans diagnosed with ALS (Amyotrophic Lateral Sclerosis). Methods: An EHR dataset for 19,662 Veterans diagnosed with ALS between January 1, 2000 to December 31, 2020 was compiled from the Veterans Health Administration (VHA) EHR database by a query for ICD9 diagnosis (335.20) or ICD10 diagnosis (G12.21) for Amyotrophic Lateral Sclerosis. Results: The cohort is predominantly male (98.94%) and white (72.37%) with a median age at disease onset of 68 years and median survival from the date of diagnosis of 590 days. With the designation of ALS as a compensable illness in 2009, there was a subsequent increase in the number of Veterans diagnosed per year in the VHA, but no change in median survival. The cohort included a greater-than-expected proportion of individuals whose branch of service at the time of separation was the Army. Conclusions: The composition of the cohort reflects the VHA population who are at greatest risk for ALS. The greater than expected proportion of individuals whose branch of service at the time of separation was the Army suggests the possibility of a branch-specific risk factor for ALS.

2.

Unsupervised probabilistic models for sequential Electronic Health Records.

Kaplan, Alan D; Greene, John D; Liu, Vincent X; Ray, Priyadip.

J Biomed Inform ; 134: 104163, 2022 10.

Artigo em Inglês | MEDLINE | ID: mdl-36038064

RESUMO

We develop an unsupervised probabilistic model for heterogeneous Electronic Health Record (EHR) data. Utilizing a mixture model formulation, our approach directly models sequences of arbitrary length, such as medications and laboratory results. This allows for subgrouping and incorporation of the dynamics underlying heterogeneous data types. The model consists of a layered set of latent variables that encode underlying structure in the data. These variables represent subject subgroups at the top layer, and unobserved states for sequences in the second layer. We train this model on episodic data from subjects receiving medical care in the Kaiser Permanente Northern California integrated healthcare delivery system. The resulting properties of the trained model generate novel insight from these complex and multifaceted data. In addition, we show how the model can be used to analyze sequences that contribute to assessment of mortality likelihood.

Assuntos

Prestação Integrada de Cuidados de Saúde , Registros Eletrônicos de Saúde , Humanos , Modelos Estatísticos , Probabilidade

3.

COVID-19 outcomes in patients with cancer: Findings from the University of California health system database.

Kwon, Daniel H; Cadena, Jose; Nguyen, Sam; Chan, Kwan Ho Ryan; Soper, Braden; Gryshuk, Amy L; Hong, Julian C; Ray, Priyadip; Huang, Franklin W.

Cancer Med ; 11(11): 2204-2215, 2022 06.

Artigo em Inglês | MEDLINE | ID: mdl-35261195

RESUMO

BACKGROUND: The interaction between cancer diagnoses and COVID-19 infection and outcomes is unclear. We leveraged a state-wide, multi-institutional database to assess cancer-related risk factors for poor COVID-19 outcomes. METHODS: We conducted a retrospective cohort study using the University of California Health COVID Research Dataset, which includes electronic health data of patients tested for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) at 17 California medical centers. We identified adults tested for SARS-CoV-2 from 2/1/2020-12/31/2020 and selected a cohort of patients with cancer. We obtained demographic, clinical, cancer type, and antineoplastic therapy data. The primary outcome was hospitalization within 30d after the first positive SARS-CoV-2 test. Secondary outcomes were SARS-CoV-2 positivity and severe COVID-19 (intensive care, mechanical ventilation, or death within 30d after the first positive test). We used multivariable logistic regression to identify cancer-related factors associated with outcomes. RESULTS: We identified 409,462 patients undergoing SARS-CoV-2 testing. Of 49,918 patients with cancer, 1781 (3.6%) tested positive. Patients with cancer were less likely to test positive (RR 0.70, 95% CI: 0.67-0.74, p < 0.001). Among the 1781 SARS-CoV-2-positive patients with cancer, BCR/ABL-negative myeloproliferative neoplasms (RR 2.15, 95% CI: 1.25-3.41, p = 0.007), venetoclax (RR 2.96, 95% CI: 1.14-5.66, p = 0.028), and methotrexate (RR 2.72, 95% CI: 1.10-5.19, p = 0.032) were associated with greater hospitalization risk. Cancer and therapy types were not associated with severe COVID-19. CONCLUSIONS: In this large, diverse cohort, cancer was associated with a decreased risk of SARS-CoV-2 positivity. Patients with BCR/ABL-negative myeloproliferative neoplasm or receiving methotrexate or venetoclax may be at increased risk of hospitalization following SARS-CoV-2 infection. Mechanistic and comparative studies are needed to validate findings.

Assuntos

COVID-19 , Neoplasias , Adulto , COVID-19/epidemiologia , Teste para COVID-19 , Hospitalização , Humanos , Metotrexato , Neoplasias/epidemiologia , Estudos Retrospectivos , SARS-CoV-2

4.

Dynamic modeling of hospitalized COVID-19 patients reveals disease state-dependent risk factors.

Soper, Braden C; Cadena, Jose; Nguyen, Sam; Chan, Kwan Ho Ryan; Kiszka, Paul; Womack, Lucas; Work, Mark; Duggan, Joan M; Haller, Steven T; Hanrahan, Jennifer A; Kennedy, David J; Mukundan, Deepa; Ray, Priyadip.

J Am Med Inform Assoc ; 29(5): 864-872, 2022 04 13.

Artigo em Inglês | MEDLINE | ID: mdl-35137149

RESUMO

OBJECTIVE: The study sought to investigate the disease state-dependent risk profiles of patient demographics and medical comorbidities associated with adverse outcomes of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections. MATERIALS AND METHODS: A covariate-dependent, continuous-time hidden Markov model with 4 states (moderate, severe, discharged, and deceased) was used to model the dynamic progression of COVID-19 during the course of hospitalization. All model parameters were estimated using the electronic health records of 1362 patients from ProMedica Health System admitted between March 20, 2020 and December 29, 2020 with a positive nasopharyngeal PCR test for SARS-CoV-2. Demographic characteristics, comorbidities, vital signs, and laboratory test results were retrospectively evaluated to infer a patient's clinical progression. RESULTS: The association between patient-level covariates and risk of progression was found to be disease state dependent. Specifically, while being male, being Black or having a medical comorbidity were all associated with an increased risk of progressing from the moderate disease state to the severe disease state, these same factors were associated with a decreased risk of progressing from the severe disease state to the deceased state. DISCUSSION: Recent studies have not included analyses of the temporal progression of COVID-19, making the current study a unique modeling-based approach to understand the dynamics of COVID-19 in hospitalized patients. CONCLUSION: Dynamic risk stratification models have the potential to improve clinical outcomes not only in COVID-19, but also in a myriad of other acute and chronic diseases that, to date, have largely been assessed only by static modeling techniques.

Assuntos

COVID-19 , Comorbidade , Feminino , Hospitalização , Humanos , Masculino , Estudos Retrospectivos , Fatores de Risco , SARS-CoV-2

5.

Budget constrained machine learning for early prediction of adverse outcomes for COVID-19 patients.

Nguyen, Sam; Chan, Ryan; Cadena, Jose; Soper, Braden; Kiszka, Paul; Womack, Lucas; Work, Mark; Duggan, Joan M; Haller, Steven T; Hanrahan, Jennifer A; Kennedy, David J; Mukundan, Deepa; Ray, Priyadip.

Sci Rep ; 11(1): 19543, 2021 10 01.

Artigo em Inglês | MEDLINE | ID: mdl-34599200

RESUMO

The combination of machine learning (ML) and electronic health records (EHR) data may be able to improve outcomes of hospitalized COVID-19 patients through improved risk stratification and patient outcome prediction. However, in resource constrained environments the clinical utility of such data-driven predictive tools may be limited by the cost or unavailability of certain laboratory tests. We leveraged EHR data to develop an ML-based tool for predicting adverse outcomes that optimizes clinical utility under a given cost structure. We further gained insights into the decision-making process of the ML models through an explainable AI tool. This cohort study was performed using deidentified EHR data from COVID-19 patients from ProMedica Health System in northwest Ohio and southeastern Michigan. We tested the performance of various ML approaches for predicting either increasing ventilatory support or mortality. We performed post hoc analysis to obtain optimal feature sets under various budget constraints. We demonstrate that it is possible to achieve a significant reduction in cost at the expense of a small reduction in predictive performance. For example, when predicting ventilation, it is possible to achieve a 43% reduction in cost with only a 3% reduction in performance. Similarly, when predicting mortality, it is possible to achieve a 50% reduction in cost with only a 1% reduction in performance. This study presents a quick, accurate, and cost-effective method to evaluate risk of deterioration for patients with SARS-CoV-2 infection at the time of clinical evaluation.

Assuntos

Orçamentos , COVID-19/patologia , COVID-19/virologia , Aprendizado de Máquina , Avaliação de Resultados em Cuidados de Saúde , SARS-CoV-2/isolamento & purificação , Humanos

6.

Nonstationary multivariate Gaussian processes for electronic health records.

Meng, Rui; Soper, Braden; Lee, Herbert K H; Liu, Vincent X; Greene, John D; Ray, Priyadip.

J Biomed Inform ; 117: 103698, 2021 05.

Artigo em Inglês | MEDLINE | ID: mdl-33617985

RESUMO

Advances in the modeling and analysis of electronic health records (EHR) have the potential to improve patient risk stratification, leading to better patient outcomes. The modeling of complex temporal relations across the multiple clinical variables inherent in EHR data is largely unexplored. Existing approaches to modeling EHR data often lack the flexibility to handle time-varying correlations across multiple clinical variables, or they are too complex for clinical interpretation. Therefore, we propose a novel nonstationary multivariate Gaussian process model for EHR data to address the aforementioned drawbacks of existing methodologies. Our proposed model is able to capture time-varying scale, correlation and smoothness across multiple clinical variables. We also provide details on two inference approaches: Maximum a posteriori and Hamilton Monte Carlo. Our model is validated on synthetic data and then we demonstrate its effectiveness on EHR data from Kaiser Permanente Division of Research (KPDOR). Finally, we use the KPDOR EHR data to investigate the relationships between a clinical patient risk metric and the latent processes of our proposed model and demonstrate statistically significant correlations between these entities.

Assuntos

Registros Eletrônicos de Saúde , Humanos , Distribuição Normal

7.

Improving five-year survival prediction via multitask learning across HPV-related cancers.

Goncalves, Andre; Soper, Braden; Nygård, Mari; Nygård, Jan F; Ray, Priyadip; Widemann, David; Sales, Ana Paula.

PLoS One ; 15(11): e0241225, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-33196642

RESUMO

Oncology is a highly siloed field of research in which sub-disciplinary specialization has limited the amount of information shared between researchers of distinct cancer types. This can be attributed to legitimate differences in the physiology and carcinogenesis of cancers affecting distinct anatomical sites. However, underlying processes that are shared across seemingly disparate cancers probably affect prognosis. The objective of the current study is to investigate whether multitask learning improves 5-year survival cancer patient survival prediction by leveraging information across anatomically distinct HPV related cancers. Data were obtained from the Surveillance, Epidemiology, and End Results (SEER) program database. The study cohort consisted of 29,768 primary cancer cases diagnosed in the United States between 2004 and 2015. Ten different cancer diagnoses were selected, all with a known association with HPV risk. In the analysis, the cancer diagnoses were categorized into three distinct topography groups of varying specificity. The most specific topography grouping consisted of 10 original cancer diagnoses differentiated by the first two digits of the ICD-O-3 topography code. The second topography grouping consisted of cancer diagnoses categorized into six distinct organ groups. Finally, the third topography grouping consisted of just two groups, head-neck cancers and ano-genital cancers. The tasks were to predict 5-year survival for patients within the different topography groups using 14 predictive features which were selected among descriptive variables available in the SEER database. The information from the predictive features was shared between tasks in three different ways, resulting in three distinct predictive models: 1) Information was not shared between patients assigned to different tasks (single task learning); 2) Information was shared between all patients, regardless of task (pooled model); 3) Only relevant information was shared between patients grouped to different tasks (multitask learning). Prediction performance was evaluated with Brier scores. All three models were evaluated against one another on each of the three distinct topography-defined tasks. The results showed that multitask classifiers achieved relative improvement for the majority of the scenarios studied compared to single task learning and pooled baseline methods. In this study, we have demonstrated that sharing information among anatomically distinct cancer types can lead to improved predictive survival models.

Assuntos

Aprendizagem , Comportamento Multitarefa , Neoplasias/mortalidade , Neoplasias/virologia , Infecções por Papillomavirus/mortalidade , Adulto , Idoso , Idoso de 80 Anos ou mais , Algoritmos , Estudos de Coortes , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Programa de SEER , Tamanho da Amostra , Análise de Sobrevida , Adulto Jovem

8.

Generation and evaluation of synthetic patient data.

Goncalves, Andre; Ray, Priyadip; Soper, Braden; Stevens, Jennifer; Coyle, Linda; Sales, Ana Paula.

BMC Med Res Methodol ; 20(1): 108, 2020 05 07.

Artigo em Inglês | MEDLINE | ID: mdl-32381039

RESUMO

BACKGROUND: Machine learning (ML) has made a significant impact in medicine and cancer research; however, its impact in these areas has been undeniably slower and more limited than in other application domains. A major reason for this has been the lack of availability of patient data to the broader ML research community, in large part due to patient privacy protection concerns. High-quality, realistic, synthetic datasets can be leveraged to accelerate methodological developments in medicine. By and large, medical data is high dimensional and often categorical. These characteristics pose multiple modeling challenges. METHODS: In this paper, we evaluate three classes of synthetic data generation approaches; probabilistic models, classification-based imputation models, and generative adversarial neural networks. Metrics for evaluating the quality of the generated synthetic datasets are presented and discussed. RESULTS: While the results and discussions are broadly applicable to medical data, for demonstration purposes we generate synthetic datasets for cancer based on the publicly available cancer registry data from the Surveillance Epidemiology and End Results (SEER) program. Specifically, our cohort consists of breast, respiratory, and non-solid cancer cases diagnosed between 2010 and 2015, which includes over 360,000 individual cases. CONCLUSIONS: We discuss the trade-offs of the different methods and metrics, providing guidance on considerations for the generation and usage of medical synthetic data.

Assuntos

Aprendizado de Máquina , Neoplasias , Humanos , Neoplasias/diagnóstico , Neoplasias/epidemiologia , Redes Neurais de Computação

9.

Bayesian multitask learning regression for heterogeneous patient cohorts.

Goncalves, Andre; Ray, Priyadip; Soper, Braden; Widemann, David; Nygård, Mari; Nygård, Jan F; Sales, Ana Paula.

J Biomed Inform ; 100S: 100059, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-34384572

RESUMO

Multitask learning (MTL) leverages commonalities across related tasks with the aim of improving individual task performance. A key modeling choice in designing MTL models is the structure of the tasks' relatedness, which may not be known. Here we propose a Bayesian multitask learning model that is able to infer the task relationship structure directly from the data. We present two variations of the model in terms of a priori information of task relatedness. First, a diffuse Wishart prior is placed on a task precision matrix so that all tasks are assumed to be equally related a priori. Second, a Bayesian graphical LASSO prior is used on the task precision matrix to impose sparsity in the task relatedness. Motivated by machine learning applications in the biomedical domain, we emphasize interpretability and uncertainty quantification in our models. To encourage model interpretability, linear mappings from the shared input spaces to task-dependent output spaces are used. To encourage uncertainty quantification, conjugate priors are used so that full posterior inference is possible. Using synthetic data, we show that our model is able to recover the underlying task relationships as well as features jointly relevant for all tasks. We demonstrate the utility of our model on three distinct biomedical applications: Alzheimer's disease progression, Parkinson's disease assessment, and cervical cancer screening compliance. We show that our model outperforms Single Task (STL) models in terms of predictive performance, and performs better than existing MTL methods for the majority of the scenarios.

10.

Bayesian joint analysis of heterogeneous genomics data.

Ray, Priyadip; Zheng, Lingling; Lucas, Joseph; Carin, Lawrence.

Bioinformatics ; 30(10): 1370-6, 2014 May 15.

Artigo em Inglês | MEDLINE | ID: mdl-24489367

RESUMO

SUMMARY: A non-parametric Bayesian factor model is proposed for joint analysis of multi-platform genomics data. The approach is based on factorizing the latent space (feature space) into a shared component and a data-specific component with the dimensionality of these components (spaces) inferred via a beta-Bernoulli process. The proposed approach is demonstrated by jointly analyzing gene expression/copy number variations and gene expression/methylation data for ovarian cancer patients, showing that the proposed model can potentially uncover key drivers related to cancer. AVAILABILITY AND IMPLEMENTATION: The source code for this model is written in MATLAB and has been made publicly available at https://sites.google.com/site/jointgenomics/. CONTACT: catherine.ll.zheng@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Genômica/métodos , Teorema de Bayes , Variações do Número de Cópias de DNA , Metilação de DNA , Feminino , Regulação da Expressão Gênica , Humanos , Neoplasias Ovarianas/genética , Software

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA