Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
medRxiv ; 2024 May 16.
Artigo em Inglês | MEDLINE | ID: mdl-38798457

RESUMO

Importance: Randomized clinical trials (RCTs) are the standard for defining an evidence-based approach to managing disease, but their generalizability to real-world patients remains challenging to quantify. Objective: To develop a multidimensional patient variable mapping algorithm to quantify the similarity and representation of electronic health record (EHR) patients corresponding to an RCT and estimate the putative treatment effects in real-world settings based on individual treatment effects observed in an RCT. Design: A retrospective analysis of the Treatment of Preserved Cardiac Function Heart Failure with an Aldosterone Antagonist Trial (TOPCAT; 2006-2012) and a multi-hospital patient cohort from the electronic health record (EHR) in the Yale New Haven Hospital System (YNHHS; 2015-2023). Setting: A multicenter international RCT (TOPCAT) and multi-hospital patient cohort (YNHHS). Participants: All TOPCAT participants and patients with heart failure with preserved ejection fraction (HFpEF) and ≥1 hospitalization within YNHHS. Exposures: 63 pre-randomization characteristics measured across the TOPCAT and YNNHS cohorts. Main Outcomes and Measures: Real-world generalizability of the RCT TOPCAT using a multidimensional phenotypic distance metric between TOPCAT and YNHHS cohorts. Estimation of the individualized treatment effect of spironolactone use on all-cause mortality within the YNHHS cohort based on phenotypic distance from the TOPCAT cohort. Results: There were 3,445 patients in TOPCAT and 11,712 HFpEF patients across five hospital sites. Across the 63 TOPCAT variables mapped by clinicians to the EHR, there were larger differences between TOPCAT and each of the 5 EHR sites (median SMD 0.200, IQR 0.037-0.410) than between the 5 EHR sites (median SMD 0.062, IQR 0.010-0.130). The synthesis of these differences across covariates using our multidimensional similarity score also suggested substantial phenotypic dissimilarity between the TOPCAT and EHR cohorts. By phenotypic distance, a majority (55%) of TOPCAT participants were closer to each other than any individual EHR patient. Using a TOPCAT-derived model of individualized treatment benefit from spironolactone, those predicted to derive benefit and receiving spironolactone in the EHR cohorts had substantially better outcomes compared with predicted benefit and not receiving the medication (HR 0.74, 95% CI 0.62-0.89). Conclusions and Relevance: We propose a novel approach to evaluating the real-world representativeness of RCT participants against corresponding patients in the EHR across the full multidimensional spectrum of the represented phenotypes. This enables the evaluation of the implications of RCTs for real-world patients. KEY POINTS: Question: How can we examine the multi-dimensional generalizability of randomized clinical trials (RCT) to real-world patient populations?Findings: We demonstrate a novel phenotypic distance metric comparing an RCT to real-world populations in a large multicenter RCT of heart failure patients and the corresponding patients in multisite electronic health records (EHRs). Across 63 pre-randomization characteristics, pairwise assessments of members of the RCT and EHR cohorts were more discordant from each other than between members of the EHR cohort (median standardized mean difference 0.200 [0.037-0.410] vs 0.062 [0.010-0.130]), with a majority (55%) of RCT participants closer to each other than any individual EHR patient. The approach also enabled the quantification of expected real world outcomes based on effects observed in the RCT.Meaning: A multidimensional phenotypic distance metric quantifies the generalizability of RCTs to a given population while also offering an avenue to examine expected real-world patient outcomes based on treatment effects observed in the RCT.

2.
medRxiv ; 2024 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-38633789

RESUMO

Introduction: Serial functional status assessments are critical to heart failure (HF) management but are often described narratively in documentation, limiting their use in quality improvement or patient selection for clinical trials. We developed and validated a deep learning-based natural language processing (NLP) strategy to extract functional status assessments from unstructured clinical notes. Methods: We identified 26,577 HF patients across outpatient services at Yale New Haven Hospital (YNHH), Greenwich Hospital (GH), and Northeast Medical Group (NMG) (mean age 76.1 years; 52.0% women). We used expert annotated notes from YNHH for model development/internal testing and from GH and NMG for external validation. The primary outcomes were NLP models to detect (a) explicit New York Heart Association (NYHA) classification, (b) HF symptoms during activity or rest, and (c) functional status assessment frequency. Results: Among 3,000 expert-annotated notes, 13.6% mentioned NYHA class, and 26.5% described HF symptoms. The model to detect NYHA classes achieved a class-weighted AUROC of 0.99 (95% CI: 0.98-1.00) at YNHH, 0.98 (0.96-1.00) at NMG, and 0.98 (0.92-1.00) at GH. The activity-related HF symptom model achieved an AUROC of 0.94 (0.89-0.98) at YNHH, 0.94 (0.91-0.97) at NMG, and 0.95 (0.92-0.99) at GH. Deploying the NYHA model among 166,655 unannotated notes from YNHH identified 21,528 (12.9%) with NYHA mentions and 17,642 encounters (10.5%) classifiable into functional status groups based on activity-related symptoms. Conclusions: We developed and validated an NLP approach to extract NYHA classification and activity-related HF symptoms from clinical notes, enhancing the ability to track optimal care and identify trial-eligible patients.

3.
medRxiv ; 2024 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-38585929

RESUMO

Randomized clinical trials (RCTs) are essential to guide medical practice; however, their generalizability to a given population is often uncertain. We developed a statistically informed Generative Adversarial Network (GAN) model, RCT-Twin-GAN, that leverages relationships between covariates and outcomes and generates a digital twin of an RCT (RCT-Twin) conditioned on covariate distributions from a second patient population. We used RCT-Twin-GAN to reproduce treatment effect outcomes of the Systolic Blood Pressure Intervention Trial (SPRINT) and the Action to Control Cardiovascular Risk in Diabetes (ACCORD) Blood Pressure Trial, which tested the same intervention but had different treatment effect results. To demonstrate treatment effect estimates of each RCT conditioned on the other RCT patient population, we evaluated the cardiovascular event-free survival of SPRINT digital twins conditioned on the ACCORD cohort and vice versa (SPRINT-conditioned ACCORD twins). The conditioned digital twins were balanced by the intervention arm (mean absolute standardized mean difference (MASMD) of covariates between treatment arms 0.019 (SD 0.018), and the conditioned covariates of the SPRINT-Twin on ACCORD were more similar to ACCORD than a sprint (MASMD 0.0082 SD 0.016 vs. 0.46 SD 0.20). Most importantly, across iterations, SPRINT conditioned ACCORD-Twin datasets reproduced the overall non-significant effect size seen in ACCORD (5-year cardiovascular outcome hazard ratio (95% confidence interval) of 0.88 (0.73-1.06) in ACCORD vs median 0.87 (0.68-1.13) in the SPRINT conditioned ACCORD-Twin), while the ACCORD conditioned SPRINT-Twins reproduced the significant effect size seen in SPRINT (0.75 (0.64-0.89) vs median 0.79 (0.72-0.86)) in ACCORD conditioned SPRINT-Twin). Finally, we describe the translation of this approach to real-world populations by conditioning the trials on an electronic health record population. Therefore, RCT-Twin-GAN simulates the direct translation of RCT-derived treatment effects across various patient populations with varying covariate distributions.

4.
medRxiv ; 2024 Feb 08.
Artigo em Inglês | MEDLINE | ID: mdl-38370787

RESUMO

Background: SGLT2 inhibitors (SGLT2is) and GLP-1 receptor agonists (GLP1-RAs) reduce major adverse cardiovascular events (MACE) in patients with type 2 diabetes mellitus (T2DM). However, their effectiveness relative to each other and other second-line antihyperglycemic agents is unknown, without any major ongoing head-to-head trials. Methods: Across the LEGEND-T2DM network, we included ten federated international data sources, spanning 1992-2021. We identified 1,492,855 patients with T2DM and established cardiovascular disease (CVD) on metformin monotherapy who initiated one of four second-line agents (SGLT2is, GLP1-RAs, dipeptidyl peptidase 4 inhibitor [DPP4is], sulfonylureas [SUs]). We used large-scale propensity score models to conduct an active comparator, target trial emulation for pairwise comparisons. After evaluating empirical equipoise and population generalizability, we fit on-treatment Cox proportional hazard models for 3-point MACE (myocardial infarction, stroke, death) and 4-point MACE (3-point MACE + heart failure hospitalization) risk, and combined hazard ratio (HR) estimates in a random-effects meta-analysis. Findings: Across cohorts, 16·4%, 8·3%, 27·7%, and 47·6% of individuals with T2DM initiated SGLT2is, GLP1-RAs, DPP4is, and SUs, respectively. Over 5·2 million patient-years of follow-up and 489 million patient-days of time at-risk, there were 25,982 3-point MACE and 41,447 4-point MACE events. SGLT2is and GLP1-RAs were associated with a lower risk for 3-point MACE compared with DPP4is (HR 0·89 [95% CI, 0·79-1·00] and 0·83 [0·70-0·98]), and SUs (HR 0·76 [0·65-0·89] and 0·71 [0·59-0·86]). DPP4is were associated with a lower 3-point MACE risk versus SUs (HR 0·87 [0·79-0·95]). The pattern was consistent for 4-point MACE for the comparisons above. There were no significant differences between SGLT2is and GLP1-RAs for 3-point or 4-point MACE (HR 1·06 [0·96-1·17] and 1·05 [0·97-1·13]). Interpretation: In patients with T2DM and established CVD, we found comparable cardiovascular risk reduction with SGLT2is and GLP1-RAs, with both agents more effective than DPP4is, which in turn were more effective than SUs. These findings suggest that the use of GLP1-RAs and SGLT2is should be prioritized as second-line agents in those with established CVD. Funding: National Institutes of Health, United States Department of Veterans Affairs.

5.
medRxiv ; 2023 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-38106089

RESUMO

Background: Randomized clinical trials (RCTs) are designed to produce evidence in selected populations. Assessing their effects in the real-world is essential to change medical practice, however, key populations are historically underrepresented in the RCTs. We define an approach to simulate RCT-based effects in real-world settings using RCT digital twins reflecting the covariate patterns in an electronic health record (EHR). Methods: We developed a Generative Adversarial Network (GAN) model, RCT-Twin-GAN, which generates a digital twin of an RCT (RCT-Twin) conditioned on covariate distributions from an EHR cohort. We improved upon a traditional tabular conditional GAN, CTGAN, with a loss function adapted for data distributions and by conditioning on multiple discrete and continuous covariates simultaneously. We assessed the similarity between a Heart Failure with preserved Ejection Fraction (HFpEF) RCT (TOPCAT), a Yale HFpEF EHR cohort, and RCT-Twin. We also evaluated cardiovascular event-free survival stratified by Spironolactone (treatment) use. Results: By applying RCT-Twin-GAN to 3445 TOPCAT participants and conditioning on 3445 Yale EHR HFpEF patients, we generated RCT-Twin datasets between 1141-3445 patients in size, depending on covariate conditioning and model parameters. RCT-Twin randomly allocated spironolactone (S)/ placebo (P) arms like an RCT, was similar to RCT by a multi-dimensional distance metric, and balanced covariates (median absolute standardized mean difference (MASMD) 0.017, IQR 0.0034-0.030). The 5 EHR-conditioned covariates in RCT-Twin were closer to the EHR compared with the RCT (MASMD 0.008 vs 0.63, IQR 0.005-0.018 vs 0.59-1.11). RCT-Twin reproduced the overall effect size seen in TOPCAT (5-year cardiovascular composite outcome odds ratio (95% confidence interval) of 0.89 (0.75-1.06) in RCT vs 0.85 (0.69-1.04) in RCT-Twin). Conclusions: RCT-Twin-GAN simulates RCT-derived effects in real-world patients by translating these effects to the covariate distributions of EHR patients. This key methodological advance may enable the direct translation of RCT-derived effects into real-world patient populations and may enable causal inference in real-world settings.

6.
medRxiv ; 2023 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-37961715

RESUMO

Randomized controlled trials (RCT) represent the cornerstone of evidence-based medicine but are resource-intensive. We propose and evaluate a machine learning (ML) strategy of adaptive predictive enrichment through computational trial phenomaps to optimize RCT enrollment. In simulated group sequential analyses of two large cardiovascular outcomes RCTs of (1) a therapeutic drug (pioglitazone versus placebo; Insulin Resistance Intervention after Stroke (IRIS) trial), and (2) a disease management strategy (intensive versus standard systolic blood pressure reduction in the Systolic Blood Pressure Intervention Trial (SPRINT)), we constructed dynamic phenotypic representations to infer response profiles during interim analyses and examined their association with study outcomes. Across three interim timepoints, our strategy learned dynamic phenotypic signatures predictive of individualized cardiovascular benefit. By conditioning a prospective candidate's probability of enrollment on their predicted benefit, we estimate that our approach would have enabled a reduction in the final trial size across ten simulations (IRIS: -14.8% ± 3.1%, pone-sample t-test=0.001; SPRINT: -17.6% ± 3.6%, pone-sample t-test<0.001), while preserving the original average treatment effect (IRIS: hazard ratio of 0.73 ± 0.01 for pioglitazone vs placebo, vs 0.76 in the original trial; SPRINT: hazard ratio of 0.72 ± 0.01 for intensive vs standard systolic blood pressure, vs 0.75 in the original trial; all with pone-sample t-test<0.01). This adaptive framework has the potential to maximize RCT enrollment efficiency.

7.
NPJ Digit Med ; 6(1): 217, 2023 Nov 25.
Artigo em Inglês | MEDLINE | ID: mdl-38001154

RESUMO

Randomized clinical trials (RCT) represent the cornerstone of evidence-based medicine but are resource-intensive. We propose and evaluate a machine learning (ML) strategy of adaptive predictive enrichment through computational trial phenomaps to optimize RCT enrollment. In simulated group sequential analyses of two large cardiovascular outcomes RCTs of (1) a therapeutic drug (pioglitazone versus placebo; Insulin Resistance Intervention after Stroke (IRIS) trial), and (2) a disease management strategy (intensive versus standard systolic blood pressure reduction in the Systolic Blood Pressure Intervention Trial (SPRINT)), we constructed dynamic phenotypic representations to infer response profiles during interim analyses and examined their association with study outcomes. Across three interim timepoints, our strategy learned dynamic phenotypic signatures predictive of individualized cardiovascular benefit. By conditioning a prospective candidate's probability of enrollment on their predicted benefit, we estimate that our approach would have enabled a reduction in the final trial size across ten simulations (IRIS: -14.8% ± 3.1%, pone-sample t-test = 0.001; SPRINT: -17.6% ± 3.6%, pone-sample t-test < 0.001), while preserving the original average treatment effect (IRIS: hazard ratio of 0.73 ± 0.01 for pioglitazone vs placebo, vs 0.76 in the original trial; SPRINT: hazard ratio of 0.72 ± 0.01 for intensive vs standard systolic blood pressure, vs 0.75 in the original trial; all simulations with Cox regression-derived p value of < 0.01 for the effect of the intervention on the respective primary outcome). This adaptive framework has the potential to maximize RCT enrollment efficiency.

8.
medRxiv ; 2023 Sep 19.
Artigo em Inglês | MEDLINE | ID: mdl-37790355

RESUMO

Importance: Elevated lipoprotein(a) [Lp(a)] is associated with atherosclerotic cardiovascular disease (ASCVD) and major adverse cardiovascular events (MACE). However, fewer than 0.5% of patients undergo Lp(a) testing, limiting the evaluation and use of novel targeted therapeutics currently under development. Objective: We developed and validated a machine learning model to enable targeted screening for elevated Lp(a). Design: Cross-sectional. Setting: 4 multinational population-based cohorts. Participants: We included 456,815 participants from the UK Biobank (UKB), the largest cohort with protocolized Lp(a) testing for model development. The model's external validity was assessed in Atherosclerosis Risk in Communities (ARIC) (N=14,484), Coronary Artery Risk Development in Young Adults (CARDIA) (N=4,124), and Multi-Ethnic Study of Atherosclerosis (MESA) (N=4,672) cohorts. Exposures: Demographics, medications, diagnoses, procedures, vitals, and laboratory measurements from UKB and linked electronic health records (EHR) were candidate input features to predict high Lp(a). We used the pooled cohort equations (PCE), an ASCVD risk marker, as a comparator to identify elevated Lp(a). Main Outcomes and Measures: The main outcome was elevated Lp(a) (≥150 nmol/L), and the number-needed-to-test (NNT) to find one case with elevated Lp(a). We explored the association of the model's prediction probabilities with all-cause and cardiovascular mortality, and MACE. Results: The Algorithmic Risk Inspection for Screening Elevated Lp(a) (ARISE) used low-density lipoprotein cholesterol, statin use, triglycerides, high-density lipoprotein cholesterol, history of ASCVD, and anti-hypertensive medication use as input features. ARISE outperformed cardiovascular risk stratification through PCE for predicting elevated Lp(a) with a significantly lower NNT (4.0 versus 8.0 [with or without PCE], P<0.001). ARISE performed comparably across external validation cohorts and subgroups, reducing the NNT by up to 67.3%, depending on the probability threshold. Over a median follow-up of 4.2 years, a high ARISE probability was also associated with a greater hazard of all-cause death and MACE (age/sex-adjusted hazard ratio [aHR], 1.35, and 1.38, respectively, P<0.001), with a greater increase in cardiovascular mortality (aHR, 2.17, P<0.001). Conclusions and Relevance: ARISE optimizes screening for elevated Lp(a) using commonly available clinical features. ARISE can be deployed in EHR and other settings to encourage greater Lp(a) testing and to improve identifying cases eligible for novel targeted therapeutics in trials. KEY POINTS: Question: How can we optimize the identification of individuals with elevated lipoprotein(a) [Lp(a)] who may be eligible for novel targeted therapeutics?Findings: Using 4 multinational population-based cohorts, we developed and validated a machine learning model, Algorithmic Risk Inspection for Screening Elevated Lp(a) (ARISE), to enable targeted screening for elevated Lp(a). In contrast to the pooled cohort equations that do not identify those with elevated Lp(a), ARISE reduces the "number-needed-to-test" to find one case with elevated Lp(a) by up to 67.3%.Meaning: ARISE can be deployed in electronic health records and other settings to enable greater yield of Lp(a) testing, thereby improving the identification of individuals with elevated Lp(a).

10.
BioData Min ; 13(1): 21, 2020 Dec 07.
Artigo em Inglês | MEDLINE | ID: mdl-33372632

RESUMO

BACKGROUND: Accurate identification of acute ischemic stroke (AIS) patient cohorts is essential for a wide range of clinical investigations. Automated phenotyping methods that leverage electronic health records (EHRs) represent a fundamentally new approach cohort identification without current laborious and ungeneralizable generation of phenotyping algorithms. We systematically compared and evaluated the ability of machine learning algorithms and case-control combinations to phenotype acute ischemic stroke patients using data from an EHR. MATERIALS AND METHODS: Using structured patient data from the EHR at a tertiary-care hospital system, we built and evaluated machine learning models to identify patients with AIS based on 75 different case-control and classifier combinations. We then estimated the prevalence of AIS patients across the EHR. Finally, we externally validated the ability of the models to detect AIS patients without AIS diagnosis codes using the UK Biobank. RESULTS: Across all models, we found that the mean AUROC for detecting AIS was 0.963 ± 0.0520 and average precision score 0.790 ± 0.196 with minimal feature processing. Classifiers trained with cases with AIS diagnosis codes and controls with no cerebrovascular disease codes had the best average F1 score (0.832 ± 0.0383). In the external validation, we found that the top probabilities from a model-predicted AIS cohort were significantly enriched for AIS patients without AIS diagnosis codes (60-150 fold over expected). CONCLUSIONS: Our findings support machine learning algorithms as a generalizable way to accurately identify AIS patients without using process-intensive manual feature curation. When a set of AIS patients is unavailable, diagnosis codes may be used to train classifier models.

11.
Nat Med ; 26(10): 1609-1615, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-32747830

RESUMO

Understanding the pathophysiology of SARS-CoV-2 infection is critical for therapeutic and public health strategies. Viral-host interactions can guide discovery of disease regulators, and protein structure function analysis points to several immune pathways, including complement and coagulation, as targets of coronaviruses. To determine whether conditions associated with dysregulated complement or coagulation systems impact disease, we performed a retrospective observational study and found that history of macular degeneration (a proxy for complement-activation disorders) and history of coagulation disorders (thrombocytopenia, thrombosis and hemorrhage) are risk factors for SARS-CoV-2-associated morbidity and mortality-effects that are independent of age, sex or history of smoking. Transcriptional profiling of nasopharyngeal swabs demonstrated that in addition to type-I interferon and interleukin-6-dependent inflammatory responses, infection results in robust engagement of the complement and coagulation pathways. Finally, in a candidate-driven genetic association study of severe SARS-CoV-2 disease, we identified putative complement and coagulation-associated loci including missense, eQTL and sQTL variants of critical complement and coagulation regulators. In addition to providing evidence that complement function modulates SARS-CoV-2 infection outcome, the data point to putative transcriptional genetic markers of susceptibility. The results highlight the value of using a multimodal analytical approach to reveal determinants and predictors of immunity, susceptibility and clinical outcome associated with infection.


Assuntos
Ativação do Complemento/imunologia , Infecções por Coronavirus/mortalidade , Hemorragia/epidemiologia , Degeneração Macular/epidemiologia , Pneumonia Viral/mortalidade , Trombocitopenia/epidemiologia , Trombose/epidemiologia , Adulto , Fatores Etários , Idoso , Idoso de 80 Anos ou mais , Betacoronavirus , Coagulação Sanguínea/genética , Transtornos da Coagulação Sanguínea/epidemiologia , COVID-19 , Ativação do Complemento/genética , Infecções por Coronavirus/sangue , Infecções por Coronavirus/genética , Infecções por Coronavirus/imunologia , Diabetes Mellitus Tipo 2/epidemiologia , Feminino , Expressão Gênica , Hemorragia/sangue , Hemorragia/imunologia , Doenças da Deficiência Hereditária de Complemento/epidemiologia , Doenças da Deficiência Hereditária de Complemento/imunologia , Humanos , Hipertensão/epidemiologia , Intubação Intratraqueal , Masculino , Pessoa de Meia-Idade , Cidade de Nova Iorque/epidemiologia , Obesidade/epidemiologia , Pandemias , Pneumonia Viral/sangue , Pneumonia Viral/genética , Pneumonia Viral/imunologia , Modelos de Riscos Proporcionais , Respiração Artificial , Estudos Retrospectivos , Fatores de Risco , SARS-CoV-2 , Índice de Gravidade de Doença , Fatores Sexuais , Trombocitopenia/sangue , Trombose/sangue
12.
medRxiv ; 2020 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-32511494

RESUMO

Understanding the pathophysiology of SARS-CoV-2 infection is critical for therapeutics and public health intervention strategies. Viral-host interactions can guide discovery of regulators of disease outcomes, and protein structure function analysis points to several immune pathways, including complement and coagulation, as targets of the coronavirus proteome. To determine if conditions associated with dysregulation of the complement or coagulation systems impact adverse clinical outcomes, we performed a retrospective observational study of 11,116 patients who presented with suspected SARS-CoV-2 infection. We found that history of macular degeneration (a proxy for complement activation disorders) and history of coagulation disorders (thrombocytopenia, thrombosis, and hemorrhage) are risk factors for morbidity and mortality in SARS-CoV-2 infected patients - effects that could not be explained by age, sex, or history of smoking. Further, transcriptional profiling of nasopharyngeal (NP) swabs from 650 control and SARS-CoV-2 infected patients demonstrated that in addition to innate Type-I interferon and IL-6 dependent inflammatory immune responses, infection results in robust engagement and activation of the complement and coagulation pathways. Finally, we conducted a candidate driven genetic association study of severe SARS-CoV-2 disease. Among the findings, our scan identified putative complement and coagulation associated loci including missense, eQTL and sQTL variants of critical regulators of the complement and coagulation cascades. In addition to providing evidence that complement function modulates SARS-CoV-2 infection outcome, the data point to putative transcriptional genetic markers of susceptibility. The results highlight the value of using a multi-modal analytical approach, combining molecular information from virus protein structure-function analysis with clinical informatics, transcriptomics, and genomics to reveal determinants and predictors of immunity, susceptibility, and clinical outcome associated with infection.

13.
AMIA Annu Symp Proc ; 2020: 1080-1089, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33936484

RESUMO

Phenotyping algorithms are essential tools for conducting clinical research on observational data. Manually devel- oped phenotyping algorithms, such as those curated within the eMERGE (electronic Medical Records and Genomics) Network, represent the gold standard but are time consuming to create. In this work, we propose a framework for learning from the structure of eMERGE phenotype concept sets to assist construction of novel phenotype definitions. We use eMERGE phenotypes as a source of reference concept sets and engineer rich features characterizing the con- cept pairs within each set. We treat these pairwise relationships as edges in a concept graph, train models to perform edge prediction, and identify candidate phenotype concept sets as highly connected subgraphs. Candidate concept sets may then be interrogated and composed to construct novel phenotype definitions.


Assuntos
Algoritmos , Registros Eletrônicos de Saúde , Genômica , Fenótipo , Humanos , Probabilidade
14.
JAMIA Open ; 2(1): 10-14, 2019 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-31633087

RESUMO

OBJECTIVES: Electronic health record (EHR) data are increasingly used for biomedical discoveries. The nature of the data, however, requires expertise in both data science and EHR structure. The Observational Medical Out-comes Partnership (OMOP) common data model (CDM) standardizes the language and structure of EHR data to promote interoperability of EHR data for research. While the OMOP CDM is valuable and more attuned to research purposes, it still requires extensive domain knowledge to utilize effectively, potentially limiting more widespread adoption of EHR data for research and quality improvement. MATERIALS AND METHODS: We have created ROMOP: an R package for direct interfacing with EHR data in the OMOP CDM format. RESULTS: ROMOP streamlines typical EHR-related data processes. Its functions include exploration of data types, extraction and summarization of patient clinical and demographic data, and patient searches using any CDM vocabulary concept. CONCLUSION: ROMOP is freely available under the Massachusetts Institute of Technology (MIT) license and can be obtained from GitHub (http://github.com/BenGlicksberg/ROMOP). We detail instructions for setup and use in the Supplementary Materials. Additionally, we provide a public sandbox server containing synthesized clinical data for users to explore OMOP data and ROMOP (http://romop.ucsf.edu).

15.
Bioinformatics ; 35(21): 4515-4518, 2019 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-31214700

RESUMO

MOTIVATION: Electronic health records (EHRs) are quickly becoming omnipresent in healthcare, but interoperability issues and technical demands limit their use for biomedical and clinical research. Interactive and flexible software that interfaces directly with EHR data structured around a common data model (CDM) could accelerate more EHR-based research by making the data more accessible to researchers who lack computational expertise and/or domain knowledge. RESULTS: We present PatientExploreR, an extensible application built on the R/Shiny framework that interfaces with a relational database of EHR data in the Observational Medical Outcomes Partnership CDM format. PatientExploreR produces patient-level interactive and dynamic reports and facilitates visualization of clinical data without any programming required. It allows researchers to easily construct and export patient cohorts from the EHR for analysis with other software. This application could enable easier exploration of patient-level data for physicians and researchers. PatientExploreR can incorporate EHR data from any institution that employs the CDM for users with approved access. The software code is free and open source under the MIT license, enabling institutions to install and users to expand and modify the application for their own purposes. AVAILABILITY AND IMPLEMENTATION: PatientExploreR can be freely obtained from GitHub: https://github.com/BenGlicksberg/PatientExploreR. We provide instructions for how researchers with approved access to their institutional EHR can use this package. We also release an open sandbox server of synthesized patient data for users without EHR access to explore: http://patientexplorer.ucsf.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Registros Eletrônicos de Saúde , Software , Computadores , Bases de Dados Factuais , Humanos , Estudos Observacionais como Assunto
16.
PLoS Comput Biol ; 12(5): e1004903, 2016 05.
Artigo em Inglês | MEDLINE | ID: mdl-27138214

RESUMO

It has been shown that the same canonical cortical circuit model with mutual inhibition and a fatigue process can explain perceptual rivalry and other neurophysiological responses to a range of static stimuli. However, it has been proposed that this model cannot explain responses to dynamic inputs such as found in intermittent rivalry and rivalry memory, where maintenance of a percept when the stimulus is absent is required. This challenges the universality of the basic canonical cortical circuit. Here, we show that by including an overlooked realistic small nonspecific background neural activity, the same basic model can reproduce intermittent rivalry and rivalry memory without compromising static rivalry and other cortical phenomena. The background activity induces a mutual-inhibition mechanism for short-term memory, which is robust to noise and where fine-tuning of recurrent excitation or inclusion of sub-threshold currents or synaptic facilitation is unnecessary. We prove existence conditions for the mechanism and show that it can explain experimental results from the quartet apparent motion illusion, which is a prototypical intermittent rivalry stimulus.


Assuntos
Memória de Curto Prazo/fisiologia , Modelos Neurológicos , Córtex Visual/fisiologia , Adulto , Astenopia/fisiopatologia , Biologia Computacional , Humanos , Masculino , Rede Nervosa/fisiologia , Ilusões Ópticas/fisiologia , Estimulação Luminosa , Percepção Visual/fisiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA