Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 44
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Bioinformatics ; 39(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-36472455

RESUMO

MOTIVATION: Making sense of networked multivariate association patterns is vitally important to many areas of high-dimensional analysis. Unfortunately, as the data-space dimensions grow, the number of association pairs increases in O(n2); this means that traditional visualizations such as heatmaps quickly become too complicated to parse effectively. RESULTS: Here, we present associationSubgraphs: a new interactive visualization method to quickly and intuitively explore high-dimensional association datasets using network percolation and clustering. The goal is to provide an efficient investigation of association subgraphs, each containing a subset of variables with stronger and more frequent associations among themselves than the remaining variables outside the subset, by showing the entire clustering dynamics and providing subgraphs under all possible cutoff values at once. Particularly, we apply associationSubgraphs to a phenome-wide multimorbidity association matrix generated from an electronic health record and provide an online, interactive demonstration for exploring multimorbidity subgraphs. AVAILABILITY AND IMPLEMENTATION: An R package implementing both the algorithm and visualization components of associationSubgraphs is available at https://github.com/tbilab/associationsubgraphs. Online documentation is available at https://prod.tbilab.org/associationsubgraphs_info/. A demo using a multimorbidity association matrix is available at https://prod.tbilab.org/associationsubgraphs-example/.


Assuntos
Multimorbidade , Software , Algoritmos , Análise por Conglomerados , Fenômica
2.
Am J Epidemiol ; 192(2): 283-295, 2023 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-36331289

RESUMO

We sought to determine whether machine learning and natural language processing (NLP) applied to electronic medical records could improve performance of automated health-care claims-based algorithms to identify anaphylaxis events using data on 516 patients with outpatient, emergency department, or inpatient anaphylaxis diagnosis codes during 2015-2019 in 2 integrated health-care institutions in the Northwest United States. We used one site's manually reviewed gold-standard outcomes data for model development and the other's for external validation based on cross-validated area under the receiver operating characteristic curve (AUC), positive predictive value (PPV), and sensitivity. In the development site 154 (64%) of 239 potential events met adjudication criteria for anaphylaxis compared with 180 (65%) of 277 in the validation site. Logistic regression models using only structured claims data achieved a cross-validated AUC of 0.58 (95% CI: 0.54, 0.63). Machine learning improved cross-validated AUC to 0.62 (0.58, 0.66); incorporating NLP-derived covariates further increased cross-validated AUCs to 0.70 (0.66, 0.75) in development and 0.67 (0.63, 0.71) in external validation data. A classification threshold with cross-validated PPV of 79% and cross-validated sensitivity of 66% in development data had cross-validated PPV of 78% and cross-validated sensitivity of 56% in external data. Machine learning and NLP-derived data improved identification of validated anaphylaxis events.


Assuntos
Anafilaxia , Processamento de Linguagem Natural , Humanos , Anafilaxia/diagnóstico , Anafilaxia/epidemiologia , Aprendizado de Máquina , Algoritmos , Serviço Hospitalar de Emergência , Registros Eletrônicos de Saúde
3.
Eur J Contracept Reprod Health Care ; 28(1): 17-22, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-36537554

RESUMO

PURPOSE: Although non-barrier contraception is commonly prescribed, the risk of urinary tract infections (UTI) with contraceptive exposure is unclear. MATERIALS AND METHODS: Using data from Vanderbilt University Medical Centre's deidentified electronic health record (EHR), women ages 18-52 were randomly sampled and matched based on age and length of EHR. This case-control analysis tested for association between contraception exposure and outcome using UTI-positive (UTI+) as cases and upper respiratory infection+ (URI+) as controls. RESULTS: 24,563 UTI + cases (mean EHR: 64.2 months; mean age: 31.2 years) and 48,649 UTI-/URI + controls (mean EHR: 63.2 months; mean age: 31.9 years) were analysed. In the primary analysis, UTI risk was statistically significantly increased for the oral contraceptive pill (OCP; OR = 1.10 [95%CI = 1.02-1.11], p ≤ 0.05), intrauterine device (IUD; OR = 1.13 [95%CI = 1.04-1.23], p ≤ 0.05), etonogestrel implant (Nexplanon®; OR = 1.56 [95% CI = 1.24-1.96], p ≤ 0.05), and medroxyprogesterone acetate injectable (Depo-Provera®; OR = 2.16 [95%CI = 1.99-2.33], p ≤ 0.05) use compared to women not prescribed contraception. A secondary analysis that included any non-IUD contraception, which could serve as a proxy for sexual activity, demonstrated a small attenuation for the association between UTI and IUD (OR = 1.09 [95%CI = 0.98-1.21], p = 0.13). CONCLUSION: This study notes potential for a small increase in UTIs with contraceptive use. Prospective studies are required before this information is applied in clinical settings. CONDENSATION: Although non-barrier contraception is commonly prescribed, the risk of urinary tract infections (UTI) with contraceptive exposure is poorly understood. This large-cohort, case-control study notes potential for a small increase in UTIs with contraceptive use.


Assuntos
Anticoncepcionais Femininos , Infecções Urinárias , Feminino , Humanos , Adulto , Adolescente , Adulto Jovem , Pessoa de Meia-Idade , Estudos de Casos e Controles , Acetato de Medroxiprogesterona , Anticoncepcionais Orais , Anticoncepção/efeitos adversos , Infecções Urinárias/epidemiologia , Infecções Urinárias/etiologia , Anticoncepcionais Femininos/efeitos adversos
4.
BMC Med ; 20(1): 333, 2022 09 28.
Artigo em Inglês | MEDLINE | ID: mdl-36167547

RESUMO

BACKGROUND: Identifying pregnancies at risk for preterm birth, one of the leading causes of worldwide infant mortality, has the potential to improve prenatal care. However, we lack broadly applicable methods to accurately predict preterm birth risk. The dense longitudinal information present in electronic health records (EHRs) is enabling scalable and cost-efficient risk modeling of many diseases, but EHR resources have been largely untapped in the study of pregnancy. METHODS: Here, we apply machine learning to diverse data from EHRs with 35,282 deliveries to predict singleton preterm birth. RESULTS: We find that machine learning models based on billing codes alone can predict preterm birth risk at various gestational ages (e.g., ROC-AUC = 0.75, PR-AUC = 0.40 at 28 weeks of gestation) and outperform comparable models trained using known risk factors (e.g., ROC-AUC = 0.65, PR-AUC = 0.25 at 28 weeks). Examining the patterns learned by the model reveals it stratifies deliveries into interpretable groups, including high-risk preterm birth subtypes enriched for distinct comorbidities. Our machine learning approach also predicts preterm birth subtypes (spontaneous vs. indicated), mode of delivery, and recurrent preterm birth. Finally, we demonstrate the portability of our approach by showing that the prediction models maintain their accuracy on a large, independent cohort (5978 deliveries) from a different healthcare system. CONCLUSIONS: By leveraging rich phenotypic and genetic features derived from EHRs, we suggest that machine learning algorithms have great potential to improve medical care during pregnancy. However, further work is needed before these models can be applied in clinical settings.


Assuntos
Nascimento Prematuro , Algoritmos , Registros Eletrônicos de Saúde , Feminino , Idade Gestacional , Humanos , Recém-Nascido , Aprendizado de Máquina , Gravidez , Nascimento Prematuro/diagnóstico , Nascimento Prematuro/epidemiologia
5.
Med Care ; 60(8): 570-578, 2022 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-35658116

RESUMO

BACKGROUND: Persons with multimorbidity (≥2 chronic conditions) face an increased risk of poor health outcomes, especially as they age. Psychosocial factors such as social isolation, chronic stress, housing insecurity, and financial insecurity have been shown to exacerbate these outcomes, but are not routinely assessed during the clinical encounter. Our objective was to extract these concepts from chart notes using natural language processing and predict their impact on health care utilization for patients with multimorbidity. METHODS: A cohort study to predict the 1-year likelihood of hospitalizations and emergency department visits for patients 65+ with multimorbidity with and without psychosocial factors. Psychosocial factors were extracted from narrative notes; all other covariates were extracted from electronic health record data from a large academic medical center using validated algorithms and concept sets. Logistic regression was performed to predict the likelihood of hospitalization and emergency department visit in the next year. RESULTS: In all, 76,479 patients were eligible; the majority were White (89%), 54% were female, with mean age 73. Those with psychosocial factors were older, had higher baseline utilization, and more chronic illnesses. The 4 psychosocial factors all independently predicted future utilization (odds ratio=1.27-2.77, C -statistic=0.63). Accounting for demographics, specific conditions, and previous utilization, 3 of 4 of the extracted factors remained predictive (odds ratio=1.13-1.86) for future utilization. Compared with models with no psychosocial factors, they had improved discrimination. Individual predictions were mixed, with social isolation predicting depression and morbidity; stress predicting atherosclerotic cardiovascular disease onset; and housing insecurity predicting substance use disorder morbidity. DISCUSSION: Psychosocial factors are known to have adverse health impacts, but are rarely measured; using natural language processing, we extracted factors that identified a higher risk segment of older adults with multimorbidity. Combining these extraction techniques with other measures of social determinants may help catalyze population health efforts to address psychosocial factors to mitigate their health impacts.


Assuntos
Hospitalização , Aceitação pelo Paciente de Cuidados de Saúde , Idoso , Doença Crônica , Estudos de Coortes , Serviço Hospitalar de Emergência , Feminino , Humanos , Masculino , Multimorbidade , Aceitação pelo Paciente de Cuidados de Saúde/psicologia
6.
J Biomed Inform ; 117: 103777, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33838341

RESUMO

From the start of the coronavirus disease 2019 (COVID-19) pandemic, researchers have looked to electronic health record (EHR) data as a way to study possible risk factors and outcomes. To ensure the validity and accuracy of research using these data, investigators need to be confident that the phenotypes they construct are reliable and accurate, reflecting the healthcare settings from which they are ascertained. We developed a COVID-19 registry at a single academic medical center and used data from March 1 to June 5, 2020 to assess differences in population-level characteristics in pandemic and non-pandemic years respectively. Median EHR length, previously shown to impact phenotype performance in type 2 diabetes, was significantly shorter in the SARS-CoV-2 positive group relative to a 2019 influenza tested group (median 3.1 years vs 8.7; Wilcoxon rank sum P = 1.3e-52). Using three phenotyping methods of increasing complexity (billing codes alone and domain-specific algorithms provided by an EHR vendor and clinical experts), common medical comorbidities were abstracted from COVID-19 EHRs, defined by the presence of a positive laboratory test (positive predictive value 100%, recall 93%). After combining performance data across phenotyping methods, we observed significantly lower false negative rates for those records billed for a comprehensive care visit (p = 4e-11) and those with complete demographics data recorded (p = 7e-5). In an early COVID-19 cohort, we found that phenotyping performance of nine common comorbidities was influenced by median EHR length, consistent with previous studies, as well as by data density, which can be measured using portable metrics including CPT codes. Here we present those challenges and potential solutions to creating deeply phenotyped, acute COVID-19 cohorts.


Assuntos
COVID-19/diagnóstico , Registros Eletrônicos de Saúde , Fenótipo , Comorbidade , Diabetes Mellitus Tipo 2 , Saúde Global , Humanos , Influenza Humana , Funções Verossimilhança , Pandemias
7.
J Allergy Clin Immunol ; 144(1): 183-192, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-30776417

RESUMO

BACKGROUND: Vancomycin is a prevalent cause of the severe hypersensitivity syndrome drug reaction with eosinophilia and systemic symptoms (DRESS), which leads to significant morbidity and mortality and commonly occurs in the setting of combination antibiotic therapy, affecting future treatment choices. Variations in HLA class I in particular have been associated with serious T cell-mediated adverse drug reactions, which has led to preventive screening strategies for some drugs. OBJECTIVE: We sought to determine whether variation in the HLA region is associated with vancomycin-induced DRESS. METHODS: Probable vancomycin-induced DRESS cases were matched 1:2 with tolerant control subjects based on sex, race, and age by using BioVU, Vanderbilt's deidentified electronic health record database. Associations between DRESS and carriage of HLA class I and II alleles were assessed by means of conditional logistic regression. An extended sample set from BioVU was used to conduct a time-to-event analysis of those exposed to vancomycin with and without the identified HLA risk allele. RESULTS: Twenty-three subjects met the inclusion criteria for vancomycin-associated DRESS. Nineteen (82.6%) of 23 cases carried HLA-A*32:01 compared with 0 (0%) of 46 of the matched vancomycin-tolerant control subjects (P = 1 × 10-8) and 6.3% of the BioVU population (n = 54,249, P = 2 × 10-16). Time-to-event analysis of DRESS development during vancomycin treatment among the HLA-A*32:01-positive group indicated that 19.2% had DRESS and did so within 4 weeks. CONCLUSIONS: HLA-A*32:01 is strongly associated with vancomycin-induced DRESS in a population of predominantly European ancestry. HLA-A*32:01 testing could improve antibiotic safety, help implicate vancomycin as the causal drug, and preserve future treatment options with coadministered antibiotics.


Assuntos
Antibacterianos/efeitos adversos , Síndrome de Hipersensibilidade a Medicamentos/imunologia , Antígenos HLA-A/imunologia , Vancomicina/efeitos adversos , Adolescente , Adulto , Idoso , Antibacterianos/química , Síndrome de Hipersensibilidade a Medicamentos/etiologia , Feminino , Antígenos HLA-A/química , Humanos , Masculino , Pessoa de Meia-Idade , Simulação de Acoplamento Molecular , Vancomicina/química , Adulto Jovem
8.
Eur Urol Open Sci ; 67: 38-44, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39156495

RESUMO

Background and objective: Previous studies have reported a strong genetic contribution to kidney stone risk. This study aims to identify genetic associations of kidney stone disease within a large-scale electronic health record system. Methods: We performed genome-wide association studies (GWASs) for nephrolithiasis from genotyped samples of 5571 cases and 83 692 controls. This analysis included a primary GWAS focused on nephrolithiasis and subsequent subgroup GWASs stratified by stone composition types. For significant risk variants, we performed association analyses with stone composition and first-time 24-h urine parameters. To assess disease severity, we investigated the associations with age at first stone diagnosis, age at first stone-related procedure, and time between first and second stone-related procedures. Key findings and limitations: The primary GWAS analysis identified ten significant loci, all located on chromosome 16 within coding regions of the UMOD gene. The strongest signal was rs28544423 (odds ratio 1.17, 95% confidence interval 1.11-1.23, p = 2.7 × 10-9). In subgroup GWASs stratified by six kidney stone composition subtypes, 19 significant loci were identified including two loci in coding regions (brushite; NXPH1, rs79970906 and rs4725104). The UMOD single nucleotide polymorphism rs28544423 was associated with differences in 24-h excretion of urinary analytes, and the minor allele was positively associated with calcium oxalate dihydrate stone composition (p < 0.05). No associations were found between UMOD variants and disease severity. Limitations include an omitted variable bias and a misclassification bias. Conclusions and clinical implications: We replicated germline variants associated with kidney stone disease risk at UMOD and reported novel variants associated with stone composition. Genetic variants of UMOD are associated with differences in 24-h urine parameters and stone composition, but not disease severity. Patient summary: We identify genetic variants linked to kidney stone disease within an electronic health record (EHR) system. These findings suggest a role for the EHR to enable a precision-medicine approach for stone disease.

9.
medRxiv ; 2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38343797

RESUMO

Introduction and Objective: We sought to replicate and discover genetic associations of kidney stone disease within a large-scale electronic health record (EHR) system. Methods: We performed genome-wide association studies (GWASs) for nephrolithiasis from genotyped samples of 5,571 cases and 83,692 controls. Among the significant risk variants, we performed association analyses of stone composition and first-time 24-hour urine parameters. To assess disease severity, we investigated the associations of risk variants with age at first stone diagnosis, age at first procedure, and time from first to second procedure. Results: The main GWAS analysis identified 10 significant loci, each located on chromosome 16 within coding regions of the UMOD gene, which codes for uromodulin, a urine protein with inhibitory activity for calcium crystallization. The strongest signal was from SNP 16:20359633-C-T (odds ratio [OR] 1.17, 95% CI 1.11-1.23), with the remaining significant SNPs having similar effect sizes. In subgroup GWASs by stone composition, 19 significant loci were identified, of which two loci were located in coding regions (brushite; NXPH1 , rs79970906 and rs4725104). The UMOD SNP 16:20359633-C-T was associated with differences in 24-hour excretion of urinary calcium, uric acid, phosphorus, sulfate; and the minor allele was positively associated with calcium oxalate dihydrate stone composition (p<0.05). No associations were found between UMOD variants and disease severity. Conclusions: We replicated germline variants associated with kidney stone disease risk at UMOD and reported novel variants associated with stone composition. Genetic variants of UMOD are associated with differences in 24-hour urine parameters and stone composition, but not disease severity.

10.
medRxiv ; 2024 Aug 13.
Artigo em Inglês | MEDLINE | ID: mdl-39211884

RESUMO

Background: Recent advancements of large language models (LLMs) like Generative Pre-trained Transformer 4 (GPT-4) have generated significant interest among the scientific community. Yet, the potential of these models to be utilized in clinical settings remains largely unexplored. This study investigated the abilities of multiple LLMs and traditional machine learning models to analyze emergency department (ED) reports and determine if the corresponding visits were caused by symptomatic kidney stones. Methods: Leveraging a dataset of manually annotated ED reports, we developed strategies to enhance the performance of GPT-4, GPT-3.5, and Llama-2 including prompt optimization, zero- and few-shot prompting, fine-tuning, and prompt augmentation. Further, we implemented fairness assessment and bias mitigation methods to investigate the potential disparities by these LLMs with respect to race and gender. A clinical expert manually assessed the explanations generated by GPT-4 for its predictions to determine if they were sound, factually correct, unrelated to the input prompt, or potentially harmful. The evaluation includes a comparison between LLMs, traditional machine learning models (logistic regression, extreme gradient boosting, and light gradient boosting machine), and a baseline system utilizing International Classification of Diseases (ICD) codes for kidney stones. Results: The best results were achieved by GPT-4 (macro-F1=0.833, 95% confidence interval [CI]=0.826-0.841) and GPT-3.5 (macro-F1=0.796, 95% CI=0.796-0.796), both being statistically significantly better than the ICD-based baseline result (macro-F1=0.71). Ablation studies revealed that the initial pre-trained GPT-3.5 model benefits from fine-tuning when using the same parameter configuration. Adding demographic information and prior disease history to the prompts allows LLMs to make more accurate decisions. The evaluation of bias found that GPT-4 exhibited no racial or gender disparities, in contrast to GPT-3.5, which failed to effectively model racial diversity. The analysis of explanations provided by GPT-4 demonstrates advanced capabilities of this model in understanding clinical text and reasoning with medical knowledge.

11.
J Mol Diagn ; 26(7): 563-573, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38588769

RESUMO

Clonal hematopoiesis of indeterminate potential (CHIP) is a common age-related phenomenon in which hematopoietic stem cells acquire mutations in a select set of genes commonly mutated in myeloid neoplasia which then expand clonally. Current sequencing assays to detect CHIP mutations are not optimized for the detection of these variants and can be cost-prohibitive when applied to large cohorts or to serial sequencing. In this study, an affordable (approximately US $8 per sample), accurate, and scalable sequencing assay for CHIP is introduced and validated. The efficacy of the assay was demonstrated by identifying CHIP mutations in a cohort of 456 individuals with DNA collected at multiple time points in Vanderbilt University's biobank and quantifying clonal expansion rates over time. A total of 101 individuals with CHIP/clonal cytopenia of undetermined significance were identified, and individual-level clonal expansion rate was calculated using the variant allele fraction at both time points. Differences in clonal expansion rate by driver gene were observed, but there was also significant individual-level heterogeneity, emphasizing the multifactorial nature of clonal expansion. Additionally, mutation co-occurrence and clonal competition between multiple driver mutations were explored.


Assuntos
Hematopoiese Clonal , Mutação , Humanos , Hematopoiese Clonal/genética , Masculino , Feminino , Idoso , Pessoa de Meia-Idade , Adulto , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/economia , Análise Custo-Benefício , Células-Tronco Hematopoéticas/metabolismo , Células-Tronco Hematopoéticas/citologia , Evolução Clonal/genética , Idoso de 80 Anos ou mais , Hematopoese/genética
12.
Blood Cancer J ; 14(1): 6, 2024 01 15.
Artigo em Inglês | MEDLINE | ID: mdl-38225345

RESUMO

Clonal hematopoiesis (CH) can be caused by either single gene mutations (eg point mutations in JAK2 causing CHIP) or mosaic chromosomal alterations (e.g., loss of heterozygosity at chromosome 9p). CH is associated with a significantly increased risk of hematologic malignancies. However, the absolute rate of transformation on an annualized basis is low. Improved prognostication of transformation risk is urgently needed for routine clinical practice. We hypothesized that the co-occurrence of CHIP and mCAs at the same locus (e.g., transforming a heterozygous JAK2 CHIP mutation into a homozygous mutation through concomitant loss of heterozygosity at chromosome 9) might have important prognostic implications for malignancy transformation risk. We tested this hypothesis using our discovery cohort, the UK Biobank (n = 451,180), and subsequently validated it in the BioVU cohort (n = 91,335). We find that individuals with a concurrent somatic mutation and mCA were at significantly increased risk of hematologic malignancy (for example, In BioVU cohort incidence of hematologic malignancies is higher in individuals with co-occurring JAK2 V617F and 9p CN-LOH; HR = 54.76, 95% CI = 33.92-88.41, P < 0.001 vs. JAK2 V617F alone; HR = 44.05, 95% CI = 35.06-55.35, P < 0.001). Currently, the 'zygosity' of the CHIP mutation is not routinely reported in clinical assays or considered in prognosticating CHIP transformation risk. Based on these observations, we propose that clinical reports should include 'zygosity' status of CHIP mutations and that future prognostication systems should take mutation 'zygosity' into account.


Assuntos
Hematopoiese Clonal , Neoplasias Hematológicas , Humanos , Mutação , Mutação Puntual , Aberrações Cromossômicas , Neoplasias Hematológicas/genética
13.
Sci Rep ; 14(1): 23429, 2024 10 08.
Artigo em Inglês | MEDLINE | ID: mdl-39379449

RESUMO

Post marketing safety surveillance depends in part on the ability to detect concerning clinical events at scale. Spontaneous reporting might be an effective component of safety surveillance, but it requires awareness and understanding among healthcare professionals to achieve its potential. Reliance on readily available structured data such as diagnostic codes risks under-coding and imprecision. Clinical textual data might bridge these gaps, and natural language processing (NLP) has been shown to aid in scalable phenotyping across healthcare records in multiple clinical domains. In this study, we developed and validated a novel incident phenotyping approach using unstructured clinical textual data agnostic to Electronic Health Record (EHR) and note type. It's based on a published, validated approach (PheRe) used to ascertain social determinants of health and suicidality across entire healthcare records. To demonstrate generalizability, we validated this approach on two separate phenotypes that share common challenges with respect to accurate ascertainment: (1) suicide attempt; (2) sleep-related behaviors. With samples of 89,428 records and 35,863 records for suicide attempt and sleep-related behaviors, respectively, we conducted silver standard (diagnostic coding) and gold standard (manual chart review) validation. We showed Area Under the Precision-Recall Curve of ~ 0.77 (95% CI 0.75-0.78) for suicide attempt and AUPR ~ 0.31 (95% CI 0.28-0.34) for sleep-related behaviors. We also evaluated performance by coded race and demonstrated differences in performance by race differed across phenotypes. Scalable phenotyping models, like most healthcare AI, require algorithmovigilance and debiasing prior to implementation.


Assuntos
Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Humanos , Modelos Estatísticos , Feminino , Masculino , Tentativa de Suicídio , Adulto , Pessoa de Meia-Idade
14.
J Am Med Inform Assoc ; 31(11): 2440-2446, 2024 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-39127052

RESUMO

OBJECTIVES: To address the need for interactive visualization tools and databases in characterizing multimorbidity patterns across different populations, we developed the Phenome-wide Multi-Institutional Multimorbidity Explorer (PheMIME). This tool leverages three large-scale EHR systems to facilitate efficient analysis and visualization of disease multimorbidity, aiming to reveal both robust and novel disease associations that are consistent across different systems and to provide insight for enhancing personalized healthcare strategies. MATERIALS AND METHODS: PheMIME integrates summary statistics from phenome-wide analyses of disease multimorbidities, utilizing data from Vanderbilt University Medical Center, Mass General Brigham, and the UK Biobank. It offers interactive and multifaceted visualizations for exploring multimorbidity. Incorporating an enhanced version of associationSubgraphs, PheMIME also enables dynamic analysis and inference of disease clusters, promoting the discovery of complex multimorbidity patterns. A case study on schizophrenia demonstrates its capability for generating interactive visualizations of multimorbidity networks within and across multiple systems. Additionally, PheMIME supports diverse multimorbidity-based discoveries, detailed further in online case studies. RESULTS: The PheMIME is accessible at https://prod.tbilab.org/PheMIME/. A comprehensive tutorial and multiple case studies for demonstration are available at https://prod.tbilab.org/PheMIME_supplementary_materials/. The source code can be downloaded from https://github.com/tbilab/PheMIME. DISCUSSION: PheMIME represents a significant advancement in medical informatics, offering an efficient solution for accessing, analyzing, and interpreting the complex and noisy real-world patient data in electronic health records. CONCLUSION: PheMIME provides an extensive multimorbidity knowledge base that consolidates data from three EHR systems, and it is a novel interactive tool designed to analyze and visualize multimorbidities across multiple EHR datasets. It stands out as the first of its kind to offer extensive multimorbidity knowledge integration with substantial support for efficient online analysis and interactive visualization.


Assuntos
Registros Eletrônicos de Saúde , Multimorbidade , Humanos , Bases de Conhecimento , Software , Esquizofrenia , Fenômica , Interface Usuário-Computador , Internet , Fenótipo
15.
Nat Commun ; 15(1): 2568, 2024 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-38531883

RESUMO

Immune checkpoint inhibitor-mediated colitis (IMC) is a common adverse event of treatment with immune checkpoint inhibitors (ICI). We hypothesize that genetic susceptibility to Crohn's disease (CD) and ulcerative colitis (UC) predisposes to IMC. In this study, we first develop a polygenic risk scores for CD (PRSCD) and UC (PRSUC) in cancer-free individuals and then test these PRSs on IMC in a cohort of 1316 patients with ICI-treated non-small cell lung cancer and perform a replication in 873 ICI-treated pan-cancer patients. In a meta-analysis, the PRSUC predicts all-grade IMC (ORmeta=1.35 per standard deviation [SD], 95% CI = 1.12-1.64, P = 2×10-03) and severe IMC (ORmeta=1.49 per SD, 95% CI = 1.18-1.88, P = 9×10-04). PRSCD is not associated with IMC. Furthermore, PRSUC predicts severe IMC among patients treated with combination ICIs (ORmeta=2.20 per SD, 95% CI = 1.07-4.53, P = 0.03). Overall, PRSUC can identify patients receiving ICI at risk of developing IMC and may be useful to monitor patients and improve patient outcomes.


Assuntos
Carcinoma Pulmonar de Células não Pequenas , Colite Ulcerativa , Colite , Doença de Crohn , Neoplasias Pulmonares , Humanos , Colite Ulcerativa/genética , Inibidores de Checkpoint Imunológico , Estratificação de Risco Genético , Doença de Crohn/genética
16.
medRxiv ; 2024 May 27.
Artigo em Inglês | MEDLINE | ID: mdl-38585743

RESUMO

Background: Electronic health records (EHR) are increasingly used for studying multimorbidities. However, concerns about accuracy, completeness, and EHRs being primarily designed for billing and administrative purposes raise questions about the consistency and reproducibility of EHR-based multimorbidity research. Methods: Utilizing phecodes to represent the disease phenome, we analyzed pairwise comorbidity strengths using a dual logistic regression approach and constructed multimorbidity as an undirected weighted graph. We assessed the consistency of the multimorbidity networks within and between two major EHR systems at local (nodes and edges), meso (neighboring patterns), and global (network statistics) scales. We present case studies to identify disease clusters and uncover clinically interpretable disease relationships. We provide an interactive web tool and a knowledge base combining data from multiple sources for online multimorbidity analysis. Findings: Analyzing data from 500,000 patients across Vanderbilt University Medical Center and Mass General Brigham health systems, we observed a strong correlation in disease frequencies (Kendall's τ = 0.643) and comorbidity strengths (Pearson ρ = 0.79). Consistent network statistics across EHRs suggest similar structures of multimorbidity networks at various scales. Comorbidity strengths and similarities of multimorbidity connection patterns align with the disease genetic correlations. Graph-theoretic analyses revealed a consistent core-periphery structure, implying efficient network clustering through threshold graph construction. Using hydronephrosis as a case study, we demonstrated the network's ability to uncover clinically relevant disease relationships and provide novel insights. Interpretation: Our findings demonstrate the robustness of large-scale EHR data for studying phenome-wide multimorbidities. The alignment of multimorbidity patterns with genetic data suggests the potential utility for uncovering shared biology of diseases. The consistent core-periphery structure offers analytical insights to discover complex disease interactions. This work also sets the stage for advanced disease modeling, with implications for precision medicine. Funding: VUMC Biostatistics Development Award, the National Institutes of Health, and the VA CSRD.

17.
medRxiv ; 2024 Jul 29.
Artigo em Inglês | MEDLINE | ID: mdl-39132474

RESUMO

Background: Standardized definitions of suicidality phenotypes, including suicidal ideation (SI), attempt (SA), and death (SD) are a critical step towards improving understanding and comparison of results in suicide research. The complexity of suicidality contributes to heterogeneity in phenotype definitions, impeding evaluation of clinical and genetic risk factors across studies and efforts to combine samples within consortia. Here, we present expert and data-supported recommendations for defining suicidality and control phenotypes to facilitate merging current/legacy samples with definition variability and aid future sample creation. Methods: A subgroup of clinician researchers and experts from the Suicide Workgroup of the Psychiatric Genomics Consortium (PGC) reviewed existing PGC definitions for SI, SA, SD, and control groups and generated preliminary consensus guidelines for instrument-derived and international classification of disease (ICD) data. ICD lists were validated in two independent datasets (N = 9,151 and 12,394). Results: Recommendations are provided for evaluated instruments for SA and SI, emphasizing selection of lifetime measures phenotype-specific wording. Recommendations are also provided for defining SI and SD from ICD data. As the SA ICD definition is complex, SA code list recommendations were validated against instrument results with sensitivity (range = 15.4% to 80.6%), specificity (range = 67.6% to 97.4%), and positive predictive values (range = 0.59-0.93) reported. Conclusions: Best-practice guidelines are presented for the use of existing information to define SI/SA/SD in consortia research. These proposed definitions are expected to facilitate more homogeneous data aggregation for genetic and multisite studies. Future research should involve refinement, improved generalizability, and validation in diverse populations.

18.
J Infect Public Health ; 16(9): 1333-1340, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37429097

RESUMO

BACKGROUND: The first human monkeypox (MPX) case was identified in the Democratic Republic of Congo (DRC) in 1970 with an outbreak in 2010 and the first human MPX case in the UK in 2022. In this study, we conducted a bibliometric analysis of the literature on monkeypox based on the Web of Science Core Collection (WOSCC) of the Institute for Scientific Information (ISI) to identify relevant topics and trends in monkeypox research. METHODS: We searched the Web of Science from 1964 until July 14, 2022, for all publications using the keywords "Monkeypox" and "Monkeypox virus." Results were compared using numerous bibliometric methodologies and stratified by journal, author, year, institution, and country-specific metrics. RESULTS: Out of 1170 publications initially selected, 1163 entered our analysis, with 65.26 % (n = 759) being original research articles and 9.37 % (n = 109) being review articles. Most MPX publications were in 2010, with 6.02 % (n = 70), followed by 2009 and 2022 at 5.67 % (n = 66) each. The USA was the country with the highest number of publications, with n = 662 (56.92 %) of total publications, followed by Germany with n = 82 (7.05 %), the UK with n = 74 (6.36 %), and Congo with n = 65 (5.59 %). Journal of Virology published the highest number of MPX publications, followed by Virology Journal and Emerging Infectious Diseases with n = 52 (9.25 %), n = 43 (7.65 %), and n = 32 (5.69 %) publications, respectively. The top contributing institutions were the Centers for Disease Control and Prevention (CDC), the US Army Medical Research Institute of Infectious Diseases, and the National Institutes of Health (NIH)National Institute of Allergy and Infectious Diseases (NIAID). CONCLUSION: Our analysis provides an objective and robust overview of the current literature on MPX and its global trends; this information could serve as a reference guide for those aiming to conduct further MPX-related research and as a source for those seeking information about MPX.


Assuntos
Mpox , Humanos , Bibliometria , Surtos de Doenças , Alemanha , Mpox/epidemiologia , Monkeypox virus
19.
Urology ; 173: 55-60, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36435346

RESUMO

OBJECTIVE: To compare rates of patient-reported kidney stone disease to Electronic Health Records (EHR) kidney stone diagnosis using a common dataset to evaluate for socio-demographic differences, including between those with and without active care. METHODS: From the All of Us research database, we identified 21,687 adult participants with both patient-reported and EHR data. We compared differences in age, sex, race, education, employment status and healthcare access between patients with self-reported kidney stone history without EHR data to those with EHR-based diagnoses. RESULTS: In this population, the self-reported prevalence of kidney stones was 8.6% overall (n = 1877), including 4.6% (n = 1004) who had self-reported diagnoses but no EHR data. Among those with self-reported kidney stone diagnoses only, the median age was 66. The EHR-based prevalence of kidney stones was 5.7% (n = 1231), median age 67. No differences were observed in age, sex, education, employment status, rural/urban status, or ability to afford healthcare between groups with EHR diagnosis or self-reported diagnosis only. Of patients who had a self-reported history of kidney stones, 24% reported actively seeing a provider for kidney stones. CONCLUSION: Kidney stone prevalence by self-report is higher than EHR-based prevalence in this national dataset. Using either method alone to estimate kidney stone prevalence may exclude some patients with the condition, although the demographic profile of both groups is similar. Approximately 1 in 4 patients report actively seeing a provider for stone disease.


Assuntos
Cálculos Renais , Humanos , Cálculos Renais/diagnóstico , Cálculos Renais/epidemiologia , Cálculos Renais/terapia , Masculino , Feminino , Adulto , Pessoa de Meia-Idade , Idoso , Registros Eletrônicos de Saúde , Prevalência , Saúde da População
20.
medRxiv ; 2023 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-38076830

RESUMO

Post marketing safety surveillance depends in part on the ability to detect concerning clinical events at scale. Spontaneous reporting might be an effective component of safety surveillance, but it requires awareness and understanding among healthcare professionals to achieve its potential. Reliance on readily available structured data such as diagnostic codes risk under-coding and imprecision. Clinical textual data might bridge these gaps, and natural language processing (NLP) has been shown to aid in scalable phenotyping across healthcare records in multiple clinical domains. In this study, we developed and validated a novel incident phenotyping approach using unstructured clinical textual data agnostic to Electronic Health Record (EHR) and note type. It's based on a published, validated approach (PheRe) used to ascertain social determinants of health and suicidality across entire healthcare records. To demonstrate generalizability, we validated this approach on two separate phenotypes that share common challenges with respect to accurate ascertainment: 1) suicide attempt; 2) sleep-related behaviors. With samples of 89,428 records and 35,863 records for suicide attempt and sleep-related behaviors, respectively, we conducted silver standard (diagnostic coding) and gold standard (manual chart review) validation. We showed Area Under the Precision-Recall Curve of ∼ 0.77 (95% CI 0.75-0.78) for suicide attempt and AUPR ∼ 0.31 (95% CI 0.28-0.34) for sleep-related behaviors. We also evaluated performance by coded race and demonstrated differences in performance by race were dissimilar across phenotypes and require algorithmovigilance and debiasing prior to implementation.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA