Pesquisa | BVS IEC

1.

Genetic heterogeneity: Challenges, impacts, and methods through an associative lens.

Woodward, Alexa A; Urbanowicz, Ryan J; Naj, Adam C; Moore, Jason H.

Genet Epidemiol ; 46(8): 555-571, 2022 12.

Artigo em Inglês | MEDLINE | ID: mdl-35924480

RESUMO

Genetic heterogeneity describes the occurrence of the same or similar phenotypes through different genetic mechanisms in different individuals. Robustly characterizing and accounting for genetic heterogeneity is crucial to pursuing the goals of precision medicine, for discovering novel disease biomarkers, and for identifying targets for treatments. Failure to account for genetic heterogeneity may lead to missed associations and incorrect inferences. Thus, it is critical to review the impact of genetic heterogeneity on the design and analysis of population level genetic studies, aspects that are often overlooked in the literature. In this review, we first contextualize our approach to genetic heterogeneity by proposing a high-level categorization of heterogeneity into "feature," "outcome," and "associative" heterogeneity, drawing on perspectives from epidemiology and machine learning to illustrate distinctions between them. We highlight the unique nature of genetic heterogeneity as a heterogeneous pattern of association that warrants specific methodological considerations. We then focus on the challenges that preclude effective detection and characterization of genetic heterogeneity across a variety of epidemiological contexts. Finally, we discuss systems heterogeneity as an integrated approach to using genetic and other high-dimensional multi-omic data in complex disease research.

Assuntos

Heterogeneidade Genética , Medicina de Precisão , Humanos , Medicina de Precisão/métodos , Aprendizado de Máquina , Fenótipo

2.

Heterogeneity of treatment effects by risk in pulmonary arterial hypertension.

Pan, Hao-Min; McClelland, Robyn L; Moutchia, Jude; Appleby, Dina H; Fritz, Jason S; Holmes, John H; Minhas, Jasleen; Palevsky, Harold I; Urbanowicz, Ryan J; Kawut, Steven M; Al-Naamani, Nadine.

Eur Respir J ; 62(1)2023 07.

Artigo em Inglês | MEDLINE | ID: mdl-37169384

RESUMO

BACKGROUND: It is currently unknown if disease severity modifies response to therapy in pulmonary arterial hypertension (PAH). We aimed to explore if disease severity, as defined by established risk-prediction algorithms, modified response to therapy in randomised clinical trials in PAH. METHODS: We performed a meta-analysis using individual participant data from 18 randomised clinical trials of therapy for PAH submitted to the United States Food and Drug Administration to determine if predicted risk of 1-year mortality at randomisation modified the treatment effect on three outcomes: change in 6-min walk distance (6MWD), clinical worsening at 12âweeks and time to clinical worsening. RESULTS: Of 6561 patients with a baseline US Registry to Evaluate Early and Long-Term PAH Disease Management (REVEAL 2.0) score, we found that individuals with higher baseline risk had higher probabilities of clinical worsening but no difference in change in 6MWD. We detected a significant interaction of REVEAL 2.0 risk and treatment assignment on change in 6MWD. For every 3-point increase in REVEAL 2.0 score, there was a 12.49âm (95% CI 5.86-19.12âm; p=0.001) greater treatment effect in change in 6MWD. We did not detect a significant risk by treatment interaction on clinical worsening with most of the risk-prediction algorithms. CONCLUSIONS: We found that predicted risk of 1-year mortality in PAH modified treatment effect as measured by 6MWD, but not clinical worsening. Our findings highlight the importance of identifying sources of treatment heterogeneity by predicted risk to tailor studies to patients most likely to have the greatest treatment response.

Assuntos

Hipertensão Pulmonar , Hipertensão Arterial Pulmonar , Humanos , Hipertensão Arterial Pulmonar/tratamento farmacológico , Hipertensão Pulmonar Primária Familiar/tratamento farmacológico , Resultado do Tratamento , Anti-Hipertensivos/uso terapêutico

3.

HLA amino acid Mismatch-Based risk stratification of kidney allograft failure using a novel Machine learning algorithm.

Dasariraju, Satvik; Gragert, Loren; Wager, Grace L; McCullough, Keith; Brown, Nicholas K; Kamoun, Malek; Urbanowicz, Ryan J.

J Biomed Inform ; 142: 104374, 2023 06.

Artigo em Inglês | MEDLINE | ID: mdl-37120046

RESUMO

OBJECTIVE: While associations between HLA antigen-level mismatches (Ag-MM) and kidney allograft failure are well established, HLA amino acid-level mismatches (AA-MM) have been less explored. Ag-MM fails to consider the substantial variability in the number of MMs at polymorphic amino acid (AA) sites within any given Ag-MM category, which may conceal variable impact on allorecognition. In this study we aim to develop a novel Feature Inclusion Bin Evolver for Risk Stratification (FIBERS) and apply it to automatically discover bins of HLA amino acid mismatches that stratify donor-recipient pairs into low versus high graft survival risk groups. METHODS: Using data from the Scientific Registry of Transplant Recipients, we applied FIBERS on a multiethnic population of 166,574 kidney transplants between 2000 and 2017. FIBERS was applied (1) across all HLA-A, B, C, DRB1, and DQB1 locus AA-MMs with comparison to 0-ABDR Ag-MM risk stratification, (2) on AA-MMs within each HLA locus individually, and (3) using cross validation to evaluate FIBERS generalizability. The predictive power of graft failure risk stratification was evaluated while adjusting for donor/recipient characteristics and HLA-A, B, C, DRB1, and DQB1 Ag-MMs as covariates. RESULTS: FIBERS's best-performing bin (on AA-MMs across all loci) added significant predictive power (hazard ratio = 1.10, Bonferroni adj. p < 0.001) in stratifying graft failure risk (where low-risk is defined as zero AA-MMs and high-risk is one or more AA-MMs) even after adjusting for Ag-MMs and donor/recipient covariates. The best bin also categorized more than twice as many patients to the low-risk category, compared to traditional 0-ABDR Ag mismatching (â¼24.4% vs â¼ 9.1%). When HLA loci were binned individually, the bin for DRB1 exhibited the strongest risk stratification; relative to zero AA-MM, one or more MMs in the bin yielded HR = 1.11, p < 0.005 in a fully adjusted Cox model. AA-MMs at HLA-DRB1 peptide contact sites contributed most to incremental risk of graft failure. Additionally, FIBERS points to possible risk associated with HLA-DQB1 AA-MMs at positions that determine specificity of peptide anchor residues and HLA-DQ heterodimer stability. CONCLUSION: FIBERS's performance suggests potential for discovery of HLA immunogenetics-based risk stratification of kidney graft failure that outperforms traditional assessment.

Assuntos

Aminoácidos , Antígenos HLA-A , Humanos , Teste de Histocompatibilidade , Aloenxertos , Medição de Risco , Rim

4.

Embracing study heterogeneity for finding genetic interactions in large-scale research consortia.

Liu, Yulun; Huang, Jing; Urbanowicz, Ryan J; Chen, Kun; Manduchi, Elisabetta; Greene, Casey S; Moore, Jason H; Scheet, Paul; Chen, Yong.

Genet Epidemiol ; 44(1): 52-66, 2020 01.

Artigo em Inglês | MEDLINE | ID: mdl-31583758

RESUMO

Genetic interactions have been recognized as a potentially important contributor to the heritability of complex diseases. Nevertheless, due to small effect sizes and stringent multiple-testing correction, identifying genetic interactions in complex diseases is particularly challenging. To address the above challenges, many genomic research initiatives collaborate to form large-scale consortia and develop open access to enable sharing of genome-wide association study (GWAS) data. Despite the perceived benefits of data sharing from large consortia, a number of practical issues have arisen, such as privacy concerns on individual genomic information and heterogeneous data sources from distributed GWAS databases. In the context of large consortia, we demonstrate that the heterogeneously appearing marginal effects over distributed GWAS databases can offer new insights into genetic interactions for which conventional methods have had limited success. In this paper, we develop a novel two-stage testing procedure, named phylogenY-based effect-size tests for interactions using first 2 moments (YETI2), to detect genetic interactions through both pooled marginal effects, in terms of averaging site-specific marginal effects, and heterogeneity in marginal effects across sites, using a meta-analytic framework. YETI2 can not only be applied to large consortia without shared personal information but also can be used to leverage underlying heterogeneity in marginal effects to prioritize potential genetic interactions. We investigate the performance of YETI2 through simulation studies and apply YETI2 to bladder cancer data from dbGaP.

Assuntos

Epistasia Genética/genética , Estudo de Associação Genômica Ampla/métodos , Neoplasias da Bexiga Urinária/genética , Humanos , Disseminação de Informação , Modelos Genéticos , Polimorfismo de Nucleotídeo Único/genética

5.

STatistical Inference Relief (STIR) feature selection.

Le, Trang T; Urbanowicz, Ryan J; Moore, Jason H; McKinney, Brett A.

Bioinformatics ; 35(8): 1358-1365, 2019 04 15.

Artigo em Inglês | MEDLINE | ID: mdl-30239600

RESUMO

MOTIVATION: Relief is a family of machine learning algorithms that uses nearest-neighbors to select features whose association with an outcome may be due to epistasis or statistical interactions with other features in high-dimensional data. Relief-based estimators are non-parametric in the statistical sense that they do not have a parameterized model with an underlying probability distribution for the estimator, making it difficult to determine the statistical significance of Relief-based attribute estimates. Thus, a statistical inferential formalism is needed to avoid imposing arbitrary thresholds to select the most important features. We reconceptualize the Relief-based feature selection algorithm to create a new family of STatistical Inference Relief (STIR) estimators that retains the ability to identify interactions while incorporating sample variance of the nearest neighbor distances into the attribute importance estimation. This variance permits the calculation of statistical significance of features and adjustment for multiple testing of Relief-based scores. Specifically, we develop a pseudo t-test version of Relief-based algorithms for case-control data. RESULTS: We demonstrate the statistical power and control of type I error of the STIR family of feature selection methods on a panel of simulated data that exhibits properties reflected in real gene expression data, including main effects and network interaction effects. We compare the performance of STIR when the adaptive radius method is used as the nearest neighbor constructor with STIR when the fixed-k nearest neighbor constructor is used. We apply STIR to real RNA-Seq data from a study of major depressive disorder and discuss STIR's straightforward extension to genome-wide association studies. AVAILABILITY AND IMPLEMENTATION: Code and data available at http://insilico.utulsa.edu/software/STIR. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Estudo de Associação Genômica Ampla , Software , Algoritmos , Análise por Conglomerados , Transtorno Depressivo Maior , Humanos , Aprendizado de Máquina , Modelos Estatísticos

6.

Benchmarking relief-based feature selection methods for bioinformatics data mining.

Urbanowicz, Ryan J; Olson, Randal S; Schmitt, Peter; Meeker, Melissa; Moore, Jason H.

J Biomed Inform ; 85: 168-188, 2018 09.

Artigo em Inglês | MEDLINE | ID: mdl-30030120

RESUMO

Modern biomedical data mining requires feature selection methods that can (1) be applied to large scale feature spaces (e.g. 'omics' data), (2) function in noisy problems, (3) detect complex patterns of association (e.g. gene-gene interactions), (4) be flexibly adapted to various problem domains and data types (e.g. genetic variants, gene expression, and clinical data) and (5) are computationally tractable. To that end, this work examines a set of filter-style feature selection algorithms inspired by the 'Relief' algorithm, i.e. Relief-Based algorithms (RBAs). We implement and expand these RBAs in an open source framework called ReBATE (Relief-Based Algorithm Training Environment). We apply a comprehensive genetic simulation study comparing existing RBAs, a proposed RBA called MultiSURF, and other established feature selection methods, over a variety of problems. The results of this study (1) support the assertion that RBAs are particularly flexible, efficient, and powerful feature selection methods that differentiate relevant features having univariate, multivariate, epistatic, or heterogeneous associations, (2) confirm the efficacy of expansions for classification vs. regression, discrete vs. continuous features, missing data, multiple classes, or class imbalance, (3) identify previously unknown limitations of specific RBAs, and (4) suggest that while MultiSURF∗ performs best for explicitly identifying pure 2-way interactions, MultiSURF yields the most reliable feature selection performance across a wide range of problem types.

Assuntos

Biologia Computacional/métodos , Mineração de Dados/métodos , Algoritmos , Benchmarking , Biologia Computacional/normas , Simulação por Computador , Mineração de Dados/normas , Bases de Dados Genéticas , Epistasia Genética , Humanos

7.

Relief-based feature selection: Introduction and review.

Urbanowicz, Ryan J; Meeker, Melissa; La Cava, William; Olson, Randal S; Moore, Jason H.

J Biomed Inform ; 85: 189-203, 2018 09.

Artigo em Inglês | MEDLINE | ID: mdl-30031057

RESUMO

Feature selection plays a critical role in biomedical data mining, driven by increasing feature dimensionality in target problems and growing interest in advanced but computationally expensive methodologies able to model complex associations. Specifically, there is a need for feature selection methods that are computationally efficient, yet sensitive to complex patterns of association, e.g. interactions, so that informative features are not mistakenly eliminated prior to downstream modeling. This paper focuses on Relief-based algorithms (RBAs), a unique family of filter-style feature selection algorithms that have gained appeal by striking an effective balance between these objectives while flexibly adapting to various data characteristics, e.g. classification vs. regression. First, this work broadly examines types of feature selection and defines RBAs within that context. Next, we introduce the original Relief algorithm and associated concepts, emphasizing the intuition behind how it works, how feature weights generated by the algorithm can be interpreted, and why it is sensitive to feature interactions without evaluating combinations of features. Lastly, we include an expansive review of RBA methodological research beyond Relief and its popular descendant, ReliefF. In particular, we characterize branches of RBA research, and provide comparative summaries of RBA algorithms including contributions, strategies, functionality, time complexity, adaptation to key data characteristics, and software availability.

Assuntos

Algoritmos , Biologia Computacional/métodos , Mineração de Dados/métodos , Humanos , Modelos Estatísticos , Análise de Regressão , Software

8.

Health activism, vaccine, and mpox discourse: BERTopic based mixed-method analyses of tweets from sexual minority men and gender diverse (SMMGD) individuals in the U.S.

Wang, Yunwen; O'Connor, Karen; Flores, Ivan; Berdahl, Carl T; Urbanowicz, Ryan J; Stevens, Robin; Bauermeister, José A; Gonzalez-Hernandez, Graciela.

medRxiv ; 2024 Mar 19.

Artigo em Inglês | MEDLINE | ID: mdl-38562836

RESUMO

Objectives: To synthesize discussions among sexual minority men and gender diverse (SMMGD) individuals on mpox, given limited representation of SMMGD voices in existing mpox literature. Methods: BERTopic (a topic modeling technique) was employed with human validations to analyze mpox-related tweets (n = 8,688; October 2020-September 2022) from 2,326 self-identified SMMGD individuals in the U.S.; followed by content analysis and geographic analysis. Results: BERTopic identified 11 topics: health activism (29.81%); mpox vaccination (25.81%) and adverse events (0.98%); sarcasm, jokes, emotional expressions (14.04%); COVID-19 and mpox (7.32%); government/public health response (6.12%); mpox symptoms (2.74%); case reports (2.21%); puns on the virus' naming (i.e., monkeypox; 0.86%); media publicity (0.68%); mpox in children (0.67%). Mpox health activism negatively correlated with LGB social climate index at U.S. state level, ρ = -0.322, p = 0.031. Conclusions: SMMGD discussions on mpox encompassed utilitarian (e.g., vaccine access, case reports, mpox symptoms) and emotionally-charged themes-advocating against homophobia, misinformation, and stigma. Mpox health activism was more prevalent in states with lower LGB social acceptance. Public Health Implications: Findings illuminate SMMGD engagement with mpox discourse, underscoring the need for more inclusive health communication strategies in infectious disease outbreaks to control associated stigma.

9.

ChatGPT and large language models in academia: opportunities and challenges.

Meyer, Jesse G; Urbanowicz, Ryan J; Martin, Patrick C N; O'Connor, Karen; Li, Ruowang; Peng, Pei-Chen; Bright, Tiffani J; Tatonetti, Nicholas; Won, Kyoung Jae; Gonzalez-Hernandez, Graciela; Moore, Jason H.

BioData Min ; 16(1): 20, 2023 Jul 13.

Artigo em Inglês | MEDLINE | ID: mdl-37443040

RESUMO

The introduction of large language models (LLMs) that allow iterative "chat" in late 2022 is a paradigm shift that enables generation of text often indistinguishable from that written by humans. LLM-based chatbots have immense potential to improve academic work efficiency, but the ethical implications of their fair use and inherent bias must be considered. In this editorial, we discuss this technology from the academic's perspective with regard to its limitations and utility for academic writing, education, and programming. We end with our stance with regard to using LLMs and chatbots in academia, which is summarized as (1) we must find ways to effectively use them, (2) their use does not constitute plagiarism (although they may produce plagiarized text), (3) we must quantify their bias, (4) users must be cautious of their poor accuracy, and (5) the future is bright for their application to research and as an academic tool.

10.

Baseline Sex Differences in Pulmonary Arterial Hypertension Randomized Clinical Trials.

Ventetuolo, Corey E; Moutchia, Jude; Baird, Grayson L; Appleby, Dina H; McClelland, Robyn L; Minhas, Jasleen; Min, Jeff; Holmes, John H; Urbanowicz, Ryan J; Al-Naamani, Nadine; Kawut, Steven M.

Ann Am Thorac Soc ; 20(1): 58-66, 2023 01.

Artigo em Inglês | MEDLINE | ID: mdl-36053665

RESUMO

Rationale: Sex-based differences in pulmonary arterial hypertension (PAH) are known, but the contribution to disease measures is understudied. Objectives: We examined whether sex was associated with baseline 6-minute-walk distance (6MWD), hemodynamics, and functional class. Methods: We conducted a secondary analysis of participant-level data from randomized clinical trials of investigational PAH therapies conducted between 1998 and 2014 and provided by the U.S. Food and Drug Administration. Outcomes were modeled as a function of an interaction between sex and age or sex and body mass index (BMI), respectively, with generalized mixed modeling. Results: We included a total of 6,633 participants from 18 randomized clinical trials. A total of 5,197 (78%) were female, with a mean age of 49.1 years and a mean BMI of 27.0 kg/m2. Among 1,436 males, the mean age was 49.7 years, and the mean BMI was 26.4 kg/m2. The most common etiology of PAH was idiopathic. Females had shorter 6MWD. For every 1 kg/m2 increase in BMI for females, 6MWD decreased 2.3 (1.6-3.0) meters (P < 0.001), whereas 6MWD did not significantly change with BMI in males (0.31 m [-0.30 to 0.92]; P = 0.32). Females had lower right atrial pressure (RAP) and mean pulmonary artery pressure, and higher cardiac index than males (all P < 0.03). Age significantly modified the sex by RAP and mean pulmonary artery pressure relationships. For every 10-year increase in age, RAP was lower in males (0.5 mm Hg [0.3-0.7]; P < 0.001), but not in females (0.13 [-0.03 to 0.28]; P = 0.10). There was a significant decrease in pulmonary vascular resistance (PVR) with increasing age regardless of sex (P < 0.001). For every 1 kg/m2 increase in BMI, there was a 3% decrease in PVR for males (P < 0.001), compared with a 2% decrease in PVR in females (P < 0.001). Conclusions: Sexual dimorphism in subjects enrolled in clinical trials extends to 6MWD and hemodynamics; these relationships are modified by age and BMI. Sex, age, and body size should be considered in the evaluation and interpretation of surrogate outcomes in PAH.

Assuntos

Hipertensão Pulmonar , Hipertensão Arterial Pulmonar , Humanos , Feminino , Masculino , Pessoa de Meia-Idade , Caracteres Sexuais , Ensaios Clínicos Controlados Aleatórios como Assunto , Hipertensão Pulmonar Primária Familiar , Hemodinâmica

11.

Is low-risk status a surrogate outcome in pulmonary arterial hypertension? An analysis of three randomised trials.

Blette, Bryan S; Moutchia, Jude; Al-Naamani, Nadine; Ventetuolo, Corey E; Cheng, Chao; Appleby, Dina; Urbanowicz, Ryan J; Fritz, Jason; Mazurek, Jeremy A; Li, Fan; Kawut, Steven M; Harhay, Michael O.

Lancet Respir Med ; 11(10): 873-882, 2023 10.

Artigo em Inglês | MEDLINE | ID: mdl-37230098

RESUMO

BACKGROUND: Targeting short-term improvements in multicomponent risk scores for mortality in patients with pulmonary arterial hypertension (PAH) could result in improved long-term outcomes. We aimed to determine whether PAH risk scores were adequate surrogates for clinical worsening or mortality outcomes in PAH randomised clinical trials (RCTs). METHODS: We performed an individual participant data meta-analysis of RCTs selected from PAH trials provided by the US Food and Drug Administration (FDA). We calculated predicted risk using the COMPERA, COMPERA 2.0, non-invasive FPHR, REVEAL 2.0, and REVEAL Lite 2 risk scores. The primary outcome of interest was time to clinical worsening, a composite endpoint composed of any of the following events: all-cause death, hospitalisation for worsening PAH, lung transplantation, atrial septostomy, discontinuation of study treatment (or study withdrawal) for worsening PAH, initiation of parenteral prostacyclin analogue therapy, or decrease of at least 15% in 6-min walk distance from baseline, combined with either worsening of WHO functional class from baseline or the addition of an approved PAH treatment. The secondary outcome of interest was time to all-cause mortality. We assessed the surrogacy of these risk scores, parameterised as attainment of low-risk status by 16 weeks, for improvement in long-term clinical worsening and survival using mediation and meta-analysis frameworks. FINDINGS: Of 28 trials received from the FDA, three RCTs (AMBITION, GRIPHON, and SERAPHIN; n=2508) had the data necessary to assess long-term surrogacy. The mean age was 49 years (SD 16), 1956 (78%) participants were women, 1704 (68%) were classified as White, and 280 (11%) were Hispanic or Latino. 1388 (55%) of 2503 participants with available data had idiopathic PAH and 776 (31%) of 2503 had PAH associated with connective tissue disease. In a mediation analysis, the proportions of treatment effects explained by attainment of low-risk status ranged only from 7% to 13%. In a meta-analysis of trial-regions, the treatment effects on low-risk status were not predictive of the treatment effects on time to clinical worsening (R2 values 0·01-0·19) nor the treatment effects on time to all-cause mortality (R2 values 0-0·2). A leave-one-out analysis suggested that the use of these risk scores as surrogates might lead to biased inferences regarding the effect of therapies on clinical outcomes in PAH RCTs. Results were similar when using absolute risk scores at 16 weeks as the potential surrogates. INTERPRETATION: Multicomponent risk scores have utility for the prediction of outcomes in patients with PAH. Clinical surrogacy for long-term outcomes cannot be inferred from observational studies of outcomes. Our analyses of three PAH trials with long-term follow-up suggest that further study is necessary before using these or other scores as surrogate outcomes in PAH RCTs or clinical care. FUNDING: Cardiovascular Medical Research and Education Fund, US National Institutes of Health.

Assuntos

Hipertensão Arterial Pulmonar , Feminino , Humanos , Pessoa de Meia-Idade , Masculino , Hipertensão Arterial Pulmonar/tratamento farmacológico , Hipertensão Pulmonar Primária Familiar , Epoprostenol , Fatores de Risco , Ensaios Clínicos Controlados Aleatórios como Assunto

12.

Gene-Interaction-Sensitive enrichment analysis in congenital heart disease.

Woodward, Alexa A; Taylor, Deanne M; Goldmuntz, Elizabeth; Mitchell, Laura E; Agopian, A J; Moore, Jason H; Urbanowicz, Ryan J.

BioData Min ; 15(1): 4, 2022 Feb 12.

Artigo em Inglês | MEDLINE | ID: mdl-35151364

RESUMO

BACKGROUND: Gene set enrichment analysis (GSEA) uses gene-level univariate associations to identify gene set-phenotype associations for hypothesis generation and interpretation. We propose that GSEA can be adapted to incorporate SNP and gene-level interactions. To this end, gene scores are derived by Relief-based feature importance algorithms that efficiently detect both univariate and interaction effects (MultiSURF) or exclusively interaction effects (MultiSURF*). We compare these interaction-sensitive GSEA approaches to traditional χ2 rankings in simulated genome-wide array data, and in a target and replication cohort of congenital heart disease patients with conotruncal defects (CTDs). RESULTS: In the simulation study and for both CTD datasets, both Relief-based approaches to GSEA captured more relevant and significant gene ontology terms compared to the univariate GSEA. Key terms and themes of interest include cell adhesion, migration, and signaling. A leading edge analysis highlighted semaphorins and their receptors, the Slit-Robo pathway, and other genes with roles in the secondary heart field and outflow tract development. CONCLUSIONS: Our results indicate that interaction-sensitive approaches to enrichment analysis can improve upon traditional univariate GSEA. This approach replicated univariate findings and identified additional and more robust support for the role of the secondary heart field and cardiac neural crest cell migration in the development of CTDs.

13.

A Semi-Automated Term Harmonization Pipeline Applied to Pulmonary Arterial Hypertension Clinical Trials.

Urbanowicz, Ryan J; Holmes, John H; Appleby, Dina; Narasimhan, Vanamala; Durborow, Stephen; Al-Naamani, Nadine; Fernando, Melissa; Kawut, Steven M.

Methods Inf Med ; 61(1-02): 3-10, 2022 05.

Artigo em Inglês | MEDLINE | ID: mdl-34820791

RESUMO

OBJECTIVE: Data harmonization is essential to integrate individual participant data from multiple sites, time periods, and trials for meta-analysis. The process of mapping terms and phrases to an ontology is complicated by typographic errors, abbreviations, truncation, and plurality. We sought to harmonize medical history (MH) and adverse events (AE) term records across 21 randomized clinical trials in pulmonary arterial hypertension and chronic thromboembolic pulmonary hypertension. METHODS: We developed and applied a semi-automated harmonization pipeline for use with domain-expert annotators to resolve ambiguous term mappings using exact and fuzzy matching. We summarized MH and AE term mapping success, including map quality measures, and imputation of a generalizing term hierarchy as defined by the applied Medical Dictionary for Regulatory Activities (MedDRA) ontology standard. RESULTS: Over 99.6% of both MH (N = 37,105) and AE (N = 58,170) records were successfully mapped to MedDRA low-level terms. Automated exact matching accounted for 74.9% of MH and 85.5% of AE mappings. Term recommendations from fuzzy matching in the pipeline facilitated annotator mapping of the remaining 24.9% of MH and 13.8% of AE records. Imputation of the generalized MedDRA term hierarchy was unambiguous in 85.2% of high-level terms, 99.4% of high-level group terms, and 99.5% of system organ class in MH, and 75% of high-level terms, 98.3% of high-level group terms, and 98.4% of system organ class in AE. CONCLUSION: This pipeline dramatically reduced the burden of manual annotation for MH and AE term harmonization and could be adapted to other data integration efforts.

Assuntos

Sistemas de Notificação de Reações Adversas a Medicamentos , Hipertensão Arterial Pulmonar , Humanos , Hipertensão Arterial Pulmonar/tratamento farmacológico , Ensaios Clínicos Controlados Aleatórios como Assunto

14.

BMI and Treatment Response in Patients With Pulmonary Arterial Hypertension: A Meta-analysis.

McCarthy, Breanne E; McClelland, Robyn L; Appleby, Dina H; Moutchia, Jude S; Minhas, Jasleen K; Min, Jeff; Mazurek, Jeremy A; Smith, K Akaya; Fritz, Jason S; Pugliese, Steven C; Urbanowicz, Ryan J; Holmes, John H; Palevsky, Harold I; Kawut, Steven M; Al-Naamani, Nadine.

Chest ; 162(2): 436-447, 2022 08.

Artigo em Inglês | MEDLINE | ID: mdl-35247393

RESUMO

BACKGROUND: Obesity is increasingly prevalent in pulmonary arterial hypertension (PAH) but is associated with improved survival, creating an "obesity paradox" in PAH. It is unknown if the improved outcomes could be attributable to obese patients deriving a greater benefit from PAH therapies. RESEARCH QUESTION: Does BMI modify treatment effectiveness in PAH? STUDY DESIGN AND METHODS: Using individual participant data, a meta-analysis was conducted of phase III, randomized, placebo-controlled trials of treatments for PAH submitted for approval to the U.S. Food and Drug Administration from 2000 to 2015. Primary outcomes were change in 6-min walk distance (6MWD) and World Health Organization (WHO) functional class. RESULTS: A total of 5,440 participants from 17 trials were included. Patients with overweight and obesity had lower baseline 6MWD and were more likely to be WHO functional class III or IV. Treatment was associated with a 27.01-m increase in 6MWD (95% CI, 21.58-32.45; P < .001) and lower odds of worse WHO functional class (OR, 0.58; 95% CI, 0.48-0.70; P < .001). For every 1 kg/m2 increase in BMI, 6MWD was reduced by 0.66 m (P = .07); there was no significant effect modification of treatment response in 6MWD according to BMI (P for interaction = .34). Higher BMI was not associated with odds of WHO functional class at end of follow-up; however, higher BMI attenuated the treatment response such that every 1 kg/m2 increase in BMI increased odds of worse WHO functional class by 3% (OR, 1.03; P for interaction = .06). INTERPRETATION: Patients with overweight and obesity had lower baseline 6MWD and worse WHO functional class than patients with normal weight with PAH. Higher BMI did not modify the treatment response for change in 6MWD, but it attenuated the treatment response for WHO functional class. PAH trials should include participants representative of all weight groups to allow for assessment of treatment heterogeneity and mechanisms.

Assuntos

Hipertensão Pulmonar , Hipertensão Arterial Pulmonar , Anti-Hipertensivos/uso terapêutico , Ensaios Clínicos Fase III como Assunto , Hipertensão Pulmonar Primária Familiar , Humanos , Obesidade/complicações , Obesidade/epidemiologia , Sobrepeso , Ensaios Clínicos Controlados Aleatórios como Assunto , Resultado do Tratamento

15.

Secular and Regional Trends among Pulmonary Arterial Hypertension Clinical Trial Participants.

Min, Jeff; Appleby, Dina H; McClelland, Robyn L; Minhas, Jasleen; Holmes, John H; Urbanowicz, Ryan J; Pugliese, Steven C; Mazurek, Jeremy A; Smith, K Akaya; Fritz, Jason S; Palevsky, Harold I; Suh, Jude Moutchia; Al-Naamani, Nadine; Kawut, Steven M.

Ann Am Thorac Soc ; 19(6): 952-961, 2022 06.

Artigo em Inglês | MEDLINE | ID: mdl-34936541

RESUMO

Rationale: The population of patients with pulmonary arterial hypertension (PAH) has evolved over time from predominantly young White women to an older, more racially diverse and obese population. Whether these changes are reflected in clinical trials is not known. Objectives: To determine secular and regional trends among PAH trial participants. Methods: We performed a pooled cohort analysis using harmonized data from phase III clinical trials of PAH therapies submitted to the U.S. Food and Drug Administration. We used mixed-effects linear and logistic regression to assess regional differences in participant age, sex, body habitus, and hemodynamics over time. Results: A total of 6,599 participants were enrolled in 18 trials between 1998 and 2013; 78% were female. The mean age of participants in North America, Europe, and Latin America at the time of study start increased by 2.09 (95% confidence interval [CI], 0.67-3.51), 1.62 (95% CI, 0.24-3.00), and 4.75 (95% CI, 2.29-7.21) years per 5 years, respectively (P = 0.01). Body mass index at the time of study start increased by 0.72 kg/m2 per 5 years (95% CI, 0.44-0.99; P < 0.001) across all regions. Eighty-five percent of participants in early studies were non-Hispanic White, but this decreased over time to 70%. Ninety-seven percent of Asians and 74% of Hispanics in the sample were recruited from Asia and Latin America. Conclusions: Patients enrolled in more recent PAH therapy trials are older and more obese, mirroring the changing epidemiology of observational cohorts. However, these trends varied by geographic region. PAH cohorts remain predominantly female, presenting challenges for generalizability to male patients. Although the proportion of non-White participants increased over time, this was primarily through recruitment in Asia and Latin America.

Assuntos

Hipertensão Arterial Pulmonar , Estudos de Coortes , Europa (Continente)/epidemiologia , Hipertensão Pulmonar Primária Familiar , Feminino , Humanos , Masculino , Obesidade , Hipertensão Arterial Pulmonar/tratamento farmacológico , Hipertensão Arterial Pulmonar/epidemiologia , Estados Unidos/epidemiologia

16.

Using Machine Learning on Home Health Care Assessments to Predict Fall Risk.

Lo, Yancy; Lynch, Selah F; Urbanowicz, Ryan J; Olson, Randal S; Ritter, Ashley Z; Whitehouse, Christina R; O'Connor, Melissa; Keim, Susan K; McDonald, Margaret; Moore, Jason H; Bowles, Kathryn H.

Stud Health Technol Inform ; 264: 684-688, 2019 Aug 21.

Artigo em Inglês | MEDLINE | ID: mdl-31438011

RESUMO

Falls are the leading cause of injuries among older adults, particularly in the more vulnerable home health care (HHC) population. Existing standardized fall risk assessments often require supplemental data collection and tend to have low specificity. We applied a random forest algorithm on readily available HHC data from the mandated Outcomes and Assessment Information Set (OASIS) with over 100 items from 59,006 HHC patients to identify factors that predict and quantify fall risks. Our ultimate goal is to build clinical decision support for fall prevention. Our model achieves higher precision and balanced accuracy than the commonly used multifactorial Missouri Alliance for Home Care fall risk assessment. This is the first known attempt to determine fall risk factors from the extensive OASIS data from a large sample. Our quantitative prediction of fall risks can aid clinical discussions of risk factors and prevention strategies for lowering fall incidence.

Assuntos

Acidentes por Quedas , Serviços de Assistência Domiciliar , Aprendizado de Máquina , Humanos , Missouri , Medição de Risco , Fatores de Risco

17.

Preparing next-generation scientists for biomedical big data: artificial intelligence approaches.

Moore, Jason H; Boland, Mary Regina; Camara, Pablo G; Chervitz, Hannah; Gonzalez, Graciela; Himes, Blanca E; Kim, Dokyoon; Mowery, Danielle L; Ritchie, Marylyn D; Shen, Li; Urbanowicz, Ryan J; Holmes, John H.

Per Med ; 16(3): 247-257, 2019 05 01.

Artigo em Inglês | MEDLINE | ID: mdl-30760118

RESUMO

Personalized medicine is being realized by our ability to measure biological and environmental information about patients. Much of these data are being stored in electronic health records yielding big data that presents challenges for its management and analysis. Here, we review several areas of knowledge that are necessary for next-generation scientists to fully realize the potential of biomedical big data. We begin with an overview of big data and its storage and management. We then review statistics and data science as foundational topics followed by a core curriculum of artificial intelligence, machine learning and natural language processing that are needed to develop predictive models for clinical decision making. We end with some specific training recommendations for preparing next-generation scientists for biomedical big data.

Assuntos

Ciência de Dados/métodos , Medicina de Precisão/métodos , Big Data , Tomada de Decisão Clínica , Mineração de Dados , Registros Eletrônicos de Saúde , Humanos , Aprendizado de Máquina , Processamento de Linguagem Natural

18.

Analysis of Gene-Gene Interactions.

Cole, Brian S; Hall, Molly A; Urbanowicz, Ryan J; Gilbert-Diamond, Diane; Moore, Jason H.

Curr Protoc Hum Genet ; 95: 1.14.1-1.14.10, 2017 10 18.

Artigo em Inglês | MEDLINE | ID: mdl-29044470

RESUMO

The goal of this unit is to introduce epistasis, or gene-gene interactions, as a significant contributor to the genetic architecture of complex traits, including disease susceptibility. This unit begins with an historical overview of the concept of epistasis and the challenges inherent in the identification of potential gene-gene interactions. Then, it reviews statistical and machine learning methods for discovering epistasis in the context of genetic studies of quantitative and categorical traits. This unit concludes with a discussion of meta-analysis, replication, and other topics of active research. © 2017 by John Wiley & Sons, Inc.

Assuntos

Epistasia Genética , Regulação da Expressão Gênica , Genômica , Algoritmos , Animais , Estudos de Associação Genética/métodos , Predisposição Genética para Doença , Genômica/métodos , Humanos , Aprendizado de Máquina , Modelos Estatísticos

19.

PMLB: a large benchmark suite for machine learning evaluation and comparison.

Olson, Randal S; La Cava, William; Orzechowski, Patryk; Urbanowicz, Ryan J; Moore, Jason H.

BioData Min ; 10: 36, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-29238404

RESUMO

BACKGROUND: The selection, development, or comparison of machine learning methods in data mining can be a difficult task based on the target problem and goals of a particular study. Numerous publicly available real-world and simulated benchmark datasets have emerged from different sources, but their organization and adoption as standards have been inconsistent. As such, selecting and curating specific benchmarks remains an unnecessary burden on machine learning practitioners and data scientists. RESULTS: The present study introduces an accessible, curated, and developing public benchmark resource to facilitate identification of the strengths and weaknesses of different machine learning methodologies. We compare meta-features among the current set of benchmark datasets in this resource to characterize the diversity of available data. Finally, we apply a number of established machine learning methods to the entire benchmark suite and analyze how datasets and algorithms cluster in terms of performance. From this study, we find that existing benchmarks lack the diversity to properly benchmark machine learning algorithms, and there are several gaps in benchmarking problems that still need to be considered. CONCLUSIONS: This work represents another important step towards understanding the limitations of popular benchmarking suites and developing a resource that connects existing benchmarking standards to more diverse and efficient standards in the future.

20.

ExSTraCS 2.0: Description and Evaluation of a Scalable Learning Classifier System.

Urbanowicz, Ryan J; Moore, Jason H.

Evol Intell ; 8(2): 89-116, 2015 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-26417393

RESUMO

Algorithmic scalability is a major concern for any machine learning strategy in this age of 'big data'. A large number of potentially predictive attributes is emblematic of problems in bioinformatics, genetic epidemiology, and many other fields. Previously, ExS-TraCS was introduced as an extended Michigan-style supervised learning classifier system that combined a set of powerful heuristics to successfully tackle the challenges of classification, prediction, and knowledge discovery in complex, noisy, and heterogeneous problem domains. While Michigan-style learning classifier systems are powerful and flexible learners, they are not considered to be particularly scalable. For the first time, this paper presents a complete description of the ExS-TraCS algorithm and introduces an effective strategy to dramatically improve learning classifier system scalability. ExSTraCS 2.0 addresses scalability with (1) a rule specificity limit, (2) new approaches to expert knowledge guided covering and mutation mechanisms, and (3) the implementation and utilization of the TuRF algorithm for improving the quality of expert knowledge discovery in larger datasets. Performance over a complex spectrum of simulated genetic datasets demonstrated that these new mechanisms dramatically improve nearly every performance metric on datasets with 20 attributes and made it possible for ExSTraCS to reliably scale up to perform on related 200 and 2000-attribute datasets. ExSTraCS 2.0 was also able to reliably solve the 6, 11, 20, 37, 70, and 135 multiplexer problems, and did so in similar or fewer learning iterations than previously reported, with smaller finite training sets, and without using building blocks discovered from simpler multiplexer problems. Furthermore, ExS-TraCS usability was made simpler through the elimination of previously critical run parameters.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA