Pesquisa | BVS CLAP/SMR-OPAS/OMS

High-dimensional genomic feature selection with the ordered stereotype logit model.

Seffernick, Anna Eames; Mrózek, Krzysztof; Nicolet, Deedra; Stone, Richard M; Eisfeld, Ann-Kathrin; Byrd, John C; Archer, Kellie J.

Brief Bioinform ; 23(6)2022 11 19.

Artigo em Inglês | MEDLINE | ID: mdl-36184192

RESUMO

For many high-dimensional genomic and epigenomic datasets, the outcome of interest is ordinal. While these ordinal outcomes are often thought of as the observed cutpoints of some latent continuous variable, some ordinal outcomes are truly discrete and are comprised of the subjective combination of several factors. The nonlinear stereotype logistic model, which does not assume proportional odds, was developed for these 'assessed' ordinal variables. It has previously been extended to the frequentist high-dimensional feature selection setting, but the Bayesian framework provides some distinct advantages in terms of simultaneous uncertainty quantification and variable selection. Here, we review the stereotype model and Bayesian variable selection methods and demonstrate how to combine them to select genomic features associated with discrete ordinal outcomes. We compared the Bayesian and frequentist methods in terms of variable selection performance. We additionally applied the Bayesian stereotype method to an acute myeloid leukemia RNA-sequencing dataset to further demonstrate its variable selection abilities by identifying features associated with the European LeukemiaNet prognostic risk score.

Assuntos

Genômica , Modelos Logísticos , Teorema de Bayes , Fatores de Risco

Penalized Bayesian forward continuation ratio model with application to high-dimensional data with discrete survival outcomes.

Seffernick, Anna Eames; Archer, Kellie J.

PLoS One ; 19(3): e0300638, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38547174

RESUMO

While time-to-event data are often continuous, there are several instances where discrete survival data, which are inherently ordinal, may be available or are more appropriate or useful. Several discrete survival models exist, but the forward continuation ratio model with a complementary log-log link has a survival interpretation and is closely related to the Cox proportional hazards model, despite being an ordinal model. This model has previously been implemented in the high-dimensional setting using the ordinal generalized monotone incremental forward stagewise algorithm. Here, we propose a Bayesian penalized forward continuation ratio model with a complementary log-log link and explore different priors to perform variable selection and regularization. Through simulations, we show that our Bayesian model outperformed the existing frequentist method in terms of variable selection performance, and that a 10% prior inclusion probability performed better than 1% or 50%. We also illustrate our model on a publicly available acute myeloid leukemia dataset to identify genomic features associated with discrete survival. We identified nine features that map to ten unique genes, five of which have been previously associated with leukemia in the literature. In conclusion, our proposed Bayesian model is flexible, allows simultaneous variable selection and uncertainty quantification, and performed well in simulation studies and application to real data.

Assuntos

Algoritmos , Genômica , Teorema de Bayes , Modelos de Riscos Proporcionais , Simulação por Computador

Computing Power and Sample Size for the False Discovery Rate in Multiple Applications.

Ni, Yonghui; Seffernick, Anna Eames; Onar-Thomas, Arzu; Pounds, Stanley B.

Genes (Basel) ; 15(3)2024 03 07.

Artigo em Inglês | MEDLINE | ID: mdl-38540403

RESUMO

The false discovery rate (FDR) is a widely used metric of statistical significance for genomic data analyses that involve multiple hypothesis testing. Power and sample size considerations are important in planning studies that perform these types of genomic data analyses. Here, we propose a three-rectangle approximation of a p-value histogram to derive a formula to compute the statistical power and sample size for analyses that involve the FDR. We also introduce the R package FDRsamplesize2, which incorporates these and other power calculation formulas to compute power for a broad variety of studies not covered by other FDR power calculation software. A few illustrative examples are provided. The FDRsamplesize2 package is available on CRAN.

Assuntos

Algoritmos , Software , Tamanho da Amostra , Projetos de Pesquisa , Genômica

ordinalbayes: Fitting Ordinal Bayesian Regression Models to High-Dimensional Data Using R.

Archer, Kellie J; Seffernick, Anna Eames; Sun, Shuai; Zhang, Yiran.

Stats (Basel) ; 5(2): 371-384, 2022 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-35574500

RESUMO

The stage of cancer is a discrete ordinal response that indicates the aggressiveness of disease and is often used by physicians to determine the type and intensity of treatment to be administered. For example, the FIGO stage in cervical cancer is based on the size and depth of the tumor as well as the level of spread. It may be of clinical relevance to identify molecular features from high-throughput genomic assays that are associated with the stage of cervical cancer to elucidate pathways related to tumor aggressiveness, identify improved molecular features that may be useful for staging, and identify therapeutic targets. High-throughput RNA-Seq data and corresponding clinical data (including stage) for cervical cancer patients have been made available through The Cancer Genome Atlas Project (TCGA). We recently described penalized Bayesian ordinal response models that can be used for variable selection for over-parameterized datasets, such as the TCGA-CESC dataset. Herein, we describe our ordinalbayes R package, available from the Comprehensive R Archive Network (CRAN), which enhances the runjags R package by enabling users to easily fit cumulative logit models when the outcome is ordinal and the number of predictors exceeds the sample size, P > N, such as for TCGA and other high-throughput genomic data. We demonstrate the use of this package by applying it to the TCGA cervical cancer dataset. Our ordinalbayes package can be used to fit models to high-dimensional datasets, and it effectively performs variable selection.

Race/ethnicity-associated blood DNA methylation differences between Japanese and European American women: an exploratory study.

Song, Min-Ae; Seffernick, Anna Eames; Archer, Kellie J; Mori, Kellie M; Park, Song-Yi; Chang, Linda; Ernst, Thomas; Tiirikainen, Maarit; Peplowska, Karolina; Wilkens, Lynne R; Le Marchand, Loïc; Lim, Unhee.

Clin Epigenetics ; 13(1): 188, 2021 10 11.

Artigo em Inglês | MEDLINE | ID: mdl-34635168

RESUMO

BACKGROUND: Racial/ethnic disparities in health reflect a combination of genetic and environmental causes, and DNA methylation may be an important mediator. We compared in an exploratory manner the blood DNA methylome of Japanese Americans (JPA) versus European Americans (EUA). METHODS: Genome-wide buffy coat DNA methylation was profiled among healthy Multiethnic Cohort participant women who were Japanese (JPA; n = 30) or European (EUA; n = 28) Americans aged 60-65. Differentially methylated CpGs by race/ethnicity (DM-CpGs) were identified by linear regression (Bonferroni-corrected P < 0.1) and analyzed in relation to corresponding gene expression, a priori selected single nucleotide polymorphisms (SNPs), and blood biomarkers of inflammation and metabolism using Pearson or Spearman correlations (FDR < 0.1). RESULTS: We identified 174 DM-CpGs with the majority of hypermethylated in JPA compared to EUA (n = 133), often in promoter regions (n = 48). Half (51%) of the genes corresponding to the DM-CpGs were involved in liver function and liver disease, and the methylation in nine genes was significantly correlated with gene expression for DM-CpGs. A total of 156 DM-CpGs were associated with rs7489665 (SH2B1). Methylation of DM-CpGs was correlated with blood levels of the cytokine MIP1B (n = 146). We confirmed some of the DM-CpGs in the TCGA adjacent non-tumor liver tissue of Asians versus EUA. CONCLUSION: We found a number of differentially methylated CpGs in blood DNA between JPA and EUA women with a potential link to liver disease, specific SNPs, and systemic inflammation. These findings may support further research on the role of DNA methylation in mediating some of the higher risk of liver disease among JPA.

Assuntos

Povo Asiático/etnologia , Metilação de DNA/genética , Etnicidade/genética , População Branca/etnologia , Proteínas Adaptadoras de Transdução de Sinal/análise , Proteínas Adaptadoras de Transdução de Sinal/sangue , Idoso , Povo Asiático/estatística & dados numéricos , Estudos de Coortes , Metilação de DNA/fisiologia , Etnicidade/estatística & dados numéricos , Feminino , Estudo de Associação Genômica Ampla , Humanos , Japão/etnologia , Masculino , Pessoa de Meia-Idade , Estados Unidos/etnologia , População Branca/estatística & dados numéricos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA