Gene set bagging for estimating the probability a statistically significant result will replicate.
BMC Bioinformatics
; 14: 360, 2013 Dec 12.
Article
em En
| MEDLINE
| ID: mdl-24330332
ABSTRACT
BACKGROUND:
Significance analysis plays a major role in identifying and ranking genes, transcription factor binding sites, DNA methylation regions, and other high-throughput features associated with illness. We propose a new approach, called gene set bagging, for measuring the probability that a gene set replicates in future studies. Gene set bagging involves resampling the original high-throughput data, performing gene-set analysis on the resampled data, and confirming that biological categories replicate in the bagged samples.RESULTS:
Using both simulated and publicly-available genomics data, we demonstrate that significant categories in a gene set enrichment analysis may be unstable when subjected to resampling. We show our method estimates the replication probability (R), the probability that a gene set will replicate as a significant result in future studies, and show in simulations that this method reflects replication better than each set's p-value.CONCLUSIONS:
Our results suggest that gene lists based on p-values are not necessarily stable, and therefore additional steps like gene set bagging may improve biological inference on gene sets.
Texto completo:
1
Base de dados:
MEDLINE
Assunto principal:
Metilação de DNA
/
Genômica
/
Replicação do DNA
Tipo de estudo:
Prognostic_studies
Limite:
Humans
Idioma:
En
Revista:
BMC Bioinformatics
Assunto da revista:
INFORMATICA MEDICA
Ano de publicação:
2013
Tipo de documento:
Article