Fast and covariate-adaptive method amplifies detection power in large-scale multiple hypothesis testing.
Nat Commun
; 10(1): 3433, 2019 07 31.
Article
en En
| MEDLINE
| ID: mdl-31366926
ABSTRACT
Multiple hypothesis testing is an essential component of modern data science. In many settings, in addition to the p-value, additional covariates for each hypothesis are available, e.g., functional annotation of variants in genome-wide association studies. Such information is ignored by popular multiple testing approaches such as the Benjamini-Hochberg procedure (BH). Here we introduce AdaFDR, a fast and flexible method that adaptively learns the optimal p-value threshold from covariates to significantly improve detection power. On eQTL analysis of the GTEx data, AdaFDR discovers 32% more associations than BH at the same false discovery rate. We prove that AdaFDR controls false discovery proportion and show that it makes substantially more discoveries while controlling false discovery rate (FDR) in extensive experiments. AdaFDR is computationally efficient and allows multi-dimensional covariates with both numeric and categorical values, making it broadly useful across many applications.
Texto completo:
1
Colección:
01-internacional
Banco de datos:
MEDLINE
Asunto principal:
Proyectos de Investigación
/
Algoritmos
/
Interpretación Estadística de Datos
Tipo de estudio:
Diagnostic_studies
Límite:
Humans
Idioma:
En
Revista:
Nat Commun
Asunto de la revista:
BIOLOGIA
/
CIENCIA
Año:
2019
Tipo del documento:
Article
País de afiliación:
Estados Unidos