Computing the Statistical Significance of Overlap between Genome Annotations with iStat.
Cell Syst
; 8(6): 523-529.e4, 2019 06 26.
Article
en En
| MEDLINE
| ID: mdl-31202632
Genome annotation remains a fundamental effort in modern biology. With reducing costs and new forms of sequencing technologies, annotations specific to tissue type and experimental conditions are continually being generated (e.g., histone methylation marks). Computing the statistical significance of overlap between two different annotations is key to many biological findings but has not been systematically addressed previously. We formalize the problem as follows: let I and If each describe a collection of n and m intervals of a genome with particular annotation. Under the null hypothesis that genomic intervals in I are randomly arranged with respect to If, what is the significance of k of m intervals of If intersecting with intervals in I? We describe a tool iSTAT that implements a combinatorial algorithm to accurately compute p values. We applied iSTAT to simulated and real datasets to obtain precise estimates and contrasted them against previous results using permutation or parametric tests.
Palabras clave
Texto completo:
1
Colección:
01-internacional
Banco de datos:
MEDLINE
Asunto principal:
Programas Informáticos
/
Genoma Humano
/
Modelos Estadísticos
/
Anotación de Secuencia Molecular
Tipo de estudio:
Evaluation_studies
/
Prognostic_studies
/
Risk_factors_studies
Límite:
Humans
Idioma:
En
Revista:
Cell Syst
Año:
2019
Tipo del documento:
Article
País de afiliación:
Estados Unidos