Training-free measures based on algorithmic probability identify high nucleosome occupancy in DNA sequences.

Zenil, Hector; Minary, Peter

Zenil, Hector; Minary, Peter.

Afiliação

Zenil H; Oxford Immune Algorithmics, Oxford University Innovation, Oxford, UK.
Minary P; Algorithmic Dynamics Lab, Unit of Computational Medicine, SciLifeLab, Center for Molecular Medicine, Karolinska Institute, Stockholm, Sweden.

Nucleic Acids Res ; 47(20): e129, 2019 11 18.

Article em En | MEDLINE | ID: mdl-31511887

RESUMO

We introduce and study a set of training-free methods of an information-theoretic and algorithmic complexity nature that we apply to DNA sequences to identify their potential to identify nucleosomal binding sites. We test the measures on well-studied genomic sequences of different sizes drawn from different sources. The measures reveal the known in vivo versus in vitro predictive discrepancies and uncover their potential to pinpoint high and low nucleosome occupancy. We explore different possible signals within and beyond the nucleosome length and find that the complexity indices are informative of nucleosome occupancy. We found that, while it is clear that the gold standard Kaplan model is driven by GC content (by design) and by k-mer training; for high occupancy, entropy and complexity-based scores are also informative and can complement the Kaplan model.

Assuntos

Nucleossomos/genética; Análise de Sequência de DNA/métodos; Algoritmos; Animais; Composição de Bases; DNA/química; DNA/genética; Humanos; Nucleossomos/química; Probabilidade

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Nucleossomos / Análise de Sequência de DNA Idioma: En Ano de publicação: 2019 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Nucleossomos / Análise de Sequência de DNA Idioma: En Ano de publicação: 2019 Tipo de documento: Article