Your browser doesn't support javascript.
loading
Modular discovery of monomeric and dimeric transcription factor binding motifs for large data sets.
Toivonen, Jarkko; Kivioja, Teemu; Jolma, Arttu; Yin, Yimeng; Taipale, Jussi; Ukkonen, Esko.
Afiliação
  • Toivonen J; Department of Computer Science, P.O. Box 68, FI-00014 University of Helsinki, Helsinki, Finland.
  • Kivioja T; Genome-Scale Biology Program, P.O. Box 63, FI-00014 University of Helsinki, Helsinki, Finland.
  • Jolma A; Division of Functional Genomics and Systems Biology, Department of Medical Biochemistry and Biophysics, and Department of Biosciences and Nutrition, Karolinska Institutet, SE 141 83 Stockholm, Sweden.
  • Yin Y; Division of Functional Genomics and Systems Biology, Department of Medical Biochemistry and Biophysics, and Department of Biosciences and Nutrition, Karolinska Institutet, SE 141 83 Stockholm, Sweden.
  • Taipale J; Genome-Scale Biology Program, P.O. Box 63, FI-00014 University of Helsinki, Helsinki, Finland.
  • Ukkonen E; Division of Functional Genomics and Systems Biology, Department of Medical Biochemistry and Biophysics, and Department of Biosciences and Nutrition, Karolinska Institutet, SE 141 83 Stockholm, Sweden.
Nucleic Acids Res ; 46(8): e44, 2018 05 04.
Article em En | MEDLINE | ID: mdl-29385521
ABSTRACT
In some dimeric cases of transcription factor (TF) binding, the specificity of dimeric motifs has been observed to differ notably from what would be expected were the two factors to bind to DNA independently of each other. Current motif discovery methods are unable to learn monomeric and dimeric motifs in modular fashion such that deviations from the expected motif would become explicit and the noise from dimeric occurrences would not corrupt monomeric models. We propose a novel modeling technique and an expectation maximization algorithm, implemented as software tool MODER, for discovering monomeric TF binding motifs and their dimeric combinations. Given training data and seeds for monomeric motifs, the algorithm learns in the same probabilistic framework a mixture model which represents monomeric motifs as standard position-specific probability matrices (PPMs), and dimeric motifs as pairs of monomeric PPMs, with associated orientation and spacing preferences. For dimers the model represents deviations from pure modular model of two independent monomers, thus making co-operative binding effects explicit. MODER can analyze in reasonable time tens of Mbps of training data. We validated the tool on HT-SELEX and ChIP-seq data. Our findings include some TFs whose expected model has palindromic symmetry but the observed model is directional.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Fatores de Transcrição / DNA Tipo de estudo: Prognostic_studies / Risk_factors_studies Idioma: En Ano de publicação: 2018 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Fatores de Transcrição / DNA Tipo de estudo: Prognostic_studies / Risk_factors_studies Idioma: En Ano de publicação: 2018 Tipo de documento: Article