Your browser doesn't support javascript.
loading
A weighted two-stage sequence alignment framework to identify motifs from ChIP-exo data.
Li, Yang; Wang, Yizhong; Wang, Cankun; Ma, Anjun; Ma, Qin; Liu, Bingqiang.
Affiliation
  • Li Y; Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA.
  • Wang Y; School of Mathematics, Shandong University, Jinan, Shandong 250100, China.
  • Wang C; Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA.
  • Ma A; Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA.
  • Ma Q; Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA.
  • Liu B; Pelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210, USA.
Patterns (N Y) ; 5(3): 100927, 2024 Mar 08.
Article in En | MEDLINE | ID: mdl-38487805
ABSTRACT
In this study, we introduce TESA (weighted two-stage alignment), an innovative motif prediction tool that refines the identification of DNA-binding protein motifs, essential for deciphering transcriptional regulatory mechanisms. Unlike traditional algorithms that rely solely on sequence data, TESA integrates the high-resolution chromatin immunoprecipitation (ChIP) signal, specifically from ChIP-exonuclease (ChIP-exo), by assigning weights to sequence positions, thereby enhancing motif discovery. TESA employs a nuanced approach combining a binomial distribution model with a graph model, further supported by a "bookend" model, to improve the accuracy of predicting motifs of varying lengths. Our evaluation, utilizing an extensive compilation of 90 prokaryotic ChIP-exo datasets from proChIPdb and 167 H. sapiens datasets, compared TESA's performance against seven established tools. The results indicate TESA's improved precision in motif identification, suggesting its valuable contribution to the field of genomic research.
Key words

Full text: 1 Database: MEDLINE Language: En Journal: Patterns (N Y) Year: 2024 Type: Article Affiliation country: United States

Full text: 1 Database: MEDLINE Language: En Journal: Patterns (N Y) Year: 2024 Type: Article Affiliation country: United States