Supervised learning of enhancer-promoter specificity based on genome-wide perturbation studies highlights areas for improvement in learning.
Bioinformatics
; 40(6)2024 06 03.
Article
em En
| MEDLINE
| ID: mdl-38870532
ABSTRACT
MOTIVATION Understanding the rules that govern enhancer-driven transcription remains a central unsolved problem in genomics. Now with multiple massively parallel enhancer perturbation assays published, there are enough data that we can utilize to learn to predict enhancer-promoter (EP) relationships in a data-driven manner. RESULTS:
We applied machine learning to one of the largest enhancer perturbation studies integrated with transcription factor (TF) and histone modification ChIP-seq. The results uncovered a discrepancy in the prediction of genome-wide data compared to data from targeted experiments. Relative strength of contact was important for prediction, confirming the basic principle of EP regulation. Novel features such as the density of the enhancers/promoters in the genomic region was found to be important, highlighting our lack of understanding on how other elements in the region contribute to the regulation. Several TF peaks were identified that improved the prediction by identifying the negatives and reducing False Positives. In summary, integrating genomic assays with enhancer perturbation studies increased the accuracy of the model, and provided novel insights into the understanding of enhancer-driven transcription. AVAILABILITY AND IMPLEMENTATION The trained models, data, and the source code are available at http//doi.org/10.5281/zenodo.11290386 and https//github.com/HanLabUNLV/sleps.
Texto completo:
1
Base de dados:
MEDLINE
Assunto principal:
Elementos Facilitadores Genéticos
/
Regiões Promotoras Genéticas
/
Aprendizado de Máquina Supervisionado
Idioma:
En
Ano de publicação:
2024
Tipo de documento:
Article