Your browser doesn't support javascript.
loading
Boosting tissue-specific prediction of active cis-regulatory regions through deep learning and Bayesian optimization techniques.
Cappelletti, Luca; Petrini, Alessandro; Gliozzo, Jessica; Casiraghi, Elena; Schubach, Max; Kircher, Martin; Valentini, Giorgio.
Afiliación
  • Cappelletti L; AnacletoLab, Dipartimento di Informatica, Università degli Studi di Milano, Milan, Italy.
  • Petrini A; AnacletoLab, Dipartimento di Informatica, Università degli Studi di Milano, Milan, Italy.
  • Gliozzo J; AnacletoLab, Dipartimento di Informatica, Università degli Studi di Milano, Milan, Italy.
  • Casiraghi E; AnacletoLab, Dipartimento di Informatica, Università degli Studi di Milano, Milan, Italy.
  • Schubach M; Berlin Institute of Health at Charité, Universitätsmedizin Berlin, Berlin, Germany.
  • Kircher M; Berlin Institute of Health at Charité, Universitätsmedizin Berlin, Berlin, Germany.
  • Valentini G; AnacletoLab, Dipartimento di Informatica, Università degli Studi di Milano, Milan, Italy. valentini@di.unimi.it.
BMC Bioinformatics ; 23(Suppl 2): 154, 2022 Dec 12.
Article en En | MEDLINE | ID: mdl-36510125
ABSTRACT

BACKGROUND:

Cis-regulatory regions (CRRs) are non-coding regions of the DNA that fine control the spatio-temporal pattern of transcription; they are involved in a wide range of pivotal processes such as the development of specific cell-lines/tissues and the dynamic cell response to physiological stimuli. Recent studies showed that genetic variants occurring in CRRs are strongly correlated with pathogenicity or deleteriousness. Considering the central role of CRRs in the regulation of physiological and pathological conditions, the correct identification of CRRs and of their tissue-specific activity status through Machine Learning methods plays a major role in dissecting the impact of genetic variants on human diseases. Unfortunately, the problem is still open, though some promising results have been already reported by (deep) machine-learning based methods that predict active promoters and enhancers in specific tissues or cell lines by encoding epigenetic or spectral features directly extracted from DNA sequences.

RESULTS:

We present the experiments we performed to compare two Deep Neural Networks, a Feed-Forward Neural Network model working on epigenomic features, and a Convolutional Neural Network model working only on genomic sequence, targeted to the identification of enhancer- and promoter-activity in specific cell lines. While performing experiments to understand how the experimental setup influences the prediction performance of the methods, we particularly focused on (1) automatic model selection performed by Bayesian optimization and (2) exploring different data rebalancing setups for reducing negative unbalancing effects.

CONCLUSIONS:

Results show that (1) automatic model selection by Bayesian optimization improves the quality of the learner; (2) data rebalancing considerably impacts the prediction performance of the models; test set rebalancing may provide over-optimistic results, and should therefore be cautiously applied; (3) despite working on sequence data, convolutional models obtain performance close to those of feed forward models working on epigenomic information, which suggests that also sequence data carries informative content for CRR-activity prediction. We therefore suggest combining both models/data types in future works.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Aprendizaje Profundo Tipo de estudio: Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: BMC Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2022 Tipo del documento: Article País de afiliación: Italia

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Aprendizaje Profundo Tipo de estudio: Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: BMC Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2022 Tipo del documento: Article País de afiliación: Italia
...