Your browser doesn't support javascript.
loading
Meta-i6mA: an interspecies predictor for identifying DNA N6-methyladenine sites of plant genomes by exploiting informative features in an integrative machine-learning framework.
Hasan, Md Mehedi; Basith, Shaherin; Khatun, Mst Shamima; Lee, Gwang; Manavalan, Balachandran; Kurata, Hiroyuki.
Afiliación
  • Hasan MM; China Agricultural University, Beijing.
  • Basith S; Department of Physiology, Ajou University School of Medicine, Republic of Korea.
  • Khatun MS; Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Japan.
  • Lee G; Department of Physiology, Ajou University School of Medicine, Republic of Korea.
  • Manavalan B; Ajou University, Republic of Korea.
  • Kurata H; Department of Bioscience and Bioinformatics in the Kyushu Institute of Technology, Japan.
Brief Bioinform ; 22(3)2021 05 20.
Article en En | MEDLINE | ID: mdl-32910169
ABSTRACT
DNA N6-methyladenine (6mA) represents important epigenetic modifications, which are responsible for various cellular processes. The accurate identification of 6mA sites is one of the challenging tasks in genome analysis, which leads to an understanding of their biological functions. To date, several species-specific machine learning (ML)-based models have been proposed, but majority of them did not test their model to other species. Hence, their practical application to other plant species is quite limited. In this study, we explored 10 different feature encoding schemes, with the goal of capturing key characteristics around 6mA sites. We selected five feature encoding schemes based on physicochemical and position-specific information that possesses high discriminative capability. The resultant feature sets were inputted to six commonly used ML methods (random forest, support vector machine, extremely randomized tree, logistic regression, naïve Bayes and AdaBoost). The Rosaceae genome was employed to train the above classifiers, which generated 30 baseline models. To integrate their individual strength, Meta-i6mA was proposed that combined the baseline models using the meta-predictor approach. In extensive independent test, Meta-i6mA showed high Matthews correlation coefficient values of 0.918, 0.827 and 0.635 on Rosaceae, rice and Arabidopsis thaliana, respectively and outperformed the existing predictors. We anticipate that the Meta-i6mA can be applied across different plant species. Furthermore, we developed an online user-friendly web server, which is available at http//kurata14.bio.kyutech.ac.jp/Meta-i6mA/.
Asunto(s)
Palabras clave

Texto completo: 1 Bases de datos: MEDLINE Asunto principal: Adenosina / Genoma de Planta / ADN de Plantas / Biología Computacional / Epigénesis Genética / Aprendizaje Automático Tipo de estudio: Clinical_trials / Prognostic_studies / Risk_factors_studies Idioma: En Revista: Brief Bioinform Asunto de la revista: BIOLOGIA / INFORMATICA MEDICA Año: 2021 Tipo del documento: Article

Texto completo: 1 Bases de datos: MEDLINE Asunto principal: Adenosina / Genoma de Planta / ADN de Plantas / Biología Computacional / Epigénesis Genética / Aprendizaje Automático Tipo de estudio: Clinical_trials / Prognostic_studies / Risk_factors_studies Idioma: En Revista: Brief Bioinform Asunto de la revista: BIOLOGIA / INFORMATICA MEDICA Año: 2021 Tipo del documento: Article