Your browser doesn't support javascript.
loading
iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data.
Chen, Zhen; Zhao, Pei; Li, Fuyi; Marquez-Lago, Tatiana T; Leier, André; Revote, Jerico; Zhu, Yan; Powell, David R; Akutsu, Tatsuya; Webb, Geoffrey I; Chou, Kuo-Chen; Smith, A Ian; Daly, Roger J; Li, Jian; Song, Jiangning.
Afiliação
  • Chen Z; School of Basic Medical Science, Qingdao University, 38 Dengzhou Road, Qingdao, 266021, Shandong, China.
  • Zhao P; State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Sciences (CAAS), Anyang, 455000, China.
  • Li F; Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia.
  • Marquez-Lago TT; Department of Genetics, School of Medicine, University of Alabama at Birmingham, USA.
  • Leier A; Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA.
  • Revote J; Department of Genetics, School of Medicine, University of Alabama at Birmingham, USA.
  • Zhu Y; Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA.
  • Powell DR; Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia.
  • Akutsu T; Biomedicine Discovery Institute and Department of Microbiology, Monash University, Melbourne, VIC 3800, Australia.
  • Webb GI; Biomedicine Discovery Institute and Department of Microbiology, Monash University, Melbourne, VIC 3800, Australia.
  • Chou KC; Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan.
  • Smith AI; Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia.
  • Daly RJ; Gordon Life Science Institute, Boston, MA 02478, USA.
  • Li J; Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 610054, China.
  • Song J; Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia.
Brief Bioinform ; 21(3): 1047-1057, 2020 05 21.
Article em En | MEDLINE | ID: mdl-31067315
ABSTRACT
With the explosive growth of biological sequences generated in the post-genomic era, one of the most challenging problems in bioinformatics and computational biology is to computationally characterize sequences, structures and functions in an efficient, accurate and high-throughput manner. A number of online web servers and stand-alone tools have been developed to address this to date; however, all these tools have their limitations and drawbacks in terms of their effectiveness, user-friendliness and capacity. Here, we present iLearn, a comprehensive and versatile Python-based toolkit, integrating the functionality of feature extraction, clustering, normalization, selection, dimensionality reduction, predictor construction, best descriptor/model selection, ensemble learning and results visualization for DNA, RNA and protein sequences. iLearn was designed for users that only want to upload their data set and select the functions they need calculated from it, while all necessary procedures and optimal settings are completed automatically by the software. iLearn includes a variety of descriptors for DNA, RNA and proteins, and four feature output formats are supported so as to facilitate direct output usage or communication with other computational tools. In total, iLearn encompasses 16 different types of feature clustering, selection, normalization and dimensionality reduction algorithms, and five commonly used machine-learning algorithms, thereby greatly facilitating feature analysis and predictor construction. iLearn is made freely available via an online web server and a stand-alone toolkit.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: DNA / RNA / Proteínas / Análise de Sequência / Aprendizado de Máquina Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2020 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: DNA / RNA / Proteínas / Análise de Sequência / Aprendizado de Máquina Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2020 Tipo de documento: Article