POSSUM: a bioinformatics toolkit for generating numerical sequence feature descriptors based on PSSM profiles.

Wang, Jiawei; Yang, Bingjiao; Revote, Jerico; Leier, André; Marquez-Lago, Tatiana T; Webb, Geoffrey; Song, Jiangning; Chou, Kuo-Chen; Lithgow, Trevor

Wang, Jiawei; Yang, Bingjiao; Revote, Jerico; Leier, André; Marquez-Lago, Tatiana T; Webb, Geoffrey; Song, Jiangning; Chou, Kuo-Chen; Lithgow, Trevor.

Afiliação

Wang J; Biomedicine Discovery Institute, Monash University, VIC 3800, Australia.
Yang B; College of Mechanical Engineering, Yanshan University, Qinhuangdao 066004, China.
Revote J; Biomedicine Discovery Institute, Monash University, VIC 3800, Australia.
Leier A; Informatics Institute and Department of Genetics, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA.
Marquez-Lago TT; Informatics Institute and Department of Genetics, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA.
Webb G; Monash Centre for Data Science, Faculty of Information Technology.
Song J; Biomedicine Discovery Institute, Monash University, VIC 3800, Australia.
Chou KC; Monash Centre for Data Science, Faculty of Information Technology.
Lithgow T; ARC Centre of Excellence for Advanced Molecular Imaging, Monash University, VIC 3800, Australia.

Bioinformatics ; 33(17): 2756-2758, 2017 Sep 01.

Article em En | MEDLINE | ID: mdl-28903538

RESUMO

SUMMARY: Evolutionary information in the form of a Position-Specific Scoring Matrix (PSSM) is a widely used and highly informative representation of protein sequences. Accordingly, PSSM-based feature descriptors have been successfully applied to improve the performance of various predictors of protein attributes. Even though a number of algorithms have been proposed in previous studies, there is currently no universal web server or toolkit available for generating this wide variety of descriptors. Here, we present POSSUM ( Po sition- S pecific S coring matrix-based feat u re generator for m achine learning), a versatile toolkit with an online web server that can generate 21 types of PSSM-based feature descriptors, thereby addressing a crucial need for bioinformaticians and computational biologists. We envisage that this comprehensive toolkit will be widely used as a powerful tool to facilitate feature extraction, selection, and benchmarking of machine learning-based models, thereby contributing to a more effective analysis and modeling pipeline for bioinformatics research. AVAILABILITY AND IMPLEMENTATION: http://possum.erc.monash.edu/ . CONTACT: trevor.lithgow@monash.edu or jiangning.song@monash.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Biologia Computacional/métodos; Aprendizado de Máquina; Matrizes de Pontuação de Posição Específica; Análise de Sequência de Proteína/métodos; Software

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2017 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2017 Tipo de documento: Article