Your browser doesn't support javascript.
loading
Twenty years of bioinformatics research for protease-specific substrate and cleavage site prediction: a comprehensive revisit and benchmarking of existing methods.
Li, Fuyi; Wang, Yanan; Li, Chen; Marquez-Lago, Tatiana T; Leier, André; Rawlings, Neil D; Haffari, Gholamreza; Revote, Jerico; Akutsu, Tatsuya; Chou, Kuo-Chen; Purcell, Anthony W; Pike, Robert N; Webb, Geoffrey I; Ian Smith, A; Lithgow, Trevor; Daly, Roger J; Whisstock, James C; Song, Jiangning.
Affiliation
  • Li F; Biomedicine Discovery Institute and Department of Biochemistry & Molecular Biology, Monash University, Melbourne, VIC 3800, Australia.
  • Wang Y; Biomedicine Discovery Institute and Department of Biochemistry & Molecular Biology, Monash University, Melbourne, VIC 3800, Australia.
  • Li C; Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai 200240, China.
  • Marquez-Lago TT; Biomedicine Discovery Institute and Department of Biochemistry & Molecular Biology, Monash University, Melbourne, VIC 3800, Australia.
  • Leier A; Department of Biology, Institute of Molecular Systems Biology,ETH Zürich, Zürich 8093, Switzerland.
  • Rawlings ND; Department of Genetics and Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA.
  • Haffari G; Department of Genetics and Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA.
  • Revote J; EMBL European Bioinformatics Institute, Wellcome Trust Genome Campus, Wellcome Trust Genome Campus,Hinxton, Cambridgeshire CB10 1SD, UK.
  • Akutsu T; Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia.
  • Chou KC; Biomedicine Discovery Institute and Department of Biochemistry & Molecular Biology, Monash University, Melbourne, VIC 3800, Australia.
  • Purcell AW; Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan.
  • Pike RN; Gordon Life Science Institute, Boston, MA 02478, USA.
  • Webb GI; Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.
  • Ian Smith A; Biomedicine Discovery Institute and Department of Biochemistry & Molecular Biology, Monash University, Melbourne, VIC 3800, Australia.
  • Lithgow T; La Trobe Institute for Molecular Science, La Trobe University, Melbourne, VIC 3086, Australia.
  • Daly RJ; ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia.
  • Whisstock JC; Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia.
  • Song J; Biomedicine Discovery Institute and Department of Biochemistry & Molecular Biology, Monash University, Melbourne, VIC 3800, Australia.
Brief Bioinform ; 20(6): 2150-2166, 2019 11 27.
Article in En | MEDLINE | ID: mdl-30184176
ABSTRACT
The roles of proteolytic cleavage have been intensively investigated and discussed during the past two decades. This irreversible chemical process has been frequently reported to influence a number of crucial biological processes (BPs), such as cell cycle, protein regulation and inflammation. A number of advanced studies have been published aiming at deciphering the mechanisms of proteolytic cleavage. Given its significance and the large number of functionally enriched substrates targeted by specific proteases, many computational approaches have been established for accurate prediction of protease-specific substrates and their cleavage sites. Consequently, there is an urgent need to systematically assess the state-of-the-art computational approaches for protease-specific cleavage site prediction to further advance the existing methodologies and to improve the prediction performance. With this goal in mind, in this article, we carefully evaluated a total of 19 computational methods (including 8 scoring function-based methods and 11 machine learning-based methods) in terms of their underlying algorithm, calculated features, performance evaluation and software usability. Then, extensive independent tests were performed to assess the robustness and scalability of the reviewed methods using our carefully prepared independent test data sets with 3641 cleavage sites (specific to 10 proteases). The comparative experimental results demonstrate that PROSPERous is the most accurate generic method for predicting eight protease-specific cleavage sites, while GPS-CCD and LabCaS outperformed other predictors for calpain-specific cleavage sites. Based on our review, we then outlined some potential ways to improve the prediction performance and ease the computational burden by applying ensemble learning, deep learning, positive unlabeled learning and parallel and distributed computing techniques. We anticipate that our study will serve as a practical and useful guide for interested readers to further advance next-generation bioinformatics tools for protease-specific cleavage site prediction.
Subject(s)
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Peptide Hydrolases / Research / Computational Biology / Benchmarking Type of study: Prognostic_studies / Risk_factors_studies Language: En Journal: Brief Bioinform Journal subject: BIOLOGIA / INFORMATICA MEDICA Year: 2019 Type: Article Affiliation country: Australia

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Peptide Hydrolases / Research / Computational Biology / Benchmarking Type of study: Prognostic_studies / Risk_factors_studies Language: En Journal: Brief Bioinform Journal subject: BIOLOGIA / INFORMATICA MEDICA Year: 2019 Type: Article Affiliation country: Australia