Your browser doesn't support javascript.
loading
Evidential deep learning for trustworthy prediction of enzyme commission number.
Han, So-Ra; Park, Mingyu; Kosaraju, Sai; Lee, JeungMin; Lee, Hyun; Lee, Jun Hyuck; Oh, Tae-Jin; Kang, Mingon.
Afiliação
  • Han SR; Department of Life Science and Biochemical Engineering, Sun Moon University, Asan, Republic of Korea.
  • Park M; Bio Big Data-based Chungnam Smart Clean Research Leader Training Program, SunMoon University, Asan, Republic of Korea.
  • Kosaraju S; Bio Big Data-based Chungnam Smart Clean Research Leader Training Program, SunMoon University, Asan, Republic of Korea.
  • Lee J; Division of Computer Science and Engineering, Sun Moon University, Asan, Republic of Korea.
  • Lee H; Department of Computer Science, University of Nevada, Las Vegas, NV, USA.
  • Lee JH; Bio Big Data-based Chungnam Smart Clean Research Leader Training Program, SunMoon University, Asan, Republic of Korea.
  • Oh TJ; Division of Computer Science and Engineering, Sun Moon University, Asan, Republic of Korea.
  • Kang M; Bio Big Data-based Chungnam Smart Clean Research Leader Training Program, SunMoon University, Asan, Republic of Korea.
Brief Bioinform ; 25(1)2023 11 22.
Article em En | MEDLINE | ID: mdl-37991247
ABSTRACT
The rapid growth of uncharacterized enzymes and their functional diversity urge accurate and trustworthy computational functional annotation tools. However, current state-of-the-art models lack trustworthiness on the prediction of the multilabel classification problem with thousands of classes. Here, we demonstrate that a novel evidential deep learning model (named ECPICK) makes trustworthy predictions of enzyme commission (EC) numbers with data-driven domain-relevant evidence, which results in significantly enhanced predictive power and the capability to discover potential new motif sites. ECPICK learns complex sequential patterns of amino acids and their hierarchical structures from 20 million enzyme data. ECPICK identifies significant amino acids that contribute to the prediction without multiple sequence alignment. Our intensive assessment showed not only outstanding enhancement of predictive performance on the largest databases of Uniprot, Protein Data Bank (PDB) and Kyoto Encyclopedia of Genes and Genomes (KEGG), but also a capability to discover new motif sites in microorganisms. ECPICK is a reliable EC number prediction tool to identify protein functions of an increasing number of uncharacterized enzymes.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Aprendizado Profundo Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Aprendizado Profundo Idioma: En Ano de publicação: 2023 Tipo de documento: Article