Your browser doesn't support javascript.
loading
DR-BERT: A protein language model to annotate disordered regions.
Nambiar, Ananthan; Forsyth, John Malcolm; Liu, Simon; Maslov, Sergei.
Afiliación
  • Nambiar A; Department of Bioengineering, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA; Carl R. Woese Institute for Genomic Biology, Urbana, IL 61801, USA. Electronic address: nambiar4@illinois.edu.
  • Forsyth JM; Carl R. Woese Institute for Genomic Biology, Urbana, IL 61801, USA; Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA.
  • Liu S; Carl R. Woese Institute for Genomic Biology, Urbana, IL 61801, USA; Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA.
  • Maslov S; Department of Bioengineering, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA; Carl R. Woese Institute for Genomic Biology, Urbana, IL 61801, USA; Department of Physics, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA; Computing, Environment and Life Sciences, Argonne N
Structure ; 32(8): 1260-1268.e3, 2024 Aug 08.
Article en En | MEDLINE | ID: mdl-38701796
ABSTRACT
Despite their lack of a rigid structure, intrinsically disordered regions (IDRs) in proteins play important roles in cellular functions, including mediating protein-protein interactions. Therefore, it is important to computationally annotate IDRs with high accuracy. In this study, we present Disordered Region prediction using Bidirectional Encoder Representations from Transformers (DR-BERT), a compact protein language model. Unlike most popular tools, DR-BERT is pretrained on unannotated proteins and trained to predict IDRs without relying on explicit evolutionary or biophysical data. Despite this, DR-BERT demonstrates significant improvement over existing methods on the Critical Assessment of protein Intrinsic Disorder (CAID) evaluation dataset and outperforms competitors on two out of four test cases in the CAID 2 dataset, while maintaining competitiveness in the others. This performance is due to the information learned during pretraining and DR-BERT's ability to use contextual information.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Proteínas Intrínsecamente Desordenadas Idioma: En Revista: Structure Asunto de la revista: BIOLOGIA MOLECULAR / BIOQUIMICA / BIOTECNOLOGIA Año: 2024 Tipo del documento: Article Pais de publicación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Proteínas Intrínsecamente Desordenadas Idioma: En Revista: Structure Asunto de la revista: BIOLOGIA MOLECULAR / BIOQUIMICA / BIOTECNOLOGIA Año: 2024 Tipo del documento: Article Pais de publicación: Estados Unidos