Your browser doesn't support javascript.
loading
Knot or not? Identifying unknotted proteins in knotted families with sequence-based Machine Learning model.
Sikora, Maciej; Klimentova, Eva; Uchal, Dawid; Sramkova, Denisa; Perlinska, Agata P; Nguyen, Mai Lan; Korpacz, Marta; Malinowska, Roksana; Nowakowski, Szymon; Rubach, Pawel; Simecek, Petr; Sulkowska, Joanna I.
Afiliação
  • Sikora M; Centre of New Technologies, University of Warsaw, Warsaw, Poland.
  • Klimentova E; Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland.
  • Uchal D; Central European Institute of Technology, Masaryk University, Brno, Czech Republic.
  • Sramkova D; National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Brno, Czech Republic.
  • Perlinska AP; Centre of New Technologies, University of Warsaw, Warsaw, Poland.
  • Nguyen ML; Faculty of Physics, University of Warsaw, Warsaw, Poland.
  • Korpacz M; Central European Institute of Technology, Masaryk University, Brno, Czech Republic.
  • Malinowska R; National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Brno, Czech Republic.
  • Nowakowski S; Centre of New Technologies, University of Warsaw, Warsaw, Poland.
  • Rubach P; Centre of New Technologies, University of Warsaw, Warsaw, Poland.
  • Simecek P; Centre of New Technologies, University of Warsaw, Warsaw, Poland.
  • Sulkowska JI; Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland.
Protein Sci ; 33(7): e4998, 2024 Jul.
Article em En | MEDLINE | ID: mdl-38888487
ABSTRACT
Knotted proteins, although scarce, are crucial structural components of certain protein families, and their roles continue to be a topic of intense research. Capitalizing on the vast collection of protein structure predictions offered by AlphaFold (AF), this study computationally examines the entire UniProt database to create a robust dataset of knotted and unknotted proteins. Utilizing this dataset, we develop a machine learning (ML) model capable of accurately predicting the presence of knots in protein structures solely from their amino acid sequences. We tested the model's capabilities on 100 proteins whose structures had not yet been predicted by AF and found agreement with our local prediction in 92% cases. From the point of view of structural biology, we found that all potentially knotted proteins predicted by AF can be classified only into 17 families. This allows us to discover the presence of unknotted proteins in families with a highly conserved knot. We found only three new protein families UCH, DUF4253, and DUF2254, that contain both knotted and unknotted proteins, and demonstrate that deletions within the knot core could potentially account for the observed unknotted (trivial) topology. Finally, we have shown that in the majority of knotted families (11 out of 15), the knotted topology is strictly conserved in functional proteins with very low sequence similarity. We have conclusively demonstrated that proteins AF predicts as unknotted are structurally accurate in their unknotted configurations. However, these proteins often represent nonfunctional fragments, lacking significant portions of the knot core (amino acid sequence).
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Proteínas / Modelos Moleculares / Bases de Dados de Proteínas / Aprendizado de Máquina Idioma: En Revista: Protein Sci Assunto da revista: BIOQUIMICA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Polônia

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Proteínas / Modelos Moleculares / Bases de Dados de Proteínas / Aprendizado de Máquina Idioma: En Revista: Protein Sci Assunto da revista: BIOQUIMICA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Polônia