Your browser doesn't support javascript.
loading
Rapid and accurate prediction of protein homo-oligomer symmetry with Seq2Symm.
Kshirsagar, Meghana; Meller, Artur; Humphreys, Ian; Sledzieski, Samuel; Xu, Yixi; Dodhia, Rahul; Horvitz, Eric; Berger, Bonnie; Bowman, Gregory; Ferres, Juan Lavista; Baker, David; Baek, Minkyung.
Afiliação
  • Kshirsagar M; Microsoft AI for Good Research Lab.
  • Meller A; Washington University in St. Louis.
  • Humphreys I; University of Washington.
  • Sledzieski S; Massachusetts Institute of Technology.
  • Xu Y; Microsoft AI for Good research lab.
  • Dodhia R; Microsoft AI for Good research lab.
  • Horvitz E; Microsoft.
  • Berger B; Massachusetts Institute of Technology.
  • Bowman G; University of Pennsylvania.
  • Ferres JL; Microsoft AI for Good Research Lab.
  • Baker D; University of Washington.
  • Baek M; Seoul National University.
Res Sq ; 2024 Apr 26.
Article em En | MEDLINE | ID: mdl-38746169
ABSTRACT
The majority of proteins must form higher-order assemblies to perform their biological functions. Despite the importance of protein quaternary structure, there are few machine learning models that can accurately and rapidly predict the symmetry of assemblies involving multiple copies of the same protein chain. Here, we address this gap by training several classes of protein foundation models, including ESM-MSA, ESM2, and RoseTTAFold2, to predict homo-oligomer symmetry. Our best model named Seq2Symm, which utilizes ESM2, outperforms existing template-based and deep learning methods. It achieves an average PR-AUC of 0.48 and 0.44 across homo-oligomer symmetries on two different held-out test sets compared to 0.32 and 0.23 for the template-based method. Because Seq2Symm can rapidly predict homo-oligomer symmetries using a single sequence as input (~ 80,000 proteins/hour), we have applied it to 5 entire proteomes and ~ 3.5 million unlabeled protein sequences to identify patterns in protein assembly complexity across biological kingdoms and species.

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article