Your browser doesn't support javascript.
loading
Sliding Window INteraction Grammar (SWING): a generalized interaction language model for peptide and protein interactions.
Omelchenko, Alisa A; Siwek, Jane C; Chhibbar, Prabal; Arshad, Sanya; Nazarali, Iliyan; Nazarali, Kiran; Rosengart, AnnaElaine; Rahimikollu, Javad; Tilstra, Jeremy; Shlomchik, Mark J; Koes, David R; Joglekar, Alok V; Das, Jishnu.
Afiliação
  • Omelchenko AA; Center for Systems immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
  • Siwek JC; Department of Immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
  • Chhibbar P; Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, PA, USA.
  • Arshad S; The joint CMU-Pitt PhD program in computational biology, School of Medicine, University of Pittsburgh, PA, USA.
  • Nazarali I; Center for Systems immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
  • Nazarali K; Department of Immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
  • Rosengart A; Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, PA, USA.
  • Rahimikollu J; The joint CMU-Pitt PhD program in computational biology, School of Medicine, University of Pittsburgh, PA, USA.
  • Tilstra J; Center for Systems immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
  • Shlomchik MJ; Department of Immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
  • Koes DR; Integrative systems biology PhD program, School of Medicine, University of Pittsburgh, PA, USA.
  • Joglekar AV; Center for Systems immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
  • Das J; Department of Immunology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
bioRxiv ; 2024 May 04.
Article em En | MEDLINE | ID: mdl-38746274
ABSTRACT
The explosion of sequence data has allowed the rapid growth of protein language models (pLMs). pLMs have now been employed in many frameworks including variant-effect and peptide-specificity prediction. Traditionally, for protein-protein or peptide-protein interactions (PPIs), corresponding sequences are either co-embedded followed by post-hoc integration or the sequences are concatenated prior to embedding. Interestingly, no method utilizes a language representation of the interaction itself. We developed an interaction LM (iLM), which uses a novel language to represent interactions between protein/peptide sequences. Sliding Window Interaction Grammar (SWING) leverages differences in amino acid properties to generate an interaction vocabulary. This vocabulary is the input into a LM followed by a supervised prediction step where the LM's representations are used as features. SWING was first applied to predicting peptideMHC (pMHC) interactions. SWING was not only successful at generating Class I and Class II models that have comparable prediction to state-of-the-art approaches, but the unique Mixed Class model was also successful at jointly predicting both classes. Further, the SWING model trained only on Class I alleles was predictive for Class II, a complex prediction task not attempted by any existing approach. For de novo data, using only Class I or Class II data, SWING also accurately predicted Class II pMHC interactions in murine models of SLE (MRL/lpr model) and T1D (NOD model), that were validated experimentally. To further evaluate SWING's generalizability, we tested its ability to predict the disruption of specific protein-protein interactions by missense mutations. Although modern methods like AlphaMissense and ESM1b can predict interfaces and variant effects/pathogenicity per mutation, they are unable to predict interaction-specific disruptions. SWING was successful at accurately predicting the impact of both Mendelian mutations and population variants on PPIs. This is the first generalizable approach that can accurately predict interaction-specific disruptions by missense mutations with only sequence information. Overall, SWING is a first-in-class generalizable zero-shot iLM that learns the language of PPIs.

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: BioRxiv Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Estados Unidos País de publicação: Estados Unidos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: BioRxiv Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Estados Unidos País de publicação: Estados Unidos