Your browser doesn't support javascript.
loading
SAAMBE-SEQ: a sequence-based method for predicting mutation effect on protein-protein binding affinity.
Li, Gen; Pahari, Swagata; Murthy, Adithya Krishna; Liang, Siqi; Fragoza, Robert; Yu, Haiyuan; Alexov, Emil.
Afiliación
  • Li G; Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA.
  • Pahari S; Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA.
  • Murthy AK; Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA.
  • Liang S; Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA.
  • Fragoza R; Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA.
  • Yu H; Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA.
  • Alexov E; Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA.
Bioinformatics ; 37(7): 992-999, 2021 05 17.
Article en En | MEDLINE | ID: mdl-32866236
ABSTRACT
MOTIVATION Vast majority of human genetic disorders are associated with mutations that affect protein-protein interactions by altering wild-type binding affinity. Therefore, it is extremely important to assess the effect of mutations on protein-protein binding free energy to assist the development of therapeutic solutions. Currently, the most popular approaches use structural information to deliver the predictions, which precludes them to be applicable on genome-scale investigations. Indeed, with the progress of genomic sequencing, researchers are frequently dealing with assessing effect of mutations for which there is no structure available.

RESULTS:

Here, we report a Gradient Boosting Decision Tree machine learning algorithm, the SAAMBE-SEQ, which is completely sequence-based and does not require structural information at all. SAAMBE-SEQ utilizes 80 features representing evolutionary information, sequence-based features and change of physical properties upon mutation at the mutation site. The approach is shown to achieve Pearson correlation coefficient (PCC) of 0.83 in 5-fold cross validation in a benchmarking test against experimentally determined binding free energy change (ΔΔG). Further, a blind test (no-STRUC) is compiled collecting experimental ΔΔG upon mutation for protein complexes for which structure is not available and used to benchmark SAAMBE-SEQ resulting in PCC in the range of 0.37-0.46. The accuracy of SAAMBE-SEQ method is found to be either better or comparable to most advanced structure-based methods. SAAMBE-SEQ is very fast, available as webserver and stand-alone code, and indeed utilizes only sequence information, and thus it is applicable for genome-scale investigations to study the effect of mutations on protein-protein interactions. AVAILABILITY AND IMPLEMENTATION SAAMBE-SEQ is available at http//compbio.clemson.edu/saambe_webserver/indexSEQ.php#started. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Programas Informáticos / Proteínas Tipo de estudio: Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2021 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Programas Informáticos / Proteínas Tipo de estudio: Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2021 Tipo del documento: Article País de afiliación: Estados Unidos