Your browser doesn't support javascript.
loading
ProteoMutaMetrics: machine learning approaches for solute carrier family 6 mutation pathogenicity prediction.
Huang, Jiahui; Osthushenrich, Tanja; MacNamara, Aidan; Mälarstig, Anders; Brocchetti, Silvia; Bradberry, Samuel; Scarabottolo, Lia; Ferrada, Evandro; Sosnin, Sergey; Digles, Daniela; Superti-Furga, Giulio; Ecker, Gerhard F.
Afiliación
  • Huang J; University of Vienna, Department of Pharmaceutical Sciences Vienna Austria gerhard.f.ecker@univie.ac.at.
  • Osthushenrich T; Bayer AG, Division Pharmaceuticals, Biomedical Data Science II Wuppertal Germany.
  • MacNamara A; Bayer AG, Division Pharmaceuticals, Biomedical Data Science II Wuppertal Germany.
  • Mälarstig A; Emerging Science & Innovation, Pfizer Worldwide Research, Development and Medical Cambridge MA USA.
  • Brocchetti S; Axxam SpA Bresso Milan Italy.
  • Bradberry S; Axxam SpA Bresso Milan Italy.
  • Scarabottolo L; Axxam SpA Bresso Milan Italy.
  • Ferrada E; CeMM, Research Center for Molecular Medicine of the Austrian Academy of Sciences Vienna Austria.
  • Sosnin S; University of Vienna, Department of Pharmaceutical Sciences Vienna Austria gerhard.f.ecker@univie.ac.at.
  • Digles D; University of Vienna, Department of Pharmaceutical Sciences Vienna Austria gerhard.f.ecker@univie.ac.at.
  • Superti-Furga G; CeMM, Research Center for Molecular Medicine of the Austrian Academy of Sciences Vienna Austria.
  • Ecker GF; University of Vienna, Department of Pharmaceutical Sciences Vienna Austria gerhard.f.ecker@univie.ac.at.
RSC Adv ; 14(19): 13083-13094, 2024 Apr 22.
Article en En | MEDLINE | ID: mdl-38655474
ABSTRACT
The solute carrier transporter family 6 (SLC6) is of key interest for their critical role in the transport of small amino acids or amino acid-like molecules. Their dysfunction is strongly associated with human diseases such as including schizophrenia, depression, and Parkinson's disease. Linking single point mutations to disease may support insights into the structure-function relationship of these transporters. This work aimed to develop a computational model for predicting the potential pathogenic effect of single point mutations in the SLC6 family. Missense mutation data was retrieved from UniProt, LitVar, and ClinVar, covering multiple protein-coding transcripts. As encoding approach, amino acid descriptors were used to calculate the average sequence properties for both original and mutated sequences. In addition to the full-sequence calculation, the sequences were cut into twelve domains. The domains are defined according to the transmembrane domains of the SLC6 transporters to analyse the regions' contributions to the pathogenicity prediction. Subsequently, several classification models, namely Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), and Extreme Gradient Boosting (XGBoost) with the hyperparameters optimized through grid search were built. For estimation of model performance, repeated stratified k-fold cross-validation was used. The accuracy values of the generated models are in the range of 0.72 to 0.80. Analysis of feature importance indicates that mutations in distinct regions of SLC6 transporters are associated with an increased risk for pathogenicity. When applying the model on an independent validation set, the performance in accuracy dropped to averagely 0.6 with high precision but low sensitivity scores.

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: RSC Adv Año: 2024 Tipo del documento: Article

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: RSC Adv Año: 2024 Tipo del documento: Article