Your browser doesn't support javascript.
loading
A random forest model for predicting exosomal proteins using evolutionary information and motifs.
Arora, Akanksha; Patiyal, Sumeet; Sharma, Neelam; Devi, Naorem Leimarembi; Kaur, Dashleen; Raghava, Gajendra P S.
  • Arora A; Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.
  • Patiyal S; Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.
  • Sharma N; Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.
  • Devi NL; Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.
  • Kaur D; Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.
  • Raghava GPS; Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.
Proteomics ; 24(6): e2300231, 2024 Mar.
Article en En | MEDLINE | ID: mdl-37525341
ABSTRACT
Non-invasive diagnostics and therapies are crucial to prevent patients from undergoing painful procedures. Exosomal proteins can serve as important biomarkers for such advancements. In this study, we attempted to build a model to predict exosomal proteins. All models are trained, tested, and evaluated on a non-redundant dataset comprising 2831 exosomal and 2831 non-exosomal proteins, where no two proteins have more than 40% similarity. Initially, the standard similarity-based method Basic Local Alignment Search Tool (BLAST) was used to predict exosomal proteins, which failed due to low-level similarity in the dataset. To overcome this challenge, machine learning (ML) based models were developed using compositional and evolutionary features of proteins achieving an area under the receiver operating characteristics (AUROC) of 0.73. Our analysis also indicated that exosomal proteins have a variety of sequence-based motifs which can be used to predict exosomal proteins. Hence, we developed a hybrid method combining motif-based and ML-based approaches for predicting exosomal proteins, achieving a maximum AUROC of 0.85 and MCC of 0.56 on an independent dataset. This hybrid model performs better than presently available methods when assessed on an independent dataset. A web server and a standalone software ExoProPred (https//webs.iiitd.edu.in/raghava/exopropred/) have been created to help scientists predict and discover exosomal proteins and find functional motifs present in them.
Asunto(s)
Palabras clave

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Análisis de Secuencia de Proteína / Bosques Aleatorios Tipo de estudio: Clinical_trials / Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Año: 2024 Tipo del documento: Article

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Análisis de Secuencia de Proteína / Bosques Aleatorios Tipo de estudio: Clinical_trials / Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Año: 2024 Tipo del documento: Article