Your browser doesn't support javascript.
loading
A protein pre-trained model-based approach for the identification of the liquid-liquid phase separation (LLPS) proteins.
Ahmed, Zahoor; Shahzadi, Kiran; Temesgen, Sebu Aboma; Ahmad, Basharat; Chen, Xiang; Ning, Lin; Zulfiqar, Hasan; Lin, Hao; Jin, Yan-Ting.
Afiliación
  • Ahmed Z; Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou, China. Electronic address: raz_45@hotmail.com.
  • Shahzadi K; Department of Biotechnology, Women University of Azad Jammu and Kashmir, Bagh, Azad Kashmir, Pakistan. Electronic address: shahzadibio786@gmail.com.
  • Temesgen SA; School of Life Science and Technology, University of Electronic Science and Technology of China, 611731 Chengdu, China. Electronic address: 202214140105@std.uestc.edu.cn.
  • Ahmad B; School of Life Science and Technology, University of Electronic Science and Technology of China, 611731 Chengdu, China. Electronic address: basharatahmad3674@gmail.com.
  • Chen X; Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou, China. Electronic address: 21728002@zju.edu.cn.
  • Ning L; School of Life Science and Technology, University of Electronic Science and Technology of China, 611731 Chengdu, China; School of Healthcare Technology, Chengdu Neusoft University, Chengdu, China. Electronic address: ninglin@nsu.edu.cn.
  • Zulfiqar H; Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou, China. Electronic address: hasanzulfiqar66@gmail.com.
  • Lin H; Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou, China. Electronic address: hlin@uestc.edu.cn.
  • Jin YT; School of Life Science and Technology, University of Electronic Science and Technology of China, 611731 Chengdu, China. Electronic address: jinyanting@uestc.edu.cn.
Int J Biol Macromol ; 277(Pt 4): 134146, 2024 Oct.
Article en En | MEDLINE | ID: mdl-39067723
ABSTRACT
Liquid-liquid phase separation (LLPS) regulates many biological processes including RNA metabolism, chromatin rearrangement, and signal transduction. Aberrant LLPS potentially leads to serious diseases. Therefore, the identification of the LLPS proteins is crucial. Traditionally, biochemistry-based methods for identifying LLPS proteins are costly, time-consuming, and laborious. In contrast, artificial intelligence-based approaches are fast and cost-effective and can be a better alternative to biochemistry-based methods. Previous research methods employed word2vec in conjunction with machine learning or deep learning algorithms. Although word2vec captures word semantics and relationships, it might not be effective in capturing features relevant to protein classification, like physicochemical properties, evolutionary relationships, or structural features. Additionally, other studies often focused on a limited set of features for model training, including planar π contact frequency, pi-pi, and ß-pairing propensities. To overcome such shortcomings, this study first constructed a reliable dataset containing 1206 protein sequences, including 603 LLPS and 603 non-LLPS protein sequences. Then a computational model was proposed to efficiently identify the LLPS proteins by perceiving semantic information of protein sequences directly; using an ESM2-36 pre-trained model based on transformer architecture in conjunction with a convolutional neural network. The model could achieve an accuracy of 85.68% and 89.67%, respectively on training data and test data, surpassing the accuracy of previous studies. The performance demonstrates the potential of our computational methods as efficient alternatives for identifying LLPS proteins.
Asunto(s)
Palabras clave

Texto completo: 1 Base de datos: MEDLINE Asunto principal: Proteínas Idioma: En Revista: Int J Biol Macromol Año: 2024 Tipo del documento: Article

Texto completo: 1 Base de datos: MEDLINE Asunto principal: Proteínas Idioma: En Revista: Int J Biol Macromol Año: 2024 Tipo del documento: Article