Freeprotmap: waiting-free prediction method for protein distance map.

Huang, Jiajian; Li, Jinpeng; Chen, Qinchang; Wang, Xia; Chen, Guangyong; Tang, Jin

Huang, Jiajian; Li, Jinpeng; Chen, Qinchang; Wang, Xia; Chen, Guangyong; Tang, Jin.

Afiliación

Huang J; Zhejiang Lab, Zhejiang, China. jiajianapply@gmail.com.
Li J; Dalian University of Technology, Liaoning, China. jiajianapply@gmail.com.
Chen Q; Zhejiang Lab, Zhejiang, China.
Wang X; The Chinese University of Hong Kong, Hong Kong, China.
Chen G; Zhejiang Lab, Zhejiang, China.
Tang J; Zhejiang Lab, Zhejiang, China. wxia2005@163.com.

BMC Bioinformatics ; 25(1): 176, 2024 May 04.

Article en En | MEDLINE | ID: mdl-38704533

ABSTRACT

ABSTRACT

BACKGROUND:

Protein residue-residue distance maps are used for remote homology detection, protein information estimation, and protein structure research. However, existing prediction approaches are time-consuming, and hundreds of millions of proteins are discovered each year, necessitating the development of a rapid and reliable prediction method for protein residue-residue distances. Moreover, because many proteins lack known homologous sequences, a waiting-free and alignment-free deep learning method is needed.

RESULT:

In this study, we propose a learning framework named FreeProtMap. In terms of protein representation processing, the proposed group pooling in FreeProtMap effectively mitigates issues arising from high-dimensional sparseness in protein representation. In terms of model structure, we have made several careful designs. Firstly, it is designed based on the locality of protein structures and triangular inequality distance constraints to improve prediction accuracy. Secondly, inference speed is improved by using additive attention and lightweight design. Besides, the generalization ability is improved by using bottlenecks and a neural network block named local microformer. As a result, FreeProtMap can predict protein residue-residue distances in tens of milliseconds and has higher precision than the best structure prediction method.

CONCLUSION:

Several groups of comparative experiments and ablation experiments verify the effectiveness of the designs. The results demonstrate that FreeProtMap significantly outperforms other state-of-the-art methods in accurate protein residue-residue distance prediction, which is beneficial for lots of protein research works. It is worth mentioning that we could scan all proteins discovered each year based on FreeProtMap to find structurally similar proteins in a short time because the fact that the structure similarity calculation method based on distance maps is much less time-consuming than algorithms based on 3D structures.

Asunto(s)

Proteínas; Proteínas/química; Biología Computacional/métodos; Bases de Datos de Proteínas; Conformación Proteica; Algoritmos; Análisis de Secuencia de Proteína/métodos; Redes Neurales de la Computación

Palabras clave

Feature representation; Residueresidue distance prediction; Waiting-free

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Bases de datos: MEDLINE Asunto principal: Proteínas Idioma: En Revista: BMC Bioinformatics Asunto de la revista: INFORMATICA MEDICA Año: 2024 Tipo del documento: Article País de afiliación: China

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google