Improving deep learning protein monomer and complex structure prediction using DeepMSA2 with huge metagenomics data.
Nat Methods
; 21(2): 279-289, 2024 Feb.
Article
in En
| MEDLINE
| ID: mdl-38167654
ABSTRACT
Leveraging iterative alignment search through genomic and metagenome sequence databases, we report the DeepMSA2 pipeline for uniform protein single- and multichain multiple-sequence alignment (MSA) construction. Large-scale benchmarks show that DeepMSA2 MSAs can remarkably increase the accuracy of protein tertiary and quaternary structure predictions compared with current state-of-the-art methods. An integrated pipeline with DeepMSA2 participated in the most recent CASP15 experiment and created complex structural models with considerably higher quality than the AlphaFold2-Multimer server (v.2.2.0). Detailed data analyses show that the major advantage of DeepMSA2 lies in its balanced alignment search and effective model selection, and in the power of integrating huge metagenomics databases. These results demonstrate a new avenue to improve deep learning protein structure prediction through advanced MSA construction and provide additional evidence that optimization of input information to deep learning-based structure prediction methods must be considered with as much care as the design of the predictor itself.
Full text:
1
Collection:
01-internacional
Database:
MEDLINE
Main subject:
Deep Learning
Type of study:
Prognostic_studies
/
Risk_factors_studies
Language:
En
Journal:
Nat Methods
Journal subject:
TECNICAS E PROCEDIMENTOS DE LABORATORIO
Year:
2024
Type:
Article
Affiliation country:
United States