Your browser doesn't support javascript.
loading
Searching for Structure: Characterizing the Protein Conformational Landscape with Clustering-Based Algorithms.
Macke, Amanda C; Stump, Jacob E; Kelly, Maria S; Rowley, Jamie; Herath, Vageesha; Mullen, Sarah; Dima, Ruxandra I.
  • Macke AC; Department of Chemistry, University of Cincinnati, Cincinnati, Ohio 45221, United States.
  • Stump JE; Department of Chemistry, University of Cincinnati, Cincinnati, Ohio 45221, United States.
  • Kelly MS; Department of Chemistry, University of Cincinnati, Cincinnati, Ohio 45221, United States.
  • Rowley J; Department of Chemistry, University of Cincinnati, Cincinnati, Ohio 45221, United States.
  • Herath V; Department of Chemistry, University of Cincinnati, Cincinnati, Ohio 45221, United States.
  • Mullen S; Department of Chemistry, Emory University, Atlanta, Georgia 30322, United States.
  • Dima RI; Department of Chemistry, The College of Wooster, Wooster, Ohio 44691, United States.
J Chem Inf Model ; 64(2): 470-482, 2024 01 22.
Article en En | MEDLINE | ID: mdl-38173388
ABSTRACT
The identification and characterization of the main conformations from a protein population are a challenging and inherently high-dimensional problem. Here, we evaluate the performance of the Secondary sTructural Ensembles with machine LeArning (StELa) double-clustering method, which clusters protein structures based on the relationship between the φ and ψ dihedral angles in a protein backbone and the secondary structure of the protein, thus focusing on the local properties of protein structures. The classification of states as vectors composed of the clusters' indices arising naturally from the Ramachandran plot is followed by the hierarchical clustering of the vectors to allow for the identification of the main features of the corresponding free energy landscape (FEL). We compare the performance of StELa with the established root-mean-squared-deviation (RMSD)-based clustering algorithm, which focuses on global properties of protein structures and with Combinatorial Averaged Transient Structure (CATS), the combinatorial averaged transient structure clustering method based on distributions of the φ and ψ dihedral angle coordinates. Using ensembles of conformations from molecular dynamics simulations of intrinsically disordered proteins (IDPs) of various lengths (tau protein fragments) or short fragments from a globular protein, we show that StELa is the clustering method that identifies many of the minima and relevant energy states around the minima from the corresponding FELs. In contrast, the RMSD-based algorithm yields a large number of clusters that usually cover most of the FEL, thus being unable to distinguish between states, while CATS does not sample well the FELs for long IDPs and fragments from globular proteins.
Asunto(s)

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Simulación de Dinámica Molecular / Proteínas Intrínsecamente Desordenadas Idioma: En Año: 2024 Tipo del documento: Article

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Simulación de Dinámica Molecular / Proteínas Intrínsecamente Desordenadas Idioma: En Año: 2024 Tipo del documento: Article