Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
Crit Rev Biochem Mol Biol ; 53(1): 1-28, 2018 02.
Artículo en Inglés | MEDLINE | ID: mdl-28976219

RESUMEN

Prediction of protein tertiary structures from amino acid sequence and understanding the mechanisms of how proteins fold, collectively known as "the protein folding problem," has been a grand challenge in molecular biology for over half a century. Theories have been developed that provide us with an unprecedented understanding of protein folding mechanisms. However, computational simulation of protein folding is still difficult, and prediction of protein tertiary structure from amino acid sequence is an unsolved problem. Progress toward a satisfying solution has been slow due to challenges in sampling the vast conformational space and deriving sufficiently accurate energy functions. Nevertheless, several techniques and algorithms have been adopted to overcome these challenges, and the last two decades have seen exciting advances in enhanced sampling algorithms, computational power and tertiary structure prediction methodologies. This review aims at summarizing these computational techniques, specifically conformational sampling algorithms and energy approximations that have been frequently used to study protein-folding mechanisms or to de novo predict protein tertiary structures. We hope that this review can serve as an overview on how the protein-folding problem can be studied computationally and, in cases where experimental approaches are prohibitive, help the researcher choose the most relevant computational approach for the problem at hand. We conclude with a summary of current challenges faced and an outlook on potential future directions.


Asunto(s)
Pliegue de Proteína , Proteínas/química , Algoritmos , Animales , Humanos , Cinética , Simulación de Dinámica Molecular , Estructura Terciaria de Proteína , Termodinámica
2.
Proteins ; 83(3): 547-63, 2015 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-25581562

RESUMEN

During CASP10 in summer 2012, we tested BCL::Fold for prediction of free modeling (FM) and template-based modeling (TBM) targets. BCL::Fold assembles the tertiary structure of a protein from predicted secondary structure elements (SSEs) omitting more flexible loop regions early on. This approach enables the sampling of conformational space for larger proteins with more complex topologies. In preparation of CASP11, we analyzed the quality of CASP10 models throughout the prediction pipeline to understand BCL::Fold's ability to sample the native topology, identify native-like models by scoring and/or clustering approaches, and our ability to add loop regions and side chains to initial SSE-only models. The standout observation is that BCL::Fold sampled topologies with a GDT_TS score > 33% for 12 of 18 and with a topology score > 0.8 for 11 of 18 test cases de novo. Despite the sampling success of BCL::Fold, significant challenges still exist in clustering and loop generation stages of the pipeline. The clustering approach employed for model selection often failed to identify the most native-like assembly of SSEs for further refinement and submission. It was also observed that for some ß-strand proteins model refinement failed as ß-strands were not properly aligned to form hydrogen bonds removing otherwise accurate models from the pool. Further, BCL::Fold samples frequently non-natural topologies that require loop regions to pass through the center of the protein.


Asunto(s)
Biología Computacional/métodos , Pliegue de Proteína , Proteínas/química , Proteínas/metabolismo , Análisis de Secuencia de Proteína/métodos , Algoritmos , Simulación por Computador , Modelos Moleculares , Conformación Proteica
3.
Structure ; 30(2): 313-320.e3, 2022 02 03.
Artículo en Inglés | MEDLINE | ID: mdl-34739840

RESUMEN

Hydrogen-deuterium exchange (HDX) measured by nuclear magnetic resonance (NMR) provides structural information for proteins relating to solvent accessibility and flexibility. While this structural information is beneficial, the data cannot be used exclusively to elucidate structures. However, the structural information provided by the HDX-NMR data can be supplemented by computational methods. In previous work, we developed an algorithm in Rosetta to predict structures using qualitative HDX-NMR data (categories of exchange rate). Here we expand on the effort, and utilize quantitative protection factors (PFs) from HDX-NMR for structure prediction. From observed correlations between PFs and solvent accessibility/flexibility measures, we present a scoring function to quantify the agreement with HDX data. Using a benchmark set of 10 proteins, an average improvement of 5.13 Å in root-mean-square deviation (RMSD) is observed for cases of inaccurate Rosetta predictions. Ultimately, seven out of 10 predictions are accurate without including HDX data, and nine out of 10 are accurate when using our PF-based HDX score.


Asunto(s)
Biología Computacional/métodos , Medición de Intercambio de Deuterio/métodos , Proteínas/química , Algoritmos , Modelos Moleculares , Conformación Proteica
4.
Nat Commun ; 13(1): 4377, 2022 07 28.
Artículo en Inglés | MEDLINE | ID: mdl-35902583

RESUMEN

Ion mobility (IM) mass spectrometry provides structural information about protein shape and size in the form of an orientationally-averaged collision cross-section (CCSIM). While IM data have been used with various computational methods, they have not yet been utilized to predict monomeric protein structure from sequence. Here, we show that IM data can significantly improve protein structure determination using the modelling suite Rosetta. We develop the Rosetta Projection Approximation using Rough Circular Shapes (PARCS) algorithm that allows for fast and accurate prediction of CCSIM from structure. Following successful testing of the PARCS algorithm, we use an integrative modelling approach to utilize IM data for protein structure prediction. Additionally, we propose a confidence metric that identifies near native models in the absence of a known structure. The results of this study demonstrate the ability of IM data to consistently improve protein structure prediction.


Asunto(s)
Espectrometría de Movilidad Iónica , Proteínas , Algoritmos , Espectrometría de Movilidad Iónica/métodos , Espectrometría de Masas/métodos , Proteínas/química
5.
Gene ; 422(1-2): 41-6, 2008 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-18601985

RESUMEN

BCL::Align is a multiple sequence alignment tool that utilizes the dynamic programming method in combination with a customizable scoring function for sequence alignment and fold recognition. The scoring function is a weighted sum of the traditional PAM and BLOSUM scoring matrices, position-specific scoring matrices output by PSI-BLAST, secondary structure predicted by a variety of methods, chemical properties, and gap penalties. By adjusting the weights, the method can be tailored for fold recognition or sequence alignment tasks at different levels of sequence identity. A Monte Carlo algorithm was used to determine optimized weight sets for sequence alignment and fold recognition that most accurately reproduced the SABmark reference alignment test set. In an evaluation of sequence alignment performance, BCL::Align ranked best in alignment accuracy (Cline score of 22.90 for sequences in the Twilight Zone) when compared with Align-m, ClustalW, T-Coffee, and MUSCLE. ROC curve analysis indicates BCL::Align's ability to correctly recognize protein folds with over 80% accuracy. The flexibility of the program allows it to be optimized for specific classes of proteins (e.g. membrane proteins) or fold families (e.g. TIM-barrel proteins). BCL::Align is free for academic use and available online at http://www.meilerlab.org/.


Asunto(s)
Algoritmos , Pliegue de Proteína , Proteínas/genética , Alineación de Secuencia , Análisis de Secuencia de Proteína/métodos , Internet , Método de Montecarlo , Estructura Secundaria de Proteína/fisiología , Proteínas/química
6.
PLoS One ; 12(5): e0177866, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28542325

RESUMEN

De novo membrane protein structure prediction is limited to small proteins due to the conformational search space quickly expanding with length. Long-range contacts (24+ amino acid separation)-residue positions distant in sequence, but in close proximity in the structure, are arguably the most effective way to restrict this conformational space. Inverse methods for co-evolutionary analysis predict a global set of position-pair couplings that best explain the observed amino acid co-occurrences, thus distinguishing between evolutionarily explained co-variances and these arising from spurious transitive effects. Here, we show that applying machine learning approaches and custom descriptors improves evolutionary contact prediction accuracy, resulting in improvement of average precision by 6 percentage points for the top 1L non-local contacts. Further, we demonstrate that predicted contacts improve protein folding with BCL::Fold. The mean RMSD100 metric for the top 10 models folded was reduced by an average of 2 Å for a benchmark of 25 membrane proteins.


Asunto(s)
Aprendizaje Automático , Proteínas de la Membrana/metabolismo , Modelos Moleculares , Pliegue de Proteína , Estructura Secundaria de Proteína/fisiología , Algoritmos , Secuencia de Aminoácidos , Humanos
7.
PLoS One ; 11(4): e0152517, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27046050

RESUMEN

In silico prediction of a protein's tertiary structure remains an unsolved problem. The community-wide Critical Assessment of Protein Structure Prediction (CASP) experiment provides a double-blind study to evaluate improvements in protein structure prediction algorithms. We developed a protein structure prediction pipeline employing a three-stage approach, consisting of low-resolution topology search, high-resolution refinement, and molecular dynamics simulation to predict the tertiary structure of proteins from the primary structure alone or including distance restraints either from predicted residue-residue contacts, nuclear magnetic resonance (NMR) nuclear overhauser effect (NOE) experiments, or mass spectroscopy (MS) cross-linking (XL) data. The protein structure prediction pipeline was evaluated in the CASP11 experiment on twenty regular protein targets as well as thirty-three 'assisted' protein targets, which also had distance restraints available. Although the low-resolution topology search module was able to sample models with a global distance test total score (GDT_TS) value greater than 30% for twelve out of twenty proteins, frequently it was not possible to select the most accurate models for refinement, resulting in a general decay of model quality over the course of the prediction pipeline. In this study, we provide a detailed overall analysis, study one target protein in more detail as it travels through the protein structure prediction pipeline, and evaluate the impact of limited experimental data.


Asunto(s)
Estructura Terciaria de Proteína , Espectroscopía de Resonancia Magnética , Espectrometría de Masas , Simulación de Dinámica Molecular , Pliegue de Proteína
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA