Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Bioinformatics ; 29(5): 580-7, 2013 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-23314126

RESUMO

MOTIVATION: The problem of ab initio protein folding is one of the most difficult in modern computational biology. The prediction of residue contacts within a protein provides a more tractable immediate step. Recently introduced maximum entropy-based correlated mutation measures (CMMs), such as direct information, have been successful in predicting residue contacts. However, most correlated mutation studies focus on proteins that have large good-quality multiple sequence alignments (MSA) because the power of correlated mutation analysis falls as the size of the MSA decreases. However, even with small autogenerated MSAs, maximum entropy-based CMMs contain information. To make use of this information, in this article, we focus not on general residue contacts but contacts between residues in ß-sheets. The strong constraints and prior knowledge associated with ß-contacts are ideally suited for prediction using a method that incorporates an often noisy CMM. RESULTS: Using contrastive divergence, a statistical machine learning technique, we have calculated a maximum entropy-based CMM. We have integrated this measure with a new probabilistic model for ß-contact prediction, which is used to predict both residue- and strand-level contacts. Using our model on a standard non-redundant dataset, we significantly outperform a 2D recurrent neural network architecture, achieving a 5% improvement in true positives at the 5% false-positive rate at the residue level. At the strand level, our approach is competitive with the state-of-the-art single methods achieving precision of 61.0% and recall of 55.4%, while not requiring residue solvent accessibility as an input. AVAILABILITY: http://www2.warwick.ac.uk/fac/sci/systemsbiology/research/software/


Assuntos
Inteligência Artificial , Modelos Estatísticos , Estrutura Secundária de Proteína , Entropia , Modelos Moleculares , Mutação , Redes Neurais de Computação , Dobramento de Proteína , Proteínas/química , Proteínas/genética , Alinhamento de Sequência , Análise de Sequência de Proteína
2.
Biophys J ; 102(4): 878-86, 2012 Feb 22.
Artigo em Inglês | MEDLINE | ID: mdl-22385859

RESUMO

Nested sampling is a Bayesian sampling technique developed to explore probability distributions localized in an exponentially small area of the parameter space. The algorithm provides both posterior samples and an estimate of the evidence (marginal likelihood) of the model. The nested sampling algorithm also provides an efficient way to calculate free energies and the expectation value of thermodynamic observables at any temperature, through a simple post processing of the output. Previous applications of the algorithm have yielded large efficiency gains over other sampling techniques, including parallel tempering. In this article, we describe a parallel implementation of the nested sampling algorithm and its application to the problem of protein folding in a Go-like force field of empirical potentials that were designed to stabilize secondary structure elements in room-temperature simulations. We demonstrate the method by conducting folding simulations on a number of small proteins that are commonly used for testing protein-folding procedures. A topological analysis of the posterior samples is performed to produce energy landscape charts, which give a high-level description of the potential energy surface for the protein folding simulations. These charts provide qualitative insights into both the folding process and the nature of the model and force field used.


Assuntos
Modelos Moleculares , Dobramento de Proteína , Proteínas de Bactérias/química , Teorema de Bayes , Peptídeos/química , Estrutura Secundária de Proteína , Termodinâmica
3.
PLoS One ; 12(2): e0169356, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28166227

RESUMO

Evolutionary information stored in multiple sequence alignments (MSAs) has been used to identify the interaction interface of protein complexes, by measuring either co-conservation or co-mutation of amino acid residues across the interface. Recently, maximum entropy related correlated mutation measures (CMMs) such as direct information, decoupling direct from indirect interactions, have been developed to identify residue pairs interacting across the protein complex interface. These studies have focussed on carefully selected protein complexes with large, good-quality MSAs. In this work, we study protein complexes with a more typical MSA consisting of fewer than 400 sequences, using a set of 79 intramolecular protein complexes. Using a maximum entropy based CMM at the residue level, we develop an interface level CMM score to be used in re-ranking docking decoys. We demonstrate that our interface level CMM score compares favourably to the complementarity trace score, an evolutionary information-based score measuring co-conservation, when combined with the number of interface residues, a knowledge-based potential and the variability score of individual amino acid sites. We also demonstrate, that, since co-mutation and co-complementarity in the MSA contain orthogonal information, the best prediction performance using evolutionary information can be achieved by combining the co-mutation information of the CMM with co-conservation information of a complementarity trace score, predicting a near-native structure as the top prediction for 41% of the dataset. The method presented is not restricted to small MSAs, and will likely improve interface prediction also for complexes with large and good-quality MSAs.


Assuntos
Biologia Computacional/métodos , Mapeamento de Interação de Proteínas , Alinhamento de Sequência , Algoritmos , Sequência de Aminoácidos , Conjuntos de Dados como Assunto , Evolução Molecular , Modelos Moleculares , Mutação , Ligação Proteica , Conformação Proteica , Mapeamento de Interação de Proteínas/métodos , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Alinhamento de Sequência/métodos
4.
J Chem Theory Comput ; 9(12): 5718-5733, 2013 Dec 10.
Artigo em Inglês | MEDLINE | ID: mdl-24683370

RESUMO

Maximum Likelihood (ML) optimization schemes are widely used for parameter inference. They maximize the likelihood of some experimentally observed data, with respect to the model parameters iteratively, following the gradient of the logarithm of the likelihood. Here, we employ a ML inference scheme to infer a generalizable, physics-based coarse-grained protein model (which includes Go̅-like biasing terms to stabilize secondary structure elements in room-temperature simulations), using native conformations of a training set of proteins as the observed data. Contrastive divergence, a novel statistical machine learning technique, is used to efficiently approximate the direction of the gradient ascent, which enables the use of a large training set of proteins. Unlike previous work, the generalizability of the protein model allows the folding of peptides and a protein (protein G) which are not part of the training set. We compare the same force field with different van der Waals (vdW) potential forms: a hard cutoff model, and a Lennard-Jones (LJ) potential with vdW parameters inferred or adopted from the CHARMM or AMBER force fields. Simulations of peptides and protein G show that the LJ model with inferred parameters outperforms the hard cutoff potential, which is consistent with previous observations. Simulations using the LJ potential with inferred vdW parameters also outperforms the protein models with adopted vdW parameter values, demonstrating that model parameters generally cannot be used with force fields with different energy functions. The software is available at https://sites.google.com/site/crankite/.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA