Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Test (Madr) ; 30(1): 59-63, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33758495
2.
Biometrics ; 74(3): 845-854, 2018 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-29569225

RESUMO

Motivated by a cutting edge problem related to the shape of α -helices in proteins, we formulate a parametric statistical model, which incorporates the cylindrical nature of the helix. Our focus is to detect a "kink," which is a drastic change in the axial direction of the helix. We propose a statistical model for the straight α -helix and derive the maximum likelihood estimation procedure. The cylinder is an accepted geometric model for α -helices, but our statistical formulation, for the first time, quantifies the uncertainty in atom positions around the cylinder. We propose a change point technique "Kink-Detector" to detect a kink location along the helix. Unlike classical change point problems, the change in direction of a helix depends on a simultaneous shift of multiple data points rather than a single data point, and is less straightforward. Our biological building block is crowdsourced data on straight and kinked helices; which has set a gold standard. We use this data to identify salient features to construct Kink-detector, test its performance and gain some insights. We find the performance of Kink-detector comparable to its computational competitor called "Kink-Finder." We highlight that identification of kinks by visual assessment can have limitations and Kink-detector may help in such cases. Further, an analysis of crowdsourced curved α -helices finds that Kink-detector is also effective in detecting moderate changes in axial directions.


Assuntos
Modelos Estatísticos , Conformação Proteica em alfa-Hélice , Proteínas/química , Funções Verossimilhança , Incerteza
3.
Mol Biol Evol ; 34(8): 2085-2100, 2017 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-28453724

RESUMO

Recently described stochastic models of protein evolution have demonstrated that the inclusion of structural information in addition to amino acid sequences leads to a more reliable estimation of evolutionary parameters. We present a generative, evolutionary model of protein structure and sequence that is valid on a local length scale. The model concerns the local dependencies between sequence and structure evolution in a pair of homologous proteins. The evolutionary trajectory between the two structures in the protein pair is treated as a random walk in dihedral angle space, which is modeled using a novel angular diffusion process on the two-dimensional torus. Coupling sequence and structure evolution in our model allows for modeling both "smooth" conformational changes and "catastrophic" conformational jumps, conditioned on the amino acid changes. The model has interpretable parameters and is comparatively more realistic than previous stochastic models, providing new insights into the relationship between sequence and structure evolution. For example, using the trained model we were able to identify an apparent sequence-structure evolutionary motif present in a large number of homologous protein pairs. The generative nature of our model enables us to evaluate its validity and its ability to simulate aspects of protein evolution conditioned on an amino acid sequence, a related amino acid sequence, a related structure or any combination thereof.


Assuntos
Proteínas/genética , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Simulação por Computador , Evolução Molecular , Modelos Genéticos , Modelos Moleculares , Conformação Proteica , Elementos Estruturais de Proteínas/genética , Proteínas/metabolismo , Análise de Sequência de Proteína/estatística & dados numéricos
4.
Biometrics ; 72(4): 1266-1274, 2016 12.
Artigo em Inglês | MEDLINE | ID: mdl-26991351

RESUMO

Applications of circular regression models appear in many different fields such as evolutionary psychology, motor behavior, biology, and, in particular, in the analysis of gene expressions in oscillatory systems. Specifically, for the gene expression problem, a researcher may be interested in modeling the relationship among the phases of cell-cycle genes in two species with differing periods. This challenging problem reduces to the problem of constructing a piecewise circular regression model and, with this objective in mind, we propose a flexible circular regression model which allows different parameter values depending on sectors along the circle. We give a detailed interpretation of the parameters in the model and provide maximum likelihood estimators. We also provide a model selection procedure based on the concept of generalized degrees of freedom. The model is then applied to the analysis of two different cell-cycle data sets and through these examples we highlight the power of our new methodology.


Assuntos
Relógios Biológicos , Modelos Estatísticos , Análise de Regressão , Ciclo Celular/genética , Simulação por Computador , Funções Verossimilhança , Modelos Biológicos , Saccharomyces/citologia , Saccharomyces/genética
5.
Proteins ; 82(2): 288-99, 2014 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-23934827

RESUMO

We propose a method to formulate probabilistic models of protein structure in atomic detail, for a given amino acid sequence, based on Bayesian principles, while retaining a close link to physics. We start from two previously developed probabilistic models of protein structure on a local length scale, which concern the dihedral angles in main chain and side chains, respectively. Conceptually, this constitutes a probabilistic and continuous alternative to the use of discrete fragment and rotamer libraries. The local model is combined with a nonlocal model that involves a small number of energy terms according to a physical force field, and some information on the overall secondary structure content. In this initial study we focus on the formulation of the joint model and the evaluation of the use of an energy vector as a descriptor of a protein's nonlocal structure; hence, we derive the parameters of the nonlocal model from the native structure without loss of generality. The local and nonlocal models are combined using the reference ratio method, which is a well-justified probabilistic construction. For evaluation, we use the resulting joint models to predict the structure of four proteins. The results indicate that the proposed method and the probabilistic models show considerable promise for probabilistic protein structure prediction and related applications.


Assuntos
Modelos Moleculares , Modelos Estatísticos , Algoritmos , Sequência de Aminoácidos , Proteínas de Bactérias/química , Teorema de Bayes , Ligação de Hidrogênio , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína , Homologia Estrutural de Proteína , Termodinâmica
6.
PLoS One ; 8(11): e79439, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24244505

RESUMO

We present the theoretical foundations of a general principle to infer structure ensembles of flexible biomolecules from spatially and temporally averaged data obtained in biophysical experiments. The central idea is to compute the Kullback-Leibler optimal modification of a given prior distribution τ(x) with respect to the experimental data and its uncertainty. This principle generalizes the successful inferential structure determination method and recently proposed maximum entropy methods. Tractability of the protocol is demonstrated through the analysis of simulated nuclear magnetic resonance spectroscopy data of a small peptide.


Assuntos
Biofísica , Modelos Teóricos , Algoritmos , Simulação por Computador
7.
Ann Appl Stat ; 7(2): 989-1009, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24052809

RESUMO

We develop a Bayesian model for the alignment of two point configurations under the full similarity transformations of rotation, translation and scaling. Other work in this area has concentrated on rigid body transformations, where scale information is preserved, motivated by problems involving molecular data; this is known as form analysis. We concentrate on a Bayesian formulation for statistical shape analysis. We generalize the model introduced by Green and Mardia for the pairwise alignment of two unlabeled configurations to full similarity transformations by introducing a scaling factor to the model. The generalization is not straight-forward, since the model needs to be reformulated to give good performance when scaling is included. We illustrate our method on the alignment of rat growth profiles and a novel application to the alignment of protein domains. Here, scaling is applied to secondary structure elements when comparing protein folds; additionally, we find that one global scaling factor is not in general sufficient to model these data and, hence, we develop a model in which multiple scale factors can be included to handle different scalings of shape components.

8.
Stat Appl Genet Mol Biol ; 10: Article 8, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21291418

RESUMO

It has long been known that the amino-acid sequence of a protein determines its 3-dimensional structure, but accurate ab initio prediction of structure from sequence remains elusive. We gain insight into local protein structure conformation by studying the relationship of dihedral angles in pairs of residues in protein sequences (dipeptides). We adopt a contingency table approach, exploring a targeted set of hypotheses through log-linear modelling to detect patterns of association of these dihedral angles in all dipeptides considered. Our models indicate a substantial association of the side-chain conformation of the first residue with the backbone conformation of the second residue (side-to-back interaction) as well as a weaker but still significant association of the backbone conformation of the first residue with the side-chain conformation of the second residue (back-to-side interaction). To compare these interactions across different dipeptides, we cluster the parameter estimates for the corresponding association terms. This reveals a striking pattern. For the side-to-back term, all dipeptides which have the same first residue jointly appear in distinct clusters whereas for the back-to-side term, we observe a much weaker pattern. This suggests that the conformation of the first residue affects the conformation of the second.


Assuntos
Dipeptídeos/química , Modelos Químicos , Proteínas/química , Análise de Sequência de Proteína/estatística & dados numéricos , Sequência de Aminoácidos , Análise por Conglomerados , Simulação por Computador , Conformação Proteica
9.
Biometrics ; 67(2): 611-9, 2011 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-20618307

RESUMO

One of the key ingredients in drug discovery is the derivation of conceptual templates called pharmacophores. A pharmacophore model characterizes the physicochemical properties common to all active molecules, called ligands, bound to a particular protein receptor, together with their relative spatial arrangement. Motivated by this important application, we develop a Bayesian hierarchical model for the derivation of pharmacophore templates from multiple configurations of point sets, partially labeled by the atom type of each point. The model is implemented through a multistage template hunting algorithm that produces a series of templates that capture the geometrical relationship of atoms matched across multiple configurations. Chemical information is incorporated by distinguishing between atoms of different elements, whereby different elements are less likely to be matched than atoms of the same element. We illustrate our method through examples of deriving templates from sets of ligands that all bind structurally related protein active sites and show that the model is able to retrieve the key pharmacophore features in two test cases.


Assuntos
Teorema de Bayes , Biologia Computacional/métodos , Desenho de Fármacos , Algoritmos , Biometria/métodos , Domínio Catalítico , Descoberta de Drogas , Proteínas/química , Relação Estrutura-Atividade
10.
PLoS Comput Biol ; 5(6): e1000406, 2009 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-19543381

RESUMO

The increasing importance of non-coding RNA in biology and medicine has led to a growing interest in the problem of RNA 3-D structure prediction. As is the case for proteins, RNA 3-D structure prediction methods require two key ingredients: an accurate energy function and a conformational sampling procedure. Both are only partly solved problems. Here, we focus on the problem of conformational sampling. The current state of the art solution is based on fragment assembly methods, which construct plausible conformations by stringing together short fragments obtained from experimental structures. However, the discrete nature of the fragments necessitates the use of carefully tuned, unphysical energy functions, and their non-probabilistic nature impairs unbiased sampling. We offer a solution to the sampling problem that removes these important limitations: a probabilistic model of RNA structure that allows efficient sampling of RNA conformations in continuous space, and with associated probabilities. We show that the model captures several key features of RNA structure, such as its rotameric nature and the distribution of the helix lengths. Furthermore, the model readily generates native-like 3-D conformations for 9 out of 10 test structures, solely using coarse-grained base-pairing information. In conclusion, the method provides a theoretical and practical solution for a major bottleneck on the way to routine prediction and simulation of RNA structure and dynamics in atomic detail.


Assuntos
Modelos Estatísticos , Conformação de Ácido Nucleico , RNA/química , Algoritmos , Teorema de Bayes , Simulação por Computador , Bases de Dados de Ácidos Nucleicos , Imageamento Tridimensional/métodos , Cadeias de Markov , Modelos Moleculares , Método de Monte Carlo , Software
11.
J Comput Biol ; 15(9): 1209-20, 2008 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-18973436

RESUMO

We propose a simple procedure for generating virtual protein C(alpha) traces. One of the key ingredients of our method, to build a three-dimensional structure from a random sequence of amino acids, is to work directly on torsional angles of the chain which we sample from a von Mises distribution. With simple modeling of the hydrophobic effect in protein folding, the procedure produces compact and globular structures. Some characteristics of real proteins (i.e., compactness and globularity) are well mimicked by this procedure. These virtual traces are used to assess algorithms for matching protein structures or functional sites.


Assuntos
Simulação por Computador , Modelos Estatísticos , Dobramento de Proteína , Algoritmos , Sequência de Aminoácidos , Interações Hidrofóbicas e Hidrofílicas , Modelos Moleculares , Conformação Proteica
12.
Proc Natl Acad Sci U S A ; 105(26): 8932-7, 2008 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-18579771

RESUMO

Despite significant progress in recent years, protein structure prediction maintains its status as one of the prime unsolved problems in computational biology. One of the key remaining challenges is an efficient probabilistic exploration of the structural space that correctly reflects the relative conformational stabilities. Here, we present a fully probabilistic, continuous model of local protein structure in atomic detail. The generative model makes efficient conformational sampling possible and provides a framework for the rigorous analysis of local sequence-structure correlations in the native state. Our method represents a significant theoretical and practical improvement over the widely used fragment assembly technique by avoiding the drawbacks associated with a discrete and nonprobabilistic approach.


Assuntos
Modelos Moleculares , Modelos Estatísticos , Proteínas/química , Motivos de Aminoácidos
13.
Biometrics ; 63(2): 505-12, 2007 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-17688502

RESUMO

A fundamental problem in bioinformatics is to characterize the secondary structure of a protein, which has traditionally been carried out by examining a scatterplot (Ramachandran plot) of the conformational angles. We examine two natural bivariate von Mises distributions--referred to as Sine and Cosine models--which have five parameters and, for concentrated data, tend to a bivariate normal distribution. These are analyzed and their main properties derived. Conditions on the parameters are established which result in bimodal behavior for the joint density and the marginal distribution, and we note an interesting situation in which the joint density is bimodal but the marginal distributions are unimodal. We carry out comparisons of the two models, and it is seen that the Cosine model may be preferred. Mixture distributions of the Cosine model are fitted to two representative protein datasets using the expectation maximization algorithm, which results in an objective partition of the scatterplot into a number of components. Our results are consistent with empirical observations; new insights are discussed.


Assuntos
Biologia Computacional/métodos , Proteínas/química , Algoritmos , Funções Verossimilhança , Malato Desidrogenase/química , Modelos Estatísticos , Mioglobina/química , Conformação Proteica , Estrutura Secundária de Proteína
14.
BMC Bioinformatics ; 8: 257, 2007 Jul 17.
Artigo em Inglês | MEDLINE | ID: mdl-17640336

RESUMO

BACKGROUND: Matching functional sites is a key problem for the understanding of protein function and evolution. The commonly used graph theoretic approach, and other related approaches, require adjustment of a matching distance threshold a priori according to the noise in atomic positions. This is difficult to pre-determine when matching sites related by varying evolutionary distances and crystallographic precision. Furthermore, sometimes the graph method is unable to identify alternative but important solutions in the neighbourhood of the distance based solution because of strict distance constraints. We consider the Bayesian approach to improve graph based solutions. In principle this approach applies to other methods with strict distance matching constraints. The Bayesian method can flexibly incorporate all types of prior information on specific binding sites (e.g. amino acid types) in contrast to combinatorial formulations. RESULTS: We present a new meta-algorithm for matching protein functional sites (active sites and ligand binding sites) based on an initial graph matching followed by refinement using a Markov chain Monte Carlo (MCMC) procedure. This procedure is an innovative extension to our recent work. The method accounts for the 3-dimensional structure of the site as well as the physico-chemical properties of the constituent amino acids. The MCMC procedure can lead to a significant increase in the number of significant matches compared to the graph method as measured independently by rigorously derived p-values. CONCLUSION: MCMC refinement step is able to significantly improve graph based matches. We apply the method to matching NAD(P)(H) binding sites within single Rossmann fold families, between different families in the same superfamily, and in different folds. Within families sites are often well conserved, but there are examples where significant shape based matches do not retain similar amino acid chemistry, indicating that even within families the same ligand may be bound using substantially different physico-chemistry. We also show that the procedure finds significant matches between binding sites for the same co-factor in different families and different folds.


Assuntos
Teorema de Bayes , Proteínas/química , 17-Hidroxiesteroide Desidrogenases/química , Álcool Desidrogenase/química , Algoritmos , Motivos de Aminoácidos , Sequência de Aminoácidos , Sítios de Ligação , Bases de Dados Factuais , Flavina-Adenina Dinucleotídeo/química , Ligantes , Funções Verossimilhança , Cadeias de Markov , Método de Monte Carlo , NADP/química , Ligação Proteica , Estrutura Terciária de Proteína , Alinhamento de Sequência
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA