Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 45
Filtrar
1.
PLoS Comput Biol ; 20(5): e1012061, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38701099

RESUMO

To optimize proteins for particular traits holds great promise for industrial and pharmaceutical purposes. Machine Learning is increasingly applied in this field to predict properties of proteins, thereby guiding the experimental optimization process. A natural question is: How much progress are we making with such predictions, and how important is the choice of regressor and representation? In this paper, we demonstrate that different assessment criteria for regressor performance can lead to dramatically different conclusions, depending on the choice of metric, and how one defines generalization. We highlight the fundamental issues of sample bias in typical regression scenarios and how this can lead to misleading conclusions about regressor performance. Finally, we make the case for the importance of calibrated uncertainty in this domain.


Assuntos
Biologia Computacional , Aprendizado de Máquina , Engenharia de Proteínas , Engenharia de Proteínas/métodos , Análise de Regressão , Biologia Computacional/métodos , Proteínas/química , Algoritmos
2.
Cell Mol Life Sci ; 80(6): 143, 2023 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-37160462

RESUMO

In terms of its relative frequency, lysine is a common amino acid in the human proteome. However, by bioinformatics we find hundreds of proteins that contain long and evolutionarily conserved stretches completely devoid of lysine residues. These so-called lysine deserts show a high prevalence in intrinsically disordered proteins with known or predicted functions within the ubiquitin-proteasome system (UPS), including many E3 ubiquitin-protein ligases and UBL domain proteasome substrate shuttles, such as BAG6, RAD23A, UBQLN1 and UBQLN2. We show that introduction of lysine residues into the deserts leads to a striking increase in ubiquitylation of some of these proteins. In case of BAG6, we show that ubiquitylation is catalyzed by the E3 RNF126, while RAD23A is ubiquitylated by E6AP. Despite the elevated ubiquitylation, mutant RAD23A appears stable, but displays a partial loss of function phenotype in fission yeast. In case of UBQLN1 and BAG6, introducing lysine leads to a reduced abundance due to proteasomal degradation of the proteins. For UBQLN1 we show that arginine residues within the lysine depleted region are critical for its ability to form cytosolic speckles/inclusions. We propose that selective pressure to avoid lysine residues may be a common evolutionary mechanism to prevent unwarranted ubiquitylation and/or perhaps other lysine post-translational modifications. This may be particularly relevant for UPS components as they closely and frequently encounter the ubiquitylation machinery and are thus more susceptible to nonspecific ubiquitylation.


Assuntos
Complexo de Endopeptidases do Proteassoma , Schizosaccharomyces , Humanos , Ubiquitina , Lisina , Citoplasma , Ubiquitinação , Schizosaccharomyces/genética , Chaperonas Moleculares , Proteínas Relacionadas à Autofagia , Proteínas Adaptadoras de Transdução de Sinal , Ubiquitina-Proteína Ligases
3.
Proc Natl Acad Sci U S A ; 118(31)2021 08 03.
Artigo em Inglês | MEDLINE | ID: mdl-34321355

RESUMO

Single-particle tracking (SPT) is a key tool for quantitative analysis of dynamic biological processes and has provided unprecedented insights into a wide range of systems such as receptor localization, enzyme propulsion, bacteria motility, and drug nanocarrier delivery. The inherently complex diffusion in such biological systems can vary drastically both in time and across systems, consequently imposing considerable analytical challenges, and currently requires an a priori knowledge of the system. Here we introduce a method for SPT data analysis, processing, and classification, which we term "diffusional fingerprinting." This method allows for dissecting the features that underlie diffusional behavior and establishing molecular identity, regardless of the underlying diffusion type. The method operates by isolating 17 descriptive features for each observed motion trajectory and generating a diffusional map of all features for each type of particle. Precise classification of the diffusing particle identity is then obtained by training a simple logistic regression model. A linear discriminant analysis generates a feature ranking that outputs the main differences among diffusional features, providing key mechanistic insights. Fingerprinting operates by both training on and predicting experimental data, without the need for pretraining on simulated data. We found this approach to work across a wide range of simulated and experimentally diverse systems, such as tracked lipases on fat substrates, transcription factors diffusing in cells, and nanoparticles diffusing in mucus. This flexibility ultimately supports diffusional fingerprinting's utility as a universal paradigm for SPT diffusional analysis and prediction.


Assuntos
Aprendizado de Máquina , Imagem Individual de Molécula/métodos , Simulação por Computador , Difusão , Interpretação de Imagem Assistida por Computador , Movimento , Tamanho da Partícula
4.
Cell Mol Life Sci ; 79(9): 484, 2022 Aug 16.
Artigo em Inglês | MEDLINE | ID: mdl-35974206

RESUMO

Ubiquitin is a small, globular protein that is conjugated to other proteins as a posttranslational event. A palette of small, folded domains recognizes and binds ubiquitin to translate and effectuate this posttranslational signal. Recent computational studies have suggested that protein regions can recognize ubiquitin via a process of folding upon binding. Using peptide binding arrays, bioinformatics, and NMR spectroscopy, we have uncovered a disordered ubiquitin-binding motif that likely remains disordered when bound and thus expands the palette of ubiquitin-binding proteins. We term this motif Disordered Ubiquitin-Binding Motif (DisUBM) and find it to be present in many proteins with known or predicted functions in degradation and transcription. We decompose the determinants of the motif showing it to rely on features of aromatic and negatively charged residues, and less so on distinct sequence positions in line with its disordered nature. We show that the affinity of the motif is low and moldable by the surrounding disordered chain, allowing for an enhanced interaction surface with ubiquitin, whereby the affinity increases ~ tenfold. Further affinity optimization using peptide arrays pushed the affinity into the low micromolar range, but compromised context dependence. Finally, we find that DisUBMs can emerge from unbiased screening of randomized peptide libraries, featuring in de novo cyclic peptides selected to bind ubiquitin chains. We suggest that naturally occurring DisUBMs can recognize ubiquitin as a posttranslational signal to act as affinity enhancers in IDPs that bind to folded and ubiquitylated binding partners.


Assuntos
Proteínas Intrinsicamente Desordenadas , Proteínas , Sequência de Aminoácidos , Proteínas Intrinsicamente Desordenadas/química , Peptídeos/metabolismo , Ligação Proteica , Proteínas/metabolismo , Ubiquitina/metabolismo
5.
RNA ; 25(2): 219-231, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30420522

RESUMO

RNA molecules are highly dynamic systems characterized by a complex interplay between sequence, structure, dynamics, and function. Molecular simulations can potentially provide powerful insights into the nature of these relationships. The analysis of structures and molecular trajectories of nucleic acids can be nontrivial because it requires processing very high-dimensional data that are not easy to visualize and interpret. Here we introduce Barnaba, a Python library aimed at facilitating the analysis of nucleic acid structures and molecular simulations. The software consists of a variety of analysis tools that allow the user to (i) calculate distances between three-dimensional structures using different metrics, (ii) back-calculate experimental data from three-dimensional structures, (iii) perform cluster analysis and dimensionality reductions, (iv) search three-dimensional motifs in PDB structures and trajectories, and (v) construct elastic network models for nucleic acids and nucleic acids-protein complexes. In addition, Barnaba makes it possible to calculate torsion angles, pucker conformations, and to detect base-pairing/base-stacking interactions. Barnaba produces graphics that conveniently visualize both extended secondary structure and dynamics for a set of molecular conformations. The software is available as a command-line tool as well as a library, and supports a variety of file formats such as PDB, dcd, and xtc files. Source code, documentation, and examples are freely available at https://github.com/srnas/barnaba under GNU GPLv3 license.


Assuntos
Biologia Computacional/métodos , Conformação de Ácido Nucleico , RNA/ultraestrutura , Software , Pareamento de Bases/genética , Bases de Dados de Proteínas , Modelos Moleculares
6.
Cell Commun Signal ; 18(1): 132, 2020 08 24.
Artigo em Inglês | MEDLINE | ID: mdl-32831102

RESUMO

BACKGROUND: Class 1 cytokine receptors (C1CRs) are single-pass transmembrane proteins responsible for transmitting signals between the outside and the inside of cells. Remarkably, they orchestrate key biological processes such as proliferation, differentiation, immunity and growth through long disordered intracellular domains (ICDs), but without having intrinsic kinase activity. Despite these key roles, their characteristics remain rudimentarily understood. METHODS: The current paper asks the question of why disorder has evolved to govern signaling of C1CRs by reviewing the literature in combination with new sequence and biophysical analyses of chain properties across the family. RESULTS: We uncover that the C1CR-ICDs are fully disordered and brimming with SLiMs. Many of these short linear motifs (SLiMs) are overlapping, jointly signifying a complex regulation of interactions, including network rewiring by isoforms. The C1CR-ICDs have unique properties that distinguish them from most IDPs and we forward the perception that the C1CR-ICDs are far from simple strings with constitutively bound kinases. Rather, they carry both organizational and operational features left uncovered within their disorder, including mechanisms and complexities of regulatory functions. CONCLUSIONS: Critically, the understanding of the fascinating ability of these long, completely disordered chains to orchestrate complex cellular signaling pathways is still in its infancy, and we urge a perceptional shift away from the current simplistic view towards uncovering their full functionalities and potential. Video abstract.


Assuntos
Proteínas Intrinsicamente Desordenadas/química , Proteínas Intrinsicamente Desordenadas/metabolismo , Receptores de Citocinas/química , Receptores de Citocinas/metabolismo , Transdução de Sinais , Motivos de Aminoácidos , Sequência de Aminoácidos , Humanos , Conformação Proteica , Isoformas de Proteínas/química , Isoformas de Proteínas/metabolismo
7.
Cell Mol Life Sci ; 76(24): 4923-4943, 2019 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-31134302

RESUMO

Proliferating cell nuclear antigen (PCNA) is a cellular hub in DNA metabolism and a potential drug target. Its binding partners carry a short linear motif (SLiM) known as the PCNA-interacting protein-box (PIP-box), but sequence-divergent motifs have been reported to bind to the same binding pocket. To investigate how PCNA accommodates motif diversity, we assembled a set of 77 experimentally confirmed PCNA-binding proteins and analyzed features underlying their binding affinity. Combining NMR spectroscopy, affinity measurements and computational analyses, we corroborate that most PCNA-binding motifs reside in intrinsically disordered regions, that structure preformation is unrelated to affinity, and that the sequence-patterns that encode binding affinity extend substantially beyond the boundaries of the PIP-box. Our systematic multidisciplinary approach expands current views on PCNA interactions and reveals that the PIP-box affinity can be modulated over four orders of magnitude by positive charges in the flanking regions. Including the flanking regions as part of the motif is expected to have broad implications, particularly for interpretation of disease-causing mutations and drug-design, targeting DNA-replication and -repair.


Assuntos
Motivos de Aminoácidos/genética , Proteínas de Ligação a DNA/química , DNA/química , Antígeno Nuclear de Célula em Proliferação/química , DNA/genética , Reparo do DNA/genética , Replicação do DNA/genética , Proteínas de Ligação a DNA/genética , Humanos , Proteínas Intrinsicamente Desordenadas/química , Proteínas Intrinsicamente Desordenadas/genética , Espectroscopia de Ressonância Magnética , Antígeno Nuclear de Célula em Proliferação/genética , Conformação Proteica
8.
J Biomol NMR ; 73(12): 713-725, 2019 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-31598803

RESUMO

Phosphorylation is one of the main regulators of cellular signaling typically occurring in flexible parts of folded proteins and in intrinsically disordered regions. It can have distinct effects on the chemical environment as well as on the structural properties near the modification site. Secondary chemical shift analysis is the main NMR method for detection of transiently formed secondary structure in intrinsically disordered proteins (IDPs) and the reliability of the analysis depends on an appropriate choice of random coil model. Random coil chemical shifts and sequence correction factors were previously determined for an Ac-QQXQQ-NH2-peptide series with X being any of the 20 common amino acids. However, a matching dataset on the phosphorylated states has so far only been incompletely determined or determined only at a single pH value. Here we extend the database by the addition of the random coil chemical shifts of the phosphorylated states of serine, threonine and tyrosine measured over a range of pH values covering the pKas of the phosphates and at several temperatures (www.bio.ku.dk/sbinlab/randomcoil). The combined results allow for accurate random coil chemical shift determination of phosphorylated regions at any pH and temperature, minimizing systematic biases of the secondary chemical shifts. Comparison of chemical shifts using random coil sets with and without inclusion of the phosphoryl group, revealed under/over estimations of helicity of up to 33%. The expanded set of random coil values will improve the reliability in detection and quantification of transient secondary structure in phosphorylation-modified IDPs.


Assuntos
Aminoácidos/metabolismo , Proteínas Intrinsicamente Desordenadas/química , Ressonância Magnética Nuclear Biomolecular/métodos , Concentração de Íons de Hidrogênio , Fosforilação , Estrutura Secundária de Proteína , Serina/metabolismo , Temperatura , Treonina/metabolismo , Tirosina/metabolismo
9.
Proc Natl Acad Sci U S A ; 113(12): 3227-32, 2016 Mar 22.
Artigo em Inglês | MEDLINE | ID: mdl-26957604

RESUMO

Formation of correct disulfide bonds in the endoplasmic reticulum is a crucial step for folding proteins destined for secretion. Protein disulfide isomerases (PDIs) play a central role in this process. We report a previously unidentified, hypervariable family of PDIs that represents the most diverse gene family of oxidoreductases described in a single genus to date. These enzymes are highly expressed specifically in the venom glands of predatory cone snails, animals that synthesize a remarkably diverse set of cysteine-rich peptide toxins (conotoxins). Enzymes in this PDI family, termed conotoxin-specific PDIs, significantly and differentially accelerate the kinetics of disulfide-bond formation of several conotoxins. Our results are consistent with a unique biological scenario associated with protein folding: The diversification of a family of foldases can be correlated with the rapid evolution of an unprecedented diversity of disulfide-rich structural domains expressed by venomous marine snails in the superfamily Conoidea.


Assuntos
Venenos de Moluscos/química , Peptídeos/química , Isomerases de Dissulfetos de Proteínas/genética , Sequência de Aminoácidos , Animais , Caramujo Conus , Dados de Sequência Molecular , Isomerases de Dissulfetos de Proteínas/química , Dobramento de Proteína , Homologia de Sequência de Aminoácidos
10.
Proc Natl Acad Sci U S A ; 111(38): 13852-7, 2014 Sep 23.
Artigo em Inglês | MEDLINE | ID: mdl-25192938

RESUMO

Methods of protein structure determination based on NMR chemical shifts are becoming increasingly common. The most widely used approaches adopt the molecular fragment replacement strategy, in which structural fragments are repeatedly reassembled into different complete conformations in molecular simulations. Although these approaches are effective in generating individual structures consistent with the chemical shift data, they do not enable the sampling of the conformational space of proteins with correct statistical weights. Here, we present a method of molecular fragment replacement that makes it possible to perform equilibrium simulations of proteins, and hence to determine their free energy landscapes. This strategy is based on the encoding of the chemical shift information in a probabilistic model in Markov chain Monte Carlo simulations. First, we demonstrate that with this approach it is possible to fold proteins to their native states starting from extended structures. Second, we show that the method satisfies the detailed balance condition and hence it can be used to carry out an equilibrium sampling from the Boltzmann distribution corresponding to the force field used in the simulations. Third, by comparing the results of simulations carried out with and without chemical shift restraints we describe quantitatively the effects that these restraints have on the free energy landscapes of proteins. Taken together, these results demonstrate that the molecular fragment replacement strategy can be used in combination with chemical shift information to characterize not only the native structures of proteins but also their conformational fluctuations.


Assuntos
Simulação por Computador , Modelos Moleculares , Ressonância Magnética Nuclear Biomolecular/métodos , Proteínas/química , Cadeias de Markov
11.
Biophys J ; 110(11): 2342-2348, 2016 06 07.
Artigo em Inglês | MEDLINE | ID: mdl-27276252

RESUMO

Bactofilins constitute a recently discovered class of bacterial proteins that form cytoskeletal filaments. They share a highly conserved domain (DUF583) of which the structure remains unknown, in part due to the large size and noncrystalline nature of the filaments. Here, we describe the atomic structure of a bactofilin domain from Caulobacter crescentus. To determine the structure, we developed an approach that combines a biophysical model for proteins with recently obtained solid-state NMR spectroscopy data and amino acid contacts predicted from a detailed analysis of the evolutionary history of bactofilins. Our structure reveals a triangular ß-helical (solenoid) conformation with conserved residues forming the tightly packed core and polar residues lining the surface. The repetitive structure explains the presence of internal repeats as well as strongly conserved positions, and is reminiscent of other fibrillar proteins. Our work provides a structural basis for future studies of bactofilin biology and for designing molecules that target them, as well as a starting point for determining the organization of the entire bactofilin filament. Finally, our approach presents new avenues for determining structures that are difficult to obtain by traditional means.


Assuntos
Proteínas de Bactérias/química , Proteínas de Bactérias/genética , Citoesqueleto/química , Citoesqueleto/genética , Sequência de Aminoácidos , Caulobacter crescentus , Simulação por Computador , Modelos Moleculares , Método de Monte Carlo , Ressonância Magnética Nuclear Biomolecular , Estrutura Secundária de Proteína , Propriedades de Superfície
12.
PLoS Comput Biol ; 11(10): e1004415, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26505632

RESUMO

There is increasing evidence that protein dynamics and conformational changes can play an important role in modulating biological function. As a result, experimental and computational methods are being developed, often synergistically, to study the dynamical heterogeneity of a protein or other macromolecules in solution. Thus, methods such as molecular dynamics simulations or ensemble refinement approaches have provided conformational ensembles that can be used to understand protein function and biophysics. These developments have in turn created a need for algorithms and software that can be used to compare structural ensembles in the same way as the root-mean-square-deviation is often used to compare static structures. Although a few such approaches have been proposed, these can be difficult to implement efficiently, hindering a broader applications and further developments. Here, we present an easily accessible software toolkit, called ENCORE, which can be used to compare conformational ensembles generated either from simulations alone or synergistically with experiments. ENCORE implements three previously described methods for ensemble comparison, that each can be used to quantify the similarity between conformational ensembles by estimating the overlap between the probability distributions that underlie them. We demonstrate the kinds of insights that can be obtained by providing examples of three typical use-cases: comparing ensembles generated with different molecular force fields, assessing convergence in molecular simulations, and calculating differences and similarities in structural ensembles refined with various sources of experimental data. We also demonstrate efficient computational scaling for typical analyses, and robustness against both the size and sampling of the ensembles. ENCORE is freely available and extendable, integrates with the established MDAnalysis software package, reads ensemble data in many common formats, and can work with large trajectory files.


Assuntos
Algoritmos , Simulação de Dinâmica Molecular , Reconhecimento Automatizado de Padrão/métodos , Proteínas/química , Proteínas/ultraestrutura , Software , Linguagens de Programação , Conformação Proteica , Validação de Programas de Computador
13.
J Am Chem Soc ; 137(1): 22-5, 2015 Jan 14.
Artigo em Inglês | MEDLINE | ID: mdl-25415595

RESUMO

Functional amyloid fibers, called curli, play a critical role in adhesion and invasion of many bacteria. Unlike pathological amyloids, curli structures are formed by polypeptide sequences whose amyloid structure has been selected for during evolution. This important distinction provides us with an opportunity to obtain structural insights from an unexpected source: the covariation of amino acids in sequences of different curli proteins. We used recently developed methods to extract amino acid contacts from a multiple sequence alignment of homologues of the curli subunit protein, CsgA. Together with an efficient force field, these contacts allow us to determine structural models of CsgA. We find that CsgA forms a ß-helical structure, where each turn corresponds to previously identified repeat sequences in CsgA. The proposed structure is validated by previously measured solid-state NMR, electron microscopy, and X-ray diffraction data and agrees with an earlier proposed model derived by complementary means.


Assuntos
Amiloide/química , Proteínas de Escherichia coli/química , Modelos Moleculares , Conformação Proteica
14.
PLoS Comput Biol ; 10(2): e1003406, 2014 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-24586124

RESUMO

A key component of computational biology is to compare the results of computer modelling with experimental measurements. Despite substantial progress in the models and algorithms used in many areas of computational biology, such comparisons sometimes reveal that the computations are not in quantitative agreement with experimental data. The principle of maximum entropy is a general procedure for constructing probability distributions in the light of new data, making it a natural tool in cases when an initial model provides results that are at odds with experiments. The number of maximum entropy applications in our field has grown steadily in recent years, in areas as diverse as sequence analysis, structural modelling, and neurobiology. In this Perspectives article, we give a broad introduction to the method, in an attempt to encourage its further adoption. The general procedure is explained in the context of a simple example, after which we proceed with a real-world application in the field of molecular simulations, where the maximum entropy procedure has recently provided new insight. Given the limited accuracy of force fields, macromolecular simulations sometimes produce results that are at not in complete and quantitative accordance with experiments. A common solution to this problem is to explicitly ensure agreement between the two by perturbing the potential energy function towards the experimental data. So far, a general consensus for how such perturbations should be implemented has been lacking. Three very recent papers have explored this problem using the maximum entropy approach, providing both new theoretical and practical insights to the problem. We highlight each of these contributions in turn and conclude with a discussion on remaining challenges.


Assuntos
Entropia , Modelos Biológicos , Biologia Computacional , Simulação por Computador , Substâncias Macromoleculares/química , Modelos Moleculares , Simulação de Dinâmica Molecular , Incerteza
15.
Proteins ; 82(2): 288-99, 2014 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-23934827

RESUMO

We propose a method to formulate probabilistic models of protein structure in atomic detail, for a given amino acid sequence, based on Bayesian principles, while retaining a close link to physics. We start from two previously developed probabilistic models of protein structure on a local length scale, which concern the dihedral angles in main chain and side chains, respectively. Conceptually, this constitutes a probabilistic and continuous alternative to the use of discrete fragment and rotamer libraries. The local model is combined with a nonlocal model that involves a small number of energy terms according to a physical force field, and some information on the overall secondary structure content. In this initial study we focus on the formulation of the joint model and the evaluation of the use of an energy vector as a descriptor of a protein's nonlocal structure; hence, we derive the parameters of the nonlocal model from the native structure without loss of generality. The local and nonlocal models are combined using the reference ratio method, which is a well-justified probabilistic construction. For evaluation, we use the resulting joint models to predict the structure of four proteins. The results indicate that the proposed method and the probabilistic models show considerable promise for probabilistic protein structure prediction and related applications.


Assuntos
Modelos Moleculares , Modelos Estatísticos , Algoritmos , Sequência de Aminoácidos , Proteínas de Bactérias/química , Teorema de Bayes , Ligação de Hidrogênio , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína , Homologia Estrutural de Proteína , Termodinâmica
16.
Res Sq ; 2024 Feb 02.
Artigo em Inglês | MEDLINE | ID: mdl-38352328

RESUMO

Sub-cellular diffusion in living systems reflects cellular processes and interactions. Recent advances in optical microscopy allow the tracking of this nanoscale diffusion of individual objects with an unprecedented level of precision. However, the agnostic and automated extraction of functional information from the diffusion of molecules and organelles within the sub-cellular environment, is labor-intensive and poses a significant challenge. Here we introduce DeepSPT, a deep learning framework to interpret the diffusional 2D or 3D temporal behavior of objects in a rapid and efficient manner, agnostically. Demonstrating its versatility, we have applied DeepSPT to automated mapping of the early events of viral infections, identifying distinct types of endosomal organelles, and clathrin-coated pits and vesicles with up to 95% accuracy and within seconds instead of weeks. The fact that DeepSPT effectively extracts biological information from diffusion alone illustrates that besides structure, motion encodes function at the molecular and subcellular level.

17.
J Comput Chem ; 34(19): 1697-705, 2013 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-23619610

RESUMO

We present a new software framework for Markov chain Monte Carlo sampling for simulation, prediction, and inference of protein structure. The software package contains implementations of recent advances in Monte Carlo methodology, such as efficient local updates and sampling from probabilistic models of local protein structure. These models form a probabilistic alternative to the widely used fragment and rotamer libraries. Combined with an easily extendible software architecture, this makes PHAISTOS well suited for Bayesian inference of protein structure from sequence and/or experimental data. Currently, two force-fields are available within the framework: PROFASI and OPLS-AA/L, the latter including the generalized Born surface area solvent model. A flexible command-line and configuration-file interface allows users quickly to set up simulations with the desired configuration. PHAISTOS is released under the GNU General Public License v3.0. Source code and documentation are freely available from http://phaistos.sourceforge.net. The software is implemented in C++ and has been tested on Linux and OSX platforms.


Assuntos
Cadeias de Markov , Método de Monte Carlo , Proteínas/química , Software , Teorema de Bayes , Simulação por Computador , Modelos Químicos , Conformação Proteica
18.
Bioinformatics ; 28(4): 510-5, 2012 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-22199383

RESUMO

MOTIVATION: Clustering protein structures is an important task in structural bioinformatics. De novo structure prediction, for example, often involves a clustering step for finding the best prediction. Other applications include assigning proteins to fold families and analyzing molecular dynamics trajectories. RESULTS: We present Pleiades, a novel approach to clustering protein structures with a rigorous mathematical underpinning. The method approximates clustering based on the root mean square deviation by first mapping structures to Gauss integral vectors--which were introduced by Røgen and co-workers--and subsequently performing K-means clustering. CONCLUSIONS: Compared to current methods, Pleiades dramatically improves on the time needed to perform clustering, and can cluster a significantly larger number of structures, while providing state-of-the-art results. The number of low energy structures generated in a typical folding study, which is in the order of 50,000 structures, can be clustered within seconds to minutes.


Assuntos
Análise por Conglomerados , Biologia Computacional/métodos , Proteínas/química , Adenilato Quinase/química , Candida/química , Escherichia coli/enzimologia , Proteínas Fúngicas/química , Simulação de Dinâmica Molecular
19.
Elife ; 122023 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-37184062

RESUMO

Predicting the thermodynamic stability of proteins is a common and widely used step in protein engineering, and when elucidating the molecular mechanisms behind evolution and disease. Here, we present RaSP, a method for making rapid and accurate predictions of changes in protein stability by leveraging deep learning representations. RaSP performs on-par with biophysics-based methods and enables saturation mutagenesis stability predictions in less than a second per residue. We use RaSP to calculate ∼ 230 million stability changes for nearly all single amino acid changes in the human proteome, and examine variants observed in the human population. We find that variants that are common in the population are substantially depleted for severe destabilization, and that there are substantial differences between benign and pathogenic variants, highlighting the role of protein stability in genetic diseases. RaSP is freely available-including via a Web interface-and enables large-scale analyses of stability in experimental and predicted protein structures.


Assuntos
Aprendizado Profundo , Humanos , Proteínas/metabolismo , Mutagênese , Aminoácidos/genética , Estabilidade Proteica , Biologia Computacional/métodos
20.
bioRxiv ; 2023 Nov 17.
Artigo em Inglês | MEDLINE | ID: mdl-38014323

RESUMO

Sub-cellular diffusion in living systems reflects cellular processes and interactions. Recent advances in optical microscopy allow the tracking of this nanoscale diffusion of individual objects with an unprecedented level of precision. However, the agnostic and automated extraction of functional information from the diffusion of molecules and organelles within the sub-cellular environment, is labor-intensive and poses a significant challenge. Here we introduce DeepSPT, a deep learning framework to interpret the diffusional 2D or 3D temporal behavior of objects in a rapid and efficient manner, agnostically. Demonstrating its versatility, we have applied DeepSPT to automated mapping of the early events of viral infections, identifying distinct types of endosomal organelles, and clathrin-coated pits and vesicles with up to 95% accuracy and within seconds instead of weeks. The fact that DeepSPT effectively extracts biological information from diffusion alone indicates that besides structure, motion encodes function at the molecular and subcellular level.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa