Pesquisa | Biblioteca Virtual em Saúde

1.

Differentiable simulation to develop molecular dynamics force fields for disordered proteins.

Greener, Joe G.

Chem Sci ; 15(13): 4897-4909, 2024 Mar 27.

Artigo em Inglês | MEDLINE | ID: mdl-38550690

RESUMO

Implicit solvent force fields are computationally efficient but can be unsuitable for running molecular dynamics on disordered proteins. Here I improve the a99SB-disp force field and the GBNeck2 implicit solvent model to better describe disordered proteins. Differentiable molecular simulations with 5 ns trajectories are used to jointly optimise 108 parameters to better match explicit solvent trajectories. Simulations with the improved force field better reproduce the radius of gyration and secondary structure content seen in experiments, whilst showing slightly degraded performance on folded proteins and protein complexes. The force field, called GB99dms, reproduces the results of a small molecule binding study and improves agreement with experiment for the aggregation of amyloid peptides. GB99dms, which can be used in OpenMM, is available at https://github.com/greener-group/GB99dms. This work is the first to show that gradients can be obtained directly from nanosecond-length differentiable simulations of biomolecules and highlights the effectiveness of this approach to training whole force fields to match desired properties.

2.

Julia for biologists.

Roesch, Elisabeth; Greener, Joe G; MacLean, Adam L; Nassar, Huda; Rackauckas, Christopher; Holy, Timothy E; Stumpf, Michael P H.

Nat Methods ; 20(5): 655-664, 2023 05.

Artigo em Inglês | MEDLINE | ID: mdl-37024649

RESUMO

Major computational challenges exist in relation to the collection, curation, processing and analysis of large genomic and imaging datasets, as well as the simulation of larger and more realistic models in systems biology. Here we discuss how a relative newcomer among programming languages-Julia-is poised to meet the current and emerging demands in the computational biosciences and beyond. Speed, flexibility, a thriving package ecosystem and readability are major factors that make high-performance computing and data analysis available to an unprecedented degree. We highlight how Julia's design is already enabling new ways of analyzing biological data and systems, and we provide a list of resources that can facilitate the transition into Julian computing.

Assuntos

Ecossistema , Linguagens de Programação , Simulação por Computador , Metodologias Computacionais , Biologia de Sistemas , Software

3.

Author Correction: Julia for biologists.

Roesch, Elisabeth; Greener, Joe G; MacLean, Adam L; Nassar, Huda; Rackauckas, Christopher; Holy, Timothy E; Stumpf, Michael P H.

Nat Methods ; 20(5): 771, 2023 May.

Artigo em Inglês | MEDLINE | ID: mdl-37120675

4.

Ultrafast end-to-end protein structure prediction enables high-throughput exploration of uncharacterized proteins.

Kandathil, Shaun M; Greener, Joe G; Lau, Andy M; Jones, David T.

Proc Natl Acad Sci U S A ; 119(4)2022 01 25.

Artigo em Inglês | MEDLINE | ID: mdl-35074909

RESUMO

Deep learning-based prediction of protein structure usually begins by constructing a multiple sequence alignment (MSA) containing homologs of the target protein. The most successful approaches combine large feature sets derived from MSAs, and considerable computational effort is spent deriving these input features. We present a method that greatly reduces the amount of preprocessing required for a target MSA, while producing main chain coordinates as a direct output of a deep neural network. The network makes use of just three recurrent networks and a stack of residual convolutional layers, making the predictor very fast to run, and easy to install and use. Our approach constructs a directly learned representation of the sequences in an MSA, starting from a one-hot encoding of the sequences. When supplemented with an approximate precision matrix, the learned representation can be used to produce structural models of comparable or greater accuracy as compared to our original DMPfold method, while requiring less than a second to produce a typical model. This level of accuracy and speed allows very large-scale three-dimensional modeling of proteins on minimal hardware, and we demonstrate this by producing models for over 1.3 million uncharacterized regions of proteins extracted from the BFD sequence clusters. After constructing an initial set of approximate models, we select a confident subset of over 30,000 models for further refinement and analysis, revealing putative novel protein folds. We also provide updated models for over 5,000 Pfam families studied in the original DMPfold paper.

Assuntos

Modelos Moleculares , Conformação Proteica , Software , Algoritmos , Caspases/química , Biologia Computacional , Bases de Dados de Proteínas , Aprendizado Profundo , Ensaios de Triagem em Larga Escala , Proteínas/química

5.

A guide to machine learning for biologists.

Greener, Joe G; Kandathil, Shaun M; Moffat, Lewis; Jones, David T.

Nat Rev Mol Cell Biol ; 23(1): 40-55, 2022 01.

Artigo em Inglês | MEDLINE | ID: mdl-34518686

RESUMO

The expanding scale and inherent complexity of biological data have encouraged a growing use of machine learning in biology to build informative and predictive models of the underlying biological processes. All machine learning techniques fit models to data; however, the specific methods are quite varied and can at first glance seem bewildering. In this Review, we aim to provide readers with a gentle introduction to a few key machine learning techniques, including the most recently developed and widely used techniques involving deep neural networks. We describe how different techniques may be suited to specific types of biological data, and also discuss some best practices and points to consider when one is embarking on experiments involving machine learning. Some emerging directions in machine learning methodology are also discussed.

Assuntos

Biologia , Aprendizado de Máquina , Animais , Aprendizado Profundo , Humanos , Redes Neurais de Computação

6.

Differentiable molecular simulation can learn all the parameters in a coarse-grained force field for proteins.

Greener, Joe G; Jones, David T.

PLoS One ; 16(9): e0256990, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34473813

RESUMO

Finding optimal parameters for force fields used in molecular simulation is a challenging and time-consuming task, partly due to the difficulty of tuning multiple parameters at once. Automatic differentiation presents a general solution: run a simulation, obtain gradients of a loss function with respect to all the parameters, and use these to improve the force field. This approach takes advantage of the deep learning revolution whilst retaining the interpretability and efficiency of existing force fields. We demonstrate that this is possible by parameterising a simple coarse-grained force field for proteins, based on training simulations of up to 2,000 steps learning to keep the native structure stable. The learned potential matches chemical knowledge and PDB data, can fold and reproduce the dynamics of small proteins, and shows ability in protein design and model scoring applications. Problems in applying differentiable molecular simulation to all-atom models of proteins are discussed along with possible solutions and the variety of available loss functions. The learned potential, simulation scripts and training code are made available at https://github.com/psipred/cgdms.

Assuntos

Aprendizado Profundo , Simulação de Dinâmica Molecular , Proteínas/química , Temperatura , Cristalização , Energia Nuclear , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta , Estrutura Terciária de Proteína

7.

BioStructures.jl: read, write and manipulate macromolecular structures in Julia.

Greener, Joe G; Selvaraj, Joel; Ward, Ben J.

Bioinformatics ; 36(14): 4206-4207, 2020 08 15.

Artigo em Inglês | MEDLINE | ID: mdl-32407511

RESUMO

SUMMARY: Robust, flexible and fast software to read, write and manipulate macromolecular structures is a prerequisite for productively doing structural bioinformatics. We present BioStructures.jl, the first dedicated package in the Julia programming language for dealing with macromolecular structures and the Protein Data Bank. BioStructures.jl builds on the lessons learned with similar packages to provide a large feature set, a flexible object representation and high performance. AVAILABILITY AND IMPLEMENTATION: BioStructures.jl is freely available under the MIT license. Source code and documentation are available at https://github.com/BioJulia/BioStructures.jl. BioStructures.jl is compatible with Julia versions 0.6 and later and is system-independent. CONTACT: j.greener@ucl.ac.uk.

Assuntos

Biologia Computacional , Software , Linguagens de Programação

8.

Recent developments in deep learning applied to protein structure prediction.

Kandathil, Shaun M; Greener, Joe G; Jones, David T.

Proteins ; 87(12): 1179-1189, 2019 12.

Artigo em Inglês | MEDLINE | ID: mdl-31589782

RESUMO

Although many structural bioinformatics tools have been using neural network models for a long time, deep neural network (DNN) models have attracted considerable interest in recent years. Methods employing DNNs have had a significant impact in recent CASP experiments, notably in CASP12 and especially CASP13. In this article, we offer a brief introduction to some of the key principles and properties of DNN models and discuss why they are naturally suited to certain problems in structural bioinformatics. We also briefly discuss methodological improvements that have enabled these successes. Using the contact prediction task as an example, we also speculate why DNN models are able to produce reasonably accurate predictions even in the absence of many homologues for a given target sequence, a result that can at first glance appear surprising given the lack of input information. We end on some thoughts about how and why these types of models can be so effective, as well as a discussion on potential pitfalls.

Assuntos

Biologia Computacional , Aprendizado Profundo , Conformação Proteica , Modelos Moleculares , Redes Neurais de Computação , Proteínas/química , Proteínas/genética , Proteínas/ultraestrutura , Homologia Estrutural de Proteína

9.

Deep learning extends de novo protein modelling coverage of genomes using iteratively predicted structural constraints.

Greener, Joe G; Kandathil, Shaun M; Jones, David T.

Nat Commun ; 10(1): 3977, 2019 09 04.

Artigo em Inglês | MEDLINE | ID: mdl-31484923

RESUMO

The inapplicability of amino acid covariation methods to small protein families has limited their use for structural annotation of whole genomes. Recently, deep learning has shown promise in allowing accurate residue-residue contact prediction even for shallow sequence alignments. Here we introduce DMPfold, which uses deep learning to predict inter-atomic distance bounds, the main chain hydrogen bond network, and torsion angles, which it uses to build models in an iterative fashion. DMPfold produces more accurate models than two popular methods for a test set of CASP12 domains, and works just as well for transmembrane proteins. Applied to all Pfam domains without known structures, confident models for 25% of these so-called dark families were produced in under a week on a small 200 core cluster. DMPfold provides models for 16% of human proteome UniProt entries without structures, generates accurate models with fewer than 100 sequences in some cases, and is freely available.

Assuntos

Biologia Computacional/métodos , Aprendizado Profundo , Modelos Moleculares , Conformação Proteica , Proteoma/química , Proteômica/métodos , Algoritmos , Animais , Sítios de Ligação/genética , Humanos , Proteoma/genética , Proteoma/metabolismo , Reprodutibilidade dos Testes

10.

Prediction of interresidue contacts with DeepMetaPSICOV in CASP13.

Kandathil, Shaun M; Greener, Joe G; Jones, David T.

Proteins ; 87(12): 1092-1099, 2019 12.

Artigo em Inglês | MEDLINE | ID: mdl-31298436

RESUMO

In this article, we describe our efforts in contact prediction in the CASP13 experiment. We employed a new deep learning-based contact prediction tool, DeepMetaPSICOV (or DMP for short), together with new methods and data sources for alignment generation. DMP evolved from MetaPSICOV and DeepCov and combines the input feature sets used by these methods as input to a deep, fully convolutional residual neural network. We also improved our method for multiple sequence alignment generation and included metagenomic sequences in the search. We discuss successes and failures of our approach and identify areas where further improvements may be possible. DMP is freely available at: https://github.com/psipred/DeepMetaPSICOV.

Assuntos

Biologia Computacional , Conformação Proteica , Proteínas/ultraestrutura , Algoritmos , Sequência de Aminoácidos/genética , Aprendizado Profundo , Aprendizado de Máquina , Metagenoma/genética , Redes Neurais de Computação , Proteínas/química , Proteínas/genética , Análise de Sequência de Proteína

11.

Design of metalloproteins and novel protein folds using variational autoencoders.

Greener, Joe G; Moffat, Lewis; Jones, David T.

Sci Rep ; 8(1): 16189, 2018 11 01.

Artigo em Inglês | MEDLINE | ID: mdl-30385875

RESUMO

The design of novel proteins has many applications but remains an attritional process with success in isolated cases. Meanwhile, deep learning technologies have exploded in popularity in recent years and are increasingly applicable to biology due to the rise in available data. We attempt to link protein design and deep learning by using variational autoencoders to generate protein sequences conditioned on desired properties. Potential copper and calcium binding sites are added to non-metal binding proteins without human intervention and compared to a hidden Markov model. In another use case, a grammar of protein structures is developed and used to produce sequences for a novel protein topology. One candidate structure is found to be stable by molecular dynamics simulation. The ability of our model to confine the vast search space of protein sequences and to scale easily has the potential to assist in a variety of protein design tasks.

Assuntos

Biologia Computacional , Metaloproteínas/química , Dobramento de Proteína , Sequência de Aminoácidos/genética , Sítios de Ligação , Aprendizado Profundo , Humanos , Simulação de Dinâmica Molecular

12.

High-Throughput Kinetic Analysis for Target-Directed Covalent Ligand Discovery.

Craven, Gregory B; Affron, Dominic P; Allen, Charlotte E; Matthies, Stefan; Greener, Joe G; Morgan, Rhodri M L; Tate, Edward W; Armstrong, Alan; Mann, David J.

Angew Chem Int Ed Engl ; 57(19): 5257-5261, 2018 05 04.

Artigo em Inglês | MEDLINE | ID: mdl-29480525

RESUMO

Cysteine-reactive small molecules are used as chemical probes of biological systems and as medicines. Identifying high-quality covalent ligands requires comprehensive kinetic analysis to distinguish selective binders from pan-reactive compounds. Quantitative irreversible tethering (qIT), a general method for screening cysteine-reactive small molecules based upon the maximization of kinetic selectivity, is described. This method was applied prospectively to discover covalent fragments that target the clinically important cell cycle regulator Cdk2. Crystal structures of the inhibitor complexes validate the approach and guide further optimization. The power of this technique is highlighted by the identification of a Cdk2-selective allosteric (type IV) kinase inhibitor whose novel mode-of-action could be exploited therapeutically.

Assuntos

Quinase 2 Dependente de Ciclina/antagonistas & inibidores , Cisteína/farmacologia , Descoberta de Drogas , Ensaios de Triagem em Larga Escala , Ligantes , Inibidores de Proteínas Quinases/farmacologia , Bibliotecas de Moléculas Pequenas/farmacologia , Quinase 2 Dependente de Ciclina/metabolismo , Cisteína/química , Cinética , Estrutura Molecular , Inibidores de Proteínas Quinases/análise , Inibidores de Proteínas Quinases/síntese química , Bibliotecas de Moléculas Pequenas/análise , Bibliotecas de Moléculas Pequenas/síntese química

13.

Structure-based prediction of protein allostery.

Greener, Joe G; Sternberg, Michael Je.

Curr Opin Struct Biol ; 50: 1-8, 2018 06.

Artigo em Inglês | MEDLINE | ID: mdl-29080471

RESUMO

Allostery is the functional change at one site on a protein caused by a change at a distant site. In order for the benefits of allostery to be taken advantage of, both for basic understanding of proteins and to develop new classes of drugs, the structure-based prediction of allosteric binding sites, modulators and communication pathways is necessary. Here we review the recently emerging field of allosteric prediction, focusing mainly on computational methods. We also describe the search for cryptic binding pockets and attempts to design allostery into proteins. The development and adoption of such methods is essential or the long-preached potential of allostery will remain elusive.

Assuntos

Regulação Alostérica , Sítio Alostérico , Proteínas/química , Relação Quantitativa Estrutura-Atividade , Biologia Computacional/métodos , Modelos Moleculares , Ligação Proteica , Software

14.

Predicting Protein Dynamics and Allostery Using Multi-Protein Atomic Distance Constraints.

Greener, Joe G; Filippis, Ioannis; Sternberg, Michael J E.

Structure ; 25(3): 546-558, 2017 03 07.

Artigo em Inglês | MEDLINE | ID: mdl-28190781

RESUMO

The related concepts of protein dynamics, conformational ensembles and allostery are often difficult to study with molecular dynamics (MD) due to the timescales involved. We present ExProSE (Exploration of Protein Structural Ensembles), a distance geometry-based method that generates an ensemble of protein structures from two input structures. ExProSE provides a unified framework for the exploration of protein structure and dynamics in a fast and accessible way. Using a dataset of apo/holo pairs it is shown that existing coarse-grained methods often cannot span large conformational changes. For T4-lysozyme, ExProSE is able to generate ensembles that are more native-like than tCONCOORD and NMSim, and comparable with targeted MD. By adding additional constraints representing potential modulators, ExProSE can predict allosteric sites. ExProSE ranks an allosteric pocket first or second for 27 out of 58 allosteric proteins, which is similar and complementary to existing methods. The ExProSE source code is freely available.

Assuntos

Biologia Computacional/métodos , Proteínas/química , Regulação Alostérica , Sítios de Ligação , Modelos Moleculares , Simulação de Dinâmica Molecular , Conformação Proteica

15.

AlloPred: prediction of allosteric pockets on proteins using normal mode perturbation analysis.

Greener, Joe G; Sternberg, Michael J E.

BMC Bioinformatics ; 16: 335, 2015 Oct 23.

Artigo em Inglês | MEDLINE | ID: mdl-26493317

RESUMO

BACKGROUND: Despite being hugely important in biological processes, allostery is poorly understood and no universal mechanism has been discovered. Allosteric drugs are a largely unexplored prospect with many potential advantages over orthosteric drugs. Computational methods to predict allosteric sites on proteins are needed to aid the discovery of allosteric drugs, as well as to advance our fundamental understanding of allostery. RESULTS: AlloPred, a novel method to predict allosteric pockets on proteins, was developed. AlloPred uses perturbation of normal modes alongside pocket descriptors in a machine learning approach that ranks the pockets on a protein. AlloPred ranked an allosteric pocket top for 23 out of 40 known allosteric proteins, showing comparable and complementary performance to two existing methods. In 28 of 40 cases an allosteric pocket was ranked first or second. The AlloPred web server, freely available at http://www.sbg.bio.ic.ac.uk/allopred/home, allows visualisation and analysis of predictions. The source code and dataset information are also available from this site. CONCLUSIONS: Perturbation of normal modes can enhance our ability to predict allosteric sites on proteins. Computational methods such as AlloPred assist drug discovery efforts by suggesting sites on proteins for further experimental study.

Assuntos

Proteínas/metabolismo , Algoritmos , Sítio Alostérico , Ligantes , Conformação Proteica

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA