Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
J Chem Inf Model ; 64(12): 4912-4927, 2024 Jun 24.
Artículo en Inglés | MEDLINE | ID: mdl-38860513

RESUMEN

Bottom-up coarse-grained (CG) models proved to be essential to complement and sometimes even replace all-atom representations of soft matter systems and biological macromolecules. The development of low-resolution models takes the moves from the reduction of the degrees of freedom employed, that is, the definition of a mapping between a system's high-resolution description and its simplified counterpart. Even in the absence of an explicit parametrization and simulation of a CG model, the observation of the atomistic system in simpler terms can be informative: this idea is leveraged by the mapping entropy, a measure of the information loss inherent to the process of coarsening. Mapping entropy lies at the heart of the extensible coarse-graining toolbox, EXCOGITO, developed to perform a number of operations and analyses on molecular systems pivoting around the properties of mappings. EXCOGITO can process an all-atom trajectory to compute the mapping entropy, identify the mapping that minimizes it, and establish quantitative relations between a low-resolution representation and the geometrical, structural, and energetic features of the system. Here, the software, which is available free of charge under an open-source license, is presented and showcased to introduce potential users to its capabilities and usage.


Asunto(s)
Entropía , Programas Informáticos , Simulación de Dinámica Molecular , Modelos Moleculares
2.
Proteins ; 91(12): 1658-1683, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37905971

RESUMEN

We present the results for CAPRI Round 54, the 5th joint CASP-CAPRI protein assembly prediction challenge. The Round offered 37 targets, including 14 homodimers, 3 homo-trimers, 13 heterodimers including 3 antibody-antigen complexes, and 7 large assemblies. On average ~70 CASP and CAPRI predictor groups, including more than 20 automatics servers, submitted models for each target. A total of 21 941 models submitted by these groups and by 15 CAPRI scorer groups were evaluated using the CAPRI model quality measures and the DockQ score consolidating these measures. The prediction performance was quantified by a weighted score based on the number of models of acceptable quality or higher submitted by each group among their five best models. Results show substantial progress achieved across a significant fraction of the 60+ participating groups. High-quality models were produced for about 40% of the targets compared to 8% two years earlier. This remarkable improvement is due to the wide use of the AlphaFold2 and AlphaFold2-Multimer software and the confidence metrics they provide. Notably, expanded sampling of candidate solutions by manipulating these deep learning inference engines, enriching multiple sequence alignments, or integration of advanced modeling tools, enabled top performing groups to exceed the performance of a standard AlphaFold2-Multimer version used as a yard stick. This notwithstanding, performance remained poor for complexes with antibodies and nanobodies, where evolutionary relationships between the binding partners are lacking, and for complexes featuring conformational flexibility, clearly indicating that the prediction of protein complexes remains a challenging problem.


Asunto(s)
Algoritmos , Mapeo de Interacción de Proteínas , Mapeo de Interacción de Proteínas/métodos , Conformación Proteica , Unión Proteica , Simulación del Acoplamiento Molecular , Biología Computacional/métodos , Programas Informáticos
3.
Artículo en Inglés | MEDLINE | ID: mdl-30628533

RESUMEN

We introduce here ML4Tox, a framework offering Deep Learning and Support Vector Machine models to predict agonist, antagonist, and binding activities of chemical compounds, in this case for the estrogen receptor ligand-binding domain. The ML4Tox models have been developed with a 10 × 5-fold cross-validation schema on the training portion of the CERAPP ToxCast dataset, formed by 1677 chemicals, each described by 777 molecular features. On the CERAPP "All Literature" evaluation set (agonist: 6319 compounds; antagonist 6539; binding 7283), ML4Tox significantly improved sensitivity over published results on all three tasks, with agonist: 0.78 vs 0.56; antagonist: 0.69 vs 0.11; binding: 0.66 vs 0.26.


Asunto(s)
Simulación por Computador , Disruptores Endocrinos/toxicidad , Contaminantes Ambientales/toxicidad , Aprendizaje Automático , Pruebas de Toxicidad/métodos , Unión Proteica , Relación Estructura-Actividad Cuantitativa , Receptores de Estrógenos , Máquina de Vectores de Soporte
4.
Commun Biol ; 7(1): 49, 2024 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-38184711

RESUMEN

The formation of a stable complex between proteins lies at the core of a wide variety of biological processes and has been the focus of countless experiments. The huge amount of information contained in the protein structural interactome in the Protein Data Bank can now be used to characterise and classify the existing biological interfaces. We here introduce ARCTIC-3D, a fast and user-friendly data mining and clustering software to retrieve data and rationalise the interface information associated with the protein input data. We demonstrate its use by various examples ranging from showing the increased interaction complexity of eukaryotic proteins, 20% of which on average have more than 3 different interfaces compared to only 10% for prokaryotes, to associating different functions to different interfaces. In the context of modelling biomolecular assemblies, we introduce the concept of "recognition entropy", related to the number of possible interfaces of the components of a protein-protein complex, which we demonstrate to correlate with the modelling difficulty in classical docking approaches. The identified interface clusters can also be used to generate various combinations of interface-specific restraints for integrative modelling. The ARCTIC-3D software is freely available at github.com/haddocking/arctic3d and can be accessed as a web-service at wenmr.science.uu.nl/arctic3d.


Asunto(s)
Minería de Datos , Células Procariotas , Análisis por Conglomerados , Bases de Datos de Proteínas , Entropía
5.
Nat Protoc ; 2024 Jun 17.
Artículo en Inglés | MEDLINE | ID: mdl-38886530

RESUMEN

Interactions between macromolecules, such as proteins and nucleic acids, are essential for cellular functions. Experimental methods can fail to provide all the information required to fully model biomolecular complexes at atomic resolution, particularly for large and heterogeneous assemblies. Integrative computational approaches have, therefore, gained popularity, complementing traditional experimental methods in structural biology. Here, we introduce HADDOCK2.4, an integrative modeling platform, and its updated web interface ( https://wenmr.science.uu.nl/haddock2.4 ). The platform seamlessly integrates diverse experimental and theoretical data to generate high-quality models of macromolecular complexes. The user-friendly web server offers automated parameter settings, access to distributed computing resources, and pre- and post-processing steps that enhance the user experience. To present the web server's various interfaces and features, we demonstrate two different applications: (i) we predict the structure of an antibody-antigen complex by using NMR data for the antigen and knowledge of the hypervariable loops for the antibody, and (ii) we perform coarse-grained modeling of PRC1 with a nucleosome particle guided by mutagenesis and functional data. The described protocols require some basic familiarity with molecular modeling and the Linux command shell. This new version of our widely used HADDOCK web server allows structural biologists and non-experts to explore intricate macromolecular assemblies encompassing various molecule types.

6.
Phys Rev E ; 106(4-1): 044101, 2022 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-36397524

RESUMEN

Complex systems are characterized by a tight, nontrivial interplay of their constituents, which gives rise to a multiscale spectrum of emergent properties. In this scenario, it is practically and conceptually difficult to identify those degrees of freedom that mostly determine the behavior of the system and separate them from less prominent players. Here, we tackle this problem making use of three measures of statistical information: Resolution, relevance, and mapping entropy. We address the links existing among them, taking the moves from the established relation between resolution and relevance and further developing novel connections between resolution and mapping entropy; by these means we can identify, in a quantitative manner, the number and selection of degrees of freedom of the system that preserve the largest information content about the generative process that underlies an empirical dataset. The method, which is implemented in a freely available software, is fully general, as it is shown through the application to three very diverse systems, namely, a toy model of independent binary spins, a coarse-grained representation of the financial stock market, and a fully atomistic simulation of a protein.

7.
Eur Phys J B ; 94(10): 204, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34720709

RESUMEN

ABSTRACT: A mapping of a macromolecule is a prescription to construct a simplified representation of the system in which only a subset of its constituent atoms is retained. As the specific choice of the mapping affects the analysis of all-atom simulations as well as the construction of coarse-grained models, the characterisation of the mapping space has recently attracted increasing attention. We here introduce a notion of scalar product and distance between reduced representations, which allows the study of the metric and topological properties of their space in a quantitative manner. Making use of a Wang-Landau enhanced sampling algorithm, we exhaustively explore such space, and examine the qualitative features of mappings in terms of their squared norm. A one-to-one correspondence with an interacting lattice gas on a finite volume leads to the emergence of discontinuous phase transitions in mapping space, which mark the boundaries between qualitatively different reduced representations of the same molecule.

8.
Front Mol Biosci ; 8: 676976, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34164432

RESUMEN

The ever increasing computer power, together with the improved accuracy of atomistic force fields, enables researchers to investigate biological systems at the molecular level with remarkable detail. However, the relevant length and time scales of many processes of interest are still hardly within reach even for state-of-the-art hardware, thus leaving important questions often unanswered. The computer-aided investigation of many biological physics problems thus largely benefits from the usage of coarse-grained models, that is, simplified representations of a molecule at a level of resolution that is lower than atomistic. A plethora of coarse-grained models have been developed, which differ most notably in their granularity; this latter aspect determines one of the crucial open issues in the field, i.e. the identification of an optimal degree of coarsening, which enables the greatest simplification at the expenses of the smallest information loss. In this review, we present the problem of coarse-grained modeling in biophysics from the viewpoint of system representation and information content. In particular, we discuss two distinct yet complementary aspects of protein modeling: on the one hand, the relationship between the resolution of a model and its capacity of accurately reproducing the properties of interest; on the other hand, the possibility of employing a lower resolution description of a detailed model to extract simple, useful, and intelligible information from the latter.

9.
Front Mol Biosci ; 8: 637396, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33996896

RESUMEN

The limits of molecular dynamics (MD) simulations of macromolecules are steadily pushed forward by the relentless development of computer architectures and algorithms. The consequent explosion in the number and extent of MD trajectories induces the need for automated methods to rationalize the raw data and make quantitative sense of them. Recently, an algorithmic approach was introduced by some of us to identify the subset of a protein's atoms, or mapping, that enables the most informative description of the system. This method relies on the computation, for a given reduced representation, of the associated mapping entropy, that is, a measure of the information loss due to such simplification; albeit relatively straightforward, this calculation can be time-consuming. Here, we describe the implementation of a deep learning approach aimed at accelerating the calculation of the mapping entropy. We rely on Deep Graph Networks, which provide extreme flexibility in handling structured input data and whose predictions prove to be accurate and-remarkably efficient. The trained network produces a speedup factor as large as 105 with respect to the algorithmic computation of the mapping entropy, enabling the reconstruction of its landscape by means of the Wang-Landau sampling scheme. Applications of this method reach much further than this, as the proposed pipeline is easily transferable to the computation of arbitrary properties of a molecular structure.

10.
J Chem Theory Comput ; 16(11): 6795-6813, 2020 Nov 10.
Artículo en Inglés | MEDLINE | ID: mdl-33108737

RESUMEN

In theoretical modeling of a physical system, a crucial step consists of the identification of those degrees of freedom that enable a synthetic yet informative representation of it. While in some cases this selection can be carried out on the basis of intuition and experience, straightforward discrimination of the important features from the negligible ones is difficult for many complex systems, most notably heteropolymers and large biomolecules. We here present a thermodynamics-based theoretical framework to gauge the effectiveness of a given simplified representation by measuring its information content. We employ this method to identify those reduced descriptions of proteins, in terms of a subset of their atoms, that retain the largest amount of information from the original model; we show that these highly informative representations share common features that are intrinsically related to the biological properties of the proteins under examination, thereby establishing a bridge between protein structure, energetics, and function.


Asunto(s)
Modelos Moleculares , Proteínas/química , Proteínas/metabolismo , Termodinámica
11.
Interface Focus ; 9(3): 20190003, 2019 Jun 06.
Artículo en Inglés | MEDLINE | ID: mdl-31065348

RESUMEN

Deep learning (DL) algorithms hold great promise for applications in the field of computational biophysics. In fact, the vast amount of available molecular structures, as well as their notable complexity, constitutes an ideal context in which DL-based approaches can be profitably employed. To express the full potential of these techniques, though, it is a prerequisite to express the information contained in a molecule's atomic positions and distances in a set of input quantities that the network can process. Many of the molecular descriptors devised so far are effective and manageable for relatively small structures, but become complex and cumbersome for larger ones. Furthermore, most of them are defined locally, a feature that could represent a limit for those applications where global properties are of interest. Here, we build a DL architecture capable of predicting non-trivial and intrinsically global quantities, that is, the eigenvalues of a protein's lowest-energy fluctuation modes. This application represents a first, relatively simple test bed for the development of a neural network approach to the quantitative analysis of protein structures, and demonstrates unexpected use in the identification of mechanically relevant regions of the molecule.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA