Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Bioinformatics ; 39(9)2023 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-37725353

RESUMEN

MOTIVATION: Living a Big Data era in Biomedicine, there is an unmet need to systematically assess experimental observations in the context of available information. This assessment would offer a means for a comprehensive and robust validation of biomedical data results and provide an initial estimate of the potential novelty of the findings. RESULTS: Here we present BQsupports, a web-based tool built upon the Bioteque biomedical descriptors that systematically analyzes and quantifies the current support to a given set of observations. The tool relies on over 1000 distinct types of biomedical descriptors, covering over 11 different biological and chemical entities, including genes, cell lines, diseases, and small molecules. By exploring hundreds of descriptors, BQsupports provide support scores for each observation across a wide variety of biomedical contexts. These scores are then aggregated to summarize the biomedical support of the assessed dataset as a whole. Finally, the BQsupports also suggests predictive features of the given dataset, which can be exploited in downstream machine learning applications. AVAILABILITY AND IMPLEMENTATION: The web application and underlying data are available online (https://bqsupports.irbbarcelona.org).


Asunto(s)
Aprendizaje Automático , Programas Informáticos , Macrodatos
2.
Nat Commun ; 13(1): 5304, 2022 09 09.
Artículo en Inglés | MEDLINE | ID: mdl-36085310

RESUMEN

Biomedical data is accumulating at a fast pace and integrating it into a unified framework is a major challenge, so that multiple views of a given biological event can be considered simultaneously. Here we present the Bioteque, a resource of unprecedented size and scope that contains pre-calculated biomedical descriptors derived from a gigantic knowledge graph, displaying more than 450 thousand biological entities and 30 million relationships between them. The Bioteque integrates, harmonizes, and formats data collected from over 150 data sources, including 12 biological entities (e.g., genes, diseases, drugs) linked by 67 types of associations (e.g., 'drug treats disease', 'gene interacts with gene'). We show how Bioteque descriptors facilitate the assessment of high-throughput protein-protein interactome data, the prediction of drug response and new repurposing opportunities, and demonstrate that they can be used off-the-shelf in downstream machine learning tasks without loss of performance with respect to using original data. The Bioteque thus offers a thoroughly processed, tractable, and highly optimized assembly of the biomedical knowledge available in the public domain.


Asunto(s)
Conocimiento , Reconocimiento de Normas Patrones Automatizadas , Bases del Conocimiento , Aprendizaje Automático , Proteínas
3.
Cell Rep Med ; 3(1): 100492, 2022 01 18.
Artículo en Inglés | MEDLINE | ID: mdl-35106508

RESUMEN

The Columbia Cancer Target Discovery and Development (CTD2) Center is developing PANACEA, a resource comprising dose-responses and RNA sequencing (RNA-seq) profiles of 25 cell lines perturbed with ∼400 clinical oncology drugs, to study a tumor-specific drug mechanism of action. Here, this resource serves as the basis for a DREAM Challenge assessing the accuracy and sensitivity of computational algorithms for de novo drug polypharmacology predictions. Dose-response and perturbational profiles for 32 kinase inhibitors are provided to 21 teams who are blind to the identity of the compounds. The teams are asked to predict high-affinity binding targets of each compound among ∼1,300 targets cataloged in DrugBank. The best performing methods leverage gene expression profile similarity analysis as well as deep-learning methodologies trained on individual datasets. This study lays the foundation for future integrative analyses of pharmacogenomic data, reconciliation of polypharmacology effects in different tumor contexts, and insights into network-based assessments of drug mechanisms of action.


Asunto(s)
Neoplasias/tratamiento farmacológico , Polifarmacología , Algoritmos , Perfilación de la Expresión Génica , Regulación Neoplásica de la Expresión Génica , Humanos , Redes Neurales de la Computación , Proteínas Quinasas/metabolismo , ARN Mensajero/genética , ARN Mensajero/metabolismo , Transcripción Genética
4.
Nat Commun ; 12(1): 3932, 2021 06 24.
Artículo en Inglés | MEDLINE | ID: mdl-34168145

RESUMEN

Chemical descriptors encode the physicochemical and structural properties of small molecules, and they are at the core of chemoinformatics. The broad release of bioactivity data has prompted enriched representations of compounds, reaching beyond chemical structures and capturing their known biological properties. Unfortunately, bioactivity descriptors are not available for most small molecules, which limits their applicability to a few thousand well characterized compounds. Here we present a collection of deep neural networks able to infer bioactivity signatures for any compound of interest, even when little or no experimental information is available for them. Our signaturizers relate to bioactivities of 25 different types (including target profiles, cellular response and clinical outcomes) and can be used as drop-in replacements for chemical descriptors in day-to-day chemoinformatics tasks. Indeed, we illustrate how inferred bioactivity signatures are useful to navigate the chemical space in a biologically relevant manner, unveiling higher-order organization in natural product collections, and to enrich mostly uncharacterized chemical libraries for activity against the drug-orphan target Snail1. Moreover, we implement a battery of signature-activity relationship (SigAR) models and show a substantial improvement in performance, with respect to chemistry-based classifiers, across a series of biophysics and physiology activity prediction benchmarks.


Asunto(s)
Bibliotecas de Moléculas Pequeñas/química , Bibliotecas de Moléculas Pequeñas/farmacología , Relación Estructura-Actividad , Línea Celular Tumoral , Bases de Datos Farmacéuticas , Evaluación Preclínica de Medicamentos/métodos , Humanos , Factores de Transcripción de la Familia Snail/antagonistas & inhibidores , Factores de Transcripción de la Familia Snail/genética , Factores de Transcripción de la Familia Snail/metabolismo
5.
J Chem Inf Model ; 60(12): 5730-5734, 2020 12 28.
Artículo en Inglés | MEDLINE | ID: mdl-32672454

RESUMEN

Until a vaccine becomes available, the current repertoire of drugs is our only therapeutic asset to fight the SARS-CoV-2 outbreak. Indeed, emergency clinical trials have been launched to assess the effectiveness of many marketed drugs, tackling the decrease of viral load through several mechanisms. Here, we present an online resource, based on small-molecule bioactivity signatures and natural language processing, to expand the portfolio of compounds with potential to treat COVID-19. By comparing the set of drugs reported to be potentially active against SARS-CoV-2 to a universe of 1 million bioactive molecules, we identify compounds that display analogous chemical and functional features to the current COVID-19 candidates. Searches can be filtered by level of evidence and mechanism of action, and results can be restricted to drug molecules or include the much broader space of bioactive compounds. Moreover, we allow users to contribute COVID-19 drug candidates, which are automatically incorporated to the pipeline once per day. The computational platform, as well as the source code, is available at https://sbnb.irbbarcelona.org/covid19.


Asunto(s)
Antivirales/química , Tratamiento Farmacológico de COVID-19 , Reposicionamiento de Medicamentos/métodos , SARS-CoV-2/efectos de los fármacos , Antivirales/farmacología , Simulación por Computador , Diseño de Fármacos , Humanos , Modelos Moleculares , Estructura Molecular , Bibliotecas de Moléculas Pequeñas/química , Bibliotecas de Moléculas Pequeñas/farmacología
6.
Nat Biotechnol ; 38(9): 1098, 2020 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-32440008

RESUMEN

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

7.
Nat Biotechnol ; 38(9): 1087-1096, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32440005

RESUMEN

Small molecules are usually compared by their chemical structure, but there is no unified analytic framework for representing and comparing their biological activity. We present the Chemical Checker (CC), which provides processed, harmonized and integrated bioactivity data on ~800,000 small molecules. The CC divides data into five levels of increasing complexity, from the chemical properties of compounds to their clinical outcomes. In between, it includes targets, off-targets, networks and cell-level information, such as omics data, growth inhibition and morphology. Bioactivity data are expressed in a vector format, extending the concept of chemical similarity to similarity between bioactivity signatures. We show how CC signatures can aid drug discovery tasks, including target identification and library characterization. We also demonstrate the discovery of compounds that reverse and mimic biological signatures of disease models and genetic perturbations in cases that could not be addressed using chemical information alone. Overall, the CC signatures facilitate the conversion of bioactivity data to a format that is readily amenable to machine learning methods.


Asunto(s)
Preparaciones Farmacéuticas/metabolismo , Bibliotecas de Moléculas Pequeñas/metabolismo , Productos Biológicos/química , Productos Biológicos/metabolismo , Productos Biológicos/uso terapéutico , Biomarcadores Farmacológicos/metabolismo , Bases de Datos Factuales , Descubrimiento de Drogas , Quimioterapia , Humanos , Preparaciones Farmacéuticas/química , Bibliotecas de Moléculas Pequeñas/química , Bibliotecas de Moléculas Pequeñas/uso terapéutico
8.
Proteins ; 87(12): 1378-1387, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31571280

RESUMEN

Critical blind assessment of structure prediction techniques is crucial for the scientific community to establish the state of the art, identify bottlenecks, and guide future developments. In Critical Assessment of Techniques in Structure Prediction (CASP), human experts assess the performance of participating methods in relation to the difficulty of the prediction task in a biennial experiment on approximately 100 targets. Yet, the development of automated computational modeling methods requires more frequent evaluation cycles and larger sets of data. The "Continuous Automated Model EvaluatiOn (CAMEO)" platform complements CASP by conducting fully automated blind prediction evaluations based on the weekly pre-release of sequences of those structures, which are going to be published in the next release of the Protein Data Bank (PDB). Each week, CAMEO publishes benchmarking results for predictions corresponding to a set of about 20 targets collected during a 4-day prediction window. CAMEO benchmarking data are generated consistently for all methods at the same point in time, enabling developers to cross-validate their method's performance, and referring to their results in publications. Many successful participants of CASP have used CAMEO-either by directly benchmarking their methods within the system or by comparing their own performance to CAMEO reference data. CAMEO offers a variety of scores reflecting different aspects of structure modeling, for example, binding site accuracy, homo-oligomer interface quality, or accuracy of local model confidence estimates. By introducing the "bestSingleTemplate" method based on structure superpositions as a reference for the accuracy of 3D modeling predictions, CAMEO facilitates objective comparison of techniques and fosters the development of advanced methods.


Asunto(s)
Biología Computacional , Conformación Proteica , Proteínas/ultraestructura , Programas Informáticos , Algoritmos , Benchmarking , Sitios de Unión , Bases de Datos de Proteínas , Humanos , Modelos Moleculares , Pliegue de Proteína , Proteínas/química , Proteínas/genética , Análisis de Secuencia de Proteína
9.
Methods Mol Biol ; 1851: 301-316, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30298405

RESUMEN

Proteins are subject to evolutionary forces that shape their three-dimensional structure to meet specific functional demands. The knowledge of the structure of a protein is therefore instrumental to gain information about the molecular basis of its function. However, experimental structure determination is inherently time consuming and expensive, making it impossible to follow the explosion of sequence data deriving from genome-scale projects. As a consequence, computational structural modeling techniques have received much attention and established themselves as a valuable complement to experimental structural biology efforts. Among these, comparative modeling remains the method of choice to model the three-dimensional structure of a protein when homology to a protein of known structure can be detected.The general strategy consists of using experimentally determined structures of proteins as templates for the generation of three-dimensional models of related family members (targets) of which the structure is unknown. This chapter provides a description of the individual steps needed to obtain a comparative model using SWISS-MODEL, one of the most widely used automated servers for protein structure homology modeling.


Asunto(s)
Proteínas/química , Biología Computacional , Modelos Moleculares , Proteínas/clasificación , Homología de Secuencia de Aminoácido , Homología Estructural de Proteína
10.
J Mol Biol ; 430(21): 4431-4438, 2018 10 19.
Artículo en Inglés | MEDLINE | ID: mdl-30274705

RESUMEN

Multi-protein machines are responsible for most cellular tasks, and many efforts have been invested in the systematic identification and characterization of thousands of these macromolecular assemblies. However, unfortunately, the (quasi) atomic details necessary to understand their function are available only for a tiny fraction of the known complexes. The computational biology community is developing strategies to integrate structural data of different nature, from electron microscopy to X-ray crystallography, to model large molecular machines, as it has been done for individual proteins and interactions with remarkable success. However, unlike for binary interactions, there is no reliable gold-standard set of three-dimensional (3D) complexes to benchmark the performance of these methodologies and detect their limitations. Here, we present a strategy to dynamically generate non-redundant sets of 3D heteromeric complexes with three or more components. By changing the values of sequence identity and component overlap between assemblies required to define complex redundancy, we can create sets of representative complexes with known 3D structure (i.e., target complexes). Using an identity threshold of 20% and imposing a fraction of component overlap of <0.5, we identify 495 unique target complexes, which represent a real non-redundant set of heteromeric assemblies with known 3D structure. Moreover, for each target complex, we also identify a set of assemblies, of varying degrees of identity and component overlap, that can be readily used as input in a complex modeling exercise (i.e., template subcomplexes). We hope that resources like this will significantly help the development and progress assessment of novel methodologies, as docking benchmarks and blind prediction contests did. The interactive resource is accessible at https://DynBench3D.irbbarcelona.org.


Asunto(s)
Biología Computacional/métodos , Complejos Multiproteicos/química , Benchmarking , Cristalografía por Rayos X , Bases de Datos de Proteínas , Internet , Microscopía Electrónica , Modelos Moleculares , Peso Molecular , Programas Informáticos
11.
Nucleic Acids Res ; 46(W1): W296-W303, 2018 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-29788355

RESUMEN

Homology modelling has matured into an important technique in structural biology, significantly contributing to narrowing the gap between known protein sequences and experimentally determined structures. Fully automated workflows and servers simplify and streamline the homology modelling process, also allowing users without a specific computational expertise to generate reliable protein models and have easy access to modelling results, their visualization and interpretation. Here, we present an update to the SWISS-MODEL server, which pioneered the field of automated modelling 25 years ago and been continuously further developed. Recently, its functionality has been extended to the modelling of homo- and heteromeric complexes. Starting from the amino acid sequences of the interacting proteins, both the stoichiometry and the overall structure of the complex are inferred by homology modelling. Other major improvements include the implementation of a new modelling engine, ProMod3 and the introduction a new local model quality estimation method, QMEANDisCo. SWISS-MODEL is freely available at https://swissmodel.expasy.org.


Asunto(s)
Internet , Conformación Proteica , Proteínas/genética , Programas Informáticos , Bases de Datos de Proteínas , Modelos Químicos , Simulación de Dinámica Molecular , Proteínas/química , Homología de Secuencia de Aminoácido , Homología Estructural de Proteína
12.
Proteins ; 86 Suppl 1: 387-398, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-29178137

RESUMEN

Every second year, the community experiment "Critical Assessment of Techniques for Structure Prediction" (CASP) is conducting an independent blind assessment of structure prediction methods, providing a framework for comparing the performance of different approaches and discussing the latest developments in the field. Yet, developers of automated computational modeling methods clearly benefit from more frequent evaluations based on larger sets of data. The "Continuous Automated Model EvaluatiOn (CAMEO)" platform complements the CASP experiment by conducting fully automated blind prediction assessments based on the weekly pre-release of sequences of those structures, which are going to be published in the next release of the PDB Protein Data Bank. CAMEO publishes weekly benchmarking results based on models collected during a 4-day prediction window, on average assessing ca. 100 targets during a time frame of 5 weeks. CAMEO benchmarking data is generated consistently for all participating methods at the same point in time, enabling developers to benchmark and cross-validate their method's performance, and directly refer to the benchmarking results in publications. In order to facilitate server development and promote shorter release cycles, CAMEO sends weekly email with submission statistics and low performance warnings. Many participants of CASP have successfully employed CAMEO when preparing their methods for upcoming community experiments. CAMEO offers a variety of scores to allow benchmarking diverse aspects of structure prediction methods. By introducing new scoring schemes, CAMEO facilitates new development in areas of active research, for example, modeling quaternary structure, complexes, or ligand binding sites.


Asunto(s)
Biología Computacional/métodos , Modelos Moleculares , Conformación Proteica , Proteínas/química , Proteínas/metabolismo , Análisis de Secuencia de Proteína/métodos , Sitios de Unión , Bases de Datos de Proteínas , Humanos , Ligandos , Unión Proteica
13.
Proteins ; 86 Suppl 1: 247-256, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-29071742

RESUMEN

We present the results of the first independent assessment of protein assemblies in CASP. A total of 1624 oligomeric models were submitted by 108 predictor groups for the 30 oligomeric targets in the CASP12 edition. We evaluated the accuracy of oligomeric predictions by comparison to their reference structures at the interface patch and residue contact levels. We find that interface patches are more reliably predicted than the specific residue contacts. Whereas none of the 15 hard oligomeric targets have successful predictions for the residue contacts at the interface, six have models with resemblance in the interface patch. Successful predictions of interface patch and contacts exist for all targets suitable for homology modeling, with at least one group improving over the best available template for each target. However, the participation in protein assembly prediction is low and uneven. Three human groups are closely ranked at the top by overall performance, but a server outperforms all other predictors for targets suitable for homology modeling. The state of the art of protein assembly prediction methods is in development and has apparent room for improvement, especially for assemblies without templates.


Asunto(s)
Biología Computacional/métodos , Bases de Datos de Proteínas , Modelos Moleculares , Simulación de Dinámica Molecular , Conformación Proteica , Proteínas/química , Algoritmos , Humanos , Pliegue de Proteína , Análisis de Secuencia de Proteína
14.
Sci Rep ; 7(1): 10480, 2017 09 05.
Artículo en Inglés | MEDLINE | ID: mdl-28874689

RESUMEN

Cellular processes often depend on interactions between proteins and the formation of macromolecular complexes. The impairment of such interactions can lead to deregulation of pathways resulting in disease states, and it is hence crucial to gain insights into the nature of macromolecular assemblies. Detailed structural knowledge about complexes and protein-protein interactions is growing, but experimentally determined three-dimensional multimeric assemblies are outnumbered by complexes supported by non-structural experimental evidence. Here, we aim to fill this gap by modeling multimeric structures by homology, only using amino acid sequences to infer the stoichiometry and the overall structure of the assembly. We ask which properties of proteins within a family can assist in the prediction of correct quaternary structure. Specifically, we introduce a description of protein-protein interface conservation as a function of evolutionary distance to reduce the noise in deep multiple sequence alignments. We also define a distance measure to structurally compare homologous multimeric protein complexes. This allows us to hierarchically cluster protein structures and quantify the diversity of alternative biological assemblies known today. We find that a combination of conservation scores, structural clustering, and classical interface descriptors, can improve the selection of homologous protein templates leading to reliable models of protein complexes.


Asunto(s)
Multimerización de Proteína , Análisis de Secuencia de Proteína/métodos , Animales , Fructosa-Bifosfato Aldolasa/química , Humanos , Unión Proteica , Conformación Proteica , Homología de Secuencia de Aminoácido
15.
Nucleic Acids Res ; 42(Web Server issue): W252-8, 2014 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-24782522

RESUMEN

Protein structure homology modelling has become a routine technique to generate 3D models for proteins when experimental structures are not available. Fully automated servers such as SWISS-MODEL with user-friendly web interfaces generate reliable models without the need for complex software packages or downloading large databases. Here, we describe the latest version of the SWISS-MODEL expert system for protein structure modelling. The SWISS-MODEL template library provides annotation of quaternary structure and essential ligands and co-factors to allow for building of complete structural models, including their oligomeric structure. The improved SWISS-MODEL pipeline makes extensive use of model quality estimation for selection of the most suitable templates and provides estimates of the expected accuracy of the resulting models. The accuracy of the models generated by SWISS-MODEL is continuously evaluated by the CAMEO system. The new web site allows users to interactively search for templates, cluster them by sequence similarity, structurally compare alternative templates and select the ones to be used for model building. In cases where multiple alternative template structures are available for a protein of interest, a user-guided template selection step allows building models in different functional states. SWISS-MODEL is available at http://swissmodel.expasy.org/.


Asunto(s)
Modelos Moleculares , Estructura Cuaternaria de Proteína , Estructura Terciaria de Proteína , Programas Informáticos , Homología Estructural de Proteína , Evolución Molecular , Internet
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...