RESUMEN
The causative agent of the coronavirus disease 2019 (COVID-19) pandemic, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has infected millions and killed hundreds of thousands of people worldwide, highlighting an urgent need to develop antiviral therapies. Here we present a quantitative mass spectrometry-based phosphoproteomics survey of SARS-CoV-2 infection in Vero E6 cells, revealing dramatic rewiring of phosphorylation on host and viral proteins. SARS-CoV-2 infection promoted casein kinase II (CK2) and p38 MAPK activation, production of diverse cytokines, and shutdown of mitotic kinases, resulting in cell cycle arrest. Infection also stimulated a marked induction of CK2-containing filopodial protrusions possessing budding viral particles. Eighty-seven drugs and compounds were identified by mapping global phosphorylation profiles to dysregulated kinases and pathways. We found pharmacologic inhibition of the p38, CK2, CDK, AXL, and PIKFYVE kinases to possess antiviral efficacy, representing potential COVID-19 therapies.
Asunto(s)
Betacoronavirus/metabolismo , Infecciones por Coronavirus/metabolismo , Evaluación Preclínica de Medicamentos/métodos , Neumonía Viral/metabolismo , Proteómica/métodos , Células A549 , Enzima Convertidora de Angiotensina 2 , Animales , Antivirales/farmacología , COVID-19 , Células CACO-2 , Quinasa de la Caseína II/antagonistas & inhibidores , Quinasa de la Caseína II/metabolismo , Chlorocebus aethiops , Infecciones por Coronavirus/virología , Quinasas Ciclina-Dependientes/antagonistas & inhibidores , Quinasas Ciclina-Dependientes/metabolismo , Células HEK293 , Interacciones Huésped-Patógeno , Humanos , Pandemias , Peptidil-Dipeptidasa A/genética , Peptidil-Dipeptidasa A/metabolismo , Fosfatidilinositol 3-Quinasas/metabolismo , Inhibidores de las Quinasa Fosfoinosítidos-3/farmacología , Fosforilación , Neumonía Viral/virología , Inhibidores de Proteínas Quinasas/farmacología , Proteínas Proto-Oncogénicas/antagonistas & inhibidores , Proteínas Proto-Oncogénicas/metabolismo , Proteínas Tirosina Quinasas Receptoras/antagonistas & inhibidores , Proteínas Tirosina Quinasas Receptoras/metabolismo , SARS-CoV-2 , Glicoproteína de la Espiga del Coronavirus/metabolismo , Células Vero , Proteínas Quinasas p38 Activadas por Mitógenos/antagonistas & inhibidores , Proteínas Quinasas p38 Activadas por Mitógenos/metabolismo , Tirosina Quinasa del Receptor AxlRESUMEN
ChEMBL (https://www.ebi.ac.uk/chembl/) is a manually curated, high-quality, large-scale, open, FAIR and Global Core Biodata Resource of bioactive molecules with drug-like properties, previously described in the 2012, 2014, 2017 and 2019 Nucleic Acids Research Database Issues. Since its introduction in 2009, ChEMBL's content has changed dramatically in size and diversity of data types. Through incorporation of multiple new datasets from depositors since the 2019 update, ChEMBL now contains slightly more bioactivity data from deposited data vs data extracted from literature. In collaboration with the EUbOPEN consortium, chemical probe data is now regularly deposited into ChEMBL. Release 27 made curated data available for compounds screened for potential anti-SARS-CoV-2 activity from several large-scale drug repurposing screens. In addition, new patent bioactivity data have been added to the latest ChEMBL releases, and various new features have been incorporated, including a Natural Product likeness score, updated flags for Natural Products, a new flag for Chemical Probes, and the initial annotation of the action type for â¼270 000 bioactivity measurements.
Asunto(s)
Descubrimiento de Drogas , Bases de Datos Factuales , Factores de TiempoRESUMEN
The safety of marketed drugs is an ongoing concern, with some of the more frequently prescribed medicines resulting in serious or life-threatening adverse effects in some patients. Safety-related information for approved drugs has been curated to include the assignment of toxicity class(es) based on their withdrawn status and/or black box warning information described on medicinal product labels. The ChEMBL resource contains a wide range of bioactivity data types, from early "Discovery" stage preclinical data for individual compounds through to postclinical data on marketed drugs; the inclusion of the curated drug safety data set within this framework can support a wide range of safety-related drug discovery questions. The curated drug safety data set will be made freely available through ChEMBL and updated in future database releases.
Asunto(s)
Preparaciones Farmacéuticas/química , Curaduría de Datos , Aprobación de Drogas , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Humanos , Modelos MolecularesRESUMEN
ChEMBL is a large, open-access bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012, 2014 and 2017 Nucleic Acids Research Database Issues. In the last two years, several important improvements have been made to the database and are described here. These include more robust capture and representation of assay details; a new data deposition system, allowing updating of data sets and deposition of supplementary data; and a completely redesigned web interface, with enhanced search and filtering capabilities.
Asunto(s)
Bases de Datos Farmacéuticas , Descubrimiento de Drogas , Bioensayo , Publicaciones Periódicas como Asunto , Interfaz Usuario-ComputadorRESUMEN
ChEMBL is an open large-scale bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012 and 2014 Nucleic Acids Research Database Issues. Since then, alongside the continued extraction of data from the medicinal chemistry literature, new sources of bioactivity data have also been added to the database. These include: deposited data sets from neglected disease screening; crop protection data; drug metabolism and disposition data and bioactivity data from patents. A number of improvements and new features have also been incorporated. These include the annotation of assays and targets using ontologies, the inclusion of targets and indications for clinical candidates, addition of metabolic pathways for drugs and calculation of structural alerts. The ChEMBL data can be accessed via a web-interface, RDF distribution, data downloads and RESTful web-services.
Asunto(s)
Bases de Datos de Compuestos Químicos , Bases de Datos de Ácidos Nucleicos , Motor de Búsqueda , Biología Computacional/métodos , Protección de Cultivos , Descubrimiento de Drogas , Ontología de Genes , Humanos , Anotación de Secuencia Molecular , Farmacología/métodos , Interfaz Usuario-Computador , Navegador WebRESUMEN
The 'druggable genome' encompasses several protein families, but only a subset of targets within them have attracted significant research attention and thus have information about them publicly available. The Illuminating the Druggable Genome (IDG) program was initiated in 2014, has the goal of developing experimental techniques and a Knowledge Management Center (KMC) that would collect and organize information about protein targets from four families, representing the most common druggable targets with an emphasis on understudied proteins. Here, we describe two resources developed by the KMC: the Target Central Resource Database (TCRD) which collates many heterogeneous gene/protein datasets and Pharos (https://pharos.nih.gov), a multimodal web interface that presents the data from TCRD. We briefly describe the types and sources of data considered by the KMC and then highlight features of the Pharos interface designed to enable intuitive access to the IDG knowledgebase. The aim of Pharos is to encourage 'serendipitous browsing', whereby related, relevant information is made easily discoverable. We conclude by describing two use cases that highlight the utility of Pharos and TCRD.
Asunto(s)
Bases de Datos Genéticas , Descubrimiento de Drogas , Genómica , Farmacogenética , Motor de Búsqueda , Análisis por Conglomerados , Biología Computacional/métodos , Descubrimiento de Drogas/métodos , Genómica/métodos , Humanos , Obesidad/tratamiento farmacológico , Obesidad/genética , Obesidad/metabolismo , Farmacogenética/métodos , Programas Informáticos , Navegador WebRESUMEN
We have designed and developed a data integration and visualization platform that provides evidence about the association of known and potential drug targets with diseases. The platform is designed to support identification and prioritization of biological targets for follow-up. Each drug target is linked to a disease using integrated genome-wide data from a broad range of data sources. The platform provides either a target-centric workflow to identify diseases that may be associated with a specific target, or a disease-centric workflow to identify targets that may be associated with a specific disease. Users can easily transition between these target- and disease-centric workflows. The Open Targets Validation Platform is accessible at https://www.targetvalidation.org.
Asunto(s)
Biología Computacional/métodos , Terapia Molecular Dirigida , Motor de Búsqueda , Programas Informáticos , Bases de Datos Factuales , Humanos , Terapia Molecular Dirigida/métodos , Reproducibilidad de los Resultados , Navegador Web , Flujo de TrabajoRESUMEN
SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/.
Asunto(s)
Bases de Datos de Compuestos Químicos , Patentes como Asunto , Minería de Datos , Preparaciones Farmacéuticas/químicaRESUMEN
ChEMBL is now a well-established resource in the fields of drug discovery and medicinal chemistry research. The ChEMBL database curates and stores standardized bioactivity, molecule, target and drug data extracted from multiple sources, including the primary medicinal chemistry literature. Programmatic access to ChEMBL data has been improved by a recent update to the ChEMBL web services (version 2.0.x, https://www.ebi.ac.uk/chembl/api/data/docs), which exposes significantly more data from the underlying database and introduces new functionality. To complement the data-focused services, a utility service (version 1.0.x, https://www.ebi.ac.uk/chembl/api/utils/docs), which provides RESTful access to commonly used cheminformatics methods, has also been concurrently developed. The ChEMBL web services can be used together or independently to build applications and data processing workflows relevant to drug discovery and chemical biology.
Asunto(s)
Bases de Datos de Compuestos Químicos , Descubrimiento de Drogas , Internet , Integración de Sistemas , Interfaz Usuario-ComputadorRESUMEN
The IntAct molecular interaction database has created a new, free, open-source, manually curated resource, the Complex Portal (www.ebi.ac.uk/intact/complex), through which protein complexes from major model organisms are being collated and made available for search, viewing and download. It has been built in close collaboration with other bioinformatics services and populated with data from ChEMBL, MatrixDB, PDBe, Reactome and UniProtKB. Each entry contains information about the participating molecules (including small molecules and nucleic acids), their stoichiometry, topology and structural assembly. Complexes are annotated with details about their function, properties and complex-specific Gene Ontology (GO) terms. Consistent nomenclature is used throughout the resource with systematic names, recommended names and a list of synonyms all provided. The use of the Evidence Code Ontology allows us to indicate for which entries direct experimental evidence is available or if the complex has been inferred based on homology or orthology. The data are searchable using standard identifiers, such as UniProt, ChEBI and GO IDs, protein, gene and complex names or synonyms. This reference resource will be maintained and grow to encompass an increasing number of organisms. Input from groups and individuals with specific areas of expertise is welcome.
Asunto(s)
Bases de Datos de Proteínas , Proteínas/química , Animales , Sitios de Unión , Humanos , Internet , Sustancias Macromoleculares/química , Ratones , Unión Proteica , Proteínas/genética , Proteínas/metabolismoRESUMEN
UNLABELLED: PPDMs is a resource that maps small molecule bioactivities to protein domains from the Pfam-A collection of protein families. Small molecule bioactivities mapped to protein domains add important precision to approaches that use protein sequence searches alignments to assist applications in computational drug discovery and systems and chemical biology. We have previously proposed a mapping heuristic for a subset of bioactivities stored in ChEMBL with the Pfam-A domain most likely to mediate small molecule binding. We have since refined this mapping using a manual procedure. Here, we present a resource that provides up-to-date mappings and the possibility to review assigned mappings as well as to participate in their assignment and curation. We also describe how mappings provided through the PPDMs resource are made accessible through the main schema of the ChEMBL database. AVAILABILITY AND IMPLEMENTATION: The PPDMs resource and curation interface is available at https://www.ebi.ac.uk/chembl/research/ppdms/pfam_maps. The source-code for PPDMs is available under the Apache license at https://github.com/chembl/pfam_maps. Source code is available at https://github.com/chembl/pfam_map_loader to demonstrate the integration process with the main schema of ChEMBL.
Asunto(s)
Bases de Datos de Compuestos Químicos , Bases de Datos de Proteínas , Descubrimiento de Drogas/métodos , Proteínas/química , Bibliotecas de Moléculas Pequeñas/farmacología , Programas Informáticos , Humanos , Estructura Terciaria de Proteína , Bibliotecas de Moléculas Pequeñas/químicaRESUMEN
ChEMBL is an open large-scale bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012 Nucleic Acids Research Database Issue. Since then, a variety of new data sources and improvements in functionality have contributed to the growth and utility of the resource. In particular, more comprehensive tracking of compounds from research stages through clinical development to market is provided through the inclusion of data from United States Adopted Name applications; a new richer data model for representing drug targets has been developed; and a number of methods have been put in place to allow users to more easily identify reliable data. Finally, access to ChEMBL is now available via a new Resource Description Framework format, in addition to the web-based interface, data downloads and web services.
Asunto(s)
Bases de Datos de Compuestos Químicos , Descubrimiento de Drogas , Sitios de Unión , Humanos , Internet , Ligandos , Preparaciones Farmacéuticas/química , Proteínas/química , Proteínas/efectos de los fármacosRESUMEN
MOTIVATION: Resource description framework (RDF) is an emerging technology for describing, publishing and linking life science data. As a major provider of bioinformatics data and services, the European Bioinformatics Institute (EBI) is committed to making data readily accessible to the community in ways that meet existing demand. The EBI RDF platform has been developed to meet an increasing demand to coordinate RDF activities across the institute and provides a new entry point to querying and exploring integrated resources available at the EBI.
Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Academias e Institutos , Investigación Biomédica , InternetRESUMEN
Allosteric modulators are ligands for proteins that exert their effects via a different binding site than the natural (orthosteric) ligand site and hence form a conceptually distinct class of ligands for a target of interest. Here, the physicochemical and structural features of a large set of allosteric and non-allosteric ligands from the ChEMBL database of bioactive molecules are analyzed. In general allosteric modulators are relatively smaller, more lipophilic and more rigid compounds, though large differences exist between different targets and target classes. Furthermore, there are differences in the distribution of targets that bind these allosteric modulators. Allosteric modulators are over-represented in membrane receptors, ligand-gated ion channels and nuclear receptor targets, but are underrepresented in enzymes (primarily proteases and kinases). Moreover, allosteric modulators tend to bind to their targets with a slightly lower potency (5.96 log units versus 6.66 log units, p<0.01). However, this lower absolute affinity is compensated by their lower molecular weight and more lipophilic nature, leading to similar binding efficiency and surface efficiency indices. Subsequently a series of classifier models are trained, initially target class independent models followed by finer-grained target (architecture/functional class) based models using the target hierarchy of the ChEMBL database. Applications of these insights include the selection of likely allosteric modulators from existing compound collections, the design of novel chemical libraries biased towards allosteric regulators and the selection of targets potentially likely to yield allosteric modulators on screening. All data sets used in the paper are available for download.
Asunto(s)
Modelos Químicos , Regulación Alostérica , Bases de Datos de Compuestos Químicos , Ligandos , Peso MolecularRESUMEN
The emergence of a number of publicly available bioactivity databases, such as ChEMBL, PubChem BioAssay and BindingDB, has raised awareness about the topics of data curation, quality and integrity. Here we provide an overview and discussion of the current and future approaches to activity, assay and target data curation of the ChEMBL database. This curation process involves several manual and automated steps and aims to: (1) maximise data accessibility and comparability; (2) improve data integrity and flag outliers, ambiguities and potential errors; and (3) add further curated annotations and mappings thus increasing the usefulness and accuracy of the ChEMBL data for all users and modellers in particular. Issues related to activity, assay and target data curation and integrity along with their potential impact for users of the data are discussed, alongside robust selection and filter strategies in order to avoid or minimise these, depending on the desired application.
Asunto(s)
Bioensayo , Exactitud de los Datos , Bases de Datos de Compuestos Químicos , Curaduría de Datos/normas , Bases de Datos de Compuestos Químicos/normas , Bases de Datos Factuales , Concentración 50 InhibidoraRESUMEN
There is a wealth of valuable chemical information in publicly available databases for use by scientists undertaking drug discovery. However finite curation resource, limitations of chemical structure software and differences in individual database applications mean that exact chemical structure equivalence between databases is unlikely to ever be a reality. The ability to identify compound equivalence has been made significantly easier by the use of the International Chemical Identifier (InChI), a non-proprietary line-notation for describing a chemical structure. More importantly, advances in methods to identify compounds that are the same at various levels of similarity, such as those containing the same parent component or having the same connectivity, are now enabling related compounds to be linked between databases where the structure matches are not exact.
Asunto(s)
Bases de Datos de Compuestos Químicos , Descubrimiento de Drogas , Estructura Molecular , Programas InformáticosRESUMEN
Currently, there are more than 800 well characterized human membrane transport proteins (including channels and transporters) and there are estimates that about 10% (approx. 2000) of all human genes are related to transport. Membrane transport proteins are of interest as potential drug targets, for drug delivery, and as a cause of side effects and drugdrug interactions. In light of the development of Open PHACTS, which provides an open pharmacological space, we analyzed selected membrane transport protein classification schemes (Transporter Classification Database, ChEMBL, IUPHAR/BPS Guide to Pharmacology, and Gene Ontology) for their ability to serve as a basis for pharmacology driven protein classification. A comparison of these membrane transport protein classification schemes by using a set of clinically relevant transporters as use-case reveals the strengths and weaknesses of the different taxonomy approaches.
Asunto(s)
Bases de Datos Farmacéuticas , Bases de Datos de Proteínas , Proteínas de Transporte de Membrana/química , Proteínas de Transporte de Membrana/clasificación , Clasificación , Descubrimiento de Drogas , Ontología de Genes , Humanos , Proteínas de Transporte de Membrana/genéticaRESUMEN
Transport proteins represent an eminent class of drug targets and ADMET (absorption, distribution, metabolism, excretion, toxicity) associated genes. There exists a large number of distinct activity assays for transport proteins, depending on not only the measurement needed (e.g. transport activity, strength of ligandprotein interaction), but also due to heterogeneous assay setups used by different research groups. Efforts to systematically organize this (divergent) bioassay data have large potential impact in Public-Private partnership and conventional commercial drug discovery. In this short review, we highlight some of the frequently used high-throughput assays for transport proteins, and we discuss emerging assay ontologies and their application to this field. Focusing on human P-glycoprotein (Multidrug resistance protein 1; gene name: ABCB1, MDR1), we exemplify how annotation of bioassay data per target class could improve and add to existing ontologies, and we propose to include an additional layer of metadata supporting data fusion across different bioassays.
Asunto(s)
Ontologías Biológicas , Descubrimiento de Drogas/métodos , Ensayos Analíticos de Alto Rendimiento , Proteínas de Transporte de Membrana , Proteínas de Transporte de Membrana/química , Proteínas de Transporte de Membrana/clasificación , Proteínas de Transporte de Membrana/metabolismo , Preparaciones Farmacéuticas/química , Preparaciones Farmacéuticas/metabolismoRESUMEN
ChEMBL is an Open Data database containing binding, functional and ADMET information for a large number of drug-like bioactive compounds. These data are manually abstracted from the primary published literature on a regular basis, then further curated and standardized to maximize their quality and utility across a wide range of chemical biology and drug-discovery research problems. Currently, the database contains 5.4 million bioactivity measurements for more than 1 million compounds and 5200 protein targets. Access is available through a web-based interface, data downloads and web services at: https://www.ebi.ac.uk/chembldb.
Asunto(s)
Bases de Datos Factuales , Descubrimiento de Drogas , Bases de Datos de Proteínas , Humanos , Preparaciones Farmacéuticas/química , Proteínas/química , Proteínas/metabolismo , Interfaz Usuario-ComputadorRESUMEN
The patent literature is a potentially valuable source of bioactivity data. In this article we describe a process to prioritise 3.7 million life science relevant patents obtained from the SureChEMBL database (https://www.surechembl.org/), according to how likely they were to contain bioactivity data for potent small molecules on less-studied targets, based on the classification developed by the Illuminating the Druggable Genome (IDG) project. The overall goal was to select a smaller number of patents that could be manually curated and incorporated into the ChEMBL database. Using relatively simple annotation and filtering pipelines, we have been able to identify a substantial number of patents containing quantitative bioactivity data for understudied targets that had not previously been reported in the peer-reviewed medicinal chemistry literature. We quantify the added value of such methods in terms of the numbers of targets that are so identified, and provide some specific illustrative examples. Our work underlines the potential value in searching the patent corpus in addition to the more traditional peer-reviewed literature. The small molecules found in these patents, together with their measured activity against the targets, are now accessible via the ChEMBL database.