Search | VHL CLAP/WR-PAHO/WHO

1.

The Global Phosphorylation Landscape of SARS-CoV-2 Infection.

Bouhaddou, Mehdi; Memon, Danish; Meyer, Bjoern; White, Kris M; Rezelj, Veronica V; Correa Marrero, Miguel; Polacco, Benjamin J; Melnyk, James E; Ulferts, Svenja; Kaake, Robyn M; Batra, Jyoti; Richards, Alicia L; Stevenson, Erica; Gordon, David E; Rojc, Ajda; Obernier, Kirsten; Fabius, Jacqueline M; Soucheray, Margaret; Miorin, Lisa; Moreno, Elena; Koh, Cassandra; Tran, Quang Dinh; Hardy, Alexandra; Robinot, Rémy; Vallet, Thomas; Nilsson-Payant, Benjamin E; Hernandez-Armenta, Claudia; Dunham, Alistair; Weigang, Sebastian; Knerr, Julian; Modak, Maya; Quintero, Diego; Zhou, Yuan; Dugourd, Aurelien; Valdeolivas, Alberto; Patil, Trupti; Li, Qiongyu; Hüttenhain, Ruth; Cakir, Merve; Muralidharan, Monita; Kim, Minkyu; Jang, Gwendolyn; Tutuncuoglu, Beril; Hiatt, Joseph; Guo, Jeffrey Z; Xu, Jiewei; Bouhaddou, Sophia; Mathy, Christopher J P; Gaulton, Anna; Manners, Emma J.

Cell ; 182(3): 685-712.e19, 2020 08 06.

Article in English | MEDLINE | ID: mdl-32645325

ABSTRACT

The causative agent of the coronavirus disease 2019 (COVID-19) pandemic, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has infected millions and killed hundreds of thousands of people worldwide, highlighting an urgent need to develop antiviral therapies. Here we present a quantitative mass spectrometry-based phosphoproteomics survey of SARS-CoV-2 infection in Vero E6 cells, revealing dramatic rewiring of phosphorylation on host and viral proteins. SARS-CoV-2 infection promoted casein kinase II (CK2) and p38 MAPK activation, production of diverse cytokines, and shutdown of mitotic kinases, resulting in cell cycle arrest. Infection also stimulated a marked induction of CK2-containing filopodial protrusions possessing budding viral particles. Eighty-seven drugs and compounds were identified by mapping global phosphorylation profiles to dysregulated kinases and pathways. We found pharmacologic inhibition of the p38, CK2, CDK, AXL, and PIKFYVE kinases to possess antiviral efficacy, representing potential COVID-19 therapies.

Subject(s)

Betacoronavirus/metabolism , Coronavirus Infections/metabolism , Drug Evaluation, Preclinical/methods , Pneumonia, Viral/metabolism , Proteomics/methods , A549 Cells , Angiotensin-Converting Enzyme 2 , Animals , Antiviral Agents/pharmacology , COVID-19 , Caco-2 Cells , Casein Kinase II/antagonists & inhibitors , Casein Kinase II/metabolism , Chlorocebus aethiops , Coronavirus Infections/virology , Cyclin-Dependent Kinases/antagonists & inhibitors , Cyclin-Dependent Kinases/metabolism , HEK293 Cells , Host-Pathogen Interactions , Humans , Pandemics , Peptidyl-Dipeptidase A/genetics , Peptidyl-Dipeptidase A/metabolism , Phosphatidylinositol 3-Kinases/metabolism , Phosphoinositide-3 Kinase Inhibitors/pharmacology , Phosphorylation , Pneumonia, Viral/virology , Protein Kinase Inhibitors/pharmacology , Proto-Oncogene Proteins/antagonists & inhibitors , Proto-Oncogene Proteins/metabolism , Receptor Protein-Tyrosine Kinases/antagonists & inhibitors , Receptor Protein-Tyrosine Kinases/metabolism , SARS-CoV-2 , Spike Glycoprotein, Coronavirus/metabolism , Vero Cells , p38 Mitogen-Activated Protein Kinases/antagonists & inhibitors , p38 Mitogen-Activated Protein Kinases/metabolism , Axl Receptor Tyrosine Kinase

2.

The ChEMBL Database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods.

Zdrazil, Barbara; Felix, Eloy; Hunter, Fiona; Manners, Emma J; Blackshaw, James; Corbett, Sybilla; de Veij, Marleen; Ioannidis, Harris; Lopez, David Mendez; Mosquera, Juan F; Magarinos, Maria Paula; Bosc, Nicolas; Arcila, Ricardo; Kizilören, Tevfik; Gaulton, Anna; Bento, A Patrícia; Adasme, Melissa F; Monecke, Peter; Landrum, Gregory A; Leach, Andrew R.

Nucleic Acids Res ; 52(D1): D1180-D1192, 2024 Jan 05.

Article in English | MEDLINE | ID: mdl-37933841

ABSTRACT

ChEMBL (https://www.ebi.ac.uk/chembl/) is a manually curated, high-quality, large-scale, open, FAIR and Global Core Biodata Resource of bioactive molecules with drug-like properties, previously described in the 2012, 2014, 2017 and 2019 Nucleic Acids Research Database Issues. Since its introduction in 2009, ChEMBL's content has changed dramatically in size and diversity of data types. Through incorporation of multiple new datasets from depositors since the 2019 update, ChEMBL now contains slightly more bioactivity data from deposited data vs data extracted from literature. In collaboration with the EUbOPEN consortium, chemical probe data is now regularly deposited into ChEMBL. Release 27 made curated data available for compounds screened for potential anti-SARS-CoV-2 activity from several large-scale drug repurposing screens. In addition, new patent bioactivity data have been added to the latest ChEMBL releases, and various new features have been incorporated, including a Natural Product likeness score, updated flags for Natural Products, a new flag for Chemical Probes, and the initial annotation of the action type for â¼270 000 bioactivity measurements.

Subject(s)

Drug Discovery , Databases, Factual , Time Factors

3.

Drug Safety Data Curation and Modeling in ChEMBL: Boxed Warnings and Withdrawn Drugs.

Hunter, Fiona M I; Bento, A Patrícia; Bosc, Nicolas; Gaulton, Anna; Hersey, Anne; Leach, Andrew R.

Chem Res Toxicol ; 34(2): 385-395, 2021 02 15.

Article in English | MEDLINE | ID: mdl-33507738

ABSTRACT

The safety of marketed drugs is an ongoing concern, with some of the more frequently prescribed medicines resulting in serious or life-threatening adverse effects in some patients. Safety-related information for approved drugs has been curated to include the assignment of toxicity class(es) based on their withdrawn status and/or black box warning information described on medicinal product labels. The ChEMBL resource contains a wide range of bioactivity data types, from early "Discovery" stage preclinical data for individual compounds through to postclinical data on marketed drugs; the inclusion of the curated drug safety data set within this framework can support a wide range of safety-related drug discovery questions. The curated drug safety data set will be made freely available through ChEMBL and updated in future database releases.

Subject(s)

Pharmaceutical Preparations/chemistry , Data Curation , Drug Approval , Drug-Related Side Effects and Adverse Reactions , Humans , Models, Molecular

4.

ChEMBL: towards direct deposition of bioassay data.

Mendez, David; Gaulton, Anna; Bento, A Patrícia; Chambers, Jon; De Veij, Marleen; Félix, Eloy; Magariños, María Paula; Mosquera, Juan F; Mutowo, Prudence; Nowotka, Michal; Gordillo-Marañón, María; Hunter, Fiona; Junco, Laura; Mugumbate, Grace; Rodriguez-Lopez, Milagros; Atkinson, Francis; Bosc, Nicolas; Radoux, Chris J; Segura-Cabrera, Aldo; Hersey, Anne; Leach, Andrew R.

Nucleic Acids Res ; 47(D1): D930-D940, 2019 01 08.

Article in English | MEDLINE | ID: mdl-30398643

ABSTRACT

ChEMBL is a large, open-access bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012, 2014 and 2017 Nucleic Acids Research Database Issues. In the last two years, several important improvements have been made to the database and are described here. These include more robust capture and representation of assay details; a new data deposition system, allowing updating of data sets and deposition of supplementary data; and a completely redesigned web interface, with enhanced search and filtering capabilities.

Subject(s)

Databases, Pharmaceutical , Drug Discovery , Biological Assay , Periodicals as Topic , User-Computer Interface

5.

The ChEMBL database in 2017.

Gaulton, Anna; Hersey, Anne; Nowotka, Michal; Bento, A Patrícia; Chambers, Jon; Mendez, David; Mutowo, Prudence; Atkinson, Francis; Bellis, Louisa J; Cibrián-Uhalte, Elena; Davies, Mark; Dedman, Nathan; Karlsson, Anneli; Magariños, María Paula; Overington, John P; Papadatos, George; Smit, Ines; Leach, Andrew R.

Nucleic Acids Res ; 45(D1): D945-D954, 2017 01 04.

Article in English | MEDLINE | ID: mdl-27899562

ABSTRACT

ChEMBL is an open large-scale bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012 and 2014 Nucleic Acids Research Database Issues. Since then, alongside the continued extraction of data from the medicinal chemistry literature, new sources of bioactivity data have also been added to the database. These include: deposited data sets from neglected disease screening; crop protection data; drug metabolism and disposition data and bioactivity data from patents. A number of improvements and new features have also been incorporated. These include the annotation of assays and targets using ontologies, the inclusion of targets and indications for clinical candidates, addition of metabolic pathways for drugs and calculation of structural alerts. The ChEMBL data can be accessed via a web-interface, RDF distribution, data downloads and RESTful web-services.

Subject(s)

Databases, Chemical , Databases, Nucleic Acid , Search Engine , Computational Biology/methods , Crop Protection , Drug Discovery , Gene Ontology , Humans , Molecular Sequence Annotation , Pharmacology/methods , User-Computer Interface , Web Browser

6.

Pharos: Collating protein information to shed light on the druggable genome.

Nguyen, Dac-Trung; Mathias, Stephen; Bologa, Cristian; Brunak, Soren; Fernandez, Nicolas; Gaulton, Anna; Hersey, Anne; Holmes, Jayme; Jensen, Lars Juhl; Karlsson, Anneli; Liu, Guixia; Ma'ayan, Avi; Mandava, Geetha; Mani, Subramani; Mehta, Saurabh; Overington, John; Patel, Juhee; Rouillard, Andrew D; Schürer, Stephan; Sheils, Timothy; Simeonov, Anton; Sklar, Larry A; Southall, Noel; Ursu, Oleg; Vidovic, Dusica; Waller, Anna; Yang, Jeremy; Jadhav, Ajit; Oprea, Tudor I; Guha, Rajarshi.

Nucleic Acids Res ; 45(D1): D995-D1002, 2017 01 04.

Article in English | MEDLINE | ID: mdl-27903890

ABSTRACT

The 'druggable genome' encompasses several protein families, but only a subset of targets within them have attracted significant research attention and thus have information about them publicly available. The Illuminating the Druggable Genome (IDG) program was initiated in 2014, has the goal of developing experimental techniques and a Knowledge Management Center (KMC) that would collect and organize information about protein targets from four families, representing the most common druggable targets with an emphasis on understudied proteins. Here, we describe two resources developed by the KMC: the Target Central Resource Database (TCRD) which collates many heterogeneous gene/protein datasets and Pharos (https://pharos.nih.gov), a multimodal web interface that presents the data from TCRD. We briefly describe the types and sources of data considered by the KMC and then highlight features of the Pharos interface designed to enable intuitive access to the IDG knowledgebase. The aim of Pharos is to encourage 'serendipitous browsing', whereby related, relevant information is made easily discoverable. We conclude by describing two use cases that highlight the utility of Pharos and TCRD.

Subject(s)

Databases, Genetic , Drug Discovery , Genomics , Pharmacogenetics , Search Engine , Cluster Analysis , Computational Biology/methods , Drug Discovery/methods , Genomics/methods , Humans , Obesity/drug therapy , Obesity/genetics , Obesity/metabolism , Pharmacogenetics/methods , Software , Web Browser

7.

Open Targets: a platform for therapeutic target identification and validation.

Koscielny, Gautier; An, Peter; Carvalho-Silva, Denise; Cham, Jennifer A; Fumis, Luca; Gasparyan, Rippa; Hasan, Samiul; Karamanis, Nikiforos; Maguire, Michael; Papa, Eliseo; Pierleoni, Andrea; Pignatelli, Miguel; Platt, Theo; Rowland, Francis; Wankar, Priyanka; Bento, A Patrícia; Burdett, Tony; Fabregat, Antonio; Forbes, Simon; Gaulton, Anna; Gonzalez, Cristina Yenyxe; Hermjakob, Henning; Hersey, Anne; Jupe, Steven; Kafkas, Senay; Keays, Maria; Leroy, Catherine; Lopez, Francisco-Javier; Magarinos, Maria Paula; Malone, James; McEntyre, Johanna; Munoz-Pomer Fuentes, Alfonso; O'Donovan, Claire; Papatheodorou, Irene; Parkinson, Helen; Palka, Barbara; Paschall, Justin; Petryszak, Robert; Pratanwanich, Naruemon; Sarntivijal, Sirarat; Saunders, Gary; Sidiropoulos, Konstantinos; Smith, Thomas; Sondka, Zbyslaw; Stegle, Oliver; Tang, Y Amy; Turner, Edward; Vaughan, Brendan; Vrousgou, Olga; Watkins, Xavier.

Nucleic Acids Res ; 45(D1): D985-D994, 2017 01 04.

Article in English | MEDLINE | ID: mdl-27899665

ABSTRACT

We have designed and developed a data integration and visualization platform that provides evidence about the association of known and potential drug targets with diseases. The platform is designed to support identification and prioritization of biological targets for follow-up. Each drug target is linked to a disease using integrated genome-wide data from a broad range of data sources. The platform provides either a target-centric workflow to identify diseases that may be associated with a specific target, or a disease-centric workflow to identify targets that may be associated with a specific disease. Users can easily transition between these target- and disease-centric workflows. The Open Targets Validation Platform is accessible at https://www.targetvalidation.org.

Subject(s)

Computational Biology/methods , Molecular Targeted Therapy , Search Engine , Software , Databases, Factual , Humans , Molecular Targeted Therapy/methods , Reproducibility of Results , Web Browser , Workflow

8.

SureChEMBL: a large-scale, chemically annotated patent document database.

Papadatos, George; Davies, Mark; Dedman, Nathan; Chambers, Jon; Gaulton, Anna; Siddle, James; Koks, Richard; Irvine, Sean A; Pettersson, Joe; Goncharoff, Nicko; Hersey, Anne; Overington, John P.

Nucleic Acids Res ; 44(D1): D1220-8, 2016 Jan 04.

Article in English | MEDLINE | ID: mdl-26582922

ABSTRACT

SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/.

Subject(s)

Databases, Chemical , Patents as Topic , Data Mining , Pharmaceutical Preparations/chemistry

9.

ChEMBL web services: streamlining access to drug discovery data and utilities.

Davies, Mark; Nowotka, Michal; Papadatos, George; Dedman, Nathan; Gaulton, Anna; Atkinson, Francis; Bellis, Louisa; Overington, John P.

Nucleic Acids Res ; 43(W1): W612-20, 2015 Jul 01.

Article in English | MEDLINE | ID: mdl-25883136

ABSTRACT

ChEMBL is now a well-established resource in the fields of drug discovery and medicinal chemistry research. The ChEMBL database curates and stores standardized bioactivity, molecule, target and drug data extracted from multiple sources, including the primary medicinal chemistry literature. Programmatic access to ChEMBL data has been improved by a recent update to the ChEMBL web services (version 2.0.x, https://www.ebi.ac.uk/chembl/api/data/docs), which exposes significantly more data from the underlying database and introduces new functionality. To complement the data-focused services, a utility service (version 1.0.x, https://www.ebi.ac.uk/chembl/api/utils/docs), which provides RESTful access to commonly used cheminformatics methods, has also been concurrently developed. The ChEMBL web services can be used together or independently to build applications and data processing workflows relevant to drug discovery and chemical biology.

Subject(s)

Databases, Chemical , Drug Discovery , Internet , Systems Integration , User-Computer Interface

10.

The complex portal--an encyclopaedia of macromolecular complexes.

Meldal, Birgit H M; Forner-Martinez, Oscar; Costanzo, Maria C; Dana, Jose; Demeter, Janos; Dumousseau, Marine; Dwight, Selina S; Gaulton, Anna; Licata, Luana; Melidoni, Anna N; Ricard-Blum, Sylvie; Roechert, Bernd; Skyzypek, Marek S; Tiwari, Manu; Velankar, Sameer; Wong, Edith D; Hermjakob, Henning; Orchard, Sandra.

Nucleic Acids Res ; 43(Database issue): D479-84, 2015 Jan.

Article in English | MEDLINE | ID: mdl-25313161

ABSTRACT

The IntAct molecular interaction database has created a new, free, open-source, manually curated resource, the Complex Portal (www.ebi.ac.uk/intact/complex), through which protein complexes from major model organisms are being collated and made available for search, viewing and download. It has been built in close collaboration with other bioinformatics services and populated with data from ChEMBL, MatrixDB, PDBe, Reactome and UniProtKB. Each entry contains information about the participating molecules (including small molecules and nucleic acids), their stoichiometry, topology and structural assembly. Complexes are annotated with details about their function, properties and complex-specific Gene Ontology (GO) terms. Consistent nomenclature is used throughout the resource with systematic names, recommended names and a list of synonyms all provided. The use of the Evidence Code Ontology allows us to indicate for which entries direct experimental evidence is available or if the complex has been inferred based on homology or orthology. The data are searchable using standard identifiers, such as UniProt, ChEBI and GO IDs, protein, gene and complex names or synonyms. This reference resource will be maintained and grow to encompass an increasing number of organisms. Input from groups and individuals with specific areas of expertise is welcome.

Subject(s)

Databases, Protein , Proteins/chemistry , Animals , Binding Sites , Humans , Internet , Macromolecular Substances/chemistry , Mice , Protein Binding , Proteins/genetics , Proteins/metabolism

11.

PPDMs-a resource for mapping small molecule bioactivities from ChEMBL to Pfam-A protein domains.

Kruger, Felix A; Gaulton, Anna; Nowotka, Michal; Overington, John P.

Bioinformatics ; 31(5): 776-8, 2015 Mar 01.

Article in English | MEDLINE | ID: mdl-25348214

ABSTRACT

UNLABELLED: PPDMs is a resource that maps small molecule bioactivities to protein domains from the Pfam-A collection of protein families. Small molecule bioactivities mapped to protein domains add important precision to approaches that use protein sequence searches alignments to assist applications in computational drug discovery and systems and chemical biology. We have previously proposed a mapping heuristic for a subset of bioactivities stored in ChEMBL with the Pfam-A domain most likely to mediate small molecule binding. We have since refined this mapping using a manual procedure. Here, we present a resource that provides up-to-date mappings and the possibility to review assigned mappings as well as to participate in their assignment and curation. We also describe how mappings provided through the PPDMs resource are made accessible through the main schema of the ChEMBL database. AVAILABILITY AND IMPLEMENTATION: The PPDMs resource and curation interface is available at https://www.ebi.ac.uk/chembl/research/ppdms/pfam_maps. The source-code for PPDMs is available under the Apache license at https://github.com/chembl/pfam_maps. Source code is available at https://github.com/chembl/pfam_map_loader to demonstrate the integration process with the main schema of ChEMBL.

Subject(s)

Databases, Chemical , Databases, Protein , Drug Discovery/methods , Proteins/chemistry , Small Molecule Libraries/pharmacology , Software , Humans , Protein Structure, Tertiary , Small Molecule Libraries/chemistry

12.

The ChEMBL bioactivity database: an update.

Bento, A Patrícia; Gaulton, Anna; Hersey, Anne; Bellis, Louisa J; Chambers, Jon; Davies, Mark; Krüger, Felix A; Light, Yvonne; Mak, Lora; McGlinchey, Shaun; Nowotka, Michal; Papadatos, George; Santos, Rita; Overington, John P.

Nucleic Acids Res ; 42(Database issue): D1083-90, 2014 Jan.

Article in English | MEDLINE | ID: mdl-24214965

ABSTRACT

ChEMBL is an open large-scale bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012 Nucleic Acids Research Database Issue. Since then, a variety of new data sources and improvements in functionality have contributed to the growth and utility of the resource. In particular, more comprehensive tracking of compounds from research stages through clinical development to market is provided through the inclusion of data from United States Adopted Name applications; a new richer data model for representing drug targets has been developed; and a number of methods have been put in place to allow users to more easily identify reliable data. Finally, access to ChEMBL is now available via a new Resource Description Framework format, in addition to the web-based interface, data downloads and web services.

Subject(s)

Databases, Chemical , Drug Discovery , Binding Sites , Humans , Internet , Ligands , Pharmaceutical Preparations/chemistry , Proteins/chemistry , Proteins/drug effects

13.

The EBI RDF platform: linked open data for the life sciences.

Jupp, Simon; Malone, James; Bolleman, Jerven; Brandizi, Marco; Davies, Mark; Garcia, Leyla; Gaulton, Anna; Gehant, Sebastien; Laibe, Camille; Redaschi, Nicole; Wimalaratne, Sarala M; Martin, Maria; Le Novère, Nicolas; Parkinson, Helen; Birney, Ewan; Jenkinson, Andrew M.

Bioinformatics ; 30(9): 1338-9, 2014 May 01.

Article in English | MEDLINE | ID: mdl-24413672

ABSTRACT

MOTIVATION: Resource description framework (RDF) is an emerging technology for describing, publishing and linking life science data. As a major provider of bioinformatics data and services, the European Bioinformatics Institute (EBI) is committed to making data readily accessible to the community in ways that meet existing demand. The EBI RDF platform has been developed to meet an increasing demand to coordinate RDF activities across the institute and provides a new entry point to querying and exploring integrated resources available at the EBI.

Subject(s)

Computational Biology/methods , Databases, Genetic , Academies and Institutes , Biomedical Research , Internet

14.

Chemical, target, and bioactive properties of allosteric modulation.

van Westen, Gerard J P; Gaulton, Anna; Overington, John P.

PLoS Comput Biol ; 10(4): e1003559, 2014 Apr.

Article in English | MEDLINE | ID: mdl-24699297

ABSTRACT

Allosteric modulators are ligands for proteins that exert their effects via a different binding site than the natural (orthosteric) ligand site and hence form a conceptually distinct class of ligands for a target of interest. Here, the physicochemical and structural features of a large set of allosteric and non-allosteric ligands from the ChEMBL database of bioactive molecules are analyzed. In general allosteric modulators are relatively smaller, more lipophilic and more rigid compounds, though large differences exist between different targets and target classes. Furthermore, there are differences in the distribution of targets that bind these allosteric modulators. Allosteric modulators are over-represented in membrane receptors, ligand-gated ion channels and nuclear receptor targets, but are underrepresented in enzymes (primarily proteases and kinases). Moreover, allosteric modulators tend to bind to their targets with a slightly lower potency (5.96 log units versus 6.66 log units, p<0.01). However, this lower absolute affinity is compensated by their lower molecular weight and more lipophilic nature, leading to similar binding efficiency and surface efficiency indices. Subsequently a series of classifier models are trained, initially target class independent models followed by finer-grained target (architecture/functional class) based models using the target hierarchy of the ChEMBL database. Applications of these insights include the selection of likely allosteric modulators from existing compound collections, the design of novel chemical libraries biased towards allosteric regulators and the selection of targets potentially likely to yield allosteric modulators on screening. All data sets used in the paper are available for download.

Subject(s)

Models, Chemical , Allosteric Regulation , Databases, Chemical , Ligands , Molecular Weight

15.

Activity, assay and target data curation and quality in the ChEMBL database.

Papadatos, George; Gaulton, Anna; Hersey, Anne; Overington, John P.

J Comput Aided Mol Des ; 29(9): 885-96, 2015 Sep.

Article in English | MEDLINE | ID: mdl-26201396

ABSTRACT

The emergence of a number of publicly available bioactivity databases, such as ChEMBL, PubChem BioAssay and BindingDB, has raised awareness about the topics of data curation, quality and integrity. Here we provide an overview and discussion of the current and future approaches to activity, assay and target data curation of the ChEMBL database. This curation process involves several manual and automated steps and aims to: (1) maximise data accessibility and comparability; (2) improve data integrity and flag outliers, ambiguities and potential errors; and (3) add further curated annotations and mappings thus increasing the usefulness and accuracy of the ChEMBL data for all users and modellers in particular. Issues related to activity, assay and target data curation and integrity along with their potential impact for users of the data are discussed, alongside robust selection and filter strategies in order to avoid or minimise these, depending on the desired application.

Subject(s)

Biological Assay , Data Accuracy , Databases, Chemical , Data Curation/standards , Databases, Chemical/standards , Databases, Factual , Inhibitory Concentration 50

16.

Chemical databases: curation or integration by user-defined equivalence?

Hersey, Anne; Chambers, Jon; Bellis, Louisa; Patrícia Bento, A; Gaulton, Anna; Overington, John P.

Drug Discov Today Technol ; 14: 17-24, 2015 Jul.

Article in English | MEDLINE | ID: mdl-26194583

ABSTRACT

There is a wealth of valuable chemical information in publicly available databases for use by scientists undertaking drug discovery. However finite curation resource, limitations of chemical structure software and differences in individual database applications mean that exact chemical structure equivalence between databases is unlikely to ever be a reality. The ability to identify compound equivalence has been made significantly easier by the use of the International Chemical Identifier (InChI), a non-proprietary line-notation for describing a chemical structure. More importantly, advances in methods to identify compounds that are the same at various levels of similarity, such as those containing the same parent component or having the same connectivity, are now enabling related compounds to be linked between databases where the structure matches are not exact.

Subject(s)

Databases, Chemical , Drug Discovery , Molecular Structure , Software

17.

Transporter taxonomy - a comparison of different transport protein classification schemes.

Viereck, Michael; Gaulton, Anna; Digles, Daniela; Ecker, Gerhard F.

Drug Discov Today Technol ; 12: e37-46, 2014 Jun.

Article in English | MEDLINE | ID: mdl-25027374

ABSTRACT

Currently, there are more than 800 well characterized human membrane transport proteins (including channels and transporters) and there are estimates that about 10% (approx. 2000) of all human genes are related to transport. Membrane transport proteins are of interest as potential drug targets, for drug delivery, and as a cause of side effects and drugdrug interactions. In light of the development of Open PHACTS, which provides an open pharmacological space, we analyzed selected membrane transport protein classification schemes (Transporter Classification Database, ChEMBL, IUPHAR/BPS Guide to Pharmacology, and Gene Ontology) for their ability to serve as a basis for pharmacology driven protein classification. A comparison of these membrane transport protein classification schemes by using a set of clinically relevant transporters as use-case reveals the strengths and weaknesses of the different taxonomy approaches.

Subject(s)

Databases, Pharmaceutical , Databases, Protein , Membrane Transport Proteins/chemistry , Membrane Transport Proteins/classification , Classification , Drug Discovery , Gene Ontology , Humans , Membrane Transport Proteins/genetics

18.

Transporter assays and assay ontologies: useful tools for drug discovery.

Zdrazil, Barbara; Chichester, Christine; Zander Balderud, Linda; Engkvist, Ola; Gaulton, Anna; Overington, John P.

Drug Discov Today Technol ; 12: e47-54, 2014 Jun.

Article in English | MEDLINE | ID: mdl-25027375

ABSTRACT

Transport proteins represent an eminent class of drug targets and ADMET (absorption, distribution, metabolism, excretion, toxicity) associated genes. There exists a large number of distinct activity assays for transport proteins, depending on not only the measurement needed (e.g. transport activity, strength of ligandprotein interaction), but also due to heterogeneous assay setups used by different research groups. Efforts to systematically organize this (divergent) bioassay data have large potential impact in Public-Private partnership and conventional commercial drug discovery. In this short review, we highlight some of the frequently used high-throughput assays for transport proteins, and we discuss emerging assay ontologies and their application to this field. Focusing on human P-glycoprotein (Multidrug resistance protein 1; gene name: ABCB1, MDR1), we exemplify how annotation of bioassay data per target class could improve and add to existing ontologies, and we propose to include an additional layer of metadata supporting data fusion across different bioassays.

Subject(s)

Biological Ontologies , Drug Discovery/methods , High-Throughput Screening Assays , Membrane Transport Proteins , Membrane Transport Proteins/chemistry , Membrane Transport Proteins/classification , Membrane Transport Proteins/metabolism , Pharmaceutical Preparations/chemistry , Pharmaceutical Preparations/metabolism

19.

ChEMBL: a large-scale bioactivity database for drug discovery.

Gaulton, Anna; Bellis, Louisa J; Bento, A Patricia; Chambers, Jon; Davies, Mark; Hersey, Anne; Light, Yvonne; McGlinchey, Shaun; Michalovich, David; Al-Lazikani, Bissan; Overington, John P.

Nucleic Acids Res ; 40(Database issue): D1100-7, 2012 Jan.

Article in English | MEDLINE | ID: mdl-21948594

ABSTRACT

ChEMBL is an Open Data database containing binding, functional and ADMET information for a large number of drug-like bioactive compounds. These data are manually abstracted from the primary published literature on a regular basis, then further curated and standardized to maximize their quality and utility across a wide range of chemical biology and drug-discovery research problems. Currently, the database contains 5.4 million bioactivity measurements for more than 1 million compounds and 5200 protein targets. Access is available through a web-based interface, data downloads and web services at: https://www.ebi.ac.uk/chembldb.

Subject(s)

Databases, Factual , Drug Discovery , Databases, Protein , Humans , Pharmaceutical Preparations/chemistry , Proteins/chemistry , Proteins/metabolism , User-Computer Interface

20.

Illuminating the druggable genome through patent bioactivity data.

Magariños, Maria P; Gaulton, Anna; Félix, Eloy; Kiziloren, Tevfik; Arcila, Ricardo; Oprea, Tudor I; Leach, Andrew R.

PeerJ ; 11: e15153, 2023.

Article in English | MEDLINE | ID: mdl-37151295

ABSTRACT

The patent literature is a potentially valuable source of bioactivity data. In this article we describe a process to prioritise 3.7 million life science relevant patents obtained from the SureChEMBL database (https://www.surechembl.org/), according to how likely they were to contain bioactivity data for potent small molecules on less-studied targets, based on the classification developed by the Illuminating the Druggable Genome (IDG) project. The overall goal was to select a smaller number of patents that could be manually curated and incorporated into the ChEMBL database. Using relatively simple annotation and filtering pipelines, we have been able to identify a substantial number of patents containing quantitative bioactivity data for understudied targets that had not previously been reported in the peer-reviewed medicinal chemistry literature. We quantify the added value of such methods in terms of the numbers of targets that are so identified, and provide some specific illustrative examples. Our work underlines the potential value in searching the patent corpus in addition to the more traditional peer-reviewed literature. The small molecules found in these patents, together with their measured activity against the targets, are now accessible via the ChEMBL database.

Subject(s)

Chemistry, Pharmaceutical , Drug Discovery , Drug Discovery/methods , Databases, Factual

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL