Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 124
Filter
1.
Genome Med ; 15(1): 78, 2023 10 12.
Article in English | MEDLINE | ID: mdl-37821946

ABSTRACT

BACKGROUND: Genetic suppression occurs when the deleterious effects of a primary "query" mutation, such as a disease-causing mutation, are rescued by a suppressor mutation elsewhere in the genome. METHODS: To capture existing knowledge on suppression relationships between human genes, we examined 2,400 published papers for potential interactions identified through either genetic modification of cultured human cells or through association studies in patients. RESULTS: The resulting network encompassed 476 unique suppression interactions covering a wide spectrum of diseases and biological functions. The interactions frequently linked genes that operate in the same biological process. Suppressors were strongly enriched for genes with a role in stress response or signaling, suggesting that deleterious mutations can often be buffered by modulating signaling cascades or immune responses. Suppressor mutations tended to be deleterious when they occurred in absence of the query mutation, in apparent contrast with their protective role in the presence of the query. We formulated and quantified mechanisms of genetic suppression that could explain 71% of interactions and provided mechanistic insight into disease pathology. Finally, we used these observations to predict suppressor genes in the human genome. CONCLUSIONS: The global suppression network allowed us to define principles of genetic suppression that were conserved across diseases, model systems, and species. The emerging frequency of suppression interactions among human genes and range of underlying mechanisms, together with the prevalence of suppression in model organisms, suggest that compensatory mutations may exist for most genetic diseases.


Subject(s)
Genome, Human , Suppression, Genetic , Humans , Mutation , Models, Biological , Human Genetics
2.
Bioinformatics ; 39(9)2023 09 02.
Article in English | MEDLINE | ID: mdl-37725353

ABSTRACT

MOTIVATION: Living a Big Data era in Biomedicine, there is an unmet need to systematically assess experimental observations in the context of available information. This assessment would offer a means for a comprehensive and robust validation of biomedical data results and provide an initial estimate of the potential novelty of the findings. RESULTS: Here we present BQsupports, a web-based tool built upon the Bioteque biomedical descriptors that systematically analyzes and quantifies the current support to a given set of observations. The tool relies on over 1000 distinct types of biomedical descriptors, covering over 11 different biological and chemical entities, including genes, cell lines, diseases, and small molecules. By exploring hundreds of descriptors, BQsupports provide support scores for each observation across a wide variety of biomedical contexts. These scores are then aggregated to summarize the biomedical support of the assessed dataset as a whole. Finally, the BQsupports also suggests predictive features of the given dataset, which can be exploited in downstream machine learning applications. AVAILABILITY AND IMPLEMENTATION: The web application and underlying data are available online (https://bqsupports.irbbarcelona.org).


Subject(s)
Machine Learning , Software , Big Data
4.
Mol Cell Proteomics ; 22(4): 100527, 2023 04.
Article in English | MEDLINE | ID: mdl-36894123

ABSTRACT

p38α (encoded by MAPK14) is a protein kinase that regulates cellular responses to almost all types of environmental and intracellular stresses. Upon activation, p38α phosphorylates many substrates both in the cytoplasm and nucleus, allowing this pathway to regulate a wide variety of cellular processes. While the role of p38α in the stress response has been widely investigated, its implication in cell homeostasis is less understood. To investigate the signaling networks regulated by p38α in proliferating cancer cells, we performed quantitative proteomic and phosphoproteomic analyses in breast cancer cells in which this pathway had been either genetically targeted or chemically inhibited. Our study identified with high confidence 35 proteins and 82 phosphoproteins (114 phosphosites) that are modulated by p38α and highlighted the implication of various protein kinases, including MK2 and mTOR, in the p38α-regulated signaling networks. Moreover, functional analyses revealed an important contribution of p38α to the regulation of cell adhesion, DNA replication, and RNA metabolism. Indeed, we provide experimental evidence supporting that p38α facilitates cancer cell adhesion and showed that this p38α function is likely mediated by the modulation of the adaptor protein ArgBP2. Collectively, our results illustrate the complexity of the p38α-regulated signaling networks, provide valuable information on p38α-dependent phosphorylation events in cancer cells, and document a mechanism by which p38α can regulate cell adhesion.


Subject(s)
Neoplasms , Proteomics , Cell Adhesion , Phosphorylation , Protein Kinases , Proteomics/methods , Signal Transduction , Mitogen-Activated Protein Kinase 14/metabolism
5.
Nat Biotechnol ; 41(1): 140-149, 2023 01.
Article in English | MEDLINE | ID: mdl-36217029

ABSTRACT

Understanding the mechanisms of coronavirus disease 2019 (COVID-19) disease severity to efficiently design therapies for emerging virus variants remains an urgent challenge of the ongoing pandemic. Infection and immune reactions are mediated by direct contacts between viral molecules and the host proteome, and the vast majority of these virus-host contacts (the 'contactome') have not been identified. Here, we present a systematic contactome map of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) with the human host encompassing more than 200 binary virus-host and intraviral protein-protein interactions. We find that host proteins genetically associated with comorbidities of severe illness and long COVID are enriched in SARS-CoV-2 targeted network communities. Evaluating contactome-derived hypotheses, we demonstrate that viral NSP14 activates nuclear factor κB (NF-κB)-dependent transcription, even in the presence of cytokine signaling. Moreover, for several tested host proteins, genetic knock-down substantially reduces viral replication. Additionally, we show for USP25 that this effect is phenocopied by the small-molecule inhibitor AZ1. Our results connect viral proteins to human genetic architecture for COVID-19 severity and offer potential therapeutic targets.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , COVID-19/genetics , Proteome/genetics , Post-Acute COVID-19 Syndrome , Virus Replication/genetics , Ubiquitin Thiolesterase/pharmacology
6.
Nat Commun ; 13(1): 5304, 2022 09 09.
Article in English | MEDLINE | ID: mdl-36085310

ABSTRACT

Biomedical data is accumulating at a fast pace and integrating it into a unified framework is a major challenge, so that multiple views of a given biological event can be considered simultaneously. Here we present the Bioteque, a resource of unprecedented size and scope that contains pre-calculated biomedical descriptors derived from a gigantic knowledge graph, displaying more than 450 thousand biological entities and 30 million relationships between them. The Bioteque integrates, harmonizes, and formats data collected from over 150 data sources, including 12 biological entities (e.g., genes, diseases, drugs) linked by 67 types of associations (e.g., 'drug treats disease', 'gene interacts with gene'). We show how Bioteque descriptors facilitate the assessment of high-throughput protein-protein interactome data, the prediction of drug response and new repurposing opportunities, and demonstrate that they can be used off-the-shelf in downstream machine learning tasks without loss of performance with respect to using original data. The Bioteque thus offers a thoroughly processed, tractable, and highly optimized assembly of the biomedical knowledge available in the public domain.


Subject(s)
Knowledge , Pattern Recognition, Automated , Knowledge Bases , Machine Learning , Proteins
7.
JHEP Rep ; 4(6): 100482, 2022 Jun.
Article in English | MEDLINE | ID: mdl-35540106

ABSTRACT

Background & Aims: The molecular mechanisms driving the progression from early-chronic liver disease (CLD) to cirrhosis and, finally, acute-on-chronic liver failure (ACLF) are largely unknown. Our aim was to develop a protein network-based approach to investigate molecular pathways driving progression from early-CLD to ACLF. Methods: Transcriptome analysis was performed on liver biopsies from patients at different liver disease stages, including fibrosis, compensated cirrhosis, decompensated cirrhosis and ACLF, and control healthy livers. We created 9 liver-specific disease-related protein-protein interaction networks capturing key pathophysiological processes potentially related to CLD. We used these networks as a framework and performed gene set-enrichment analysis (GSEA) to identify dynamic gene profiles of disease progression. Results: Principal component analyses revealed that samples clustered according to the disease stage. GSEA of the defined processes showed an upregulation of inflammation, fibrosis and apoptosis networks throughout disease progression. Interestingly, we did not find significant gene expression differences between compensated and decompensated cirrhosis, while ACLF showed acute expression changes in all the defined liver disease-related networks. The analyses of disease progression patterns identified ascending and descending expression profiles associated with ACLF onset. Functional analyses showed that ascending profiles were associated with inflammation, fibrosis, apoptosis, senescence and carcinogenesis networks, while descending profiles were mainly related to oxidative stress and genetic factors. We confirmed by qPCR the upregulation of genes of the ascending profile and validated our findings in an independent patient cohort. Conclusion: ACLF is characterized by a specific hepatic gene expression pattern related to inflammation, fibrosis, apoptosis, senescence and carcinogenesis. Moreover, the observed profile is significantly different from that of compensated and decompensated cirrhosis, supporting the hypothesis that ACLF should be considered a distinct entity. Lay summary: By using transjugular biopsies obtained from patients at different stages of chronic liver disease, we unveil the molecular pathogenic mechanisms implicated in the progression of chronic liver disease to cirrhosis and acute-on-chronic liver failure. The most relevant finding in this study is that patients with acute-on-chronic liver failure present a specific hepatic gene expression pattern distinct from that of patients at earlier disease stages. This gene expression pattern is mostly related to inflammation, fibrosis, angiogenesis, and senescence and apoptosis pathways in the liver.

8.
Cell Rep Med ; 3(1): 100492, 2022 01 18.
Article in English | MEDLINE | ID: mdl-35106508

ABSTRACT

The Columbia Cancer Target Discovery and Development (CTD2) Center is developing PANACEA, a resource comprising dose-responses and RNA sequencing (RNA-seq) profiles of 25 cell lines perturbed with ∼400 clinical oncology drugs, to study a tumor-specific drug mechanism of action. Here, this resource serves as the basis for a DREAM Challenge assessing the accuracy and sensitivity of computational algorithms for de novo drug polypharmacology predictions. Dose-response and perturbational profiles for 32 kinase inhibitors are provided to 21 teams who are blind to the identity of the compounds. The teams are asked to predict high-affinity binding targets of each compound among ∼1,300 targets cataloged in DrugBank. The best performing methods leverage gene expression profile similarity analysis as well as deep-learning methodologies trained on individual datasets. This study lays the foundation for future integrative analyses of pharmacogenomic data, reconciliation of polypharmacology effects in different tumor contexts, and insights into network-based assessments of drug mechanisms of action.


Subject(s)
Neoplasms/drug therapy , Polypharmacology , Algorithms , Gene Expression Profiling , Gene Expression Regulation, Neoplastic , Humans , Neural Networks, Computer , Protein Kinases/metabolism , RNA, Messenger/genetics , RNA, Messenger/metabolism , Transcription, Genetic
9.
Curr Opin Chem Biol ; 66: 102090, 2022 02.
Article in English | MEDLINE | ID: mdl-34626922

ABSTRACT

Through the representation of small molecule structures as numerical descriptors and the exploitation of the similarity principle, chemoinformatics has made paramount contributions to drug discovery, from unveiling mechanisms of action and repurposing approved drugs to de novo crafting of molecules with desired properties and tailored targets. Yet, the inherent complexity of biological systems has fostered the implementation of large-scale experimental screenings seeking a deeper understanding of the targeted proteins, the disrupted biological processes and the systemic responses of cells to chemical perturbations. After this wealth of data, a new generation of data-driven descriptors has arisen providing a rich portrait of small molecule characteristics that goes beyond chemical properties. Here, we give an overview of biologically relevant descriptors, covering chemical compounds, proteins and other biological entities, such as diseases and cell lines, while aligning them to the major contributions in the field from disciplines, such as natural language processing or computer vision. We now envision a new scenario for chemical and biological entities where they both are translated into a common numerical format. In this computational framework, complex connections between entities can be unveiled by means of simple arithmetic operations, such as distance measures, additions, and subtractions.


Subject(s)
Drug Discovery , Proteins , Biology , Computational Biology
10.
Genome Med ; 13(1): 168, 2021 10 26.
Article in English | MEDLINE | ID: mdl-34702310

ABSTRACT

BACKGROUND: In spite of many years of research, our understanding of the molecular bases of Alzheimer's disease (AD) is still incomplete, and the medical treatments available mainly target the disease symptoms and are hardly effective. Indeed, the modulation of a single target (e.g., ß-secretase) has proven to be insufficient to significantly alter the physiopathology of the disease, and we should therefore move from gene-centric to systemic therapeutic strategies, where AD-related changes are modulated globally. METHODS: Here we present the complete characterization of three murine models of AD at different stages of the disease (i.e., onset, progression and advanced). We combined the cognitive assessment of these mice with histological analyses and full transcriptional and protein quantification profiling of the hippocampus. Additionally, we derived specific Aß-related molecular AD signatures and looked for drugs able to globally revert them. RESULTS: We found that AD models show accelerated aging and that factors specifically associated with Aß pathology are involved. We discovered a few proteins whose abundance increases with AD progression, while the corresponding transcript levels remain stable, and showed that at least two of them (i.e., lfit3 and Syt11) co-localize with Aß plaques in the brain. Finally, we found two NSAIDs (dexketoprofen and etodolac) and two anti-hypertensives (penbutolol and bendroflumethiazide) that overturn the cognitive impairment in AD mice while reducing Aß plaques in the hippocampus and partially restoring the physiological levels of AD signature genes to wild-type levels. CONCLUSIONS: The characterization of three AD mouse models at different disease stages provides an unprecedented view of AD pathology and how this differs from physiological aging. Moreover, our computational strategy to chemically revert AD signatures has shown that NSAID and anti-hypertensive drugs may still have an opportunity as anti-AD agents, challenging previous reports.


Subject(s)
Alzheimer Disease/genetics , Alzheimer Disease/metabolism , Proteomics/methods , Transcriptome , Aging , Amyloid beta-Peptides , Animals , Brain/metabolism , Cognitive Dysfunction , Disease Models, Animal , Drug Discovery , Female , Gene Expression Regulation, Neoplastic , Gene Knock-In Techniques , Humans , Mice , Mice, Inbred C57BL , Mice, Transgenic , Plaque, Amyloid/metabolism
11.
Vaccines (Basel) ; 9(7)2021 Jul 19.
Article in English | MEDLINE | ID: mdl-34358215

ABSTRACT

Systems vaccinology has seldomly been used in therapeutic HIV-1 vaccine research. Our aim was to identify early gene 'signatures' that predicted virus load control after analytical therapy interruption (ATI) in participants of a dendritic cell-based HIV-1 vaccine trial (DCV2). mRNA and miRNA were extracted from frozen post-vaccination PBMC samples; gene expression was determined by microarray method. In gene set enrichment analysis, responders showed an up-regulation of 14 gene sets (TNF-alpha/NFkB pathway, inflammatory response, the complement system, Il6 and Il2 JAK-STAT signaling, among others) and a down-regulation of 7 gene sets (such as E2F targets or interferon alpha response). The expression of genes regulated by three (miR-223-3p, miR-1183 and miR-8063) of the 9 differentially expressed miRNAs was significantly down-regulated in responders. The deregulation of certain gene sets related to inflammatory processes seems fundamental for viral control, and certain miRNAs may be important in fine-tuning these processes.

12.
Nat Commun ; 12(1): 3932, 2021 06 24.
Article in English | MEDLINE | ID: mdl-34168145

ABSTRACT

Chemical descriptors encode the physicochemical and structural properties of small molecules, and they are at the core of chemoinformatics. The broad release of bioactivity data has prompted enriched representations of compounds, reaching beyond chemical structures and capturing their known biological properties. Unfortunately, bioactivity descriptors are not available for most small molecules, which limits their applicability to a few thousand well characterized compounds. Here we present a collection of deep neural networks able to infer bioactivity signatures for any compound of interest, even when little or no experimental information is available for them. Our signaturizers relate to bioactivities of 25 different types (including target profiles, cellular response and clinical outcomes) and can be used as drop-in replacements for chemical descriptors in day-to-day chemoinformatics tasks. Indeed, we illustrate how inferred bioactivity signatures are useful to navigate the chemical space in a biologically relevant manner, unveiling higher-order organization in natural product collections, and to enrich mostly uncharacterized chemical libraries for activity against the drug-orphan target Snail1. Moreover, we implement a battery of signature-activity relationship (SigAR) models and show a substantial improvement in performance, with respect to chemistry-based classifiers, across a series of biophysics and physiology activity prediction benchmarks.


Subject(s)
Small Molecule Libraries/chemistry , Small Molecule Libraries/pharmacology , Structure-Activity Relationship , Cell Line, Tumor , Databases, Pharmaceutical , Drug Evaluation, Preclinical/methods , Humans , Snail Family Transcription Factors/antagonists & inhibitors , Snail Family Transcription Factors/genetics , Snail Family Transcription Factors/metabolism
13.
Science ; 372(6542)2021 05 07.
Article in English | MEDLINE | ID: mdl-33958448

ABSTRACT

Phenotypes associated with genetic variants can be altered by interactions with other genetic variants (GxG), with the environment (GxE), or both (GxGxE). Yeast genetic interactions have been mapped on a global scale, but the environmental influence on the plasticity of genetic networks has not been examined systematically. To assess environmental rewiring of genetic networks, we examined 14 diverse conditions and scored 30,000 functionally representative yeast gene pairs for dynamic, differential interactions. Different conditions revealed novel differential interactions, which often uncovered functional connections between distantly related gene pairs. However, the majority of observed genetic interactions remained unchanged in different conditions, suggesting that the global yeast genetic interaction network is robust to environmental perturbation and captures the fundamental functional architecture of a eukaryotic cell.


Subject(s)
Gene Regulatory Networks , Gene-Environment Interaction , Saccharomyces cerevisiae Proteins/genetics , Saccharomyces cerevisiae/genetics , Alleles , Genetic Fitness , Mutation
14.
Mol Syst Biol ; 17(5): e10138, 2021 05.
Article in English | MEDLINE | ID: mdl-34042294

ABSTRACT

The consequence of a mutation can be influenced by the context in which it operates. For example, loss of gene function may be tolerated in one genetic background, and lethal in another. The extent to which mutant phenotypes are malleable, the architecture of modifiers and the identities of causal genes remain largely unknown. Here, we measure the fitness effects of ~ 1,100 temperature-sensitive alleles of yeast essential genes in the context of variation from ten different natural genetic backgrounds and map the modifiers for 19 combinations. Altogether, fitness defects for 149 of the 580 tested genes (26%) could be suppressed by genetic variation in at least one yeast strain. Suppression was generally driven by gain-of-function of a single, strong modifier gene, and involved both genes encoding complex or pathway partners suppressing specific temperature-sensitive alleles, as well as general modifiers altering the effect of many alleles. The emerging frequency of suppression and range of possible mechanisms suggest that a substantial fraction of monogenic diseases could be managed by modulating other gene products.


Subject(s)
Gain of Function Mutation , Genes, Essential , Saccharomyces cerevisiae/growth & development , Gene Expression Regulation, Fungal , Genes, Modifier , Genetic Variation , Mutation , Phenotype , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae Proteins/genetics
15.
Nucleic Acids Res ; 49(6): 3156-3167, 2021 04 06.
Article in English | MEDLINE | ID: mdl-33677561

ABSTRACT

The EMBL-EBI Complex Portal is a knowledgebase of macromolecular complexes providing persistent stable identifiers. Entries are linked to literature evidence and provide details of complex membership, function, structure and complex-specific Gene Ontology annotations. Data are freely available and downloadable in HUPO-PSI community standards and missing entries can be requested for curation. In collaboration with Saccharomyces Genome Database and UniProt, the yeast complexome, a compendium of all known heteromeric assemblies from the model organism Saccharomyces cerevisiae, was curated. This expansion of knowledge and scope has led to a 50% increase in curated complexes compared to the previously published dataset, CYC2008. The yeast complexome is used as a reference resource for the analysis of complexes from large-scale experiments. Our analysis showed that genes coding for proteins in complexes tend to have more genetic interactions, are co-expressed with more genes, are more multifunctional, localize more often in the nucleus, and are more often involved in nucleic acid-related metabolic processes and processes where large machineries are the predominant functional drivers. A comparison to genetic interactions showed that about 40% of expanded co-complex pairs also have genetic interactions, suggesting strong functional links between complex members.


Subject(s)
Saccharomyces cerevisiae Proteins/metabolism , Saccharomyces cerevisiae/metabolism , Datasets as Topic , Gene Ontology , Knowledge Bases , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae Proteins/genetics
16.
Mol Syst Biol ; 16(9): e9828, 2020 09.
Article in English | MEDLINE | ID: mdl-32939983

ABSTRACT

Essential genes tend to be highly conserved across eukaryotes, but, in some cases, their critical roles can be bypassed through genetic rewiring. From a systematic analysis of 728 different essential yeast genes, we discovered that 124 (17%) were dispensable essential genes. Through whole-genome sequencing and detailed genetic analysis, we investigated the genetic interactions and genome alterations underlying bypass suppression. Dispensable essential genes often had paralogs, were enriched for genes encoding membrane-associated proteins, and were depleted for members of protein complexes. Functionally related genes frequently drove the bypass suppression interactions. These gene properties were predictive of essential gene dispensability and of specific suppressors among hundreds of genes on aneuploid chromosomes. Our findings identify yeast's core essential gene set and reveal that the properties of dispensable essential genes are conserved from yeast to human cells, correlating with human genes that display cell line-specific essentiality in the Cancer Dependency Map (DepMap) project.


Subject(s)
Genes, Essential , Genes, Fungal , Saccharomyces cerevisiae/genetics , Suppression, Genetic , Aneuploidy , Evolution, Molecular , Gene Deletion , Gene Duplication , Gene Regulatory Networks , Genes, Suppressor , Multiprotein Complexes/metabolism
17.
Genome Med ; 12(1): 78, 2020 09 09.
Article in English | MEDLINE | ID: mdl-32907621

ABSTRACT

Identification of actionable genomic vulnerabilities is key to precision oncology. Utilizing a large-scale drug screening in patient-derived xenografts, we uncover driver gene alteration connections, derive driver co-occurrence (DCO) networks, and relate these to drug sensitivity. Our collection of 53 drug-response predictors attains an average balanced accuracy of 58% in a cross-validation setting, rising to 66% for a subset of high-confidence predictions. We experimentally validated 12 out of 14 predictions in mice and adapted our strategy to obtain drug-response models from patients' progression-free survival data. Our strategy reveals links between oncogenic alterations, increasing the clinical impact of genomic profiling.


Subject(s)
Models, Theoretical , Neoplasms/etiology , Neoplasms/therapy , Precision Medicine , Algorithms , Antineoplastic Agents/pharmacology , Antineoplastic Agents/therapeutic use , Biomarkers, Tumor , Clinical Decision-Making , Databases, Factual , Disease Management , Drug Resistance, Neoplasm/drug effects , Gene Expression Regulation, Neoplastic/drug effects , Genomics/methods , Humans , Neoplasms/pathology , Oncogenes , Precision Medicine/methods , Reproducibility of Results , Translational Research, Biomedical , Treatment Outcome
18.
J Chem Inf Model ; 60(12): 5730-5734, 2020 12 28.
Article in English | MEDLINE | ID: mdl-32672454

ABSTRACT

Until a vaccine becomes available, the current repertoire of drugs is our only therapeutic asset to fight the SARS-CoV-2 outbreak. Indeed, emergency clinical trials have been launched to assess the effectiveness of many marketed drugs, tackling the decrease of viral load through several mechanisms. Here, we present an online resource, based on small-molecule bioactivity signatures and natural language processing, to expand the portfolio of compounds with potential to treat COVID-19. By comparing the set of drugs reported to be potentially active against SARS-CoV-2 to a universe of 1 million bioactive molecules, we identify compounds that display analogous chemical and functional features to the current COVID-19 candidates. Searches can be filtered by level of evidence and mechanism of action, and results can be restricted to drug molecules or include the much broader space of bioactive compounds. Moreover, we allow users to contribute COVID-19 drug candidates, which are automatically incorporated to the pipeline once per day. The computational platform, as well as the source code, is available at https://sbnb.irbbarcelona.org/covid19.


Subject(s)
Antiviral Agents/chemistry , COVID-19 Drug Treatment , Drug Repositioning/methods , SARS-CoV-2/drug effects , Antiviral Agents/pharmacology , Computer Simulation , Drug Design , Humans , Models, Molecular , Molecular Structure , Small Molecule Libraries/chemistry , Small Molecule Libraries/pharmacology
19.
Nat Biotechnol ; 38(9): 1098, 2020 Sep.
Article in English | MEDLINE | ID: mdl-32440008

ABSTRACT

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

20.
Nat Biotechnol ; 38(9): 1087-1096, 2020 09.
Article in English | MEDLINE | ID: mdl-32440005

ABSTRACT

Small molecules are usually compared by their chemical structure, but there is no unified analytic framework for representing and comparing their biological activity. We present the Chemical Checker (CC), which provides processed, harmonized and integrated bioactivity data on ~800,000 small molecules. The CC divides data into five levels of increasing complexity, from the chemical properties of compounds to their clinical outcomes. In between, it includes targets, off-targets, networks and cell-level information, such as omics data, growth inhibition and morphology. Bioactivity data are expressed in a vector format, extending the concept of chemical similarity to similarity between bioactivity signatures. We show how CC signatures can aid drug discovery tasks, including target identification and library characterization. We also demonstrate the discovery of compounds that reverse and mimic biological signatures of disease models and genetic perturbations in cases that could not be addressed using chemical information alone. Overall, the CC signatures facilitate the conversion of bioactivity data to a format that is readily amenable to machine learning methods.


Subject(s)
Pharmaceutical Preparations/metabolism , Small Molecule Libraries/metabolism , Biological Products/chemistry , Biological Products/metabolism , Biological Products/therapeutic use , Biomarkers, Pharmacological/metabolism , Databases, Factual , Drug Discovery , Drug Therapy , Humans , Pharmaceutical Preparations/chemistry , Small Molecule Libraries/chemistry , Small Molecule Libraries/therapeutic use
SELECTION OF CITATIONS
SEARCH DETAIL
...