RESUMO
The categorization of human diseases is mainly based on the affected organ system and phenotypic characteristics. This is limiting the view to the pathological manifestations, while it neglects mechanistic relationships that are crucial to develop therapeutic strategies. This work aims to advance the understanding of diseases and their relatedness beyond traditional phenotypic views. Hence, the similarity among 502 diseases is mapped using six different data dimensions encompassing molecular, clinical, and pharmacological information retrieved from public sources. Multiple distance measures and multi-view clustering are used to assess the patterns of disease relatedness. The integration of all six dimensions into a consensus map of disease relationships reveals a divergent disease view from the International Classification of Diseases (ICD), emphasizing novel insights offered by a multi-view disease map. Disease features such as genes, pathways, and chemicals that are enriched in distinct disease groups are identified. Finally, an evaluation of the top similar diseases of three candidate diseases common in the Western population shows concordance with known epidemiological associations and reveals rare features shared between Type 2 diabetes (T2D) and Alzheimer's disease. A revision of disease relationships holds promise for facilitating the reconstruction of comorbidity patterns, repurposing drugs, and advancing drug discovery in the future.
Assuntos
Diabetes Mellitus Tipo 2 , Humanos , Diabetes Mellitus Tipo 2/genética , Análise por Conglomerados , Doença de Alzheimer/genética , Doença/genética , Fenótipo , Classificação Internacional de DoençasRESUMO
Hazard assessment is the first step in evaluating the potential adverse effects of chemicals. Traditionally, toxicological assessment has focused on the exposure, overlooking the impact of the exposed system on the observed toxicity. However, systems toxicology emphasizes how system properties significantly contribute to the observed response. Hence, systems theory states that interactions store more information than individual elements, leading to the adoption of network based models to represent complex systems in many fields of life sciences. Here, they develop a network-based approach to characterize toxicological responses in the context of a biological system, inferring biological system specific networks. They directly link molecular alterations to the adverse outcome pathway (AOP) framework, establishing direct connections between omics data and toxicologically relevant phenotypic events. They apply this framework to a dataset including 31 engineered nanomaterials with different physicochemical properties in two different in vitro and one in vivo models and demonstrate how the biological system is the driving force of the observed response. This work highlights the potential of network-based methods to significantly improve their understanding of toxicological mechanisms from a systems biology perspective and provides relevant considerations and future data-driven approaches for the hazard assessment of nanomaterials and other advanced materials.
Assuntos
Rotas de Resultados Adversos , Nanoestruturas , Nanoestruturas/toxicidade , Humanos , Biologia de Sistemas/métodos , Animais , Toxicologia/métodosRESUMO
MOTIVATION: De novo drug development is a long and expensive process that poses significant challenges from the design to the preclinical testing, making the introduction into the market slow and difficult. This limitation paved the way to the development of drug repurposing, which consists in the re-usage of already approved drugs, developed for other therapeutic indications. Although several efforts have been carried out in the last decade in order to achieve clinically relevant drug repurposing predictions, the amount of repurposed drugs that have been employed in actual pharmacological therapies is still limited. On one hand, mechanistic approaches, including profile-based and network-based methods, exploit the wealth of data about drug sensitivity and perturbational profiles as well as disease transcriptomics profiles. On the other hand, chemocentric approaches, including structure-based methods, take into consideration the intrinsic structural properties of the drugs and their molecular targets. The poor integration between mechanistic and chemocentric approaches is one of the main limiting factors behind the poor translatability of drug repurposing predictions into the clinics. RESULTS: In this work, we introduce DREAM, an R package aimed to integrate mechanistic and chemocentric approaches in a unified computational workflow. DREAM is devoted to the druggability evaluation of pathological conditions of interest, leveraging robust drug repurposing predictions. In addition, the user can derive optimized sets of drugs putatively suitable for combination therapy. In order to show the functionalities of the DREAM package, we report a case study on atopic dermatitis. AVAILABILITY AND IMPLEMENTATION: DREAM is freely available at https://github.com/fhaive/dream. The docker image of DREAM is available at: https://hub.docker.com/r/fhaive/dream.
Assuntos
Reposicionamento de Medicamentos , Transcriptoma , Humanos , Reposicionamento de Medicamentos/métodosRESUMO
SUMMARY: Biological data repositories are an invaluable source of publicly available research evidence. Unfortunately, the lack of convergence of the scientific community on a common metadata annotation strategy has resulted in large amounts of data with low FAIRness (Findable, Accessible, Interoperable and Reusable). The possibility of generating high-quality insights from their integration relies on data curation, which is typically an error-prone process while also being expensive in terms of time and human labour. Here, we present ESPERANTO, an innovative framework that enables a standardized semi-supervised harmonization and integration of toxicogenomics metadata and increases their FAIRness in a Good Laboratory Practice-compliant fashion. The harmonization across metadata is guaranteed with the definition of an ad hoc vocabulary. The tool interface is designed to support the user in metadata harmonization in a user-friendly manner, regardless of the background and the type of expertise. AVAILABILITY AND IMPLEMENTATION: ESPERANTO and its user manual are freely available for academic purposes at https://github.com/fhaive/esperanto. The input and the results showcased in Supplementary File S1 are available at the same link.
Assuntos
Metadados , Software , Humanos , Toxicogenética , Idioma , Curadoria de DadosRESUMO
Adverse outcome pathways (AOPs) are emerging as a central framework in modern toxicology and other fields in biomedicine. They serve as an extension of pathway-based concepts by depicting biological mechanisms as causally linked sequences of key events (KEs) from a molecular initiating event (MIE) to an adverse outcome. AOPs guide the use and development of new approach methodologies (NAMs) aimed at reducing animal experimentation. While AOPs model the systemic mechanisms at various levels of biological organisation, toxicogenomics provides the means to study the molecular mechanisms of chemical exposures. Systematic integration of these two concepts would improve the application of AOP-based knowledge while also supporting the interpretation of complex omics data. Hence, we established this link through rigorous curation of molecular annotations for the KEs of human relevant AOPs. We further expanded and consolidated the annotations of the biological context of KEs. These curated annotations pave the way to embed AOPs in molecular data interpretation, facilitating the emergence of new knowledge in biomedicine.
Assuntos
Rotas de Resultados Adversos , Humanos , Bases de Conhecimento , ToxicogenéticaRESUMO
MOTIVATION: Transcriptomic data can be used to describe the mechanism of action (MOA) of a chemical compound. However, omics data tend to be complex and prone to noise, making the comparison of different datasets challenging. Often, transcriptomic profiles are compared at the level of individual gene expression values, or sets of differentially expressed genes. Such approaches can suffer from underlying technical and biological variance, such as the biological system exposed on or the machine/method used to measure gene expression data, technical errors and further neglect the relationships between the genes. We propose a network mapping approach for knowledge-driven comparison of transcriptomic profiles (KNeMAP), which combines genes into similarity groups based on multiple levels of prior information, hence adding a higher-level view onto the individual gene view. When comparing KNeMAP with fold change (expression) based and deregulated gene set-based methods, KNeMAP was able to group compounds with higher accuracy with respect to prior information as well as is less prone to noise corrupted data. RESULT: We applied KNeMAP to analyze the Connectivity Map dataset, where the gene expression changes of three cell lines were analyzed after treatment with 676 drugs as well as the Fortino et al. dataset where two cell lines with 31 nanomaterials were analyzed. Although the expression profiles across the biological systems are highly different, KNeMAP was able to identify sets of compounds that induce similar molecular responses when exposed on the same biological system. AVAILABILITY AND IMPLEMENTATION: Relevant data and the KNeMAP function is available at: https://github.com/fhaive/KNeMAP and 10.5281/zenodo.7334711.
Assuntos
Perfilação da Expressão Gênica , TranscriptomaRESUMO
Mechanistic toxicology provides a powerful approach to inform on the safety of chemicals and the development of safe-by-design compounds. Although toxicogenomics supports mechanistic evaluation of chemical exposures, its implementation into the regulatory framework is hindered by uncertainties in the analysis and interpretation of such data. The use of mechanistic evidence through the adverse outcome pathway (AOP) concept is promoted for the development of new approach methodologies (NAMs) that can reduce animal experimentation. However, to unleash the full potential of AOPs and build confidence into toxicogenomics, robust associations between AOPs and patterns of molecular alteration need to be established. Systematic curation of molecular events to AOPs will create the much-needed link between toxicogenomics and systemic mechanisms depicted by the AOPs. This, in turn, will introduce novel ways of benefitting from the AOPs, including predictive models and targeted assays, while also reducing the need for multiple testing strategies. Hence, a multi-step strategy to annotate AOPs is developed, and the resulting associations are applied to successfully highlight relevant adverse outcomes for chemical exposures with strong in vitro and in vivo convergence, supporting chemical grouping and other data-driven approaches. Finally, a panel of AOP-derived in vitro biomarkers for pulmonary fibrosis (PF) is identified and experimentally validated.
Assuntos
Rotas de Resultados Adversos , Segurança Química , Animais , Medição de Risco/métodos , ToxicogenéticaRESUMO
There is an urgent need to apply effective, data-driven approaches to reliably predict engineered nanomaterial (ENM) toxicity. Here we introduce a predictive computational framework based on the molecular and phenotypic effects of a large panel of ENMs across multiple in vitro and in vivo models. Our methodology allows for the grouping of ENMs based on multi-omics approaches combined with robust toxicity tests. Importantly, we identify mRNA-based toxicity markers and extensively replicate them in multiple independent datasets. We find that models based on combinations of omics-derived features and material intrinsic properties display significantly improved predictive accuracy as compared to physicochemical properties alone.
Assuntos
Nanoestruturas , Biomarcadores , Nanoestruturas/toxicidade , RNA Mensageiro/genéticaRESUMO
Despite remarkable efforts of computational and predictive pharmacology to improve therapeutic strategies for complex diseases, only in a few cases have the predictions been eventually employed in the clinics. One of the reasons behind this drawback is that current predictive approaches are based only on the integration of molecular perturbation of a certain disease with drug sensitivity signatures, neglecting intrinsic properties of the drugs. Here we integrate mechanistic and chemocentric approaches to drug repositioning by developing an innovative network pharmacology strategy. We developed a multilayer network-based computational framework integrating perturbational signatures of the disease as well as intrinsic characteristics of the drugs, such as their mechanism of action and chemical structure. We present five case studies carried out on public data from The Cancer Genome Atlas, including invasive breast cancer, colon adenocarcinoma, lung squamous cell carcinoma, hepatocellular carcinoma and prostate adenocarcinoma. Our results highlight paclitaxel as a suitable drug for combination therapy for many of the considered cancer types. In addition, several non-cancer-related genes representing unusual drug targets were identified as potential candidates for pharmacological treatment of cancer.
RESUMO
The recent advancements in toxicogenomics have led to the availability of large omics data sets, representing the starting point for studying the exposure mechanism of action and identifying candidate biomarkers for toxicity prediction. The current lack of standard methods in data generation and analysis hampers the full exploitation of toxicogenomics-based evidence in regulatory risk assessment. Moreover, the pipelines for the preprocessing and downstream analyses of toxicogenomic data sets can be quite challenging to implement. During the years, we have developed a number of software packages to address specific questions related to multiple steps of toxicogenomics data analysis and modelling. In this review we present the Nextcast software collection and discuss how its individual tools can be combined into efficient pipelines to answer specific biological questions. Nextcast components are of great support to the scientific community for analysing and interpreting large data sets for the toxicity evaluation of compounds in an unbiased, straightforward, and reliable manner. The Nextcast software suite is available at: ( https://github.com/fhaive/nextcast).
RESUMO
Biomarkers are valuable indicators of the state of a biological system. Microarray technology has been extensively used to identify biomarkers and build computational predictive models for disease prognosis, drug sensitivity and toxicity evaluations. Activation biomarkers can be used to understand the underlying signaling cascades, mechanisms of action and biological cross talk. Biomarker detection from microarray data requires several considerations both from the biological and computational points of view. In this chapter, we describe the main methodology used in biomarkers discovery and predictive modeling and we address some of the related challenges. Moreover, we discuss biomarker validation and give some insights into multiomics strategies for biomarker detection.
Assuntos
Análise em Microsséries , Biomarcadores , Pesquisa BiomédicaRESUMO
The amount of data made available by microarrays gives researchers the opportunity to delve into the complexity of biological systems. However, the noisy and extremely high-dimensional nature of this kind of data poses significant challenges. Microarrays allow for the parallel measurement of thousands of molecular objects spanning different layers of interactions. In order to be able to discover hidden patterns, the most disparate analytical techniques have been proposed. Here, we describe the basic methodologies to approach the analysis of microarray datasets that focus on the task of (sub)group discovery.
Assuntos
Algoritmos , Análise por Conglomerados , Perfilação da Expressão Gênica , Análise de Sequência com Séries de OligonucleotídeosRESUMO
The pharmacological arsenal against the COVID-19 pandemic is largely based on generic anti-inflammatory strategies or poorly scalable solutions. Moreover, as the ongoing vaccination campaign is rolling slower than wished, affordable and effective therapeutics are needed. To this end, there is increasing attention toward computational methods for drug repositioning and de novo drug design. Here, multiple data-driven computational approaches are systematically integrated to perform a virtual screening and prioritize candidate drugs for the treatment of COVID-19. From the list of prioritized drugs, a subset of representative candidates to test in human cells is selected. Two compounds, 7-hydroxystaurosporine and bafetinib, show synergistic antiviral effects in vitro and strongly inhibit viral-induced syncytia formation. Moreover, since existing drug repositioning methods provide limited usable information for de novo drug design, the relevant chemical substructures of the identified drugs are extracted to provide a chemical vocabulary that may help to design new effective drugs.
Assuntos
Antivirais/farmacologia , Tratamento Farmacológico da COVID-19 , COVID-19 , Células Gigantes , Pirimidinas/farmacologia , SARS-CoV-2/metabolismo , Estaurosporina/análogos & derivados , Células A549 , COVID-19/metabolismo , Biologia Computacional , Avaliação Pré-Clínica de Medicamentos , Reposicionamento de Medicamentos , Células Gigantes/metabolismo , Células Gigantes/virologia , Humanos , Estaurosporina/farmacologiaRESUMO
. De novo drug design is a computational approach that generates novel molecular structures from atomic building blocks with no a priori relationships. Conventional methods include structure-based and ligand-based design, which depend on the properties of the active site of a biological target or its known active binders, respectively. Artificial intelligence, including machine learning, is an emerging field that has positively impacted the drug discovery process. Deep reinforcement learning is a subdivision of machine learning that combines artificial neural networks with reinforcement-learning architectures. This method has successfully been employed to develop novel de novo drug design approaches using a variety of artificial networks including recurrent neural networks, convolutional neural networks, generative adversarial networks, and autoencoders. This review article summarizes advances in de novo drug design, from conventional growth algorithms to advanced machine-learning methodologies and highlights hot topics for further development.
Assuntos
Desenho de Fármacos , Aprendizado de Máquina , Redes Neurais de Computação , Preparações Farmacêuticas/química , Animais , HumanosRESUMO
BACKGROUND: Omics technologies have been widely applied in toxicology studies to investigate the effects of different substances on exposed biological systems. A classical toxicogenomic study consists in testing the effects of a compound at different dose levels and different time points. The main challenge consists in identifying the gene alteration patterns that are correlated to doses and time points. The majority of existing methods for toxicogenomics data analysis allow the study of the molecular alteration after the exposure (or treatment) at each time point individually. However, this kind of analysis cannot identify dynamic (time-dependent) events of dose responsiveness. RESULTS: We propose TinderMIX, an approach that simultaneously models the effects of time and dose on the transcriptome to investigate the course of molecular alterations exerted in response to the exposure. Starting from gene log fold-change, TinderMIX fits different integrated time and dose models to each gene, selects the optimal one, and computes its time and dose effect map; then a user-selected threshold is applied to identify the responsive area on each map and verify whether the gene shows a dynamic (time-dependent) and dose-dependent response; eventually, responsive genes are labelled according to the integrated time and dose point of departure. CONCLUSIONS: To showcase the TinderMIX method, we analysed 2 drugs from the Open TG-GATEs dataset, namely, cyclosporin A and thioacetamide. We first identified the dynamic dose-dependent mechanism of action of each drug and compared them. Our analysis highlights that different time- and dose-integrated point of departure recapitulates the toxicity potential of the compounds as well as their dynamic dose-dependent mechanism of action.
Assuntos
Biologia Computacional/métodos , Software , Toxicogenética/métodos , Algoritmos , Relação Dose-Resposta a Droga , Perfilação da Expressão Gênica , Regulação da Expressão Gênica/efeitos dos fármacos , Humanos , Testes Farmacogenômicos , Variantes FarmacogenômicosRESUMO
Preprocessing of transcriptomics data plays a pivotal role in the development of toxicogenomics-driven tools for chemical toxicity assessment. The generation and exploitation of large volumes of molecular profiles, following an appropriate experimental design, allows the employment of toxicogenomics (TGx) approaches for a thorough characterisation of the mechanism of action (MOA) of different compounds. To date, a plethora of data preprocessing methodologies have been suggested. However, in most cases, building the optimal analytical workflow is not straightforward. A careful selection of the right tools must be carried out, since it will affect the downstream analyses and modelling approaches. Transcriptomics data preprocessing spans across multiple steps such as quality check, filtering, normalization, batch effect detection and correction. Currently, there is a lack of standard guidelines for data preprocessing in the TGx field. Defining the optimal tools and procedures to be employed in the transcriptomics data preprocessing will lead to the generation of homogeneous and unbiased data, allowing the development of more reliable, robust and accurate predictive models. In this review, we outline methods for the preprocessing of three main transcriptomic technologies including microarray, bulk RNA-Sequencing (RNA-Seq), and single cell RNA-Sequencing (scRNA-Seq). Moreover, we discuss the most common methods for the identification of differentially expressed genes and to perform a functional enrichment analysis. This review is the second part of a three-article series on Transcriptomics in Toxicogenomics.
RESUMO
The starting point of successful hazard assessment is the generation of unbiased and trustworthy data. Conventional toxicity testing deals with extensive observations of phenotypic endpoints in vivo and complementing in vitro models. The increasing development of novel materials and chemical compounds dictates the need for a better understanding of the molecular changes occurring in exposed biological systems. Transcriptomics enables the exploration of organisms' responses to environmental, chemical, and physical agents by observing the molecular alterations in more detail. Toxicogenomics integrates classical toxicology with omics assays, thus allowing the characterization of the mechanism of action (MOA) of chemical compounds, novel small molecules, and engineered nanomaterials (ENMs). Lack of standardization in data generation and analysis currently hampers the full exploitation of toxicogenomics-based evidence in risk assessment. To fill this gap, TGx methods need to take into account appropriate experimental design and possible pitfalls in the transcriptomic analyses as well as data generation and sharing that adhere to the FAIR (Findable, Accessible, Interoperable, and Reusable) principles. In this review, we summarize the recent advancements in the design and analysis of DNA microarray, RNA sequencing (RNA-Seq), and single-cell RNA-Seq (scRNA-Seq) data. We provide guidelines on exposure time, dose and complex endpoint selection, sample quality considerations and sample randomization. Furthermore, we summarize publicly available data resources and highlight applications of TGx data to understand and predict chemical toxicity potential. Additionally, we discuss the efforts to implement TGx into regulatory decision making to promote alternative methods for risk assessment and to support the 3R (reduction, refinement, and replacement) concept. This review is the first part of a three-article series on Transcriptomics in Toxicogenomics. These initial considerations on Experimental Design, Technologies, Publicly Available Data, Regulatory Aspects, are the starting point for further rigorous and reliable data preprocessing and modeling, described in the second and third part of the review series.
RESUMO
Transcriptomics data are relevant to address a number of challenges in Toxicogenomics (TGx). After careful planning of exposure conditions and data preprocessing, the TGx data can be used in predictive toxicology, where more advanced modelling techniques are applied. The large volume of molecular profiles produced by omics-based technologies allows the development and application of artificial intelligence (AI) methods in TGx. Indeed, the publicly available omics datasets are constantly increasing together with a plethora of different methods that are made available to facilitate their analysis, interpretation and the generation of accurate and stable predictive models. In this review, we present the state-of-the-art of data modelling applied to transcriptomics data in TGx. We show how the benchmark dose (BMD) analysis can be applied to TGx data. We review read across and adverse outcome pathways (AOP) modelling methodologies. We discuss how network-based approaches can be successfully employed to clarify the mechanism of action (MOA) or specific biomarkers of exposure. We also describe the main AI methodologies applied to TGx data to create predictive classification and regression models and we address current challenges. Finally, we present a short description of deep learning (DL) and data integration methodologies applied in these contexts. Modelling of TGx data represents a valuable tool for more accurate chemical safety assessment. This review is the third part of a three-article series on Transcriptomics in Toxicogenomics.
RESUMO
MOTIVATION: The analysis of dose-dependent effects on the gene expression is gaining attention in the field of toxicogenomics. Currently available computational methods are usually limited to specific omics platforms or biological annotations and are able to analyse only one experiment at a time. RESULTS: We developed the software BMDx with a graphical user interface for the Benchmark Dose (BMD) analysis of transcriptomics data. We implemented an approach based on the fitting of multiple models and the selection of the optimal model based on the Akaike Information Criterion. The BMDx tool takes as an input a gene expression matrix and a phenotype table, computes the BMD, its related values, and IC50/EC50 estimations. It reports interactive tables and plots that the user can investigate for further details of the fitting, dose effects and functional enrichment. BMDx allows a fast and convenient comparison of the BMD values of a transcriptomics experiment at different time points and an effortless way to interpret the results. Furthermore, BMDx allows to analyse and to compare multiple experiments at once. AVAILABILITY AND IMPLEMENTATION: BMDx is implemented as an R/Shiny software and is available at https://github.com/Greco-Lab/BMDx/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Benchmarking , Biologia Computacional , Software , Toxicogenética , TranscriptomaRESUMO
Magnetic resonance imaging allows acquiring functional and structural connectivity data from which high-density whole-brain networks can be derived to carry out connectome-wide analyses in normal and clinical populations. Graph theory has been widely applied to investigate the modular structure of brain connections by using centrality measures to identify the "hub" of human connectomes, and community detection methods to delineate subnetworks associated with diverse cognitive and sensorimotor functions. These analyses typically rely on a preprocessing step (pruning) to reduce computational complexity and remove the weakest edges that are most likely affected by experimental noise. However, weak links may contain relevant information about brain connectivity, therefore, the identification of the optimal trade-off between retained and discarded edges is a subject of active research. We introduce a pruning algorithm to identify edges that carry the highest information content. The algorithm selects both strong edges (i.e. edges belonging to shortest paths) and weak edges that are topologically relevant in weakly connected subnetworks. The newly developed "strong-weak" pruning (SWP) algorithm was validated on simulated networks that mimic the structure of human brain networks. It was then applied for the analysis of a real dataset of subjects affected by amyotrophic lateral sclerosis (ALS), both at the early (ALS2) and late (ALS3) stage of the disease, and of healthy control subjects. SWP preprocessing allowed identifying statistically significant differences in the path length of networks between patients and healthy subjects. ALS patients showed a decrease of connectivity between frontal cortex to temporal cortex and parietal cortex and between temporal and occipital cortex. Moreover, degree of centrality measures revealed significantly different hub and centrality scores between patient subgroups. These findings suggest a widespread alteration of network topology in ALS associated with disease progression.