RESUMO
With over 450 genes, solute carriers (SLCs) constitute the largest transporter superfamily responsible for the uptake and efflux of nutrients, metabolites, and xenobiotics in human cells. SLCs are associated with a wide variety of human diseases, including cancer, diabetes, and metabolic and neurological disorders. They represent an important therapeutic target class that remains only partly exploited as therapeutics that target SLCs are scarce. Additionally, many small molecules reported in the literature to target SLCs are poorly characterized. Both features may be due to the difficulty of developing SLC transport assays that fulfill the quality criteria for high-throughput screening. Here, we report one of the main limitations hampering assay development within the RESOLUTE consortium: the lack of a resource providing high-quality information on SLC tool compounds. To address this, we provide a systematic annotation of tool compounds targeting SLCs. We first provide an overview on RESOLUTE assays. Next, we present a list of SLC-targeting compounds collected from the literature and public databases; we found that most data sources lacked specificity data. Finally, we report on experimental tests of 19 selected compounds against a panel of 13 SLCs from seven different families. Except for a few inhibitors, which were active on unrelated SLCs, the tested inhibitors demonstrated high selectivity for their reported targets. To make this knowledge easily accessible to the scientific community, we created an interactive dashboard displaying the collected data in the RESOLUTE web portal (https://re-solute.eu). We anticipate that our open-access resources on assays and compounds will support the development of future drug discovery campaigns for SLCs.
RESUMO
The solute carrier transporter family 6 (SLC6) is of key interest for their critical role in the transport of small amino acids or amino acid-like molecules. Their dysfunction is strongly associated with human diseases such as including schizophrenia, depression, and Parkinson's disease. Linking single point mutations to disease may support insights into the structure-function relationship of these transporters. This work aimed to develop a computational model for predicting the potential pathogenic effect of single point mutations in the SLC6 family. Missense mutation data was retrieved from UniProt, LitVar, and ClinVar, covering multiple protein-coding transcripts. As encoding approach, amino acid descriptors were used to calculate the average sequence properties for both original and mutated sequences. In addition to the full-sequence calculation, the sequences were cut into twelve domains. The domains are defined according to the transmembrane domains of the SLC6 transporters to analyse the regions' contributions to the pathogenicity prediction. Subsequently, several classification models, namely Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), and Extreme Gradient Boosting (XGBoost) with the hyperparameters optimized through grid search were built. For estimation of model performance, repeated stratified k-fold cross-validation was used. The accuracy values of the generated models are in the range of 0.72 to 0.80. Analysis of feature importance indicates that mutations in distinct regions of SLC6 transporters are associated with an increased risk for pathogenicity. When applying the model on an independent validation set, the performance in accuracy dropped to averagely 0.6 with high precision but low sensitivity scores.
RESUMO
In the past years the interest in Solute Carrier Transporters (SLC) has increased due to their potential as drug targets. At the same time, macrocycles demonstrated promising activities as therapeutic agents. However, the overall macrocycle/SLC-transporter interaction landscape has not been fully revealed yet. In this study, we present a statistical analysis of macrocycles with measured activity against SLC-transporter. Using a data mining pipeline based on KNIME retrieved in total 825 bioactivity data points of macrocycles interacting with SLC-transporter. For further analysis of the SLC inhibitor profiles we developed an interactive KNIME workflow as well as an interactive map of the chemical space coverage utilizing parametric t-SNE models. The parametric t-SNE models provide a good discrimination ability among several corresponding SLC subfamilies' targets. The KNIME workflow, the dataset, and the visualization tool are freely available to the community.
Assuntos
Compostos Macrocíclicos , Compostos Macrocíclicos/química , Compostos Macrocíclicos/farmacologia , Humanos , Proteínas Carreadoras de Solutos/antagonistas & inibidores , Mineração de DadosRESUMO
WikiPathways (wikipathways.org) is an open-source biological pathway database. Collaboration and open science are pivotal to the success of WikiPathways. Here we highlight the continuing efforts supporting WikiPathways, content growth and collaboration among pathway researchers. As an evolving database, there is a growing need for WikiPathways to address and overcome technical challenges. In this direction, WikiPathways has undergone major restructuring, enabling a renewed approach for sharing and curating pathway knowledge, thus providing stability for the future of community pathway curation. The website has been redesigned to improve and enhance user experience. This next generation of WikiPathways continues to support existing features while improving maintainability of the database and facilitating community input by providing new functionality and leveraging automation.
Assuntos
Bases de Dados FactuaisRESUMO
The solute carrier (SLC) superfamily represents the biggest family of transporters with important roles in health and disease. Despite being attractive and druggable targets, the majority of SLCs remains understudied. One major hurdle in research on SLCs is the lack of tools, such as cell-based assays to investigate their biological role and for drug discovery. Another challenge is the disperse and anecdotal information on assay strategies that are suitable for SLCs. This review provides a comprehensive overview of state-of-the-art cellular assay technologies for SLC research and discusses relevant SLC characteristics enabling the choice of an optimal assay technology. The Innovative Medicines Initiative consortium RESOLUTE intends to accelerate research on SLCs by providing the scientific community with high-quality reagents, assay technologies and data sets, and to ultimately unlock SLCs for drug discovery.
RESUMO
WikiPathways (https://www.wikipathways.org) is a biological pathway database known for its collaborative nature and open science approaches. With the core idea of the scientific community developing and curating biological knowledge in pathway models, WikiPathways lowers all barriers for accessing and using its content. Increasingly more content creators, initiatives, projects and tools have started using WikiPathways. Central in this growth and increased use of WikiPathways are the various communities that focus on particular subsets of molecular pathways such as for rare diseases and lipid metabolism. Knowledge from published pathway figures helps prioritize pathway development, using optical character and named entity recognition. We show the growth of WikiPathways over the last three years, highlight the new communities and collaborations of pathway authors and curators, and describe various technologies to connect to external resources and initiatives. The road toward a sustainable, community-driven pathway database goes through integration with other resources such as Wikidata and allowing more use, curation and redistribution of WikiPathways content.
Assuntos
Bases de Dados Factuais , COVID-19/patologia , Curadoria de Dados , Humanos , Publicações , Interface Usuário-ComputadorRESUMO
BACKGROUND: The KNIME platform offers several tools for the analysis of chem- and pharmacoinformatics data. Unless one has sufficient in-house data available for the analysis of interest, it is necessary to fetch third party data into KNIME. Many data sources offer valuable data, but including this data in a workflow is not always straightforward. OBJECTIVE: Here we discuss different ways of accessing public data sources. We give an overview of KNIME nodes for different sources, with references to available example workflows. For data sources with no individual KNIME node available, we present a general approach of accessing a web interface via KNIME. In addition, we discuss necessary steps before the data can be analysed, such as data curation, chemical standardisation and the merging of datasets.
Assuntos
Bases de Dados Factuais , Software , Fluxo de TrabalhoRESUMO
Open PHACTS is a pre-competitive project to answer scientific questions developed recently by the pharmaceutical industry. Having high quality biological interaction information in the Open PHACTS Discovery Platform is needed to answer multiple pathway related questions. To address this, updated WikiPathways data has been added to the platform. This data includes information about biological interactions, such as stimulation and inhibition. The platform's Application Programming Interface (API) was extended with appropriate calls to reference these interactions. These new methods of the Open PHACTS API are available now.
Assuntos
Antineoplásicos/farmacologia , Pesquisa Biomédica , Biologia Computacional/métodos , Descoberta de Drogas , Armazenamento e Recuperação da Informação/métodos , Transdução de Sinais , Software , Indústria Farmacêutica , Humanos , Hipertrofia/tratamento farmacológico , Hipertrofia/patologia , Miócitos Cardíacos/citologia , Miócitos Cardíacos/efeitos dos fármacos , Neoplasias/tratamento farmacológico , Neoplasias/patologiaRESUMO
The Open PHACTS Discovery Platform integrates several public databases, which can be of interest when annotating the results of a phenotypic screening campaign. Workflow tools provide easy-to-customize possibilities to access the platform. Here, we describe how to create such workflows for two different workflow tools (KNIME and Pipeline Pilot), including a protocol to annotate compounds (e.g., phenotypic screening hits) with compound classification, known protein targets, and classifications of the targets.
Assuntos
Biologia Computacional/métodos , Descoberta de Drogas , Software , Descoberta de Drogas/métodos , Anotação de Sequência Molecular , Fenótipo , Navegador , Fluxo de TrabalhoRESUMO
WikiPathways (wikipathways.org) captures the collective knowledge represented in biological pathways. By providing a database in a curated, machine readable way, omics data analysis and visualization is enabled. WikiPathways and other pathway databases are used to analyze experimental data by research groups in many fields. Due to the open and collaborative nature of the WikiPathways platform, our content keeps growing and is getting more accurate, making WikiPathways a reliable and rich pathway database. Previously, however, the focus was primarily on genes and proteins, leaving many metabolites with only limited annotation. Recent curation efforts focused on improving the annotation of metabolism and metabolic pathways by associating unmapped metabolites with database identifiers and providing more detailed interaction knowledge. Here, we report the outcomes of the continued growth and curation efforts, such as a doubling of the number of annotated metabolite nodes in WikiPathways. Furthermore, we introduce an OpenAPI documentation of our web services and the FAIR (Findable, Accessible, Interoperable and Reusable) annotation of resources to increase the interoperability of the knowledge encoded in these pathways and experimental omics data. New search options, monthly downloads, more links to metabolite databases, and new portals make pathway knowledge more effortlessly accessible to individual researchers and research communities.
Assuntos
Bases de Dados de Compostos Químicos , Metabolômica , Animais , Curadoria de Dados , Mineração de Dados , Bases de Dados de Compostos Químicos/normas , Bases de Dados Genéticas , Humanos , Redes e Vias Metabólicas , Controle de Qualidade , Ferramenta de Busca , SoftwareRESUMO
With the public availability of large data sources such as ChEMBLdb and the Open PHACTS Discovery Platform, retrieval of data sets for certain protein targets of interest with consistent assay conditions is no longer a time consuming process. Especially the use of workflow engines such as KNIME or Pipeline Pilot allows complex queries and enables to simultaneously search for several targets. Data can then directly be used as input to various ligand- and structure-based studies. In this contribution, using in-house projects on P-gp inhibition, transporter selectivity, and TRPV1 modulation we outline how the incorporation of linked life science data in the daily execution of projects allowed to expand our approaches from conventional Hansch analysis to complex, integrated multilayer models.
Assuntos
Disciplinas das Ciências Biológicas , Biologia Computacional/métodos , Desenho de Fármacos , Indústria Farmacêutica , Software , Estrutura Molecular , Proteínas/química , Relação Estrutura-Atividade , Fluxo de TrabalhoRESUMO
BACKGROUND: The human ATP binding cassette transporters Breast Cancer Resistance Protein (BCRP) and Multidrug Resistance Protein 1 (P-gp) are co-expressed in many tissues and barriers, especially at the blood-brain barrier and at the hepatocyte canalicular membrane. Understanding their interplay in affecting the pharmacokinetics of drugs is of prime interest. In silico tools to predict inhibition and substrate profiles towards BCRP and P-gp might serve as early filters in the drug discovery and development process. However, to build such models, pharmacological data must be collected for both targets, which is a tedious task, often involving manual and poorly reproducible steps. RESULTS: Compounds with inhibitory activity measured against BCRP and/or P-gp were retrieved by combining Open Data and manually curated data from literature using a KNIME workflow. After determination of compound overlap, machine learning approaches were used to establish multi-label classification models for BCRP/P-gp. Different ways of addressing multi-label problems are explored and compared: label-powerset, binary relevance and classifiers chain. Label-powerset revealed important molecular features for selective or polyspecific inhibitory activity. In our dataset, only two descriptors (the numbers of hydrophobic and aromatic atoms) were sufficient to separate selective BCRP inhibitors from selective P-gp inhibitors. Also, dual inhibitors share properties with both groups of selective inhibitors. Binary relevance and classifiers chain allow improving the predictivity of the models. CONCLUSIONS: The KNIME workflow proved a useful tool to merge data from diverse sources. It could be used for building multi-label datasets of any set of pharmacological targets for which there is data available either in the open domain or in-house. By applying various multi-label learning algorithms, important molecular features driving transporter selectivity could be retrieved. Finally, using the dataset with missing annotations, predictive models can be derived in cases where no accurate dense dataset is available (not enough data overlap or no well balanced class distribution).Graphical abstract.
RESUMO
Modern data-driven drug discovery requires integrated resources to support decision-making and enable new discoveries. The Open PHACTS Discovery Platform (http://dev.openphacts.org) was built to address this requirement by focusing on drug discovery questions that are of high priority to the pharmaceutical industry. Although complex, most of these frequently asked questions (FAQs) revolve around the combination of data concerning compounds, targets, pathways and diseases. Computational drug discovery using workflow tools and the integrated resources of Open PHACTS can deliver answers to most of these questions. Here, we report on a selection of workflows used for solving these use cases and discuss some of the research challenges. The workflows are accessible online from myExperiment (http://www.myexperiment.org) and are available for reuse by the scientific community.
Assuntos
Biologia Computacional , Bases de Dados de Compostos Químicos , Bases de Dados de Produtos Farmacêuticos , Técnicas de Apoio para a Decisão , Descoberta de Drogas/métodos , Preparações Farmacêuticas/química , Fluxo de Trabalho , Acesso à Informação , Mineração de Dados , Humanos , Estrutura Molecular , Transdução de Sinais/efeitos dos fármacos , Relação Estrutura-Atividade , Integração de SistemasRESUMO
Integration of open access, curated, high-quality information from multiple disciplines in the Life and Biomedical Sciences provides a holistic understanding of the domain. Additionally, the effective linking of diverse data sources can unearth hidden relationships and guide potential research strategies. However, given the lack of consistency between descriptors and identifiers used in different resources and the absence of a simple mechanism to link them, gathering and combining relevant, comprehensive information from diverse databases remains a challenge. The Open Pharmacological Concepts Triple Store (Open PHACTS) is an Innovative Medicines Initiative project that uses semantic web technology approaches to enable scientists to easily access and process data from multiple sources to solve real-world drug discovery problems. The project draws together sources of publicly-available pharmacological, physicochemical and biomolecular data, represents it in a stable infrastructure and provides well-defined information exploration and retrieval methods. Here, we highlight the utility of this platform in conjunction with workflow tools to solve pharmacological research questions that require interoperability between target, compound, and pathway data. Use cases presented herein cover 1) the comprehensive identification of chemical matter for a dopamine receptor drug discovery program 2) the identification of compounds active against all targets in the Epidermal growth factor receptor (ErbB) signaling pathway that have a relevance to disease and 3) the evaluation of established targets in the Vitamin D metabolism pathway to aid novel Vitamin D analogue design. The example workflows presented illustrate how the Open PHACTS Discovery Platform can be used to exploit existing knowledge and generate new hypotheses in the process of drug discovery.
Assuntos
Bases de Dados como Assunto , Descoberta de Drogas/organização & administração , Software , Descoberta de Drogas/métodos , Descoberta de Drogas/estatística & dados numéricosRESUMO
Currently, there are more than 800 well characterized human membrane transport proteins (including channels and transporters) and there are estimates that about 10% (approx. 2000) of all human genes are related to transport. Membrane transport proteins are of interest as potential drug targets, for drug delivery, and as a cause of side effects and drugdrug interactions. In light of the development of Open PHACTS, which provides an open pharmacological space, we analyzed selected membrane transport protein classification schemes (Transporter Classification Database, ChEMBL, IUPHAR/BPS Guide to Pharmacology, and Gene Ontology) for their ability to serve as a basis for pharmacology driven protein classification. A comparison of these membrane transport protein classification schemes by using a set of clinically relevant transporters as use-case reveals the strengths and weaknesses of the different taxonomy approaches.
Assuntos
Bases de Dados de Produtos Farmacêuticos , Bases de Dados de Proteínas , Proteínas de Membrana Transportadoras/química , Proteínas de Membrana Transportadoras/classificação , Classificação , Descoberta de Drogas , Ontologia Genética , Humanos , Proteínas de Membrana Transportadoras/genéticaRESUMO
There is strong evidence that ATP-binding cassette (ABC) transporters play a critical role in the pharmacokinetic and pharmacodynamic properties of many drugs and xenobiotics. Due to their pharmacological role, several computational approaches have been developed to understand and predict the interaction between ABC transporters and their ligands. Here, we provide an overview of the current state of the art of the ligand-based models that, derived from the transport and inhibitory activities of a set of ligands, have been published for ABC transporters.
Assuntos
Transportadores de Cassetes de Ligação de ATP/metabolismo , Biologia Computacional/métodos , Modelos Biológicos , Transportadores de Cassetes de Ligação de ATP/química , Animais , Previsões , Humanos , Ligantes , Preparações Farmacêuticas/química , Preparações Farmacêuticas/metabolismo , Relação Quantitativa Estrutura-Atividade , Especificidade por Substrato , Xenobióticos/química , Xenobióticos/farmacocinéticaRESUMO
Self-organizing maps, which are unsupervised artificial neural networks, have become a very useful tool in a wide area of disciplines, including medicinal chemistry. Here, we will focus on two applications of self-organizing maps: the use of self-organizing maps for in silico screening and for clustering and visualisation of large datasets. Additionally, the importance of parameter selection is discussed and some modifications to the original algorithm are summarised.