RESUMO
While link prediction methods in knowledge graphs have been increasingly utilized to locate potential associations between compounds and diseases, they suffer from lack of sufficient evidence to explain why a drug and a disease may be indicated. This is especially true for knowledge graph embedding (KGE) based methods where a drug-disease indication is linked only by information gleaned from a vector representation. Complementary pathwalking algorithms can increase the confidence of drug repurposing candidates by traversing a knowledge graph. However, these methods heavily weigh the relatedness of drugs, through their targets, pharmacology or shared diseases. Furthermore, these methods can rely on arbitrarily extracted paths as evidence of a compound to disease indication and lack the ability to make predictions on rare diseases. In this paper, we evaluate seven link prediction methods on a vast biomedical knowledge graph for drug repurposing. We follow the principle of consilience, and combine the reasoning paths and predictions provided by path-based reasoning approaches with those of KGE methods to identify putative drug repurposing indications. Finally, we highlight the utility of our approach through a potential repurposing indication.
RESUMO
Motivation: Data reuse is a common and vital practice in molecular biology and enables the knowledge gathered over recent decades to drive discovery and innovation in the life sciences. Much of this knowledge has been collated into molecular biology databases, such as UniProtKB, and these resources derive enormous value from sharing data among themselves. However, quantifying and documenting this kind of data reuse remains a challenge. Results: The article reports on a one-day virtual workshop hosted by the UniProt Consortium in March 2023, attended by representatives from biodata resources, experts in data management, and NIH program managers. Workshop discussions focused on strategies for tracking data reuse, best practices for reusing data, and the challenges associated with data reuse and tracking. Surveys and discussions showed that data reuse is widespread, but critical information for reproducibility is sometimes lacking. Challenges include costs of tracking data reuse, tensions between tracking data and open sharing, restrictive licenses, and difficulties in tracking commercial data use. Recommendations that emerged from the discussion include: development of standardized formats for documenting data reuse, education about the obstacles posed by restrictive licenses, and continued recognition by funding agencies that data management is a critical activity that requires dedicated resources. Availability and implementation: Summaries of survey results are available at: https://docs.google.com/forms/d/1j-VU2ifEKb9C-sW6l3ATB79dgHdRk5v_lESv2hawnso/viewanalytics (survey of data providers) and https://docs.google.com/forms/d/18WbJFutUd7qiZoEzbOytFYXSfWFT61hVce0vjvIwIjk/viewanalytics (survey of users).
RESUMO
Type 1 diabetes (T1D) is a prototypic T cell-mediated autoimmune disease. Because the islets of Langerhans are insulated from blood vessels by a double basement membrane and lack detectable lymphatic drainage, interactions between endocrine and circulating T cells are not permitted. Thus, we hypothesized that initiation and progression of anti-islet immunity required islet neolymphangiogenesis to allow T cell access to the islet. Combining microscopy and single cell approaches, the timing of this phenomenon in mice was situated between 5 and 8 wk of age when activated anti-insulin CD4 T cells became detectable in peripheral blood while peri-islet pathology developed. This "peri-insulitis," dominated by CD4 T cells, respected the islet basement membrane and was limited on the outside by lymphatic endothelial cells that gave it the attributes of a tertiary lymphoid structure. As in most tissues, lymphangiogenesis seemed to be secondary to local segmental endothelial inflammation at the collecting postcapillary venule. In addition to classic markers of inflammation such as CD29, V-CAM, and NOS, MHC class II molecules were expressed by nonhematopoietic cells in the same location both in mouse and human islets. This CD45- MHC class II+ cell population was capable of spontaneously presenting islet Ags to CD4 T cells. Altogether, these observations favor an alternative model for the initiation of T1D, outside of the islet, in which a vascular-associated cell appears to be an important MHC class II-expressing and -presenting cell.
Assuntos
Diabetes Mellitus Tipo 1 , Ilhotas Pancreáticas , Humanos , Camundongos , Animais , Células Endoteliais , Antígenos de Histocompatibilidade Classe II , Inflamação/patologia , Camundongos Endogâmicos NODRESUMO
Retinal image slip during head rotation drives motor learning in the rotational vestibulo-ocular reflex (VOR) and forms the basis of gaze-stability exercises that treat vestibular dysfunction. Clinical exercises, however, are unengaging, cannot easily be titrated to the level of impairment, and provide neither direct feedback nor tracking of the patient's adherence, performance, and progress. To address this, we have developed a custom application for VOR training based on an interactive computer game. In this study, we tested the ability of this game to induce VOR learning in individuals with normal vestibular function, and we compared the efficacy of single-step and incremental learning protocols. Eighteen participants played the game twice on different days. All participants tolerated the game and were able to complete both sessions. The game scenario incorporated a series of brief head rotations, similar to active head impulses, that were paired with a dynamic acuity task and with a visual-vestibular mismatch (VVM) intended to increase VOR gain (single-step: 300 successful trials at ×1.5 viewing; incremental: 100 trials each of ×1.13, ×1.33, and ×1.5 viewing). Overall, VOR gain increased by 15 ± 4.7% (mean ± 95% CI, P < 0.001). Gains increased similarly for active and passive head rotations, and, contrary to our hypothesis, there was little effect of the learning strategy. This study shows that an interactive computer game provides robust VOR training and has the potential to deliver effective, engaging, and trackable gaze-stability exercises to patients with a range of vestibular dysfunctions.NEW & NOTEWORTHY This study demonstrates the feasibility and efficacy of a customized computer game to induce motor learning in the high-frequency rotational vestibulo-ocular reflex. It provides a physiological basis for the deployment of this technology to clinical vestibular rehabilitation.
Assuntos
Reflexo Vestíbulo-Ocular , Vestíbulo do Labirinto , Humanos , Reflexo Vestíbulo-Ocular/fisiologia , Adaptação Fisiológica/fisiologia , Terapia por Exercício , Movimentos da Cabeça/fisiologiaRESUMO
Objective: Gastric intestinal metaplasia (GIM) is a precancerous lesion that increases gastric cancer (GC) risk. The Operative Link on GIM (OLGIM) is a combined clinical-histopathologic system to risk-stratify patients with GIM. The identification of molecular biomarkers that are indicators for advanced OLGIM lesions may improve cancer prevention efforts. Methods: This study was based on clinical and genomic data from four cohorts: 1) GAPS, a GIM cohort with detailed OLGIM severity scoring (N=303 samples); 2) the Cancer Genome Atlas (N=198); 3) a collation of in-house and publicly available scRNA-seq data (N=40), and 4) a spatial validation cohort (N=5) consisting of annotated histology slides of patients with either GC or advanced GIM. We used a multi-omics pipeline to identify, validate and sequentially parse a highly-refined signature of 26 genes which characterize high-risk GIM. Results: Using standard RNA-seq, we analyzed two separate, non-overlapping discovery (N=88) and validation (N=215) sets of GIM. In the discovery phase, we identified 105 upregulated genes specific for high-risk GIM (defined as OLGIM III-IV), of which 100 genes were independently confirmed in the validation set. Spatial transcriptomic profiling revealed 36 of these 100 genes to be expressed in metaplastic foci in GIM. Comparison with bulk GC sequencing data revealed 26 of these genes to be expressed in intestinal-type GC. Single-cell profiling resolved the 26-gene signature to both mature intestinal lineages (goblet cells, enterocytes) and immature intestinal lineages (stem-like cells). A subset of these genes was further validated using single-molecule multiplex fluorescence in situ hybridization. We found certain genes (TFF3 and ANPEP) to mark differentiated intestinal lineages, whereas others (OLFM4 and CPS1) localized to immature cells in the isthmic/crypt region of metaplastic glands, consistent with the findings from scRNAseq analysis. Conclusions: using an integrated multi-omics approach, we identified a novel 26-gene expression signature for high-OLGIM precursors at increased risk for GC. We found this signature localizes to aberrant intestinal stem-like cells within the metaplastic microenvironment. These findings hold important translational significance for future prevention and early detection efforts.
RESUMO
Knowledge graphs have become a common approach for knowledge representation. Yet, the application of graph methodology is elusive due to the sheer number and complexity of knowledge sources. In addition, semantic incompatibilities hinder efforts to harmonize and integrate across these diverse sources. As part of The Biomedical Translator Consortium, we have developed a knowledge graph-based question-answering system designed to augment human reasoning and accelerate translational scientific discovery: the Translator system. We have applied the Translator system to answer biomedical questions in the context of a broad array of diseases and syndromes, including Fanconi anemia, primary ciliary dyskinesia, multiple sclerosis, and others. A variety of collaborative approaches have been used to research and develop the Translator system. One recent approach involved the establishment of a monthly "Question-of-the-Month (QotM) Challenge" series. Herein, we describe the structure of the QotM Challenge; the six challenges that have been conducted to date on drug-induced liver injury, cannabidiol toxicity, coronavirus infection, diabetes, psoriatic arthritis, and ATP1A3-related phenotypes; the scientific insights that have been gleaned during the challenges; and the technical issues that were identified over the course of the challenges and that can now be addressed to foster further development of the prototype Translator system. We close with a discussion on Large Language Models such as ChatGPT and highlight differences between those models and the Translator system.
RESUMO
SUMMARY: Knowledge graphs are an increasingly common data structure for representing biomedical information. These knowledge graphs can easily represent heterogeneous types of information, and many algorithms and tools exist for querying and analyzing graphs. Biomedical knowledge graphs have been used in a variety of applications, including drug repurposing, identification of drug targets, prediction of drug side effects, and clinical decision support. Typically, knowledge graphs are constructed by centralization and integration of data from multiple disparate sources. Here, we describe BioThings Explorer, an application that can query a virtual, federated knowledge graph derived from the aggregated information in a network of biomedical web services. BioThings Explorer leverages semantically precise annotations of the inputs and outputs for each resource, and automates the chaining of web service calls to execute multi-step graph queries. Because there is no large, centralized knowledge graph to maintain, BioThings Explorer is distributed as a lightweight application that dynamically retrieves information at query time. AVAILABILITY AND IMPLEMENTATION: More information can be found at https://explorer.biothings.io and code is available at https://github.com/biothings/biothings_explorer.
Assuntos
Algoritmos , Reconhecimento Automatizado de PadrãoRESUMO
Computational drug repositioning methods have emerged as an attractive and effective solution to find new candidates for existing therapies, reducing the time and cost of drug development. Repositioning methods based on biomedical knowledge graphs typically offer useful supporting biological evidence. This evidence is based on reasoning chains or subgraphs that connect a drug to a disease prediction. However, there are no databases of drug mechanisms that can be used to train and evaluate such methods. Here, we introduce the Drug Mechanism Database (DrugMechDB), a manually curated database that describes drug mechanisms as paths through a knowledge graph. DrugMechDB integrates a diverse range of authoritative free-text resources to describe 4,583 drug indications with 32,249 relationships, representing 14 major biological scales. DrugMechDB can be employed as a benchmark dataset for assessing computational drug repositioning models or as a valuable resource for training such models.
Assuntos
Benchmarking , Desenvolvimento de Medicamentos , Bases de Dados Factuais , Reposicionamento de Medicamentos , ConhecimentoRESUMO
The endocrine pancreas is one of the most inaccessible organs of the human body. Its autoimmune attack leads to type 1 diabetes (T1D) in a genetically susceptible population and a lifelong need for exogenous insulin replacement. Monitoring disease progression by sampling peripheral blood would provide key insights into T1D immune-mediated mechanisms and potentially change preclinical diagnosis and the evaluation of therapeutic interventions. This effort has been limited to the measurement of circulating anti-islet antibodies, which despite a recognized diagnostic value, remain poorly predictive at the individual level for a fundamentally CD4 T cell-dependent disease. Here, peptide-major histocompatibility complex tetramers were used to profile blood anti-insulin CD4 T cells in mice and humans. While percentages of these were not directly informative, the state of activation of anti-insulin T cells measured by RNA and protein profiling was able to distinguish the absence of autoimmunity versus disease progression. Activated anti-insulin CD4 T cell were detected not only at time of diagnosis but also in patients with established disease and in some at-risk individuals. These results support the concept that antigen-specific CD4 T cells might be used to monitor autoimmunity in real time. This advance can inform our approach to T1D diagnosis and therapeutic interventions in the preclinical phase of anti-islet autoimmunity.
Assuntos
Diabetes Mellitus Tipo 1 , Ilhotas Pancreáticas , Humanos , Camundongos , Animais , Linfócitos T CD4-Positivos , Diabetes Mellitus Tipo 1/metabolismo , Autoimunidade , Ilhotas Pancreáticas/metabolismo , Antígenos/metabolismo , Insulina/metabolismo , Camundongos Endogâmicos NODRESUMO
In the ongoing effort to discover treatments for Alzheimer's disease (AD), there has been considerable focus on investigating the use of repurposed drug candidates. Mining of electronic health record data has the potential to identify novel correlated effects between commonly used drugs and AD. In this study, claims from members with commercial health insurance coverage were analyzed to determine the correlation between the use of various drugs on AD incidence and claim frequency. We found that, within the insured population, several medications for psychotic and mental illnesses were associated with higher disease incidence and frequency, while, to a lesser extent, antibiotics and anti-inflammatory drugs were associated with lower AD incidence rates. The observations thus provide a general overview of the prescription and claim relationships between various drug types and Alzheimer's disease, with insights into which drugs have possible implications on resulting AD diagnosis.
Assuntos
Doença de Alzheimer , Seguro , Humanos , Doença de Alzheimer/tratamento farmacológico , Doença de Alzheimer/epidemiologia , Prescrições de MedicamentosRESUMO
Knowledge graphs are an increasingly common data structure for representing biomedical information. These knowledge graphs can easily represent heterogeneous types of information, and many algorithms and tools exist for querying and analyzing graphs. Biomedical knowledge graphs have been used in a variety of applications, including drug repurposing, identification of drug targets, prediction of drug side effects, and clinical decision support. Typically, knowledge graphs are constructed by centralization and integration of data from multiple disparate sources. Here, we describe BioThings Explorer, an application that can query a virtual, federated knowledge graph derived from the aggregated information in a network of biomedical web services. BioThings Explorer leverages semantically precise annotations of the inputs and outputs for each resource, and automates the chaining of web service calls to execute multi-step graph queries. Because there is no large, centralized knowledge graph to maintain, BioThing Explorer is distributed as a lightweight application that dynamically retrieves information at query time. More information can be found at https://explorer.biothings.io, and code is available at https://github.com/biothings/biothings_explorer.
RESUMO
Computational drug repositioning methods have emerged as an attractive and effective solution to find new candidates for existing therapies, reducing the time and cost of drug development. Repositioning methods based on biomedical knowledge graphs typically offer useful supporting biological evidence. This evidence is based on reasoning chains or subgraphs that connect a drug to disease predictions. However, there are no databases of drug mechanisms that can be used to train and evaluate such methods. Here, we introduce the Drug Mechanism Database (DrugMechDB), a manually curated database that describes drug mechanisms as paths through a knowledge graph. DrugMechDB integrates a diverse range of authoritative free-text resources to describe 4,583 drug indications with 32,249 relationships, representing 14 major biological scales. DrugMechDB can be employed as a benchmark dataset for assessing computational drug repurposing models or as a valuable resource for training such models.
RESUMO
BACKGROUND: Biomedical researchers are strongly encouraged to make their research outputs more Findable, Accessible, Interoperable, and Reusable (FAIR). While many biomedical research outputs are more readily accessible through open data efforts, finding relevant outputs remains a significant challenge. Schema.org is a metadata vocabulary standardization project that enables web content creators to make their content more FAIR. Leveraging Schema.org could benefit biomedical research resource providers, but it can be challenging to apply Schema.org standards to biomedical research outputs. We created an online browser-based tool that empowers researchers and repository developers to utilize Schema.org or other biomedical schema projects. RESULTS: Our browser-based tool includes features which can help address many of the barriers towards Schema.org-compliance such as: The ability to easily browse for relevant Schema.org classes, the ability to extend and customize a class to be more suitable for biomedical research outputs, the ability to create data validation to ensure adherence of a research output to a customized class, and the ability to register a custom class to our schema registry enabling others to search and re-use it. We demonstrate the use of our tool with the creation of the Outbreak.info schema-a large multi-class schema for harmonizing various COVID-19 related resources. CONCLUSIONS: We have created a browser-based tool to empower biomedical research resource providers to leverage Schema.org classes to make their research outputs more FAIR.
Assuntos
Pesquisa Biomédica , COVID-19 , Humanos , MetadadosRESUMO
Biomedical datasets are increasing in size, stored in many repositories, and face challenges in FAIRness (findability, accessibility, interoperability, reusability). As a Consortium of infectious disease researchers from 15 Centers, we aim to adopt open science practices to promote transparency, encourage reproducibility, and accelerate research advances through data reuse. To improve FAIRness of our datasets and computational tools, we evaluated metadata standards across established biomedical data repositories. The vast majority do not adhere to a single standard, such as Schema.org, which is widely-adopted by generalist repositories. Consequently, datasets in these repositories are not findable in aggregation projects like Google Dataset Search. We alleviated this gap by creating a reusable metadata schema based on Schema.org and catalogued nearly 400 datasets and computational tools we collected. The approach is easily reusable to create schemas interoperable with community standards, but customized to a particular context. Our approach enabled data discovery, increased the reusability of datasets from a large research consortium, and accelerated research. Lastly, we discuss ongoing challenges with FAIRness beyond discoverability.
Assuntos
Doenças Transmissíveis , Conjuntos de Dados como Assunto , Metadados , Reprodutibilidade dos Testes , Conjuntos de Dados como Assunto/normas , HumanosRESUMO
Outbreak.info Research Library is a standardized, searchable interface of coronavirus disease 2019 (COVID-19) and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) publications, clinical trials, datasets, protocols and other resources, built with a reusable framework. We developed a rigorous schema to enforce consistency across different sources and resource types and linked related resources. Researchers can quickly search the latest research across data repositories, regardless of resource type or repository location, via a search interface, public application programming interface (API) and R package.
Assuntos
COVID-19 , Humanos , SARS-CoV-2 , Surtos de DoençasRESUMO
In response to the emergence of SARS-CoV-2 variants of concern, the global scientific community, through unprecedented effort, has sequenced and shared over 11 million genomes through GISAID, as of May 2022. This extraordinarily high sampling rate provides a unique opportunity to track the evolution of the virus in near real-time. Here, we present outbreak.info , a platform that currently tracks over 40 million combinations of Pango lineages and individual mutations, across over 7,000 locations, to provide insights for researchers, public health officials and the general public. We describe the interpretable visualizations available in our web application, the pipelines that enable the scalable ingestion of heterogeneous sources of SARS-CoV-2 variant data and the server infrastructure that enables widespread data dissemination via a high-performance API that can be accessed using an R package. We show how outbreak.info can be used for genomic surveillance and as a hypothesis-generation tool to understand the ongoing pandemic at varying geographic and temporal scales.
Assuntos
COVID-19 , SARS-CoV-2 , Humanos , Genômica , Surtos de Doenças , MutaçãoRESUMO
PURPOSE: The liver is the most frequent metastatic site for colorectal cancer. Its microenvironment is modified to provide a niche that is conducive for colorectal cancer cell growth. This study focused on characterizing the cellular changes in the metastatic colorectal cancer (mCRC) liver tumor microenvironment (TME). EXPERIMENTAL DESIGN: We analyzed a series of microsatellite stable (MSS) mCRCs to the liver, paired normal liver tissue, and peripheral blood mononuclear cells using single-cell RNA sequencing (scRNA-seq). We validated our findings using multiplexed spatial imaging and bulk gene expression with cell deconvolution. RESULTS: We identified TME-specific SPP1-expressing macrophages with altered metabolism features, foam cell characteristics, and increased activity in extracellular matrix (ECM) organization. SPP1+ macrophages and fibroblasts expressed complementary ligand-receptor pairs with the potential to mutually influence their gene-expression programs. TME lacked dysfunctional CD8 T cells and contained regulatory T cells, indicative of immunosuppression. Spatial imaging validated these cell states in the TME. Moreover, TME macrophages and fibroblasts had close spatial proximity, which is a requirement for intercellular communication and networking. In an independent cohort of mCRCs in the liver, we confirmed the presence of SPP1+ macrophages and fibroblasts using gene-expression data. An increased proportion of TME fibroblasts was associated with the worst prognosis in these patients. CONCLUSIONS: We demonstrated that mCRC in the liver is characterized by transcriptional alterations of macrophages in the TME. Intercellular networking between macrophages and fibroblasts supports colorectal cancer growth in the immunosuppressed metastatic niche in the liver. These features can be used to target immune-checkpoint-resistant MSS tumors.