Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
1.
Bioinformatics ; 38(6): 1624-1630, 2022 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-34935870

RESUMO

MOTIVATION: Table recognition systems are widely used to extract and structure quantitative information from the vast amount of documents that are increasingly available from different open sources. While many systems already perform well on tables with a simple layout, tables in the biomedical domain are often much more complex. Benchmark and training data for such tables are however very limited. RESULTS: To address this issue, we present a novel, highly curated benchmark dataset based on a hand-curated literature corpus on neurological disorders, which can be used to tune and evaluate table extraction applications for this challenging domain. We evaluate several state-of-the-art table extraction systems based on our proposed benchmark and discuss challenges that emerged during the benchmark creation as well as factors that can impact the performance of recognition methods. For the evaluation procedure, we propose a new metric as well as several improvements that result in a better performance evaluation. AVAILABILITY AND IMPLEMENTATION: The resulting benchmark dataset (https://zenodo.org/record/5549977) as well as the source code to our novel evaluation approach can be openly accessed. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Benchmarking , Doenças do Sistema Nervoso , Humanos , Software
2.
Bioinformatics ; 38(6): 1648-1656, 2022 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-34986221

RESUMO

MOTIVATION: The majority of biomedical knowledge is stored in structured databases or as unstructured text in scientific publications. This vast amount of information has led to numerous machine learning-based biological applications using either text through natural language processing (NLP) or structured data through knowledge graph embedding models. However, representations based on a single modality are inherently limited. RESULTS: To generate better representations of biological knowledge, we propose STonKGs, a Sophisticated Transformer trained on biomedical text and Knowledge Graphs (KGs). This multimodal Transformer uses combined input sequences of structured information from KGs and unstructured text data from biomedical literature to learn joint representations in a shared embedding space. First, we pre-trained STonKGs on a knowledge base assembled by the Integrated Network and Dynamical Reasoning Assembler consisting of millions of text-triple pairs extracted from biomedical literature by multiple NLP systems. Then, we benchmarked STonKGs against three baseline models trained on either one of the modalities (i.e. text or KG) across eight different classification tasks, each corresponding to a different biological application. Our results demonstrate that STonKGs outperforms both baselines, especially on the more challenging tasks with respect to the number of classes, improving upon the F1-score of the best baseline by up to 0.084 (i.e. from 0.881 to 0.965). Finally, our pre-trained model as well as the model architecture can be adapted to various other transfer learning applications. AVAILABILITY AND IMPLEMENTATION: We make the source code and the Python package of STonKGs available at GitHub (https://github.com/stonkgs/stonkgs) and PyPI (https://pypi.org/project/stonkgs/). The pre-trained STonKGs models and the task-specific classification models are respectively available at https://huggingface.co/stonkgs/stonkgs-150k and https://zenodo.org/communities/stonkgs. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Reconhecimento Automatizado de Padrão , Software , Aprendizado de Máquina , Processamento de Linguagem Natural , Publicações
3.
Bioinformatics ; 38(24): 5466-5468, 2022 12 13.
Artigo em Inglês | MEDLINE | ID: mdl-36303318

RESUMO

MOTIVATION: A global medical crisis like the coronavirus disease 2019 (COVID-19) pandemic requires interdisciplinary and highly collaborative research from all over the world. One of the key challenges for collaborative research is a lack of interoperability among various heterogeneous data sources. Interoperability, standardization and mapping of datasets are necessary for data analysis and applications in advanced algorithms such as developing personalized risk prediction modeling. RESULTS: To ensure the interoperability and compatibility among COVID-19 datasets, we present here a common data model (CDM) which has been built from 11 different COVID-19 datasets from various geographical locations. The current version of the CDM holds 4639 data variables related to COVID-19 such as basic patient information (age, biological sex and diagnosis) as well as disease-specific data variables, for example, Anosmia and Dyspnea. Each of the data variables in the data model is associated with specific data types, variable mappings, value ranges, data units and data encodings that could be used for standardizing any dataset. Moreover, the compatibility with established data standards like OMOP and FHIR makes the CDM a well-designed CDM for COVID-19 data interoperability. AVAILABILITY AND IMPLEMENTATION: The CDM is available in a public repo here: https://github.com/Fraunhofer-SCAI-Applied-Semantics/COVID-19-Global-Model. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
COVID-19 , Humanos , Algoritmos , Pandemias
4.
Bioinformatics ; 38(15): 3850-3852, 2022 08 02.
Artigo em Inglês | MEDLINE | ID: mdl-35652780

RESUMO

MOTIVATION: The importance of clinical data in understanding the pathophysiology of complex disorders has prompted the launch of multiple initiatives designed to generate patient-level data from various modalities. While these studies can reveal important findings relevant to the disease, each study captures different yet complementary aspects and modalities which, when combined, generate a more comprehensive picture of disease etiology. However, achieving this requires a global integration of data across studies, which proves to be challenging given the lack of interoperability of cohort datasets. RESULTS: Here, we present the Data Steward Tool (DST), an application that allows for semi-automatic semantic integration of clinical data into ontologies and global data models and data standards. We demonstrate the applicability of the tool in the field of dementia research by establishing a Clinical Data Model (CDM) in this domain. The CDM currently consists of 277 common variables covering demographics (e.g. age and gender), diagnostics, neuropsychological tests and biomarker measurements. The DST combined with this disease-specific data model shows how interoperability between multiple, heterogeneous dementia datasets can be achieved. AVAILABILITY AND IMPLEMENTATION: The DST source code and Docker images are respectively available at https://github.com/SCAI-BIO/data-steward and https://hub.docker.com/r/phwegner/data-steward. Furthermore, the DST is hosted at https://data-steward.bio.scai.fraunhofer.de/data-steward. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Demência , Semântica , Humanos , Software , Demência/diagnóstico
5.
Nucleic Acids Res ; 49(14): 7939-7953, 2021 08 20.
Artigo em Inglês | MEDLINE | ID: mdl-34197603

RESUMO

We attempt to address a key question in the joint analysis of transcriptomic data: can we correlate the patterns we observe in transcriptomic datasets to known interactions and pathway knowledge to broaden our understanding of disease pathophysiology? We present a systematic approach that sheds light on the patterns observed in hundreds of transcriptomic datasets from over sixty indications by using pathways and molecular interactions as a template. Our analysis employs transcriptomic datasets to construct dozens of disease specific co-expression networks, alongside a human protein-protein interactome network. Leveraging the interoperability between these two network templates, we explore patterns both common and particular to these diseases on three different levels. Firstly, at the node-level, we identify most and least common proteins across diseases and evaluate their consistency against the interactome as a proxy for their prevalence in the scientific literature. Secondly, we overlay both network templates to analyze common correlations and interactions across diseases at the edge-level. Thirdly, we explore the similarity between patterns observed at the disease-level and pathway knowledge to identify signatures associated with specific diseases and indication areas. Finally, we present a case scenario in schizophrenia, where we show how our approach can be used to investigate disease pathophysiology.


Assuntos
Doença/genética , Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Predisposição Genética para Doença/genética , Transdução de Sinais/genética , Transcriptoma/genética , Algoritmos , Análise por Conglomerados , Humanos , Esquizofrenia/genética
6.
BMC Bioinformatics ; 23(1): 231, 2022 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-35705903

RESUMO

Distinct gene expression patterns within cells are foundational for the diversity of functions and unique characteristics observed in specific contexts, such as human tissues and cell types. Though some biological processes commonly occur across contexts, by harnessing the vast amounts of available gene expression data, we can decipher the processes that are unique to a specific context. Therefore, with the goal of developing a portrait of context-specific patterns to better elucidate how they govern distinct biological processes, this work presents a large-scale exploration of transcriptomic signatures across three different contexts (i.e., tissues, cell types, and cell lines) by leveraging over 600 gene expression datasets categorized into 98 subcontexts. The strongest pairwise correlations between genes from these subcontexts are used for the construction of co-expression networks. Using a network-based approach, we then pinpoint patterns that are unique and common across these subcontexts. First, we focused on patterns at the level of individual nodes and evaluated their functional roles using a human protein-protein interactome as a referential network. Next, within each context, we systematically overlaid the co-expression networks to identify specific and shared correlations as well as relations already described in scientific literature. Additionally, in a pathway-level analysis, we overlaid node and edge sets from co-expression networks against pathway knowledge to identify biological processes that are related to specific subcontexts or groups of them. Finally, we have released our data and scripts at https://zenodo.org/record/5831786 and https://github.com/ContNeXt/ , respectively and developed ContNeXt ( https://contnext.scai.fraunhofer.de/ ), a web application to explore the networks generated in this work.


Assuntos
Redes Reguladoras de Genes , Transcriptoma , Perfilação da Expressão Gênica , Humanos , Software
7.
Bioinformatics ; 37(9): 1332-1334, 2021 06 09.
Artigo em Inglês | MEDLINE | ID: mdl-32976572

RESUMO

SUMMARY: The COVID-19 crisis has elicited a global response by the scientific community that has led to a burst of publications on the pathophysiology of the virus. However, without coordinated efforts to organize this knowledge, it can remain hidden away from individual research groups. By extracting and formalizing this knowledge in a structured and computable form, as in the form of a knowledge graph, researchers can readily reason and analyze this information on a much larger scale. Here, we present the COVID-19 Knowledge Graph, an expansive cause-and-effect network constructed from scientific literature on the new coronavirus that aims to provide a comprehensive view of its pathophysiology. To make this resource available to the research community and facilitate its exploration and analysis, we also implemented a web application and released the KG in multiple standard formats. AVAILABILITY AND IMPLEMENTATION: The COVID-19 Knowledge Graph is publicly available under CC-0 license at https://github.com/covid19kg and https://bikmi.covid19-knowledgespace.de. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
COVID-19 , Software , Humanos , Reconhecimento Automatizado de Padrão , Publicações , SARS-CoV-2
8.
Bioinformatics ; 36(24): 5703-5705, 2021 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-33346828

RESUMO

MOTIVATION: The COVID-19 pandemic has prompted an impressive, worldwide response by the academic community. In order to support text mining approaches as well as data description, linking and harmonization in the context of COVID-19, we have developed an ontology representing major novel coronavirus (SARS-CoV-2) entities. The ontology has a strong scope on chemical entities suited for drug repurposing, as this is a major target of ongoing COVID-19 therapeutic development. RESULTS: The ontology comprises 2270 classes of concepts and 38 987 axioms (2622 logical axioms and 2434 declaration axioms). It depicts the roles of molecular and cellular entities in virus-host interactions and in the virus life cycle, as well as a wide spectrum of medical and epidemiological concepts linked to COVID-19. The performance of the ontology has been tested on Medline and the COVID-19 corpus provided by the Allen Institute. AVAILABILITYAND IMPLEMENTATION: COVID-19 Ontology is released under a Creative Commons 4.0 License and shared via https://github.com/covid-19-ontology/covid-19. The ontology is also deposited in BioPortal at https://bioportal.bioontology.org/ontologies/COVID-19. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

9.
BMC Bioinformatics ; 20(1): 494, 2019 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-31604427

RESUMO

BACKGROUND: Literature derived knowledge assemblies have been used as an effective way of representing biological phenomenon and understanding disease etiology in systems biology. These include canonical pathway databases such as KEGG, Reactome and WikiPathways and disease specific network inventories such as causal biological networks database, PD map and NeuroMMSig. The represented knowledge in these resources delineates qualitative information focusing mainly on the causal relationships between biological entities. Genes, the major constituents of knowledge representations, tend to express differentially in different conditions such as cell types, brain regions and disease stages. A classical approach of interpreting a knowledge assembly is to explore gene expression patterns of the individual genes. However, an approach that enables quantification of the overall impact of differentially expressed genes in the corresponding network is still lacking. RESULTS: Using the concept of heat diffusion, we have devised an algorithm that is able to calculate the magnitude of regulation of a biological network using expression datasets. We have demonstrated that molecular mechanisms specific to Alzheimer (AD) and Parkinson Disease (PD) regulate with different intensities across spatial and temporal resolutions. Our approach depicts that the mitochondrial dysfunction in PD is severe in cortex and advanced stages of PD patients. Similarly, we have shown that the intensity of aggregation of neurofibrillary tangles (NFTs) in AD increases as the disease progresses. This finding is in concordance with previous studies that explain the burden of NFTs in stages of AD. CONCLUSIONS: This study is one of the first attempts that enable quantification of mechanisms represented as biological networks. We have been able to quantify the magnitude of regulation of a biological network and illustrate that the magnitudes are different across spatial and temporal resolution.


Assuntos
Algoritmos , Encéfalo/metabolismo , Doenças Neurodegenerativas/metabolismo , Biologia de Sistemas/métodos , Doença de Alzheimer/genética , Doença de Alzheimer/metabolismo , Regulação da Expressão Gênica , Humanos , Redes e Vias Metabólicas , Mitocôndrias/metabolismo , Mitocôndrias/fisiologia , Doenças Neurodegenerativas/genética , Doenças Neurodegenerativas/fisiopatologia , Doença de Parkinson/genética , Doença de Parkinson/metabolismo , Doença de Parkinson/fisiopatologia , Mapas de Interação de Proteínas , Transdução de Sinais
10.
Brief Bioinform ; 17(3): 505-16, 2016 05.
Artigo em Inglês | MEDLINE | ID: mdl-26249223

RESUMO

The work we present here is based on the recent extension of the syntax of the Biological Expression Language (BEL), which now allows for the representation of genetic variation information in cause-and-effect models. In our article, we describe, how genetic variation information can be used to identify candidate disease mechanisms in diseases with complex aetiology such as Alzheimer's disease and Parkinson's disease. In those diseases, we have to assume that many genetic variants contribute moderately to the overall dysregulation that in the case of neurodegenerative diseases has such a long incubation time until the first clinical symptoms are detectable. Owing to the multilevel nature of dysregulation events, systems biomedicine modelling approaches need to combine mechanistic information from various levels, including gene expression, microRNA (miRNA) expression, protein-protein interaction, genetic variation and pathway. OpenBEL, the open source version of BEL, has recently been extended to match this requirement, and we demonstrate in our article, how candidate mechanisms for early dysregulation events in Alzheimer's disease can be identified based on an integrative mining approach that identifies 'chains of causation' that include single nucleotide polymorphism information in BEL models.


Assuntos
Variação Genética , Expressão Gênica , Humanos , Doenças Neurodegenerativas
11.
Bioinformatics ; 33(22): 3679-3681, 2017 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-28651363

RESUMO

MOTIVATION: The concept of a 'mechanism-based taxonomy of human disease' is currently replacing the outdated paradigm of diseases classified by clinical appearance. We have tackled the paradigm of mechanism-based patient subgroup identification in the challenging area of research on neurodegenerative diseases. RESULTS: We have developed a knowledge base representing essential pathophysiology mechanisms of neurodegenerative diseases. Together with dedicated algorithms, this knowledge base forms the basis for a 'mechanism-enrichment server' that supports the mechanistic interpretation of multiscale, multimodal clinical data. AVAILABILITY AND IMPLEMENTATION: NeuroMMSig is available at http://neurommsig.scai.fraunhofer.de/. CONTACT: martin.hofmann-apitius@scai.fraunhofer.de. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Bases de Conhecimento , Doenças Neurodegenerativas/metabolismo , Doenças Neurodegenerativas/fisiopatologia , Humanos , Internet , Modelos Biológicos , Doenças Neurodegenerativas/genética , Software
12.
Theor Biol Med Model ; 12: 20, 2015 Sep 22.
Artigo em Inglês | MEDLINE | ID: mdl-26395080

RESUMO

BACKGROUND: Despite the unprecedented and increasing amount of data, relatively little progress has been made in molecular characterization of mechanisms underlying Parkinson's disease. In the area of Parkinson's research, there is a pressing need to integrate various pieces of information into a meaningful context of presumed disease mechanism(s). Disease ontologies provide a novel means for organizing, integrating, and standardizing the knowledge domains specific to disease in a compact, formalized and computer-readable form and serve as a reference for knowledge exchange or systems modeling of disease mechanism. METHODS: The Parkinson's disease ontology was built according to the life cycle of ontology building. Structural, functional, and expert evaluation of the ontology was performed to ensure the quality and usability of the ontology. A novelty metric has been introduced to measure the gain of new knowledge using the ontology. Finally, a cause-and-effect model was built around PINK1 and two gene expression studies from the Gene Expression Omnibus database were re-annotated to demonstrate the usability of the ontology. RESULTS: The Parkinson's disease ontology with a subclass-based taxonomic hierarchy covers the broad spectrum of major biomedical concepts from molecular to clinical features of the disease, and also reflects different views on disease features held by molecular biologists, clinicians and drug developers. The current version of the ontology contains 632 concepts, which are organized under nine views. The structural evaluation showed the balanced dispersion of concept classes throughout the ontology. The functional evaluation demonstrated that the ontology-driven literature search could gain novel knowledge not present in the reference Parkinson's knowledge map. The ontology was able to answer specific questions related to Parkinson's when evaluated by experts. Finally, the added value of the Parkinson's disease ontology is demonstrated by ontology-driven modeling of PINK1 and re-annotation of gene expression datasets relevant to Parkinson's disease. CONCLUSIONS: Parkinson's disease ontology delivers the knowledge domain of Parkinson's disease in a compact, computer-readable form, which can be further edited and enriched by the scientific community and also to be used to construct, represent and automatically extend Parkinson's-related computable models. A practical version of the Parkinson's disease ontology for browsing and editing can be publicly accessed at http://bioportal.bioontology.org/ontologies/PDON .


Assuntos
Ontologia Genética , Conhecimento , Doença de Parkinson/genética , Software , Animais , Bases de Dados Genéticas , Modelos Animais de Doenças , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Humanos , Anotação de Sequência Molecular , Doença de Parkinson/etiologia
13.
Alzheimers Dement ; 11(11): 1329-39, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-25849034

RESUMO

INTRODUCTION: The discovery and development of new treatments for Alzheimer's disease (AD) requires a profound mechanistic understanding of the disease. Here, we propose a model-driven approach supporting the systematic identification of putative disease mechanisms. METHODS: We have created a model for AD and a corresponding model for the normal physiology of neurons using biological expression language to systematically model causal and correlative relationships between biomolecules, pathways, and clinical readouts. Through model-model comparison we identify "chains of causal relationships" that lead to new insights into putative disease mechanisms. RESULTS: Using differential analysis of our models we identified a new mechanism explaining the effect of amyloid-beta on apoptosis via both the neurotrophic tyrosine kinase receptor, type 2 and nerve growth factor receptor branches of the neurotrophin signaling pathway. We also provide the example of a model-guided interpretation of genetic variation data for a comorbidity analysis between AD and type 2 diabetes mellitus. DISCUSSION: The two computable, literature-based models introduced here provide a powerful framework for the generation and validation of rational, testable hypotheses across disease areas.


Assuntos
Doença de Alzheimer/fisiopatologia , Modelos Neurológicos , Neurônios/fisiologia , Doença de Alzheimer/complicações , Doença de Alzheimer/epidemiologia , Doença de Alzheimer/genética , Precursor de Proteína beta-Amiloide/metabolismo , Animais , Encéfalo/fisiologia , Encéfalo/fisiopatologia , Comorbidade , Humanos , Polimorfismo de Nucleotídeo Único
14.
Int J Mol Sci ; 16(12): 29179-206, 2015 Dec 07.
Artigo em Inglês | MEDLINE | ID: mdl-26690135

RESUMO

Since the decoding of the Human Genome, techniques from bioinformatics, statistics, and machine learning have been instrumental in uncovering patterns in increasing amounts and types of different data produced by technical profiling technologies applied to clinical samples, animal models, and cellular systems. Yet, progress on unravelling biological mechanisms, causally driving diseases, has been limited, in part due to the inherent complexity of biological systems. Whereas we have witnessed progress in the areas of cancer, cardiovascular and metabolic diseases, the area of neurodegenerative diseases has proved to be very challenging. This is in part because the aetiology of neurodegenerative diseases such as Alzheimer´s disease or Parkinson´s disease is unknown, rendering it very difficult to discern early causal events. Here we describe a panel of bioinformatics and modeling approaches that have recently been developed to identify candidate mechanisms of neurodegenerative diseases based on publicly available data and knowledge. We identify two complementary strategies-data mining techniques using genetic data as a starting point to be further enriched using other data-types, or alternatively to encode prior knowledge about disease mechanisms in a model based framework supporting reasoning and enrichment analysis. Our review illustrates the challenges entailed in integrating heterogeneous, multiscale and multimodal information in the area of neurology in general and neurodegeneration in particular. We conclude, that progress would be accelerated by increasing efforts on performing systematic collection of multiple data-types over time from each individual suffering from neurodegenerative disease. The work presented here has been driven by project AETIONOMY; a project funded in the course of the Innovative Medicines Initiative (IMI); which is a public-private partnership of the European Federation of Pharmaceutical Industry Associations (EFPIA) and the European Commission (EC).


Assuntos
Mineração de Dados , Doenças Neurodegenerativas/genética , Animais , Biologia Computacional , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Bases de Conhecimento , Polimorfismo de Nucleotídeo Único
15.
Artigo em Inglês | MEDLINE | ID: mdl-36462601

RESUMO

Schizophrenia and bipolar disorder are characterized by highly similar neuropsychological signatures, implying shared neurobiological mechanisms between these two disorders. These disorders also have comorbidities, such as type 2 diabetes mellitus (T2DM). To date, an understanding of the mechanisms that mediate the link between these two disorders remains incomplete. In this work, we identify and investigate shared patterns across multiple schizophrenia, bipolar disorder and T2DM gene expression datasets through multiple strategies. Firstly, we investigate dysregulation patterns at the gene-level and compare our findings against disease-specific knowledge graphs (KGs). Secondly, we analyze the concordance of co-expression patterns across datasets to identify disease-specific as well as common pathways. Thirdly, we examine enriched pathways across datasets and disorders to identify common biological mechanisms between them. Lastly, we investigate the correspondence of shared genetic variants between these two disorders and T2DM as well as the disease-specific KGs. In conclusion, our work reveals several shared candidate genes and pathways, particularly those related to the immune system, such as TNF signaling pathway, IL-17 signaling pathway and NF-kappa B signaling pathway and nervous system, such as dopaminergic synapse and GABAergic synapse, which we propose mediate the link between schizophrenia and bipolar disorder and its shared comorbidity, T2DM.


Assuntos
Transtorno Bipolar , Diabetes Mellitus Tipo 2 , Esquizofrenia , Humanos , Transtorno Bipolar/psicologia , Esquizofrenia/epidemiologia , Esquizofrenia/genética , Comorbidade , Transdução de Sinais
16.
Bioinform Adv ; 3(1): vbad033, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37016683

RESUMO

Motivation: Epilepsy is a multifaceted complex disorder that requires a precise understanding of the classification, diagnosis, treatment and disease mechanism governing it. Although scattered resources are available on epilepsy, comprehensive and structured knowledge is missing. In contemplation to promote multidisciplinary knowledge exchange and facilitate advancement in clinical management, especially in pre-clinical research, a disease-specific ontology is necessary. The presented ontology is designed to enable better interconnection between scientific community members in the epilepsy domain. Results: The Epilepsy Ontology (EPIO) is an assembly of structured knowledge on various aspects of epilepsy, developed according to Basic Formal Ontology (BFO) and Open Biological and Biomedical Ontology (OBO) Foundry principles. Concepts and definitions are collected from the latest International League against Epilepsy (ILAE) classification, domain-specific ontologies and scientific literature. This ontology consists of 1879 classes and 28 151 axioms (2171 declaration axioms, 2219 logical axioms) from several aspects of epilepsy. This ontology is intended to be used for data management and text mining purposes. Availability and implementation: The current release of the ontology is publicly available under a Creative Commons 4.0 License and shared via http://purl.obolibrary.org/obo/epso.owl and is a community-based effort assembling various facets of the complex disease. The ontology is also deposited in BioPortal at https://bioportal.bioontology.org/ontologies/EPIO. Supplementary information: Supplementary data are available at Bioinformatics Advances online.

18.
NPJ Syst Biol Appl ; 7(1): 40, 2021 10 27.
Artigo em Inglês | MEDLINE | ID: mdl-34707117

RESUMO

The utility of pathway signatures lies in their capability to determine whether a specific pathway or biological process is dysregulated in a given patient. These signatures have been widely used in machine learning (ML) methods for a variety of applications including precision medicine, drug repurposing, and drug discovery. In this work, we leverage highly predictive ML models for drug response simulation in individual patients by calibrating the pathway activity scores of disease samples. Using these ML models and an intuitive scoring algorithm to modify the signatures of patients, we evaluate whether a given sample that was formerly classified as diseased, could be predicted as normal following drug treatment simulation. We then use this technique as a proxy for the identification of potential drug candidates. Furthermore, we demonstrate the ability of our methodology to successfully identify approved and clinically investigated drugs for four different cancers, outperforming six comparable state-of-the-art methods. We also show how this approach can deconvolute a drugs' mechanism of action and propose combination therapies. Taken together, our methodology could be promising to support clinical decision-making in personalized medicine by simulating a drugs' effect on a given patient.


Assuntos
Fenômenos Biológicos , Aprendizado de Máquina , Algoritmos , Simulação por Computador , Humanos , Medicina de Precisão
19.
J Alzheimers Dis ; 80(2): 831-840, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33554913

RESUMO

BACKGROUND: Neuroimaging markers provide quantitative insight into brain structure and function in neurodegenerative diseases, such as Alzheimer's disease, where we lack mechanistic insights to explain pathophysiology. These mechanisms are often mediated by genes and genetic variations and are often studied through the lens of genome-wide association studies. Linking these two disparate layers (i.e., imaging and genetic variation) through causal relationships between biological entities involved in the disease's etiology would pave the way to large-scale mechanistic reasoning and interpretation. OBJECTIVE: We explore how genetic variants may lead to functional alterations of intermediate molecular traits, which can further impact neuroimaging hallmarks over a series of biological processes across multiple scales. METHODS: We present an approach in which knowledge pertaining to single nucleotide polymorphisms and imaging readouts is extracted from the literature, encoded in Biological Expression Language, and used in a novel workflow to assist in the functional interpretation of SNPs in a clinical context. RESULTS: We demonstrate our approach in a case scenario which proposes KANSL1 as a candidate gene that accounts for the clinically reported correlation between the incidence of the genetic variants and hippocampal atrophy. We find that the workflow prioritizes multiple mechanisms reported in the literature through which KANSL1 may have an impact on hippocampal atrophy such as through the dysregulation of cell proliferation, synaptic plasticity, and metabolic processes. CONCLUSION: We have presented an approach that enables pinpointing relevant genetic variants as well as investigating their functional role in biological processes spanning across several, diverse biological scales.


Assuntos
Doença de Alzheimer/genética , Predisposição Genética para Doença/genética , Neuroimagem , Biologia de Sistemas , Doença de Alzheimer/diagnóstico por imagem , Biomarcadores/metabolismo , Encéfalo/metabolismo , Encéfalo/patologia , Estudo de Associação Genômica Ampla/métodos , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Biologia de Sistemas/métodos
20.
Artif Intell Life Sci ; 1: 100020, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34988543

RESUMO

Despite available vaccinations COVID-19 case numbers around the world are still growing, and effective medications against severe cases are lacking. In this work, we developed a machine learning model which predicts mortality for COVID-19 patients using data from the multi-center 'Lean European Open Survey on SARS-CoV-2-infected patients' (LEOSS) observational study (>100 active sites in Europe, primarily in Germany), resulting into an AUC of almost 80%. We showed that molecular mechanisms related to dementia, one of the relevant predictors in our model, intersect with those associated to COVID-19. Most notably, among these molecules was tyrosine kinase 2 (TYK2), a protein that has been patented as drug target in Alzheimer's Disease but also genetically associated with severe COVID-19 outcomes. We experimentally verified that anti-cancer drugs Sorafenib and Regorafenib showed a clear anti-cytopathic effect in Caco2 and VERO-E6 cells and can thus be regarded as potential treatments against COVID-19. Altogether, our work demonstrates that interpretation of machine learning based risk models can point towards drug targets and new treatment options, which are strongly needed for COVID-19.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA