Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 135
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Brief Bioinform ; 23(3)2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35453140

RESUMO

Pathway enrichment analysis has become a widely used knowledge-based approach for the interpretation of biomedical data. Its popularity has led to an explosion of both enrichment methods and pathway databases. While the elegance of pathway enrichment lies in its simplicity, multiple factors can impact the results of such an analysis, which may not be accounted for. Researchers may fail to give influential aspects their due, resorting instead to popular methods and gene set collections, or default settings. Despite ongoing efforts to establish set guidelines, meaningful results are still hampered by a lack of consensus or gold standards around how enrichment analysis should be conducted. Nonetheless, such concerns have prompted a series of benchmark studies specifically focused on evaluating the influence of various factors on pathway enrichment results. In this review, we organize and summarize the findings of these benchmarks to provide a comprehensive overview on the influence of these factors. Our work covers a broad spectrum of factors, spanning from methodological assumptions to those related to prior biological knowledge, such as pathway definitions and database choice. In doing so, we aim to shed light on how these aspects can lead to insignificant, uninteresting or even contradictory results. Finally, we conclude the review by proposing future benchmarks as well as solutions to overcome some of the challenges, which originate from the outlined factors.


Assuntos
Bases de Dados Factuais , Análise Fatorial , Estudos Longitudinais
2.
Bioinformatics ; 39(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-36322820

RESUMO

MOTIVATION: Drug discovery practitioners in industry and academia use semantic tools to extract information from online scientific literature to generate new insights into targets, therapeutics and diseases. However, due to complexities in access and analysis, patent-based literature is often overlooked as a source of information. As drug discovery is a highly competitive field, naturally, tools that tap into patent literature can provide any actor in the field an advantage in terms of better informed decision-making. Hence, we aim to facilitate access to patent literature through the creation of an automatic tool for extracting information from patents described in existing public resources. RESULTS: Here, we present PEMT, a novel patent enrichment tool, that takes advantage of public databases like ChEMBL and SureChEMBL to extract relevant patent information linked to chemical structures and/or gene names described through FAIR principles and metadata annotations. PEMT aims at supporting drug discovery and research by establishing a patent landscape around genes of interest. The pharmaceutical focus of the tool is mainly due to the subselection of International Patent Classification codes, but in principle, it can be used for other patent fields, provided that a link between a concept and chemical structure is investigated. Finally, we demonstrate a use-case in rare diseases by generating a gene-patent list based on the epidemiological prevalence of these diseases and exploring their underlying patent landscapes. AVAILABILITY AND IMPLEMENTATION: PEMT is an open-source Python tool and its source code and PyPi package are available at https://github.com/Fraunhofer-ITMP/PEMT and https://pypi.org/project/PEMT/, respectively. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Metadados , Software , Bases de Dados Factuais
3.
Bioinformatics ; 38(6): 1648-1656, 2022 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-34986221

RESUMO

MOTIVATION: The majority of biomedical knowledge is stored in structured databases or as unstructured text in scientific publications. This vast amount of information has led to numerous machine learning-based biological applications using either text through natural language processing (NLP) or structured data through knowledge graph embedding models. However, representations based on a single modality are inherently limited. RESULTS: To generate better representations of biological knowledge, we propose STonKGs, a Sophisticated Transformer trained on biomedical text and Knowledge Graphs (KGs). This multimodal Transformer uses combined input sequences of structured information from KGs and unstructured text data from biomedical literature to learn joint representations in a shared embedding space. First, we pre-trained STonKGs on a knowledge base assembled by the Integrated Network and Dynamical Reasoning Assembler consisting of millions of text-triple pairs extracted from biomedical literature by multiple NLP systems. Then, we benchmarked STonKGs against three baseline models trained on either one of the modalities (i.e. text or KG) across eight different classification tasks, each corresponding to a different biological application. Our results demonstrate that STonKGs outperforms both baselines, especially on the more challenging tasks with respect to the number of classes, improving upon the F1-score of the best baseline by up to 0.084 (i.e. from 0.881 to 0.965). Finally, our pre-trained model as well as the model architecture can be adapted to various other transfer learning applications. AVAILABILITY AND IMPLEMENTATION: We make the source code and the Python package of STonKGs available at GitHub (https://github.com/stonkgs/stonkgs) and PyPI (https://pypi.org/project/stonkgs/). The pre-trained STonKGs models and the task-specific classification models are respectively available at https://huggingface.co/stonkgs/stonkgs-150k and https://zenodo.org/communities/stonkgs. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Reconhecimento Automatizado de Padrão , Software , Aprendizado de Máquina , Processamento de Linguagem Natural , Publicações
4.
Bioinformatics ; 38(15): 3850-3852, 2022 08 02.
Artigo em Inglês | MEDLINE | ID: mdl-35652780

RESUMO

MOTIVATION: The importance of clinical data in understanding the pathophysiology of complex disorders has prompted the launch of multiple initiatives designed to generate patient-level data from various modalities. While these studies can reveal important findings relevant to the disease, each study captures different yet complementary aspects and modalities which, when combined, generate a more comprehensive picture of disease etiology. However, achieving this requires a global integration of data across studies, which proves to be challenging given the lack of interoperability of cohort datasets. RESULTS: Here, we present the Data Steward Tool (DST), an application that allows for semi-automatic semantic integration of clinical data into ontologies and global data models and data standards. We demonstrate the applicability of the tool in the field of dementia research by establishing a Clinical Data Model (CDM) in this domain. The CDM currently consists of 277 common variables covering demographics (e.g. age and gender), diagnostics, neuropsychological tests and biomarker measurements. The DST combined with this disease-specific data model shows how interoperability between multiple, heterogeneous dementia datasets can be achieved. AVAILABILITY AND IMPLEMENTATION: The DST source code and Docker images are respectively available at https://github.com/SCAI-BIO/data-steward and https://hub.docker.com/r/phwegner/data-steward. Furthermore, the DST is hosted at https://data-steward.bio.scai.fraunhofer.de/data-steward. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Demência , Semântica , Humanos , Software , Demência/diagnóstico
5.
Bioinformatics ; 38(24): 5466-5468, 2022 12 13.
Artigo em Inglês | MEDLINE | ID: mdl-36303318

RESUMO

MOTIVATION: A global medical crisis like the coronavirus disease 2019 (COVID-19) pandemic requires interdisciplinary and highly collaborative research from all over the world. One of the key challenges for collaborative research is a lack of interoperability among various heterogeneous data sources. Interoperability, standardization and mapping of datasets are necessary for data analysis and applications in advanced algorithms such as developing personalized risk prediction modeling. RESULTS: To ensure the interoperability and compatibility among COVID-19 datasets, we present here a common data model (CDM) which has been built from 11 different COVID-19 datasets from various geographical locations. The current version of the CDM holds 4639 data variables related to COVID-19 such as basic patient information (age, biological sex and diagnosis) as well as disease-specific data variables, for example, Anosmia and Dyspnea. Each of the data variables in the data model is associated with specific data types, variable mappings, value ranges, data units and data encodings that could be used for standardizing any dataset. Moreover, the compatibility with established data standards like OMOP and FHIR makes the CDM a well-designed CDM for COVID-19 data interoperability. AVAILABILITY AND IMPLEMENTATION: The CDM is available in a public repo here: https://github.com/Fraunhofer-SCAI-Applied-Semantics/COVID-19-Global-Model. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
COVID-19 , Humanos , Algoritmos , Pandemias
6.
Nucleic Acids Res ; 49(14): 7939-7953, 2021 08 20.
Artigo em Inglês | MEDLINE | ID: mdl-34197603

RESUMO

We attempt to address a key question in the joint analysis of transcriptomic data: can we correlate the patterns we observe in transcriptomic datasets to known interactions and pathway knowledge to broaden our understanding of disease pathophysiology? We present a systematic approach that sheds light on the patterns observed in hundreds of transcriptomic datasets from over sixty indications by using pathways and molecular interactions as a template. Our analysis employs transcriptomic datasets to construct dozens of disease specific co-expression networks, alongside a human protein-protein interactome network. Leveraging the interoperability between these two network templates, we explore patterns both common and particular to these diseases on three different levels. Firstly, at the node-level, we identify most and least common proteins across diseases and evaluate their consistency against the interactome as a proxy for their prevalence in the scientific literature. Secondly, we overlay both network templates to analyze common correlations and interactions across diseases at the edge-level. Thirdly, we explore the similarity between patterns observed at the disease-level and pathway knowledge to identify signatures associated with specific diseases and indication areas. Finally, we present a case scenario in schizophrenia, where we show how our approach can be used to investigate disease pathophysiology.


Assuntos
Doença/genética , Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Predisposição Genética para Doença/genética , Transdução de Sinais/genética , Transcriptoma/genética , Algoritmos , Análise por Conglomerados , Humanos , Esquizofrenia/genética
7.
BMC Bioinformatics ; 23(1): 231, 2022 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-35705903

RESUMO

Distinct gene expression patterns within cells are foundational for the diversity of functions and unique characteristics observed in specific contexts, such as human tissues and cell types. Though some biological processes commonly occur across contexts, by harnessing the vast amounts of available gene expression data, we can decipher the processes that are unique to a specific context. Therefore, with the goal of developing a portrait of context-specific patterns to better elucidate how they govern distinct biological processes, this work presents a large-scale exploration of transcriptomic signatures across three different contexts (i.e., tissues, cell types, and cell lines) by leveraging over 600 gene expression datasets categorized into 98 subcontexts. The strongest pairwise correlations between genes from these subcontexts are used for the construction of co-expression networks. Using a network-based approach, we then pinpoint patterns that are unique and common across these subcontexts. First, we focused on patterns at the level of individual nodes and evaluated their functional roles using a human protein-protein interactome as a referential network. Next, within each context, we systematically overlaid the co-expression networks to identify specific and shared correlations as well as relations already described in scientific literature. Additionally, in a pathway-level analysis, we overlaid node and edge sets from co-expression networks against pathway knowledge to identify biological processes that are related to specific subcontexts or groups of them. Finally, we have released our data and scripts at https://zenodo.org/record/5831786 and https://github.com/ContNeXt/ , respectively and developed ContNeXt ( https://contnext.scai.fraunhofer.de/ ), a web application to explore the networks generated in this work.


Assuntos
Redes Reguladoras de Genes , Transcriptoma , Perfilação da Expressão Gênica , Humanos , Software
8.
Bioinformatics ; 37(1): 137-139, 2021 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-33367476

RESUMO

SUMMARY: High-throughput screening yields vast amounts of biological data which can be highly challenging to interpret. In response, knowledge-driven approaches emerged as possible solutions to analyze large datasets by leveraging prior knowledge of biomolecular interactions represented in the form of biological networks. Nonetheless, given their size and complexity, their manual investigation quickly becomes impractical. Thus, computational approaches, such as diffusion algorithms, are often employed to interpret and contextualize the results of high-throughput experiments. Here, we present MultiPaths, a framework consisting of two independent Python packages for network analysis. While the first package, DiffuPy, comprises numerous commonly used diffusion algorithms applicable to any generic network, the second, DiffuPath, enables the application of these algorithms on multi-layer biological networks. To facilitate its usability, the framework includes a command line interface, reproducible examples and documentation. To demonstrate the framework, we conducted several diffusion experiments on three independent multi-omics datasets over disparate networks generated from pathway databases, thus, highlighting the ability of multi-layer networks to integrate multiple modalities. Finally, the results of these experiments demonstrate how the generation of harmonized networks from disparate databases can improve predictive performance with respect to individual resources. AVAILABILITY AND IMPLEMENTATION: DiffuPy and DiffuPath are publicly available under the Apache License 2.0 at https://github.com/multipaths. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

9.
Bioinformatics ; 37(19): 3311-3318, 2021 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-33964127

RESUMO

SUMMARY: As machine learning and artificial intelligence increasingly attain a larger number of applications in the biomedical domain, at their core, their utility depends on the data used to train them. Due to the complexity and high dimensionality of biomedical data, there is a need for approaches that combine prior knowledge around known biological interactions with patient data. Here, we present CLinical Embedding of Patients (CLEP), a novel approach that generates new patient representations by leveraging both prior knowledge and patient-level data. First, given a patient-level dataset and a knowledge graph containing relations across features that can be mapped to the dataset, CLEP incorporates patients into the knowledge graph as new nodes connected to their most characteristic features. Next, CLEP employs knowledge graph embedding models to generate new patient representations that can ultimately be used for a variety of downstream tasks, ranging from clustering to classification. We demonstrate how using new patient representations generated by CLEP significantly improves performance in classifying between patients and healthy controls for a variety of machine learning models, as compared to the use of the original transcriptomics data. Furthermore, we also show how incorporating patients into a knowledge graph can foster the interpretation and identification of biological features characteristic of a specific disease or patient subgroup. Finally, we released CLEP as an open source Python package together with examples and documentation. AVAILABILITY AND IMPLEMENTATION: CLEP is available to the bioinformatics community as an open source Python package at https://github.com/hybrid-kg/clep under the Apache 2.0 License. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

10.
Bioinformatics ; 37(9): 1332-1334, 2021 06 09.
Artigo em Inglês | MEDLINE | ID: mdl-32976572

RESUMO

SUMMARY: The COVID-19 crisis has elicited a global response by the scientific community that has led to a burst of publications on the pathophysiology of the virus. However, without coordinated efforts to organize this knowledge, it can remain hidden away from individual research groups. By extracting and formalizing this knowledge in a structured and computable form, as in the form of a knowledge graph, researchers can readily reason and analyze this information on a much larger scale. Here, we present the COVID-19 Knowledge Graph, an expansive cause-and-effect network constructed from scientific literature on the new coronavirus that aims to provide a comprehensive view of its pathophysiology. To make this resource available to the research community and facilitate its exploration and analysis, we also implemented a web application and released the KG in multiple standard formats. AVAILABILITY AND IMPLEMENTATION: The COVID-19 Knowledge Graph is publicly available under CC-0 license at https://github.com/covid19kg and https://bikmi.covid19-knowledgespace.de. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
COVID-19 , Software , Humanos , Reconhecimento Automatizado de Padrão , Publicações , SARS-CoV-2
11.
Bioinformatics ; 36(24): 5703-5705, 2021 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-33346828

RESUMO

MOTIVATION: The COVID-19 pandemic has prompted an impressive, worldwide response by the academic community. In order to support text mining approaches as well as data description, linking and harmonization in the context of COVID-19, we have developed an ontology representing major novel coronavirus (SARS-CoV-2) entities. The ontology has a strong scope on chemical entities suited for drug repurposing, as this is a major target of ongoing COVID-19 therapeutic development. RESULTS: The ontology comprises 2270 classes of concepts and 38 987 axioms (2622 logical axioms and 2434 declaration axioms). It depicts the roles of molecular and cellular entities in virus-host interactions and in the virus life cycle, as well as a wide spectrum of medical and epidemiological concepts linked to COVID-19. The performance of the ontology has been tested on Medline and the COVID-19 corpus provided by the Allen Institute. AVAILABILITYAND IMPLEMENTATION: COVID-19 Ontology is released under a Creative Commons 4.0 License and shared via https://github.com/covid-19-ontology/covid-19. The ontology is also deposited in BioPortal at https://bioportal.bioontology.org/ontologies/COVID-19. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

12.
PLoS Comput Biol ; 16(12): e1008464, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-33264280

RESUMO

Elucidating the causal mechanisms responsible for disease can reveal potential therapeutic targets for pharmacological intervention and, accordingly, guide drug repositioning and discovery. In essence, the topology of a network can reveal the impact a drug candidate may have on a given biological state, leading the way for enhanced disease characterization and the design of advanced therapies. Network-based approaches, in particular, are highly suited for these purposes as they hold the capacity to identify the molecular mechanisms underlying disease. Here, we present drug2ways, a novel methodology that leverages multimodal causal networks for predicting drug candidates. Drug2ways implements an efficient algorithm which reasons over causal paths in large-scale biological networks to propose drug candidates for a given disease. We validate our approach using clinical trial information and demonstrate how drug2ways can be used for multiple applications to identify: i) single-target drug candidates, ii) candidates with polypharmacological properties that can optimize multiple targets, and iii) candidates for combination therapy. Finally, we make drug2ways available to the scientific community as a Python package that enables conducting these applications on multiple standard network formats.


Assuntos
Descoberta de Drogas/métodos , Reposicionamento de Medicamentos/métodos , Modelos Biológicos , Algoritmos , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Simulação por Computador , Tratamento Farmacológico , Humanos , Neoplasias/tratamento farmacológico , Fenótipo , Polifarmacologia
13.
BMC Bioinformatics ; 21(1): 231, 2020 Jun 05.
Artigo em Inglês | MEDLINE | ID: mdl-32503412

RESUMO

BACKGROUND: During the last decade, there has been a surge towards computational drug repositioning owing to constantly increasing -omics data in the biomedical research field. While numerous existing methods focus on the integration of heterogeneous data to propose candidate drugs, it is still challenging to substantiate their results with mechanistic insights of these candidate drugs. Therefore, there is a need for more innovative and efficient methods which can enable better integration of data and knowledge for drug repositioning. RESULTS: Here, we present a customizable workflow (PS4DR) which not only integrates high-throughput data such as genome-wide association study (GWAS) data and gene expression signatures from disease and drug perturbations but also takes pathway knowledge into consideration to predict drug candidates for repositioning. We have collected and integrated publicly available GWAS data and gene expression signatures for several diseases and hundreds of FDA-approved drugs or those under clinical trial in this study. Additionally, different pathway databases were used for mechanistic knowledge integration in the workflow. Using this systematic consolidation of data and knowledge, the workflow computes pathway signatures that assist in the prediction of new indications for approved and investigational drugs. CONCLUSION: We showcase PS4DR with applications demonstrating how this tool can be used for repositioning and identifying new drugs as well as proposing drugs that can simulate disease dysregulations. We were able to validate our workflow by demonstrating its capability to predict FDA-approved drugs for their known indications for several diseases. Further, PS4DR returned many potential drug candidates for repositioning that were backed up by epidemiological evidence extracted from scientific literature. Source code is freely available at https://github.com/ps4dr/ps4dr.


Assuntos
Preparações Farmacêuticas/metabolismo , Interface Usuário-Computador , Ensaios Clínicos como Assunto , Biologia Computacional/métodos , Reposicionamento de Medicamentos , Estudo de Associação Genômica Ampla , Humanos , Transcriptoma , Fluxo de Trabalho
14.
Curr Opin Neurol ; 33(2): 249-254, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-32073441

RESUMO

PURPOSE OF REVIEW: With the advancement of computational approaches and abundance of biomedical data, a broad range of neurodegenerative disease models have been developed. In this review, we argue that computational models can be both relevant and useful in neurodegenerative disease research and although the current established models have limitations in clinical practice, artificial intelligence has the potential to overcome deficiencies encountered by these models, which in turn can improve our understanding of disease. RECENT FINDINGS: In recent years, diverse computational approaches have been used to shed light on different aspects of neurodegenerative disease models. For example, linear and nonlinear mixed models, self-modeling regression, differential equation models, and event-based models have been applied to provide a better understanding of disease progression patterns and biomarker trajectories. Additionally, the Cox-regression technique, Bayesian network models, and deep-learning-based approaches have been used to predict the probability of future incidence of disease, whereas nonnegative matrix factorization, nonhierarchical cluster analysis, hierarchical agglomerative clustering, and deep-learning-based approaches have been employed to stratify patients based on their disease subtypes. Furthermore, the interpretation of neurodegenerative disease data is possible through knowledge-based models which use prior knowledge to complement data-driven analyses. These knowledge-based models can include pathway-centric approaches to establish pathways perturbed in a given condition, as well as disease-specific knowledge maps, which elucidate the mechanisms involved in a given disease. Collectively, these established models have revealed high granular details and insights into neurodegenerative disease models. SUMMARY: In conjunction with increasingly advanced computational approaches, a wide spectrum of neurodegenerative disease models, which can be broadly categorized into data-driven and knowledge-driven, have been developed. We review the state of the art data and knowledge-driven models and discuss the necessary steps which are vital to bring them into clinical application.


Assuntos
Ciência de Dados , Doenças Neurodegenerativas/epidemiologia , Algoritmos , Humanos , Modelos Estatísticos
15.
BMC Bioinformatics ; 20(1): 494, 2019 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-31604427

RESUMO

BACKGROUND: Literature derived knowledge assemblies have been used as an effective way of representing biological phenomenon and understanding disease etiology in systems biology. These include canonical pathway databases such as KEGG, Reactome and WikiPathways and disease specific network inventories such as causal biological networks database, PD map and NeuroMMSig. The represented knowledge in these resources delineates qualitative information focusing mainly on the causal relationships between biological entities. Genes, the major constituents of knowledge representations, tend to express differentially in different conditions such as cell types, brain regions and disease stages. A classical approach of interpreting a knowledge assembly is to explore gene expression patterns of the individual genes. However, an approach that enables quantification of the overall impact of differentially expressed genes in the corresponding network is still lacking. RESULTS: Using the concept of heat diffusion, we have devised an algorithm that is able to calculate the magnitude of regulation of a biological network using expression datasets. We have demonstrated that molecular mechanisms specific to Alzheimer (AD) and Parkinson Disease (PD) regulate with different intensities across spatial and temporal resolutions. Our approach depicts that the mitochondrial dysfunction in PD is severe in cortex and advanced stages of PD patients. Similarly, we have shown that the intensity of aggregation of neurofibrillary tangles (NFTs) in AD increases as the disease progresses. This finding is in concordance with previous studies that explain the burden of NFTs in stages of AD. CONCLUSIONS: This study is one of the first attempts that enable quantification of mechanisms represented as biological networks. We have been able to quantify the magnitude of regulation of a biological network and illustrate that the magnitudes are different across spatial and temporal resolution.


Assuntos
Algoritmos , Encéfalo/metabolismo , Doenças Neurodegenerativas/metabolismo , Biologia de Sistemas/métodos , Doença de Alzheimer/genética , Doença de Alzheimer/metabolismo , Regulação da Expressão Gênica , Humanos , Redes e Vias Metabólicas , Mitocôndrias/metabolismo , Mitocôndrias/fisiologia , Doenças Neurodegenerativas/genética , Doenças Neurodegenerativas/fisiopatologia , Doença de Parkinson/genética , Doença de Parkinson/metabolismo , Doença de Parkinson/fisiopatologia , Mapas de Interação de Proteínas , Transdução de Sinais
16.
BMC Bioinformatics ; 20(1): 243, 2019 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-31092193

RESUMO

BACKGROUND: The complexity of representing biological systems is compounded by an ever-expanding body of knowledge emerging from multi-omics experiments. A number of pathway databases have facilitated pathway-centric approaches that assist in the interpretation of molecular signatures yielded by these experiments. However, the lack of interoperability between pathway databases has hindered the ability to harmonize these resources and to exploit their consolidated knowledge. Such a unification of pathway knowledge is imperative in enhancing the comprehension and modeling of biological abstractions. RESULTS: Here, we present PathMe, a Python package that transforms pathway knowledge from three major pathway databases into a unified abstraction using Biological Expression Language as the pivotal, integrative schema. PathMe is complemented by a novel web application (freely available at https://pathme.scai.fraunhofer.de/ ) which allows users to comprehensively explore pathway crosstalk and compare areas of consensus and discrepancies. CONCLUSIONS: This work has harmonized three major pathway databases and transformed them into a unified schema in order to gain a holistic picture of pathway knowledge. We demonstrate the utility of the PathMe framework in: i) integrating pathway landscapes at the database level, ii) comparing the degree of consensus at the pathway level, and iii) exploring pathway crosstalk and investigating consensus at the molecular level.


Assuntos
Transdução de Sinais , Software , Biologia Computacional , Bases de Dados como Assunto , Bases de Dados Factuais , Humanos , Serina-Treonina Quinases TOR/metabolismo
17.
Bioinformatics ; 34(13): 2316-2318, 2018 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-29949955

RESUMO

Summary: While cause-and-effect knowledge assembly models encoded in Biological Expression Language are able to support generation of mechanistic hypotheses, they are static and limited in their ability to encode temporality. Here, we present BEL2ABM, a software for producing continuous, dynamic, executable agent-based models from BEL templates. Availability and implementation: The tool has been developed in Java and NetLogo. Code, data and documentation are available under the Apache 2.0 License at https://github.com/pybel/bel2abm. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Idioma , Software , Documentação , Humanos , Modelos Biológicos
18.
Nucleic Acids Res ; 45(16): 9290-9301, 2017 Sep 19.
Artigo em Inglês | MEDLINE | ID: mdl-28934507

RESUMO

With this study, we provide a comprehensive reference dataset of detailed miRNA expression profiles from seven types of human peripheral blood cells (NK cells, B lymphocytes, cytotoxic T lymphocytes, T helper cells, monocytes, neutrophils and erythrocytes), serum, exosomes and whole blood. The peripheral blood cells from buffy coats were typed and sorted using FACS/MACS. The overall dataset was generated from 450 small RNA libraries using high-throughput sequencing. By employing a comprehensive bioinformatics and statistical analysis, we show that 3' trimming modifications as well as composition of 3' added non-templated nucleotides are distributed in a lineage-specific manner-the closer the hematopoietic progenitors are, the higher their similarities in sequence variation of the 3' end. Furthermore, we define the blood cell-specific miRNA and isomiR expression patterns and identify novel cell type specific miRNA candidates. The study provides the most comprehensive contribution to date towards a complete miRNA catalogue of human peripheral blood, which can be used as a reference for future studies. The dataset has been deposited in GEO and also can be explored interactively following this link: http://134.245.63.235/ikmb-tools/bloodmiRs.


Assuntos
Células Sanguíneas/metabolismo , MicroRNAs/sangue , Linhagem da Célula , Eritrócitos/metabolismo , Exossomos/metabolismo , Humanos , Linfócitos/metabolismo , MicroRNAs/química , Células Mieloides/metabolismo , Análise de Sequência de RNA , Transcriptoma
19.
Brief Bioinform ; 17(3): 505-16, 2016 05.
Artigo em Inglês | MEDLINE | ID: mdl-26249223

RESUMO

The work we present here is based on the recent extension of the syntax of the Biological Expression Language (BEL), which now allows for the representation of genetic variation information in cause-and-effect models. In our article, we describe, how genetic variation information can be used to identify candidate disease mechanisms in diseases with complex aetiology such as Alzheimer's disease and Parkinson's disease. In those diseases, we have to assume that many genetic variants contribute moderately to the overall dysregulation that in the case of neurodegenerative diseases has such a long incubation time until the first clinical symptoms are detectable. Owing to the multilevel nature of dysregulation events, systems biomedicine modelling approaches need to combine mechanistic information from various levels, including gene expression, microRNA (miRNA) expression, protein-protein interaction, genetic variation and pathway. OpenBEL, the open source version of BEL, has recently been extended to match this requirement, and we demonstrate in our article, how candidate mechanisms for early dysregulation events in Alzheimer's disease can be identified based on an integrative mining approach that identifies 'chains of causation' that include single nucleotide polymorphism information in BEL models.


Assuntos
Variação Genética , Expressão Gênica , Humanos , Doenças Neurodegenerativas
20.
Bioinformatics ; 33(22): 3679-3681, 2017 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-28651363

RESUMO

MOTIVATION: The concept of a 'mechanism-based taxonomy of human disease' is currently replacing the outdated paradigm of diseases classified by clinical appearance. We have tackled the paradigm of mechanism-based patient subgroup identification in the challenging area of research on neurodegenerative diseases. RESULTS: We have developed a knowledge base representing essential pathophysiology mechanisms of neurodegenerative diseases. Together with dedicated algorithms, this knowledge base forms the basis for a 'mechanism-enrichment server' that supports the mechanistic interpretation of multiscale, multimodal clinical data. AVAILABILITY AND IMPLEMENTATION: NeuroMMSig is available at http://neurommsig.scai.fraunhofer.de/. CONTACT: martin.hofmann-apitius@scai.fraunhofer.de. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Bases de Conhecimento , Doenças Neurodegenerativas/metabolismo , Doenças Neurodegenerativas/fisiopatologia , Humanos , Internet , Modelos Biológicos , Doenças Neurodegenerativas/genética , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA