Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 21
Filter
1.
Bioinformatics ; 38(10): 2880-2891, 2022 05 13.
Article in English | MEDLINE | ID: mdl-35561182

ABSTRACT

MOTIVATION: Drug repositioning is an attractive alternative to de novo drug discovery due to reduced time and costs to bring drugs to market. Computational repositioning methods, particularly non-black-box methods that can account for and predict a drug's mechanism, may provide great benefit for directing future development. By tuning both data and algorithm to utilize relationships important to drug mechanisms, a computational repositioning algorithm can be trained to both predict and explain mechanistically novel indications. RESULTS: In this work, we examined the 123 curated drug mechanism paths found in the drug mechanism database (DrugMechDB) and after identifying the most important relationships, we integrated 18 data sources to produce a heterogeneous knowledge graph, MechRepoNet, capable of capturing the information in these paths. We applied the Rephetio repurposing algorithm to MechRepoNet using only a subset of relationships known to be mechanistic in nature and found adequate predictive ability on an evaluation set with AUROC value of 0.83. The resulting repurposing model allowed us to prioritize paths in our knowledge graph to produce a predicted treatment mechanism. We found that DrugMechDB paths, when present in the network were rated highly among predicted mechanisms. We then demonstrated MechRepoNet's ability to use mechanistic insight to identify a drug's mechanistic target, with a mean reciprocal rank of 0.525 on a test set of known drug-target interactions. Finally, we walked through repurposing examples of the anti-cancer drug imatinib for use in the treatment of asthma, and metolazone for use in the treatment of osteoporosis, to demonstrate this method's utility in providing mechanistic insight into repurposing predictions it provides. AVAILABILITY AND IMPLEMENTATION: The Python code to reproduce the entirety of this analysis is available at: https://github.com/SuLab/MechRepoNet (archived at https://doi.org/10.5281/zenodo.6456335). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , Drug Repositioning , Databases, Pharmaceutical
2.
BMC Bioinformatics ; 20(1): 653, 2019 Dec 11.
Article in English | MEDLINE | ID: mdl-31829175

ABSTRACT

BACKGROUND: Computational compound repositioning has the potential for identifying new uses for existing drugs, and new algorithms and data source aggregation strategies provide ever-improving results via in silico metrics. However, even with these advances, the number of compounds successfully repositioned via computational screening remains low. New strategies for algorithm evaluation that more accurately reflect the repositioning potential of a compound could provide a better target for future optimizations. RESULTS: Using a text-mined database, we applied a previously described network-based computational repositioning algorithm, yielding strong results via cross-validation, averaging 0.95 AUROC on test-set indications. However, to better approximate a real-world scenario, we built a time-resolved evaluation framework. At various time points, we built networks corresponding to prior knowledge for use as a training set, and then predicted on a test set comprised of indications that were subsequently described. This framework showed a marked reduction in performance, peaking in performance metrics with the 1985 network at an AUROC of .797. Examining performance reductions due to removal of specific types of relationships highlighted the importance of drug-drug and disease-disease similarity metrics. Using data from future timepoints, we demonstrate that further acquisition of these kinds of data may help improve computational results. CONCLUSIONS: Evaluating a repositioning algorithm using indications unknown to input network better tunes its ability to find emerging drug indications, rather than finding those which have been randomly withheld. Focusing efforts on improving algorithmic performance in a time-resolved paradigm may further improve computational repositioning predictions.


Subject(s)
Computational Biology/methods , Data Mining , Drug Repositioning , Knowledge Bases , Algorithms , Disease , Humans , Machine Learning , Reproducibility of Results , Time Factors
3.
Nucleic Acids Res ; 45(D1): D833-D839, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27924018

ABSTRACT

The information about the genetic basis of human diseases lies at the heart of precision medicine and drug discovery. However, to realize its full potential to support these goals, several problems, such as fragmentation, heterogeneity, availability and different conceptualization of the data must be overcome. To provide the community with a resource free of these hurdles, we have developed DisGeNET (http://www.disgenet.org), one of the largest available collections of genes and variants involved in human diseases. DisGeNET integrates data from expert curated repositories, GWAS catalogues, animal models and the scientific literature. DisGeNET data are homogeneously annotated with controlled vocabularies and community-driven ontologies. Additionally, several original metrics are provided to assist the prioritization of genotype-phenotype relationships. The information is accessible through a web interface, a Cytoscape App, an RDF SPARQL endpoint, scripts in several programming languages and an R package. DisGeNET is a versatile platform that can be used for different research purposes including the investigation of the molecular underpinnings of specific human diseases and their comorbidities, the analysis of the properties of disease genes, the generation of hypothesis on drug therapeutic action and drug adverse effects, the validation of computationally predicted disease genes and the evaluation of text-mining methods performance.


Subject(s)
Computational Biology/methods , Databases, Genetic , Genetic Association Studies/methods , Genetic Predisposition to Disease , Genetic Variation , Genomics/methods , Humans , Software , Web Browser
4.
Bioinformatics ; 33(17): 2723-2730, 2017 Sep 01.
Article in English | MEDLINE | ID: mdl-28449114

ABSTRACT

MOTIVATION: Biological data and knowledge bases increasingly rely on Semantic Web technologies and the use of knowledge graphs for data integration, retrieval and federated queries. In the past years, feature learning methods that are applicable to graph-structured data are becoming available, but have not yet widely been applied and evaluated on structured biological knowledge. Results: We develop a novel method for feature learning on biological knowledge graphs. Our method combines symbolic methods, in particular knowledge representation using symbolic logic and automated reasoning, with neural networks to generate embeddings of nodes that encode for related information within knowledge graphs. Through the use of symbolic logic, these embeddings contain both explicit and implicit information. We apply these embeddings to the prediction of edges in the knowledge graph representing problems of function prediction, finding candidate genes of diseases, protein-protein interactions, or drug target relations, and demonstrate performance that matches and sometimes outperforms traditional approaches based on manually crafted features. Our method can be applied to any biological knowledge graph, and will thereby open up the increasing amount of Semantic Web based knowledge bases in biology to use in machine learning and data analytics. AVAILABILITY AND IMPLEMENTATION: https://github.com/bio-ontology-research-group/walking-rdf-and-owl. CONTACT: robert.hoehndorf@kaust.edu.sa. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Computational Biology/methods , Knowledge Bases , Machine Learning , Neural Networks, Computer , Humans
5.
Bioinformatics ; 32(14): 2236-8, 2016 07 15.
Article in English | MEDLINE | ID: mdl-27153650

ABSTRACT

MOTIVATION: DisGeNET-RDF makes available knowledge on the genetic basis of human diseases in the Semantic Web. Gene-disease associations (GDAs) and their provenance metadata are published as human-readable and machine-processable web resources. The information on GDAs included in DisGeNET-RDF is interlinked to other biomedical databases to support the development of bioinformatics approaches for translational research through evidence-based exploitation of a rich and fully interconnected linked open data. AVAILABILITY AND IMPLEMENTATION: http://rdf.disgenet.org/ CONTACT: support@disgenet.org.


Subject(s)
Computational Biology , Disease/genetics , Semantics , Databases, Factual , Humans , Internet
6.
BMC Bioinformatics ; 16: 55, 2015 Feb 21.
Article in English | MEDLINE | ID: mdl-25886734

ABSTRACT

BACKGROUND: Current biomedical research needs to leverage and exploit the large amount of information reported in scientific publications. Automated text mining approaches, in particular those aimed at finding relationships between entities, are key for identification of actionable knowledge from free text repositories. We present the BeFree system aimed at identifying relationships between biomedical entities with a special focus on genes and their associated diseases. RESULTS: By exploiting morpho-syntactic information of the text, BeFree is able to identify gene-disease, drug-disease and drug-target associations with state-of-the-art performance. The application of BeFree to real-case scenarios shows its effectiveness in extracting information relevant for translational research. We show the value of the gene-disease associations extracted by BeFree through a number of analyses and integration with other data sources. BeFree succeeds in identifying genes associated to a major cause of morbidity worldwide, depression, which are not present in other public resources. Moreover, large-scale extraction and analysis of gene-disease associations, and integration with current biomedical knowledge, provided interesting insights on the kind of information that can be found in the literature, and raised challenges regarding data prioritization and curation. We found that only a small proportion of the gene-disease associations discovered by using BeFree is collected in expert-curated databases. Thus, there is a pressing need to find alternative strategies to manual curation, in order to review, prioritize and curate text-mining data and incorporate it into domain-specific databases. We present our strategy for data prioritization and discuss its implications for supporting biomedical research and applications. CONCLUSIONS: BeFree is a novel text mining system that performs competitively for the identification of gene-disease, drug-disease and drug-target associations. Our analyses show that mining only a small fraction of MEDLINE results in a large dataset of gene-disease associations, and only a small proportion of this dataset is actually recorded in curated resources (2%), raising several issues on data prioritization and curation. We propose that joint analysis of text mined data with data curated by experts appears as a suitable approach to both assess data quality and highlight novel and interesting information.


Subject(s)
Data Mining/methods , Disease/genetics , Information Storage and Retrieval , MEDLINE , Publications , Translational Research, Biomedical , Databases, Factual , Depression/genetics , Disease/classification , Humans , Knowledge Bases
7.
Sci Rep ; 14(1): 20731, 2024 09 05.
Article in English | MEDLINE | ID: mdl-39237660

ABSTRACT

Congenital Anomalies of the Kidney and Urinary Tract (CAKUT) is the leading cause of childhood chronic kidney failure and a significant cause of chronic kidney disease in adults. Genetic and environmental factors are known to influence CAKUT development, but the currently known disease mechanism remains incomplete. Our goal is to identify affected pathways and networks in CAKUT, and thereby aid in getting a better understanding of its pathophysiology. With this goal, the miRNome, peptidome, and proteome of over 30 amniotic fluid samples of patients with non-severe CAKUT was compared to patients with severe CAKUT. These omics data sets were made findable, accessible, interoperable, and reusable (FAIR) to facilitate their integration with external data resources. Furthermore, we analysed and integrated the omics data sets using three different bioinformatics strategies: integrative analysis with mixOmics, joint dimensionality reduction and pathway analysis. The three bioinformatics analyses provided complementary features, but all pointed towards an important role for collagen in CAKUT development and the PI3K-AKT signalling pathway. Additionally, several key genes (CSF1, IGF2, ITGB1, and RAC1) and microRNAs were identified. We published the three analysis strategies as containerized workflows. These workflows can be applied to other FAIR data sets and help gaining knowledge on other rare diseases.


Subject(s)
Collagen , Phosphatidylinositol 3-Kinases , Proto-Oncogene Proteins c-akt , Signal Transduction , Humans , Proto-Oncogene Proteins c-akt/metabolism , Proto-Oncogene Proteins c-akt/genetics , Phosphatidylinositol 3-Kinases/metabolism , Phosphatidylinositol 3-Kinases/genetics , Collagen/metabolism , Collagen/genetics , Computational Biology/methods , MicroRNAs/genetics , MicroRNAs/metabolism , Vesico-Ureteral Reflux/genetics , Vesico-Ureteral Reflux/metabolism , Female , Proteome/metabolism , Amniotic Fluid/metabolism , Urinary Tract/metabolism , Multiomics , Urogenital Abnormalities
8.
J Biomed Semantics ; 14(1): 21, 2023 Dec 11.
Article in English | MEDLINE | ID: mdl-38082345

ABSTRACT

BACKGROUND: The FAIR principles recommend the use of controlled vocabularies, such as ontologies, to define data and metadata concepts. Ontologies are currently modelled following different approaches, sometimes describing conflicting definitions of the same concepts, which can affect interoperability. To cope with that, prior literature suggests organising ontologies in levels, where domain specific (low-level) ontologies are grounded in domain independent high-level ontologies (i.e., foundational ontologies). In this level-based organisation, foundational ontologies work as translators of intended meaning, thus improving interoperability. Despite their considerable acceptance in biomedical research, there are very few studies testing foundational ontologies. This paper describes a systematic literature mapping that was conducted to understand how foundational ontologies are used in biomedical research and to find empirical evidence supporting their claimed (dis)advantages. RESULTS: From a set of 79 selected papers, we identified that foundational ontologies are used for several purposes: ontology construction, repair, mapping, and ontology-based data analysis. Foundational ontologies are claimed to improve interoperability, enhance reasoning, speed up ontology development and facilitate maintainability. The complexity of using foundational ontologies is the most commonly cited downside. Despite being used for several purposes, there were hardly any experiments (1 paper) testing the claims for or against the use of foundational ontologies. In the subset of 49 papers that describe the development of an ontology, it was observed a low adherence to ontology construction (16 papers) and ontology evaluation formal methods (4 papers). CONCLUSION: Our findings have two main implications. First, the lack of empirical evidence about the use of foundational ontologies indicates a need for evaluating the use of such artefacts in biomedical research. Second, the low adherence to formal methods illustrates how the field could benefit from a more systematic approach when dealing with the development and evaluation of ontologies. The understanding of how foundational ontologies are used in the biomedical field can drive future research towards the improvement of ontologies and, consequently, data FAIRness. The adoption of formal methods can impact the quality and sustainability of ontologies, and reusing these methods from other fields is encouraged.


Subject(s)
Biological Ontologies , Biomedical Research , Vocabulary, Controlled
9.
J Biomed Semantics ; 13(1): 12, 2022 04 25.
Article in English | MEDLINE | ID: mdl-35468846

ABSTRACT

BACKGROUND: The COVID-19 pandemic has challenged healthcare systems and research worldwide. Data is collected all over the world and needs to be integrated and made available to other researchers quickly. However, the various heterogeneous information systems that are used in hospitals can result in fragmentation of health data over multiple data 'silos' that are not interoperable for analysis. Consequently, clinical observations in hospitalised patients are not prepared to be reused efficiently and timely. There is a need to adapt the research data management in hospitals to make COVID-19 observational patient data machine actionable, i.e. more Findable, Accessible, Interoperable and Reusable (FAIR) for humans and machines. We therefore applied the FAIR principles in the hospital to make patient data more FAIR. RESULTS: In this paper, we present our FAIR approach to transform COVID-19 observational patient data collected in the hospital into machine actionable digital objects to answer medical doctors' research questions. With this objective, we conducted a coordinated FAIRification among stakeholders based on ontological models for data and metadata, and a FAIR based architecture that complements the existing data management. We applied FAIR Data Points for metadata exposure, turning investigational parameters into a FAIR dataset. We demonstrated that this dataset is machine actionable by means of three different computational activities: federated query of patient data along open existing knowledge sources across the world through the Semantic Web, implementing Web APIs for data query interoperability, and building applications on top of these FAIR patient data for FAIR data analytics in the hospital. CONCLUSIONS: Our work demonstrates that a FAIR research data management plan based on ontological models for data and metadata, open Science, Semantic Web technologies, and FAIR Data Points is providing data infrastructure in the hospital for machine actionable FAIR Digital Objects. This FAIR data is prepared to be reused for federated analysis, linkable to other FAIR data such as Linked Open Data, and reusable to develop software applications on top of them for hypothesis generation and knowledge discovery.


Subject(s)
COVID-19 , Pandemics , COVID-19/epidemiology , Hospitals , Humans , Metadata , Semantic Web
10.
J Biomed Semantics ; 13(1): 9, 2022 03 15.
Article in English | MEDLINE | ID: mdl-35292119

ABSTRACT

BACKGROUND: The European Platform on Rare Disease Registration (EU RD Platform) aims to address the fragmentation of European rare disease (RD) patient data, scattered among hundreds of independent and non-coordinating registries, by establishing standards for integration and interoperability. The first practical output of this effort was a set of 16 Common Data Elements (CDEs) that should be implemented by all RD registries. Interoperability, however, requires decisions beyond data elements - including data models, formats, and semantics. Within the European Joint Programme on Rare Diseases (EJP RD), we aim to further the goals of the EU RD Platform by generating reusable RD semantic model templates that follow the FAIR Data Principles. RESULTS: Through a team-based iterative approach, we created semantically grounded models to represent each of the CDEs, using the SemanticScience Integrated Ontology as the core framework for representing the entities and their relationships. Within that framework, we mapped the concepts represented in the CDEs, and their possible values, into domain ontologies such as the Orphanet Rare Disease Ontology, Human Phenotype Ontology and National Cancer Institute Thesaurus. Finally, we created an exemplar, reusable ETL pipeline that we will be deploying over these non-coordinating data repositories to assist them in creating model-compliant FAIR data without requiring site-specific coding nor expertise in Linked Data or FAIR. CONCLUSIONS: Within the EJP RD project, we determined that creating reusable, expert-designed templates reduced or eliminated the requirement for our participating biomedical domain experts and rare disease data hosts to understand OWL semantics. This enabled them to publish highly expressive FAIR data using tools and approaches that were already familiar to them.


Subject(s)
Common Data Elements , Rare Diseases , Humans , Registries , Semantics , Workflow
11.
Database (Oxford) ; 20222022 05 25.
Article in English | MEDLINE | ID: mdl-35616100

ABSTRACT

Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for mapping between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Mappings often lack the metadata needed to be correctly interpreted and applied. For example, are two terms equivalent or merely related? Are they narrow or broad matches? Or are they associated in some other way? Such relationships between the mapped terms are often not documented, which leads to incorrect assumptions and makes them hard to use in scenarios that require a high degree of precision (such as diagnostics or risk prediction). Furthermore, the lack of descriptions of how mappings were done makes it hard to combine and reconcile mappings, particularly curated and automated ones. We have developed the Simple Standard for Sharing Ontological Mappings (SSSOM) which addresses these problems by: (i) Introducing a machine-readable and extensible vocabulary to describe metadata that makes imprecision, inaccuracy and incompleteness in mappings explicit. (ii) Defining an easy-to-use simple table-based format that can be integrated into existing data science pipelines without the need to parse or query ontologies, and that integrates seamlessly with Linked Data principles. (iii) Implementing open and community-driven collaborative workflows that are designed to evolve the standard continuously to address changing requirements and mapping practices. (iv) Providing reference tools and software libraries for working with the standard. In this paper, we present the SSSOM standard, describe several use cases in detail and survey some of the existing work on standardizing the exchange of mappings, with the goal of making mappings Findable, Accessible, Interoperable and Reusable (FAIR). The SSSOM specification can be found at http://w3id.org/sssom/spec. Database URL: http://w3id.org/sssom/spec.


Subject(s)
Metadata , Semantic Web , Data Management , Databases, Factual , Workflow
12.
Stud Health Technol Inform ; 279: 144-146, 2021 May 07.
Article in English | MEDLINE | ID: mdl-33965931

ABSTRACT

BACKGROUND: Integration of heterogenous resources is key for Rare Disease research. Within the EJP RD, common Application Programming Interface specifications are proposed for discovery of resources and data records. This is not sufficient for automated processing between RD resources and meeting the FAIR principles. OBJECTIVE: To design a solution to improve FAIR for machines for the EJP RD API specification. METHODS: A FAIR Data Point is used to expose machine-actionable metadata of digital resources and it is configured to store its content to a semantic database to be FAIR at the source. RESULTS: A solution was designed based on grlc server as middleware to implement the EJP RD API specification on top of the FDP. CONCLUSION: grlc reduces potential API implementation overhead faced by maintainers who use FAIR at the source.


Subject(s)
Rare Diseases , Software , Databases, Factual , Humans , Internet , Metadata , Semantics
13.
Eur Biophys J ; 39(11): 1471-5, 2010 Oct.
Article in English | MEDLINE | ID: mdl-20364341

ABSTRACT

Representative crystal structures of the ligand-binding domain for the majority of nuclear receptors are currently available. A systematic comparative analysis of these structures identified an energetically favorable cation-π interaction that involves an amino acid located at the extreme C-terminal end and appears to form only in the agonist conformation of the estrogen receptor α, glucocorticoid, mineralocorticoid, progesterone, and androgen receptors. It is postulated that this cation-π interaction is used by members of the estrogen-like subfamily to provide additional stabilization to the transcriptional active conformation upon ligand binding.


Subject(s)
Computational Biology , Estrogens/metabolism , Receptors, Cytoplasmic and Nuclear/agonists , Receptors, Cytoplasmic and Nuclear/chemistry , Amino Acid Sequence , Animals , Humans , Ligands , Models, Molecular , Molecular Sequence Data , Protein Stability , Protein Structure, Tertiary , Rats , Receptors, Cytoplasmic and Nuclear/metabolism , Thermodynamics
14.
Genomics Inform ; 18(2): e17, 2020 Jun.
Article in English | MEDLINE | ID: mdl-32634871

ABSTRACT

The amount of content on social media platforms such as Twitter is expanding rapidly. Simultaneously, the lack of patient information seriously hinders the diagnosis and treatment of rare/intractable diseases. However, these patient communities are especially active on social media. Data from social media could serve as a source of patient-centric knowledge for these diseases complementary to the information collected in clinical settings and patient registries, and may also have potential for research use. To explore this question, we attempted to extract patient-centric knowledge from social media as a task for the 3-day Biomedical Linked Annotation Hackathon 6 (BLAH6). We selected amyotrophic lateral sclerosis and multiple sclerosis as use cases of rare and intractable diseases, respectively, and we extracted patient histories related to these health conditions from Twitter. Four diagnosed patients for each disease were selected. From the user timelines of these eight patients, we extracted tweets that might be related to health conditions. Based on our experiment, we show that our approach has considerable potential, although we identified problems that should be addressed in future attempts to mine information about rare/intractable diseases from Twitter.

15.
Database (Oxford) ; 20202020 01 01.
Article in English | MEDLINE | ID: mdl-32283553

ABSTRACT

Hypothesis generation is a critical step in research and a cornerstone in the rare disease field. Research is most efficient when those hypotheses are based on the entirety of knowledge known to date. Systematic review articles are commonly used in biomedicine to summarize existing knowledge and contextualize experimental data. But the information contained within review articles is typically only expressed as free-text, which is difficult to use computationally. Researchers struggle to navigate, collect and remix prior knowledge as it is scattered in several silos without seamless integration and access. This lack of a structured information framework hinders research by both experimental and computational scientists. To better organize knowledge and data, we built a structured review article that is specifically focused on NGLY1 Deficiency, an ultra-rare genetic disease first reported in 2012. We represented this structured review as a knowledge graph and then stored this knowledge graph in a Neo4j database to simplify dissemination, querying and visualization of the network. Relative to free-text, this structured review better promotes the principles of findability, accessibility, interoperability and reusability (FAIR). In collaboration with domain experts in NGLY1 Deficiency, we demonstrate how this resource can improve the efficiency and comprehensiveness of hypothesis generation. We also developed a read-write interface that allows domain experts to contribute FAIR structured knowledge to this community resource. In contrast to traditional free-text review articles, this structured review exists as a living knowledge graph that is curated by humans and accessible to computational analyses. Finally, we have generalized this workflow into modular and repurposable components that can be applied to other domain areas. This NGLY1 Deficiency-focused network is publicly available at http://ngly1graph.org/. AVAILABILITY AND IMPLEMENTATION: Database URL: http://ngly1graph.org/. Network data files are at: https://github.com/SuLab/ngly1-graph and source code at: https://github.com/SuLab/bioknowledge-reviewer. CONTACT: asu@scripps.edu.


Subject(s)
Biomedical Research/methods , Computational Biology/methods , Databases, Factual , Knowledge Bases , Animals , Biomedical Research/statistics & numerical data , Computational Biology/statistics & numerical data , Congenital Disorders of Glycosylation/genetics , Congenital Disorders of Glycosylation/metabolism , Data Curation/methods , Data Mining/methods , Humans , Internet , Peptide-N4-(N-acetyl-beta-glucosaminyl) Asparagine Amidase/deficiency , Peptide-N4-(N-acetyl-beta-glucosaminyl) Asparagine Amidase/genetics , Peptide-N4-(N-acetyl-beta-glucosaminyl) Asparagine Amidase/metabolism , Systematic Reviews as Topic
16.
Elife ; 92020 03 17.
Article in English | MEDLINE | ID: mdl-32180547

ABSTRACT

Wikidata is a community-maintained knowledge base that has been assembled from repositories in the fields of genomics, proteomics, genetic variants, pathways, chemical compounds, and diseases, and that adheres to the FAIR principles of findability, accessibility, interoperability and reusability. Here we describe the breadth and depth of the biomedical knowledge contained within Wikidata, and discuss the open-source tools we have built to add information to Wikidata and to synchronize it with source databases. We also demonstrate several use cases for Wikidata, including the crowdsourced curation of biomedical ontologies, phenotype-based diagnosis of disease, and drug repurposing.


Subject(s)
Biological Science Disciplines , Computational Biology , Databases, Factual , Genomics , Proteomics , Humans , Pattern Recognition, Automated
18.
Biomed Res Int ; 2017: 8327980, 2017.
Article in English | MEDLINE | ID: mdl-29214177

ABSTRACT

Patient registries are an essential tool to increase current knowledge regarding rare diseases. Understanding these data is a vital step to improve patient treatments and to create the most adequate tools for personalized medicine. However, the growing number of disease-specific patient registries brings also new technical challenges. Usually, these systems are developed as closed data silos, with independent formats and models, lacking comprehensive mechanisms to enable data sharing. To tackle these challenges, we developed a Semantic Web based solution that allows connecting distributed and heterogeneous registries, enabling the federation of knowledge between multiple independent environments. This semantic layer creates a holistic view over a set of anonymised registries, supporting semantic data representation, integrated access, and querying. The implemented system gave us the opportunity to answer challenging questions across disperse rare disease patient registries. The interconnection between those registries using Semantic Web technologies benefits our final solution in a way that we can query single or multiple instances according to our needs. The outcome is a unique semantic layer, connecting miscellaneous registries and delivering a lightweight holistic perspective over the wealth of knowledge stemming from linked rare disease patient registries.


Subject(s)
Database Management Systems/statistics & numerical data , Information Storage and Retrieval/statistics & numerical data , Rare Diseases/epidemiology , Registries/statistics & numerical data , Semantic Web/statistics & numerical data , Computational Biology/methods , Databases, Factual/statistics & numerical data , Humans , Information Dissemination/methods , Internet/statistics & numerical data , Software/statistics & numerical data
20.
Database (Oxford) ; 2015: bav028, 2015.
Article in English | MEDLINE | ID: mdl-25877637

ABSTRACT

DisGeNET is a comprehensive discovery platform designed to address a variety of questions concerning the genetic underpinning of human diseases. DisGeNET contains over 380,000 associations between >16,000 genes and 13,000 diseases, which makes it one of the largest repositories currently available of its kind. DisGeNET integrates expert-curated databases with text-mined data, covers information on Mendelian and complex diseases, and includes data from animal disease models. It features a score based on the supporting evidence to prioritize gene-disease associations. It is an open access resource available through a web interface, a Cytoscape plugin and as a Semantic Web resource. The web interface supports user-friendly data exploration and navigation. DisGeNET data can also be analysed via the DisGeNET Cytoscape plugin, and enriched with the annotations of other plugins of this popular network analysis software suite. Finally, the information contained in DisGeNET can be expanded and complemented using Semantic Web technologies and linked to a variety of resources already present in the Linked Data cloud. Hence, DisGeNET offers one of the most comprehensive collections of human gene-disease associations and a valuable set of tools for investigating the molecular mechanisms underlying diseases of genetic origin, designed to fulfill the needs of different user profiles, including bioinformaticians, biologists and health-care practitioners. Database URL: http://www.disgenet.org/


Subject(s)
Databases, Genetic , Gene Regulatory Networks , Genetic Diseases, Inborn/genetics , Genome, Human , Internet , User-Computer Interface , Animals , Cloud Computing , Disease Models, Animal , Humans
SELECTION OF CITATIONS
SEARCH DETAIL