Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 45
Filter
Add more filters

Publication year range
1.
BMC Med Inform Decis Mak ; 24(1): 184, 2024 Jun 27.
Article in English | MEDLINE | ID: mdl-38937817

ABSTRACT

An ever-increasing amount of data on a person's daily functioning is being collected, which holds information to revolutionize person-centered healthcare. However, the full potential of data on daily functioning cannot yet be exploited as it is mostly stored in an unstructured and inaccessible manner. The integration of these data, and thereby expedited knowledge discovery, is possible by the introduction of functionomics as a complementary 'omics' initiative, embracing the advances in data science. Functionomics is the study of high-throughput data on a person's daily functioning, that can be operationalized with the International Classification of Functioning, Disability and Health (ICF).A prerequisite for making functionomics operational are the FAIR (Findable, Accessible, Interoperable, and Reusable) principles. This paper illustrates a step by step application of the FAIR principles for making functionomics data machine readable and accessible, under strictly certified conditions, in a practical example. Establishing more FAIR functionomics data repositories, analyzed using a federated data infrastructure, enables new knowledge generation to improve health and person-centered healthcare. Together, as one allied health and healthcare research community, we need to consider to take up the here proposed methods.


Subject(s)
Activities of Daily Living , Humans , Patient-Centered Care , International Classification of Functioning, Disability and Health
2.
Brief Bioinform ; 20(2): 540-550, 2019 03 22.
Article in English | MEDLINE | ID: mdl-28968694

ABSTRACT

This review provides a historical overview of the inception and development of bioinformatics research in the Netherlands. Rooted in theoretical biology by foundational figures such as Paulien Hogeweg (at Utrecht University since the 1970s), the developments leading to organizational structures supporting a relatively large Dutch bioinformatics community will be reviewed. We will show that the most valuable resource that we have built over these years is the close-knit national expert community that is well engaged in basic and translational life science research programmes. The Dutch bioinformatics community is accustomed to facing the ever-changing landscape of data challenges and working towards solutions together. In addition, this community is the stable factor on the road towards sustainability, especially in times where existing funding models are challenged and change rapidly.


Subject(s)
Community Networks , Computational Biology/methods , Computational Biology/organization & administration , Sequence Analysis, DNA/standards , Translational Research, Biomedical , Humans , Netherlands
5.
BMC Bioinformatics ; 15 Suppl 1: S2, 2014.
Article in English | MEDLINE | ID: mdl-24564249

ABSTRACT

Many efforts exist to design and implement approaches and tools for data capture, integration and analysis in the life sciences. Challenges are not only the heterogeneity, size and distribution of information sources, but also the danger of producing too many solutions for the same problem. Methodological, technological, infrastructural and social aspects appear to be essential for the development of a new generation of best practices and tools. In this paper, we analyse and discuss these aspects from different perspectives, by extending some of the ideas that arose during the NETTAB 2012 Workshop, making reference especially to the European context. First, relevance of using data and software models for the management and analysis of biological data is stressed. Second, some of the most relevant community achievements of the recent years, which should be taken as a starting point for future efforts in this research domain, are presented. Third, some of the main outstanding issues, challenges and trends are analysed. The challenges related to the tendency to fund and create large scale international research infrastructures and public-private partnerships in order to address the complex challenges of data intensive science are especially discussed. The needs and opportunities of Genomic Computing (the integration, search and display of genomic information at a very specific level, e.g. at the level of a single DNA region) are then considered. In the current data and network-driven era, social aspects can become crucial bottlenecks. How these may best be tackled to unleash the technical abilities for effective data integration and validation efforts is then discussed. Especially the apparent lack of incentives for already overwhelmed researchers appears to be a limitation for sharing information and knowledge with other scientists. We point out as well how the bioinformatics market is growing at an unprecedented speed due to the impact that new powerful in silico analysis promises to have on better diagnosis, prognosis, drug discovery and treatment, towards personalized medicine. An open business model for bioinformatics, which appears to be able to reduce undue duplication of efforts and support the increased reuse of valuable data sets, tools and platforms, is finally discussed.


Subject(s)
Computational Biology/methods , Software , Algorithms , Animals , Computational Biology/trends , Cooperative Behavior , Genome , Genomics , Humans
6.
Hum Mutat ; 34(4): 661-6, 2013 Apr.
Article in English | MEDLINE | ID: mdl-23401191

ABSTRACT

A forum of the Human Variome Project (HVP) was held as a satellite to the 2012 Annual Meeting of the American Society of Human Genetics in San Francisco, California. The theme of this meeting was "Getting Ready for the Human Phenome Project." Understanding the genetic contribution to both rare single-gene "Mendelian" disorders and more complex common diseases will require integration of research efforts among many fields and better defined phenotypes. The HVP is dedicated to bringing together researchers and research populations throughout the world to provide the resources to investigate the impact of genetic variation on disease. To this end, there needs to be a greater sharing of phenotype and genotype data. For this to occur, many databases that currently exist will need to become interoperable to allow for the combining of cohorts with similar phenotypes to increase statistical power for studies attempting to identify novel disease genes or causative genetic variants. Improved systems and tools that enhance the collection of phenotype data from clinicians are urgently needed. This meeting begins the HVP's effort toward this important goal.


Subject(s)
Databases, Genetic , Human Genome Project , Phenotype , Computational Biology , Humans
7.
J Biomed Semantics ; 14(1): 21, 2023 Dec 11.
Article in English | MEDLINE | ID: mdl-38082345

ABSTRACT

BACKGROUND: The FAIR principles recommend the use of controlled vocabularies, such as ontologies, to define data and metadata concepts. Ontologies are currently modelled following different approaches, sometimes describing conflicting definitions of the same concepts, which can affect interoperability. To cope with that, prior literature suggests organising ontologies in levels, where domain specific (low-level) ontologies are grounded in domain independent high-level ontologies (i.e., foundational ontologies). In this level-based organisation, foundational ontologies work as translators of intended meaning, thus improving interoperability. Despite their considerable acceptance in biomedical research, there are very few studies testing foundational ontologies. This paper describes a systematic literature mapping that was conducted to understand how foundational ontologies are used in biomedical research and to find empirical evidence supporting their claimed (dis)advantages. RESULTS: From a set of 79 selected papers, we identified that foundational ontologies are used for several purposes: ontology construction, repair, mapping, and ontology-based data analysis. Foundational ontologies are claimed to improve interoperability, enhance reasoning, speed up ontology development and facilitate maintainability. The complexity of using foundational ontologies is the most commonly cited downside. Despite being used for several purposes, there were hardly any experiments (1 paper) testing the claims for or against the use of foundational ontologies. In the subset of 49 papers that describe the development of an ontology, it was observed a low adherence to ontology construction (16 papers) and ontology evaluation formal methods (4 papers). CONCLUSION: Our findings have two main implications. First, the lack of empirical evidence about the use of foundational ontologies indicates a need for evaluating the use of such artefacts in biomedical research. Second, the low adherence to formal methods illustrates how the field could benefit from a more systematic approach when dealing with the development and evaluation of ontologies. The understanding of how foundational ontologies are used in the biomedical field can drive future research towards the improvement of ontologies and, consequently, data FAIRness. The adoption of formal methods can impact the quality and sustainability of ontologies, and reusing these methods from other fields is encouraged.


Subject(s)
Biological Ontologies , Biomedical Research , Vocabulary, Controlled
8.
Hum Mutat ; 33(11): 1503-12, 2012 Nov.
Article in English | MEDLINE | ID: mdl-22736453

ABSTRACT

The advances in bioinformatics required to annotate human genomic variants and to place them in public data repositories have not kept pace with their discovery. Moreover, a law of diminishing returns has begun to operate both in terms of data publication and submission. Although the continued deposition of such data in the public domain is essential to maximize both their scientific and clinical utility, rewards for data sharing are few, representing a serious practical impediment to data submission. To date, two main strategies have been adopted as a means to encourage the submission of human genomic variant data: (1) database journal linkups involving the affiliation of a scientific journal with a publicly available database and (2) microattribution, involving the unambiguous linkage of data to their contributors via a unique identifier. The latter could in principle lead to the establishment of a microcitation-tracking system that acknowledges individual endeavor and achievement. Both approaches could incentivize potential data contributors, thereby encouraging them to share their data with the scientific community. Here, we summarize and critically evaluate approaches that have been proposed to address current deficiencies in data attribution and discuss ways in which they could become more widely adopted as novel scientific publication modalities.


Subject(s)
Genetic Variation , Genome, Human , Publishing , Computational Biology , Data Collection , Databases, Genetic , Humans , Peer Review, Research
9.
Hum Mutat ; 33(10): 1494-6, 2012 Oct.
Article in English | MEDLINE | ID: mdl-22623360

ABSTRACT

The joint Open PHACTS/GEN2PHEN workshop on "Solving Bottlenecks in Data Sharing in the Life Sciences" was held in Volendam, the Netherlands, on September 19 and 20, 2011, and was attended by representatives from academia, industry, publishing, and funding agencies. The aim of the workshop was to explore the issues that influence the extent to which data in the life sciences are shared, and to explore sustainability scenarios that would enable and promote "open" data sharing. Several key challenges were identified and solutions to each of these were proposed.


Subject(s)
Biological Science Disciplines/organization & administration , Information Dissemination , Biological Science Disciplines/legislation & jurisprudence , Humans
10.
Front Big Data ; 5: 883341, 2022.
Article in English | MEDLINE | ID: mdl-35647536

ABSTRACT

Although all the technical components supporting fully orchestrated Digital Twins (DT) currently exist, what remains missing is a conceptual clarification and analysis of a more generalized concept of a DT that is made FAIR, that is, universally machine actionable. This methodological overview is a first step toward this clarification. We present a review of previously developed semantic artifacts and how they may be used to compose a higher-order data model referred to here as a FAIR Digital Twin (FDT). We propose an architectural design to compose, store and reuse FDTs supporting data intensive research, with emphasis on privacy by design and their use in GDPR compliant open science.

11.
J Biomed Semantics ; 13(1): 12, 2022 04 25.
Article in English | MEDLINE | ID: mdl-35468846

ABSTRACT

BACKGROUND: The COVID-19 pandemic has challenged healthcare systems and research worldwide. Data is collected all over the world and needs to be integrated and made available to other researchers quickly. However, the various heterogeneous information systems that are used in hospitals can result in fragmentation of health data over multiple data 'silos' that are not interoperable for analysis. Consequently, clinical observations in hospitalised patients are not prepared to be reused efficiently and timely. There is a need to adapt the research data management in hospitals to make COVID-19 observational patient data machine actionable, i.e. more Findable, Accessible, Interoperable and Reusable (FAIR) for humans and machines. We therefore applied the FAIR principles in the hospital to make patient data more FAIR. RESULTS: In this paper, we present our FAIR approach to transform COVID-19 observational patient data collected in the hospital into machine actionable digital objects to answer medical doctors' research questions. With this objective, we conducted a coordinated FAIRification among stakeholders based on ontological models for data and metadata, and a FAIR based architecture that complements the existing data management. We applied FAIR Data Points for metadata exposure, turning investigational parameters into a FAIR dataset. We demonstrated that this dataset is machine actionable by means of three different computational activities: federated query of patient data along open existing knowledge sources across the world through the Semantic Web, implementing Web APIs for data query interoperability, and building applications on top of these FAIR patient data for FAIR data analytics in the hospital. CONCLUSIONS: Our work demonstrates that a FAIR research data management plan based on ontological models for data and metadata, open Science, Semantic Web technologies, and FAIR Data Points is providing data infrastructure in the hospital for machine actionable FAIR Digital Objects. This FAIR data is prepared to be reused for federated analysis, linkable to other FAIR data such as Linked Open Data, and reusable to develop software applications on top of them for hypothesis generation and knowledge discovery.


Subject(s)
COVID-19 , Pandemics , COVID-19/epidemiology , Hospitals , Humans , Metadata , Semantic Web
13.
Proteomics ; 11(5): 843-53, 2011 Mar.
Article in English | MEDLINE | ID: mdl-21280221

ABSTRACT

We introduce a framework for predicting novel protein-protein interactions (PPIs), based on Fisher's method for combining probabilities of predictions that are based on different data sources, such as the biomedical literature, protein domain and mRNA expression information. Our method compares favorably to our previous method based on text-mining alone and other methods such as STRING. We evaluated our algorithms through the prediction of experimentally found protein interactions underlying Muscular Dystrophy, Huntington's Disease and Polycystic Kidney Disease, which had not yet been recorded in protein-protein interaction databases. We found a 1.74-fold increase in the mean average prediction precision for dysferlin and a 3.09-fold for huntingtin when compared to STRING. The top 10 of predicted interaction partners of huntingtin were analysed in depth. Five were identified previously, and the other five were new potential interaction partners. The full matrix of human protein pairs and their prediction scores are available for download. Our framework can be extended to predict other types of relationships such as proteins in a complex, pathway or related disease mechanisms.


Subject(s)
Huntington Disease/metabolism , Membrane Proteins/metabolism , Muscle Proteins/metabolism , Muscular Dystrophies/metabolism , Nerve Tissue Proteins/metabolism , Nuclear Proteins/metabolism , Polycystic Kidney Diseases/metabolism , Protein Interaction Mapping/methods , Algorithms , Animals , Computational Biology/methods , Databases, Protein , Drosophila , Dysferlin , Gene Expression , Humans , Huntingtin Protein , Huntington Disease/genetics , Membrane Proteins/genetics , Mice , Molecular Targeted Therapy , Muscle Proteins/genetics , Muscular Dystrophies/genetics , Nerve Tissue Proteins/genetics , Nuclear Proteins/genetics , Polycystic Kidney Diseases/genetics , Predictive Value of Tests , Probability , Protein Binding , Protein Structure, Tertiary , RNA, Messenger
14.
JAMIA Open ; 3(3): 472-486, 2020 Oct.
Article in English | MEDLINE | ID: mdl-33426479

ABSTRACT

The premise of Open Science is that research and medical management will progress faster if data and knowledge are openly shared. The value of Open Science is nowhere more important and appreciated than in the rare disease (RD) community. Research into RDs has been limited by insufficient patient data and resources, a paucity of trained disease experts, and lack of therapeutics, leading to long delays in diagnosis and treatment. These issues can be ameliorated by following the principles and practices of sharing that are intrinsic to Open Science. Here, we describe how the RD community has adopted the core pillars of Open Science, adding new initiatives to promote care and research for RD patients and, ultimately, for all of medicine. We also present recommendations that can advance Open Science more globally.

15.
BMC Bioinformatics ; 9: 291, 2008 Jun 24.
Article in English | MEDLINE | ID: mdl-18577208

ABSTRACT

BACKGROUND: Comparative analysis of expression microarray studies is difficult due to the large influence of technical factors on experimental outcome. Still, the identified differentially expressed genes may hint at the same biological processes. However, manually curated assignment of genes to biological processes, such as pursued by the Gene Ontology (GO) consortium, is incomplete and limited. We hypothesised that automatic association of genes with biological processes through thesaurus-controlled mining of Medline abstracts would be more effective. Therefore, we developed a novel algorithm (LAMA: Literature-Aided Meta-Analysis) to quantify the similarity between transcriptomics studies. We evaluated our algorithm on a large compendium of 102 microarray studies published in the field of muscle development and disease, and compared it to similarity measures based on gene overlap and over-representation of biological processes assigned by GO. RESULTS: While the overlap in both genes and overrepresented GO-terms was poor, LAMA retrieved many more biologically meaningful links between studies, with substantially lower influence of technical factors. LAMA correctly grouped muscular dystrophy, regeneration and myositis studies, and linked patient and corresponding mouse model studies. LAMA also retrieves the connecting biological concepts. Among other new discoveries, we associated cullin proteins, a class of ubiquitinylation proteins, with genes down-regulated during muscle regeneration, whereas ubiquitinylation was previously reported to be activated during the inverse process: muscle atrophy. CONCLUSION: Our literature-based association analysis is capable of finding hidden common biological denominators in microarray studies, and circumvents the need for raw data analysis or curated gene annotation databases.


Subject(s)
Meta-Analysis as Topic , Muscle Development , Muscular Diseases , Natural Language Processing , Oligonucleotide Array Sequence Analysis , Animals , Artificial Intelligence , Cluster Analysis , Gene Expression Profiling , Humans , MEDLINE , Models, Animal , Pattern Recognition, Automated/methods , Publications , Reproducibility of Results , Vocabulary, Controlled
16.
BMC Bioinformatics ; 8: 14, 2007 Jan 18.
Article in English | MEDLINE | ID: mdl-17233900

ABSTRACT

BACKGROUND: High-throughput experiments, such as with DNA microarrays, typically result in hundreds of genes potentially relevant to the process under study, rendering the interpretation of these experiments problematic. Here, we propose and evaluate an approach to find functional associations between large numbers of genes and other biomedical concepts from free-text literature. For each gene, a profile of related concepts is constructed that summarizes the context in which the gene is mentioned in literature. We assign a weight to each concept in the profile based on a likelihood ratio measure. Gene concept profiles can then be clustered to find related genes and other concepts. RESULTS: The experimental validation was done in two steps. We first applied our method on a controlled test set. After this proved to be successful the datasets from two DNA microarray experiments were analyzed in the same way and the results were evaluated by domain experts. The first dataset was a gene-expression profile that characterizes the cancer cells of a group of acute myeloid leukemia patients. For this group of patients the biological background of the cancer cells is largely unknown. Using our methodology we found an association of these cells to monocytes, which agreed with other experimental evidence. The second data set consisted of differentially expressed genes following androgen receptor stimulation in a prostate cancer cell line. Based on the analysis we put forward a hypothesis about the biological processes induced in these studied cells: secretory lysosomes are involved in the production of prostatic fluid and their development and/or secretion are androgen-regulated processes. CONCLUSION: Our method can be used to analyze DNA microarray datasets based on information explicitly and implicitly available in the literature. We provide a publicly available tool, dubbed Anni, for this purpose.


Subject(s)
Biomarkers, Tumor/analysis , Gene Expression Profiling/methods , Leukemia, Monocytic, Acute/metabolism , Neoplasm Proteins/analysis , Oligonucleotide Array Sequence Analysis/methods , Prostatic Neoplasms/metabolism , Receptors, Androgen/analysis , Algorithms , Databases, Genetic , Diagnosis, Computer-Assisted/methods , Humans , Information Storage and Retrieval/methods , Leukemia, Monocytic, Acute/diagnosis , Male , Natural Language Processing , Prostatic Neoplasms/diagnosis , Reproducibility of Results , Sensitivity and Specificity
17.
J Biomed Inform ; 40(3): 316-24, 2007 Jun.
Article in English | MEDLINE | ID: mdl-17079192

ABSTRACT

Gene and protein name identification in text requires a dictionary approach to relate synonyms to the same gene or protein, and to link names to external databases. However, existing dictionaries are incomplete. We investigate two complementary methods for automatic generation of a comprehensive dictionary: combination of information from existing gene and protein databases and rule-based generation of spelling variations. Both methods have been reported in literature before, but have hitherto not been combined and evaluated systematically. We combined gene and protein names from several existing databases of four different organisms. The combined dictionaries showed a substantial increase in recall on three different test sets, as compared to any single database. Application of 23 spelling variation rules to the combined dictionaries further increased recall. However, many rules appeared to have no effect and some appear to have a detrimental effect on precision.


Subject(s)
Dictionaries as Topic , Abstracting and Indexing , Animals , Automation , Computational Biology , Computers , Databases, Genetic , Databases, Protein , Humans , Information Storage and Retrieval , Models, Statistical , Names , Natural Language Processing , Terminology as Topic
18.
Int J Med Inform ; 75(3-4): 257-67, 2006.
Article in English | MEDLINE | ID: mdl-16198618

ABSTRACT

Examples are given of the use of large research databases for knowledge discovery. Such databases are not only increasingly used for research in the 'hard' mathematics-based disciplines such as physics and engineering but also in more 'soft' disciplines, such as sociology, psychology and, in general, the humanities. In between the 'hard' and the 'soft' disciplines lie disciplines such as biomedicine and health care, from which we have selected our illustrations. This latter area can be subdivided into: (1) fundamental biomedical research, related to the 'hard' scientific approach; (2) clinical research, using both 'hard' and 'soft' data and (3) population-based research, which can be subdivided into prospective and retrospective research. The examples that we shall offer are representative for using computers in scientific research in general, but in medical and health informatics in particular.


Subject(s)
Biomedical Research/methods , Database Management Systems , Databases, Factual , Delivery of Health Care , Health Knowledge, Attitudes, Practice , Information Storage and Retrieval/methods , Europe , Medical Records Systems, Computerized , Research Design
19.
Orphanet J Rare Dis ; 11(1): 97, 2016 08 01.
Article in English | MEDLINE | ID: mdl-27476530

ABSTRACT

BACKGROUND: Huntington's disease (HD) is a devastating brain disorder with no effective treatment or cure available. The scarcity of brain tissue makes it hard to study changes in the brain and impossible to perform longitudinal studies. However, peripheral pathology in HD suggests that it is possible to study the disease using peripheral tissue as a monitoring tool for disease progression and/or efficacy of novel therapies. In this study, we investigated if blood can be used to monitor disease severity and progression in brain. Since previous attempts using only gene expression proved unsuccessful, we compared blood and brain Huntington's disease signatures in a functional context. METHODS: Microarray HD gene expression profiles from three brain regions were compared to the transcriptome of HD blood generated by next generation sequencing. The comparison was performed with a combination of weighted gene co-expression network analysis and literature based functional analysis (Concept Profile Analysis). Uniquely, our comparison of blood and brain datasets was not based on (the very limited) gene overlap but on the similarity between the gene annotations in four different semantic categories: "biological process", "cellular component", "molecular function" and "disease or syndrome". RESULTS: We identified signatures in HD blood reflecting a broad pathophysiological spectrum, including alterations in the immune response, sphingolipid biosynthetic processes, lipid transport, cell signaling, protein modification, spliceosome, RNA splicing, vesicle transport, cell signaling and synaptic transmission. Part of this spectrum was reminiscent of the brain pathology. The HD signatures in caudate nucleus and BA4 exhibited the highest similarity with blood, irrespective of the category of semantic annotations used. BA9 exhibited an intermediate similarity, while cerebellum had the least similarity. We present two signatures that were shared between blood and brain: immune response and spinocerebellar ataxias. CONCLUSIONS: Our results demonstrate that HD blood exhibits dysregulation that is similar to brain at a functional level, but not necessarily at the level of individual genes. We report two common signatures that can be used to monitor the pathology in brain of HD patients in a non-invasive manner. Our results are an exemplar of how signals in blood data can be used to represent brain disorders. Our methodology can be used to study disease specific signatures in diseases where heterogeneous tissues are involved in the pathology.


Subject(s)
Brain/metabolism , Huntington Disease/blood , Huntington Disease/metabolism , Biomarkers/blood , Biomarkers/metabolism , Brain/pathology , Disease Progression , Gene Expression/genetics , Gene Expression Profiling , High-Throughput Nucleotide Sequencing , Humans , Huntington Disease/genetics
20.
PLoS One ; 11(2): e0149621, 2016.
Article in English | MEDLINE | ID: mdl-26919047

ABSTRACT

High-throughput experimental methods such as medical sequencing and genome-wide association studies (GWAS) identify increasingly large numbers of potential relations between genetic variants and diseases. Both biological complexity (millions of potential gene-disease associations) and the accelerating rate of data production necessitate computational approaches to prioritize and rationalize potential gene-disease relations. Here, we use concept profile technology to expose from the biomedical literature both explicitly stated gene-disease relations (the explicitome) and a much larger set of implied gene-disease associations (the implicitome). Implicit relations are largely unknown to, or are even unintended by the original authors, but they vastly extend the reach of existing biomedical knowledge for identification and interpretation of gene-disease associations. The implicitome can be used in conjunction with experimental data resources to rationalize both known and novel associations. We demonstrate the usefulness of the implicitome by rationalizing known and novel gene-disease associations, including those from GWAS. To facilitate the re-use of implicit gene-disease associations, we publish our data in compliance with FAIR Data Publishing recommendations [https://www.force11.org/group/fairgroup] using nanopublications. An online tool (http://knowledge.bio) is available to explore established and potential gene-disease associations in the context of other biomedical relations.


Subject(s)
Computational Biology/methods , Databases, Genetic , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans
SELECTION OF CITATIONS
SEARCH DETAIL