Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 37
Filter
1.
J Clin Transl Sci ; 7(1): e214, 2023.
Article in English | MEDLINE | ID: mdl-37900350

ABSTRACT

Knowledge graphs have become a common approach for knowledge representation. Yet, the application of graph methodology is elusive due to the sheer number and complexity of knowledge sources. In addition, semantic incompatibilities hinder efforts to harmonize and integrate across these diverse sources. As part of The Biomedical Translator Consortium, we have developed a knowledge graph-based question-answering system designed to augment human reasoning and accelerate translational scientific discovery: the Translator system. We have applied the Translator system to answer biomedical questions in the context of a broad array of diseases and syndromes, including Fanconi anemia, primary ciliary dyskinesia, multiple sclerosis, and others. A variety of collaborative approaches have been used to research and develop the Translator system. One recent approach involved the establishment of a monthly "Question-of-the-Month (QotM) Challenge" series. Herein, we describe the structure of the QotM Challenge; the six challenges that have been conducted to date on drug-induced liver injury, cannabidiol toxicity, coronavirus infection, diabetes, psoriatic arthritis, and ATP1A3-related phenotypes; the scientific insights that have been gleaned during the challenges; and the technical issues that were identified over the course of the challenges and that can now be addressed to foster further development of the prototype Translator system. We close with a discussion on Large Language Models such as ChatGPT and highlight differences between those models and the Translator system.

2.
Health Informatics J ; 29(2): 14604582231170892, 2023.
Article in English | MEDLINE | ID: mdl-37066514

ABSTRACT

The Integrated Clinical and Environmental Exposures Service (ICEES) provides open regulatory-compliant access to clinical data, including electronic health record data, that have been integrated with environmental exposures data. While ICEES has been validated in the context of an asthma use case and several other use cases, the regulatory constraints on the ICEES open application programming interface (OpenAPI) result in data loss when using the service for multivariate analysis. In this study, we investigated the robustness of the ICEES OpenAPI through a comparative analysis, in which we applied a generalized linear model (GLM) to the OpenAPI data and the constraint-free source data to examine factors predictive of asthma exacerbations. Consistent with previous studies, we found that the main predictors identified by both analyses were sex, prednisone, race, obesity, and airborne particulate exposure. Comparison of GLM model fit revealed that data loss impacts model quality, but only with select interaction terms. We conclude that the ICEES OpenAPI supports multivariate analysis, albeit with potential data loss that users should be aware of.


Subject(s)
Asthma , Electronic Health Records , Humans , Linear Models , Environmental Exposure , Software , Asthma/epidemiology
3.
Clin Transl Sci ; 15(8): 1848-1855, 2022 08.
Article in English | MEDLINE | ID: mdl-36125173

ABSTRACT

Within clinical, biomedical, and translational science, an increasing number of projects are adopting graphs for knowledge representation. Graph-based data models elucidate the interconnectedness among core biomedical concepts, enable data structures to be easily updated, and support intuitive queries, visualizations, and inference algorithms. However, knowledge discovery across these "knowledge graphs" (KGs) has remained difficult. Data set heterogeneity and complexity; the proliferation of ad hoc data formats; poor compliance with guidelines on findability, accessibility, interoperability, and reusability; and, in particular, the lack of a universally accepted, open-access model for standardization across biomedical KGs has left the task of reconciling data sources to downstream consumers. Biolink Model is an open-source data model that can be used to formalize the relationships between data structures in translational science. It incorporates object-oriented classification and graph-oriented features. The core of the model is a set of hierarchical, interconnected classes (or categories) and relationships between them (or predicates) representing biomedical entities such as gene, disease, chemical, anatomic structure, and phenotype. The model provides class and edge attributes and associations that guide how entities should relate to one another. Here, we highlight the need for a standardized data model for KGs, describe Biolink Model, and compare it with other models. We demonstrate the utility of Biolink Model in various initiatives, including the Biomedical Data Translator Consortium and the Monarch Initiative, and show how it has supported easier integration and interoperability of biomedical KGs, bringing together knowledge from multiple sources and helping to realize the goals of translational science.


Subject(s)
Pattern Recognition, Automated , Translational Science, Biomedical , Knowledge
4.
Front Artif Intell ; 5: 918888, 2022.
Article in English | MEDLINE | ID: mdl-35837616

ABSTRACT

Research on rare diseases has received increasing attention, in part due to the realized profitability of orphan drugs. Biomedical informatics holds promise in accelerating translational research on rare disease, yet challenges remain, including the lack of diagnostic codes for rare diseases and privacy concerns that prevent research access to electronic health records when few patients exist. The Integrated Clinical and Environmental Exposures Service (ICEES) provides regulatory-compliant open access to electronic health record data that have been integrated with environmental exposures data, as well as analytic tools to explore the integrated data. We describe a proof-of-concept application of ICEES to examine demographics, clinical characteristics, environmental exposures, and health outcomes among a cohort of patients enriched for phenotypes associated with cystic fibrosis (CF), idiopathic bronchiectasis (IB), and primary ciliary dyskinesia (PCD). We then focus on a subset of patients with CF, leveraging the availability of a diagnostic code for CF and serving as a benchmark for our development work. We use ICEES to examine select demographics, co-diagnoses, and environmental exposures that may contribute to poor health outcomes among patients with CF, defined as emergency department or inpatient visits for respiratory issues. We replicate current understanding of the pathogenesis and clinical manifestations of CF by identifying co-diagnoses of asthma, chronic nasal congestion, cough, middle ear disease, and pneumonia as factors that differentiate patients with poor health outcomes from those with better health outcomes. We conclude by discussing our preliminary findings in relation to other published work, the strengths and limitations of our approach, and our future directions.

5.
Clin Transl Sci ; 2022 May 25.
Article in English | MEDLINE | ID: mdl-35611543

ABSTRACT

Clinical, biomedical, and translational science has reached an inflection point in the breadth and diversity of available data and the potential impact of such data to improve human health and well-being. However, the data are often siloed, disorganized, and not broadly accessible due to discipline-specific differences in terminology and representation. To address these challenges, the Biomedical Data Translator Consortium has developed and tested a pilot knowledge graph-based "Translator" system capable of integrating existing biomedical data sets and "translating" those data into insights intended to augment human reasoning and accelerate translational science. Having demonstrated feasibility of the Translator system, the Translator program has since moved into development, and the Translator Consortium has made significant progress in the research, design, and implementation of an operational system. Herein, we describe the current system's architecture, performance, and quality of results. We apply Translator to several real-world use cases developed in collaboration with subject-matter experts. Finally, we discuss the scientific and technical features of Translator and compare those features to other state-of-the-art, biomedical graph-based question-answering systems.

6.
JMIR Form Res ; 6(4): e32357, 2022 Apr 01.
Article in English | MEDLINE | ID: mdl-35363149

ABSTRACT

BACKGROUND: The Integrated Clinical and Environmental Exposures Service (ICEES) serves as an open-source, disease-agnostic, regulatory-compliant framework and approach for openly exposing and exploring clinical data that have been integrated at the patient level with a variety of environmental exposures data. ICEES is equipped with tools to support basic statistical exploration of the integrated data in a completely open manner. OBJECTIVE: This study aims to further develop and apply ICEES as a novel tool for openly exposing and exploring integrated clinical and environmental data. We focus on an asthma use case. METHODS: We queried the ICEES open application programming interface (OpenAPI) using a functionality that supports chi-square tests between feature variables and a primary outcome measure, with a Bonferroni correction for multiple comparisons (α=.001). We focused on 2 primary outcomes that are indicative of asthma exacerbations: annual emergency department (ED) or inpatient visits for respiratory issues; and annual prescriptions for prednisone. RESULTS: Of the 157,410 patients within the asthma cohort, 26,332 (16.73%) had 1 or more annual ED or inpatient visits for respiratory issues, and 17,056 (10.84%) had 1 or more annual prescriptions for prednisone. We found that close proximity to a major roadway or highway, exposure to high levels of particulate matter ≤2.5 µm (PM2.5) or ozone, female sex, Caucasian race, low residential density, lack of health insurance, and low household income were significantly associated with asthma exacerbations (P<.001). Asthma exacerbations did not vary by rural versus urban residence. Moreover, the results were largely consistent across outcome measures. CONCLUSIONS: Our results demonstrate that the open-source ICEES can be used to replicate and extend published findings on factors that influence asthma exacerbations. As a disease-agnostic, open-source approach for integrating, exposing, and exploring patient-level clinical and environmental exposures data, we believe that ICEES will have broad adoption by other institutions and application in environmental health and other biomedical fields.

7.
Drug Discov Today ; 27(6): 1671-1678, 2022 06.
Article in English | MEDLINE | ID: mdl-35182735

ABSTRACT

Here, we propose a broad concept of 'Clinical Outcome Pathways' (COPs), which are defined as a series of key molecular and cellular events that underlie therapeutic effects of drug molecules. We formalize COPs as a chain of the following events: molecular initiating event (MIE) â†’ intermediate event(s) â†’ clinical outcome. We illustrate the concept with COP examples both for primary and alternative (i.e., drug repurposing) therapeutic applications. We also describe the elucidation of COPs for several drugs of interest using the publicly accessible Reasoning Over Biomedical Objects linked in Knowledge-Oriented Pathways (ROBOKOP) biomedical knowledge graph-mining tool. We propose that broader use of COP uncovered with the help of biomedical knowledge graph mining will likely accelerate drug discovery and repurposing efforts.


Subject(s)
Drug Repositioning , Knowledge Bases , Drug Discovery , Knowledge
8.
Article in English | MEDLINE | ID: mdl-34769911

ABSTRACT

ICEES (Integrated Clinical and Environmental Exposures Service) provides a disease-agnostic, regulatory-compliant approach for openly exposing and analyzing clinical data that have been integrated at the patient level with environmental exposures data. ICEES is equipped with basic features to support exploratory analysis using statistical approaches, such as bivariate chi-square tests. We recently developed a method for using ICEES to generate multivariate tables for subsequent application of machine learning and statistical models. The objective of the present study was to use this approach to identify predictors of asthma exacerbations through the application of three multivariate methods: conditional random forest, conditional tree, and generalized linear model. Among seven potential predictor variables, we found five to be of significant importance using both conditional random forest and conditional tree: prednisone, race, airborne particulate exposure, obesity, and sex. The conditional tree method additionally identified several significant two-way and three-way interactions among the same variables. When we applied a generalized linear model, we identified four significant predictor variables, namely prednisone, race, airborne particulate exposure, and obesity. When ranked in order by effect size, the results were in agreement with the results from the conditional random forest and conditional tree methods as well as the published literature. Our results suggest that the open multivariate analytic capabilities provided by ICEES are valid in the context of an asthma use case and likely will have broad value in advancing open research in environmental and public health.


Subject(s)
Asthma , Environmental Exposure , Asthma/epidemiology , Asthma/etiology , Humans , Machine Learning , Models, Statistical
9.
ArXiv ; 2021 Aug 25.
Article in English | MEDLINE | ID: mdl-34462722

ABSTRACT

As the COVID-19 pandemic continues to impact the world, data is being gathered and analyzed to better understand the disease. Recognizing the potential for visual analytics technologies to support exploratory analysis and hypothesis generation from longitudinal clinical data, a team of collaborators worked to apply existing event sequence visual analytics technologies to a longitudinal clinical data from a cohort of 998 patients with high rates of COVID-19 infection. This paper describes the initial steps toward this goal, including: (1) the data transformation and processing work required to prepare the data for visual analysis, (2) initial findings and observations, and (3) qualitative feedback and lessons learned which highlight key features as well as limitations to address in future work.

10.
JMIR Med Inform ; 9(7): e26714, 2021 Jul 20.
Article in English | MEDLINE | ID: mdl-34283031

ABSTRACT

BACKGROUND: Knowledge graphs are a common form of knowledge representation in biomedicine and many other fields. We developed an open biomedical knowledge graph-based system termed Reasoning Over Biomedical Objects linked in Knowledge Oriented Pathways (ROBOKOP). ROBOKOP consists of both a front-end user interface and a back-end knowledge graph. The ROBOKOP user interface allows users to posit questions and explore answer subgraphs. Users can also posit questions through direct Cypher query of the underlying knowledge graph, which currently contains roughly 6 million nodes or biomedical entities and 140 million edges or predicates describing the relationship between nodes, drawn from over 30 curated data sources. OBJECTIVE: We aimed to apply ROBOKOP to survey data on workplace exposures and immune-mediated diseases from the Environmental Polymorphisms Registry (EPR) within the National Institute of Environmental Health Sciences. METHODS: We analyzed EPR survey data and identified 45 associations between workplace chemical exposures and immune-mediated diseases, as self-reported by study participants (n= 4574), with 20 associations significant at P<.05 after false discovery rate correction. We then used ROBOKOP to (1) validate the associations by determining whether plausible connections exist within the ROBOKOP knowledge graph and (2) propose biological mechanisms that might explain them and serve as hypotheses for subsequent testing. We highlight the following three exemplar associations: carbon monoxide-multiple sclerosis, ammonia-asthma, and isopropanol-allergic disease. RESULTS: ROBOKOP successfully returned answer sets for three queries that were posed in the context of the driving examples. The answer sets included potential intermediary genes, as well as supporting evidence that might explain the observed associations. CONCLUSIONS: We demonstrate real-world application of ROBOKOP to generate mechanistic hypotheses for associations between workplace chemical exposures and immune-mediated diseases. We expect that ROBOKOP will find broad application across many biomedical fields and other scientific disciplines due to its generalizability, speed to discovery and generation of mechanistic hypotheses, and open nature.

11.
Clin Transl Sci ; 14(5): 1719-1724, 2021 09.
Article in English | MEDLINE | ID: mdl-33742785

ABSTRACT

"Knowledge graphs" (KGs) have become a common approach for representing biomedical knowledge. In a KG, multiple biomedical data sets can be linked together as a graph representation, with nodes representing entities, such as "chemical substance" or "genes," and edges representing predicates, such as "causes" or "treats." Reasoning and inference algorithms can then be applied to the KG and used to generate new knowledge. We developed three KG-based question-answering systems as part of the Biomedical Data Translator program. These systems are typically tested and evaluated using traditional software engineering tools and approaches. In this study, we explored a team-based approach to test and evaluate the prototype "Translator Reasoners" through the application of Medical College Admission Test (MCAT) questions. Specifically, we describe three "hackathons," in which the developers of each of the three systems worked together with a moderator to determine whether the applications could be used to solve MCAT questions. The results demonstrate progressive improvement in system performance, with 0% (0/5) correct answers during the first hackathon, 75% (3/4) correct during the second hackathon, and 100% (5/5) correct during the final hackathon. We discuss the technical and sociologic lessons learned and conclude that MCAT questions can be applied successfully in the context of moderated hackathons to test and evaluate prototype KG-based question-answering systems, identify gaps in current capabilities, and improve performance. Finally, we highlight several published clinical and translational science applications of the Translator Reasoners.


Subject(s)
Pattern Recognition, Automated/methods , Translational Science, Biomedical/methods , Algorithms , College Admission Test/statistics & numerical data , Datasets as Topic , Humans
12.
Article in English | MEDLINE | ID: mdl-35875189

ABSTRACT

The Integrated Clinical and Environmental Exposures Service (ICEES) provides regulatory-compliant open access to sensitive patient data that have been integrated with public exposures data. ICEES was designed initially to support dynamic cohort creation and bivariate contingency tests. The objective of the present study was to develop an open approach to support multivariate analyses using existing ICEES functionalities and abiding by all regulatory constraints. We first developed an open approach for generating a multivariate table that maintains contingencies between clinical and environmental variables using programmatic calls to the open ICEES application programming interface. We then applied the approach to data on a large cohort (N = 22,365) of patients with asthma or related conditions and generated an eight-feature table. Due to regulatory constraints, data loss was incurred with the incorporation of each successive feature variable, from a starting sample size of N = 22,365 to a final sample size of N = 4,556 (20.4%), but data loss was < 10% until the addition of the final two feature variables. We then applied a generalized linear model to the subsequent dataset and focused on the impact of seven select feature variables on asthma exacerbations, defined as annual emergency department or inpatient visits for respiratory issues. We identified five feature variables-sex, race, obesity, prednisone, and airborne particulate exposure-as significant predictors of asthma exacerbations. We discuss the advantages and disadvantages of ICEES open multivariate analysis and conclude that, despite limitations, ICEES can provide a valuable resource for open multivariate analysis and can serve as an exemplar for regulatory-compliant informatic solutions to open patient data, with capabilities to explore the impact of environmental exposures on health outcomes.

13.
JMIR Med Inform ; 8(11): e17964, 2020 Nov 23.
Article in English | MEDLINE | ID: mdl-33226347

ABSTRACT

BACKGROUND: Efforts are underway to semantically integrate large biomedical knowledge graphs using common upper-level ontologies to federate graph-oriented application programming interfaces (APIs) to the data. However, federation poses several challenges, including query routing to appropriate knowledge sources, generation and evaluation of answer subsets, semantic merger of those answer subsets, and visualization and exploration of results. OBJECTIVE: We aimed to develop an interactive environment for query, visualization, and deep exploration of federated knowledge graphs. METHODS: We developed a biomedical query language and web application interphase-termed as Translator Query Language (TranQL)-to query semantically federated knowledge graphs and explore query results. TranQL uses the Biolink data model as an upper-level biomedical ontology and an API standard that has been adopted by the Biomedical Data Translator Consortium to specify a protocol for expressing a query as a graph of Biolink data elements compiled from statements in the TranQL query language. Queries are mapped to federated knowledge sources, and answers are merged into a knowledge graph, with mappings between the knowledge graph and specific elements of the query. The TranQL interactive web application includes a user interface to support user exploration of the federated knowledge graph. RESULTS: We developed 2 real-world use cases to validate TranQL and address biomedical questions of relevance to translational science. The use cases posed questions that traversed 2 federated Translator API endpoints: Integrated Clinical and Environmental Exposures Service (ICEES) and Reasoning Over Biomedical Objects linked in Knowledge Oriented Pathways (ROBOKOP). ICEES provides open access to observational clinical and environmental data, and ROBOKOP provides access to linked biomedical entities, such as "gene," "chemical substance," and "disease," that are derived largely from curated public data sources. We successfully posed queries to TranQL that traversed these endpoints and retrieved answers that we visualized and evaluated. CONCLUSIONS: TranQL can be used to ask questions of relevance to translational science, rapidly obtain answers that require assertions from a federation of knowledge sources, and provide valuable insights for translational research and clinical practice.

14.
Article in English | MEDLINE | ID: mdl-32708093

ABSTRACT

Environmental exposures have profound effects on health and disease. While public repositories exist for a variety of exposures data, these are generally difficult to access, navigate, and interpret. We describe the research, development, and application of three open application programming interfaces (APIs) that support access to usable, nationwide, exposures data from three public repositories: airborne pollutant estimates from the US Environmental Protection Agency; roadway data from the US Department of Transportation; and socio-environmental exposures from the US Census Bureau's American Community Survey. Three open APIs were successfully developed, deployed, and tested using random latitude/longitude values and time periods as input parameters. After confirming the accuracy of the data, we used the APIs to extract exposures data on 2550 participants from a cohort within the Environmental Polymorphisms Registry (EPR) at the National Institute of Environmental Health Sciences, and we successfully linked the exposure estimates with participant-level data derived from the EPR. We then conducted an exploratory, proof-of-concept analysis of the integrated data for a subset of participants with self-reported asthma and largely replicated our prior findings on the impact of select exposures and demographic factors on asthma exacerbations. Together, the three open exposures APIs provide a valuable resource, with application across environmental and public health fields.


Subject(s)
Air Pollutants/adverse effects , Environmental Exposure/adverse effects , Environmental Pollutants , Social Environment , Access to Information , Air Pollutants/analysis , Environmental Exposure/analysis , Female , Humans , Male , Socioeconomic Factors , United States , United States Environmental Protection Agency
15.
BMC Med Inform Decis Mak ; 20(1): 53, 2020 03 11.
Article in English | MEDLINE | ID: mdl-32160884

ABSTRACT

BACKGROUND: Informatics tools to support the integration and subsequent interrogation of spatiotemporal data such as clinical data and environmental exposures data are lacking. Such tools are needed to support research in environmental health and any biomedical field that is challenged by the need for integrated spatiotemporal data to examine individual-level determinants of health and disease. RESULTS: We have developed an open-source software application-FHIR PIT (Health Level 7 Fast Healthcare Interoperability Resources Patient data Integration Tool)-to enable studies on the impact of individual-level environmental exposures on health and disease. FHIR PIT was motivated by the need to integrate patient data derived from our institution's clinical warehouse with a variety of public data sources on environmental exposures and then openly expose the data via ICEES (Integrated Clinical and Environmental Exposures Service). FHIR PIT consists of transformation steps or building blocks that can be chained together to form a transformation and integration workflow. Several transformation steps are generic and thus can be reused. As such, new types of data can be incorporated into the modular FHIR PIT pipeline by simply reusing generic steps or adding new ones. We validated FHIR PIT in the context of a driving use case designed to investigate the impact of airborne pollutant exposures on asthma. Specifically, we replicated published findings demonstrating racial disparities in the impact of airborne pollutants on asthma exacerbations. CONCLUSIONS: While FHIR PIT was developed to support our driving use case on asthma, the software can be used to integrate any type and number of spatiotemporal data sources at a level of granularity that enables individual-level study. We expect FHIR PIT to facilitate research in environmental health and numerous other biomedical disciplines.


Subject(s)
Electronic Health Records , Environmental Exposure , Health Information Interoperability/standards , Software Design , Software , Health Level Seven , Humans , Spatio-Temporal Analysis , Systems Integration , Workflow
16.
J Biomed Inform ; 100: 103325, 2019 12.
Article in English | MEDLINE | ID: mdl-31676459

ABSTRACT

This special communication describes activities, products, and lessons learned from a recent hackathon that was funded by the National Center for Advancing Translational Sciences via the Biomedical Data Translator program ('Translator'). Specifically, Translator team members self-organized and worked together to conceptualize and execute, over a five-day period, a multi-institutional clinical research study that aimed to examine, using open clinical data sources, relationships between sex, obesity, diabetes, and exposure to airborne fine particulate matter among patients with severe asthma. The goal was to develop a proof of concept that this new model of collaboration and data sharing could effectively produce meaningful scientific results and generate new scientific hypotheses. Three Translator Clinical Knowledge Sources, each of which provides open access (via Application Programming Interfaces) to data derived from the electronic health record systems of major academic institutions, served as the source of study data. Jupyter Python notebooks, shared in GitHub repositories, were used to call the knowledge sources and analyze and integrate the results. The results replicated established or suspected relationships between sex, obesity, diabetes, exposure to airborne fine particulate matter, and severe asthma. In addition, the results demonstrated specific differences across the three Translator Clinical Knowledge Sources, suggesting cohort- and/or environment-specific factors related to the services themselves or the catchment area from which each service derives patient data. Collectively, this special communication demonstrates the power and utility of intense, team-oriented hackathons and offers general technical, organizational, and scientific lessons learned.


Subject(s)
Asthma/physiopathology , Diabetes Mellitus/physiopathology , Environmental Exposure , Information Storage and Retrieval , Obesity/physiopathology , Particulate Matter/toxicity , Sex Factors , Asthma/complications , Female , Humans , Male , Obesity/complications , Severity of Illness Index
17.
J Chem Inf Model ; 59(12): 4968-4973, 2019 12 23.
Article in English | MEDLINE | ID: mdl-31769676

ABSTRACT

A proliferation of data sources has led to the notional existence of an implicit Knowledge Graph (KG) that contains vast amounts of biological knowledge contributed by distributed Application Programming Interfaces (APIs). However, challenges arise when integrating data across multiple APIs due to incompatible semantic types, identifier schemes, and data formats. We present ROBOKOP KG ( http://robokopkg.renci.org ), which is a KG that was initially built to support the open biomedical question-answering application, ROBOKOP (Reasoning Over Biomedical Objects linked in Knowledge-Oriented Pathways) ( http://robokop.renci.org ). Additionally, we present the ROBOKOP Knowledge Graph Builder (KGB), which constructs the KG and provides an extensible framework to handle graph query over and integration of federated data sources.


Subject(s)
Computer Graphics , Data Mining/methods , Knowledge Bases , Databases, Factual , User-Computer Interface
18.
JMIR Med Inform ; 7(4): e15199, 2019 Oct 16.
Article in English | MEDLINE | ID: mdl-31621639

ABSTRACT

BACKGROUND: In a multisite clinical research collaboration, institutions may or may not use the same common data model (CDM) to store clinical data. To overcome this challenge, we proposed to use Health Level 7's Fast Healthcare Interoperability Resources (FHIR) as a meta-CDM-a single standard to represent clinical data. OBJECTIVE: In this study, we aimed to create an open-source application termed the Clinical Asset Mapping Program for FHIR (CAMP FHIR) to efficiently transform clinical data to FHIR for supporting source-agnostic CDM-to-FHIR mapping. METHODS: Mapping with CAMP FHIR involves (1) mapping each source variable to its corresponding FHIR element and (2) mapping each item in the source data's value sets to the corresponding FHIR value set item for variables with strict value sets. To date, CAMP FHIR has been used to transform 108 variables from the Informatics for Integrating Biology & the Bedside (i2b2) and Patient-Centered Outcomes Research Network data models to fields across 7 FHIR resources. It is designed to allow input from any source data model and will support additional FHIR resources in the future. RESULTS: We have used CAMP FHIR to transform data on approximately 23,000 patients with asthma from our institution's i2b2 database. Data quality and integrity were validated against the origin point of the data, our enterprise clinical data warehouse. CONCLUSIONS: We believe that CAMP FHIR can serve as an alternative to implementing new CDMs on a project-by-project basis. Moreover, the use of FHIR as a CDM could support rare data sharing opportunities, such as collaborations between academic medical centers and community hospitals. We anticipate adoption and use of CAMP FHIR to foster sharing of clinical data across institutions for downstream applications in translational research.

19.
Bioinformatics ; 35(24): 5382-5384, 2019 12 15.
Article in English | MEDLINE | ID: mdl-31410449

ABSTRACT

SUMMARY: Knowledge graphs (KGs) are quickly becoming a common-place tool for storing relationships between entities from which higher-level reasoning can be conducted. KGs are typically stored in a graph-database format, and graph-database queries can be used to answer questions of interest that have been posed by users such as biomedical researchers. For simple queries, the inclusion of direct connections in the KG and the storage and analysis of query results are straightforward; however, for complex queries, these capabilities become exponentially more challenging with each increase in complexity of the query. For instance, one relatively complex query can yield a KG with hundreds of thousands of query results. Thus, the ability to efficiently query, store, rank and explore sub-graphs of a complex KG represents a major challenge to any effort designed to exploit the use of KGs for applications in biomedical research and other domains. We present Reasoning Over Biomedical Objects linked in Knowledge Oriented Pathways as an abstraction layer and user interface to more easily query KGs and store, rank and explore query results. AVAILABILITY AND IMPLEMENTATION: An instance of the ROBOKOP UI for exploration of the ROBOKOP Knowledge Graph can be found at http://robokop.renci.org. The ROBOKOP Knowledge Graph can be accessed at http://robokopkg.renci.org. Code and instructions for building and deploying ROBOKOP are available under the MIT open software license from https://github.com/NCATS-Gamma/robokop. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Pattern Recognition, Automated , Software , Databases, Factual
20.
Article in English | MEDLINE | ID: mdl-31119199

ABSTRACT

Electronic Health Record (EHR) systems typically define laboratory test results using the Laboratory Observation Identifier Names and Codes (LOINC) and can transmit them using Fast Healthcare Interoperability Resource (FHIR) standards. LOINC has not yet been semantically integrated with computational resources for phenotype analysis. Here, we provide a method for mapping LOINC-encoded laboratory test results transmitted in FHIR standards to Human Phenotype Ontology (HPO) terms. We annotated the medical implications of 2923 commonly used laboratory tests with HPO terms. Using these annotations, our software assesses laboratory test results and converts each result into an HPO term. We validated our approach with EHR data from 15,681 patients with respiratory complaints and identified known biomarkers for asthma. Finally, we provide a freely available SMART on FHIR application that can be used within EHR systems. Our approach allows readily available laboratory tests in EHR to be reused for deep phenotyping and exploits the hierarchical structure of HPO to integrate distinct tests that have comparable medical interpretations for association studies.

SELECTION OF CITATIONS
SEARCH DETAIL
...