RESUMEN
BACKGROUND: Data heterogeneity is a common phenomenon related to the secondary use of electronic health records (EHR) data from different sources. The Observational Health Data Sciences and Informatics (OHDSI) Common Data Model (CDM) organizes healthcare data into standard data structures using concepts that are explicitly and formally specified through standard vocabularies, thereby facilitating large-scale analysis. The objective of this study is to design, develop, and evaluate generic survival analysis routines built using the OHDSI CDM. METHODS: We used intrahepatic cholangiocarcinoma (ICC) patient data to implement CDM-based survival analysis methods. Our methods comprise the following modules: 1) Mapping local terms to standard OHDSI concepts. The analytical expression of variables and values related to demographic characteristics, medical history, smoking status, laboratory results, and tumor feature data. These data were mapped to standard OHDSI concepts through a manual analysis; 2) Loading patient data into the CDM using the concept mappings; 3) Developing an R interface that supports the portable survival analysis on top of OHDSI CDM, and comparing the CDM-based analysis results with those using traditional statistical analysis methods. RESULTS: Our dataset contained 346 patients diagnosed with ICC. The collected clinical data contains 115 variables, of which 75 variables were mapped to the OHDSI concepts. These concepts mainly belong to four domains: condition, observation, measurement, and procedure. The corresponding standard concepts are scattered in six vocabularies: ICD10CM, ICD10PCS, SNOMED, LOINC, NDFRT, and READ. We loaded a total of 25,950 patient data records into the OHDSI CDM database. However, 40 variables failed to map to the OHDSI CDM as they mostly belong to imaging data and pathological data. CONCLUSIONS: Our study demonstrates that conducting survival analysis using the OHDSI CDM is feasible and can produce reusable analysis routines. However, challenges to be overcome include 1) semantic loss caused by inaccurate mapping and value normalization; 2) incomplete OHDSI vocabularies describing imaging data, pathological data, and modular data representation.
Asunto(s)
Macrodatos , Colangiocarcinoma , Registros Electrónicos de Salud , Almacenamiento y Recuperación de la Información , Sistemas de Información , Análisis de Supervivencia , HumanosRESUMEN
OBJECTIVE: To introduce 2 R-packages that facilitate conducting health economics research on OMOP-based data networks, aiming to standardize and improve the reproducibility, transparency, and transferability of health economic models. MATERIALS AND METHODS: We developed the software tools and demonstrated their utility by replicating a UK-based heart failure data analysis across 5 different international databases from Estonia, Spain, Serbia, and the United States. RESULTS: We examined treatment trajectories of 47 163 patients. The overall incremental cost-effectiveness ratio (ICER) for telemonitoring relative to standard of care was 57 472 /QALY. Country-specific ICERs were 60 312 /QALY in Estonia, 58 096 /QALY in Spain, 40 372 /QALY in Serbia, and 90 893 /QALY in the US, which surpassed the established willingness-to-pay thresholds. DISCUSSION: Currently, the cost-effectiveness analysis lacks standard tools, is performed in ad-hoc manner, and relies heavily on published information that might not be specific for local circumstances. Published results often exhibit a narrow focus, central to a single site, and provide only partial decision criteria, limiting their generalizability and comprehensive utility. CONCLUSION: We created 2 R-packages to pioneer cost-effectiveness analysis in OMOP CDM data networks. The first manages state definitions and database interaction, while the second focuses on Markov model learning and profile synthesis. We demonstrated their utility in a multisite heart failure study, comparing telemonitoring and standard care, finding telemonitoring not cost-effective.