RESUMEN
BACKGROUND: Real-world evidence (RWE) plays a key role in regulatory and healthcare decision-making, but the potentially fragmentated nature of generated evidence may limit its utility for clinical decision-making. Heterogeneity and a lack of reproducibility in RWE resulting from inconsistent application of methodologies across data sources should be minimized through harmonization. METHODS: This paper's aim is to describe and reflect upon a multidisciplinary research platform (FOUNTAIN; FinerenOne mUlti-database NeTwork for evidence generAtIoN) with coordinated studies using diverse RWE generation approaches and explore the platform's strengths and limitations. With guidance from an executive advisory committee of multidisciplinary experts and patient representatives, the goal of the FOUNTAIN platform is to harmonize RWE generation across a portfolio of research projects, including research partner collaborations and a common data model (CDM)-based program. FOUNTAIN's overarching objectives as a research platform are to establish long-term collaborations among pharmacoepidemiology research partners and experts and to integrate diverse approaches for RWE generation, including global protocol execution by research partners in local data sources and common protocol execution in multiple data sources through federated data networks, while ensuring harmonization of medical definitions, methodology, and reproducible artifacts across all studies. Specifically, the aim of the multiple studies run within the frame of FOUNTAIN is to provide insight into the real-world utilization, effectiveness, and safety of finerenone across its life-cycle. RESULTS: Currently, the FOUNTAIN platform includes 9 research partner collaborations and 8 CDM-mapped data sources from 7 countries (United States, United Kingdom, China, Japan, The Netherlands, Spain, and Denmark). These databases and research partners were selected after a feasibility fit-for-purpose evaluation. Six multicountry, multidatabase, cohort studies are ongoing to describe patient populations, current standard of care, comorbidity profiles, healthcare resource use, and treatment effectiveness and safety in different patient populations with chronic kidney disease and type 2 diabetes. Strengths and potential limitations of FOUNTAIN are described in the context of valid RWE generation. CONCLUSION: The establishment of the FOUNTAIN platform has allowed harmonized execution of multiple studies, promoting consistency both within individual studies that employ multiple data sources and across all studies run within the platform's framework. FOUNTAIN presents a proposal to efficiently improve the consistency and generalizability of RWE on finerenone.
Asunto(s)
Medicina Basada en la Evidencia , Humanos , Medicina Basada en la Evidencia/métodos , Medicina Basada en la Evidencia/normas , Bases de Datos Factuales/estadística & datos numéricos , Proyectos de Investigación , Reproducibilidad de los Resultados , Antagonistas de Receptores de Mineralocorticoides/uso terapéuticoRESUMEN
OBJECTIVES: This study aims to enhance the analysis of healthcare processes by introducing Object-Centric Process Mining (OCPM). By offering a holistic perspective that accounts for the interactions among various objects, OCPM transcends the constraints of conventional patient-centric process mining approaches, ensuring a more detailed and inclusive understanding of healthcare dynamics. METHODS: We develop a novel method to transform the Observational Medical Outcomes Partnership Common Data Models (OMOP CDM) into Object-Centric Event Logs (OCELs). First, an OMOP CDM4PM is created from the standard OMOP CDM, focusing on data relevant to generating OCEL and addressing healthcare data's heterogeneity and standardization challenges. Second, this subset is transformed into OCEL based on specified healthcare criteria, including identifying various object types, clinical activities, and their relationships. The methodology is tested on the MIMIC-IV database to evaluate its effectiveness and utility. RESULTS: Our proposed method effectively produces OCELs when applied to the MIMIC-IV dataset, allowing for the implementation of OCPM in the healthcare industry. We rigorously evaluate the comprehensiveness and level of abstraction to validate our approach's effectiveness. Additionally, we create diverse object-centric process models intricately designed to navigate the complexities inherent in healthcare processes. CONCLUSION: Our approach introduces a novel perspective by integrating multiple viewpoints simultaneously. To the best of our knowledge, this is the inaugural application of OCPM within the healthcare sector, marking a significant advancement in the field.
Asunto(s)
Minería de Datos , Minería de Datos/métodos , Humanos , Atención a la Salud , Evaluación de Procesos, Atención de Salud/métodos , Bases de Datos Factuales , Informática Médica/métodos , Registros Electrónicos de SaludRESUMEN
PURPOSE: Real-world data (RWD) offers a valuable resource for generating population-level disease epidemiology metrics. We aimed to develop a well-tested and user-friendly R package to compute incidence rates and prevalence in data mapped to the observational medical outcomes partnership (OMOP) common data model (CDM). MATERIALS AND METHODS: We created IncidencePrevalence, an R package to support the analysis of population-level incidence rates and point- and period-prevalence in OMOP-formatted data. On top of unit testing, we assessed the face validity of the package. To do so, we calculated incidence rates of COVID-19 using RWD from Spain (SIDIAP) and the United Kingdom (CPRD Aurum), and replicated two previously published studies using data from the Netherlands (IPCI) and the United Kingdom (CPRD Gold). We compared the obtained results to those previously published, and measured execution times by running a benchmark analysis across databases. RESULTS: IncidencePrevalence achieved high agreement to previously published data in CPRD Gold and IPCI, and showed good performance across databases. For COVID-19, incidence calculated by the package was similar to public data after the first-wave of the pandemic. CONCLUSION: For data mapped to the OMOP CDM, the IncidencePrevalence R package can support descriptive epidemiological research. It enables reliable estimation of incidence and prevalence from large real-world data sets. It represents a simple, but extendable, analytical framework to generate estimates in a reproducible and timely manner.
Asunto(s)
COVID-19 , Manejo de Datos , Humanos , Incidencia , Prevalencia , Bases de Datos Factuales , COVID-19/epidemiologíaRESUMEN
PURPOSE: We aimed to develop a standardized method to calculate daily dose (i.e., the amount of drug a patient was exposed to per day) of any drug on a global scale using only drug information of typical observational data in the Observational Medical Outcomes Partnership Common Data Model (OMOP CDM) and a single reference table from Observational Health Data Sciences And Informatics (OHDSI). MATERIALS AND METHODS: The OMOP DRUG_STRENGTH reference table contains information on the strength or concentration of drugs, whereas the OMOP DRUG_EXPOSURE table contains information on patients' drug prescriptions or dispensations/claims. Based on DRUG_EXPOSURE data from the primary care databases Clinical Practice Research Datalink GOLD (United Kingdom) and Integrated Primary Care Information (IPCI, The Netherlands) and healthcare claims from PharMetrics® Plus for Academics (USA), we developed four formulas to calculate daily dose given different DRUG_STRENGTH reference table information. We tested the dose formulas by comparing the calculated median daily dose to the World Health Organization (WHO) Defined Daily Dose (DDD) for six different ingredients in those three databases and additional four international databases representing a variety of healthcare settings: MAITT (Estonia, healthcare claims and discharge summaries), IQVIA Disease Analyzer Germany (outpatient data), IQVIA Longitudinal Patient Database Belgium (outpatient data), and IMASIS Parc Salut (Spain, hospital data). Finally, in each database, we assessed the proportion of drug records for which daily dose calculations were possible using the suggested formulas. RESULTS: Applying the dose formulas, we obtained median daily doses that generally matched the WHO DDD definitions. Our dose formulas were applicable to >85% of drug records in all but one of the assessed databases. CONCLUSION: We have established and implemented a standardized daily dose calculation in OMOP CDM providing reliable and reproducible results.
Asunto(s)
Bases de Datos Factuales , Humanos , Bases de Datos Factuales/estadística & datos numéricos , Reino Unido , Cálculo de Dosificación de Drogas , Países Bajos , Atención Primaria de Salud , Farmacoepidemiología/métodos , Organización Mundial de la SaludRESUMEN
PURPOSE: In rare diseases, real-world evidence (RWE) generation is often restricted due to small patient numbers and global geographic distribution. A federated data network (FDN) approach brings together multiple data sources harmonized for collaboration to increase the power of observational research. In this paper, we review how to increase reproducibility and transparency of RWE studies in rare diseases through disease-specific FDNs. METHOD: To be successful, a multiple stakeholder scientific FDN collaboration requires a strong governance model in place. In such a model, each database owner remains in full control regarding the use of and access to patient-level data and is responsible for data privacy, ethical, and legal compliance. Provided that all this is well documented and good database descriptions are in place, such a governance model results in increased transparency, while reproducibility is achieved through data curation and harmonization, and distributed analytical methods. RESULTS: Leveraging the OHDSI community set of methods and tools, two rare disease-specific FDNs are discussed in more detail. For multiple myeloma, HONEUR-the Haematology Outcomes Network in Europe-has built a strong community among the data partners dedicated to scientific exchange and research. To advance scientific knowledge in pulmonary hypertension (PH) an FDN, called PHederation, was established to form a partnership of research institutions with PH databases coming from diverse origins.
Asunto(s)
Enfermedades Raras , Humanos , Enfermedades Raras/epidemiología , Reproducibilidad de los Resultados , Bases de Datos Factuales , Europa (Continente)RESUMEN
BACKGROUND: Lumbar spinal stenosis (LSS) and spondylolisthesis (SPL) are characterized as degenerative spinal pathologies and share considerable similarities. However, opinions vary on whether to recommend exercise or restrict it for these diseases. Few studies have objectively compared the effects of daily physical activity on LSS and SPL because it is impossible to restrict activities ethnically and practically. We investigated the effect of restricting physical activity due to social distancing (SoD) on LSS and SPL, focusing on the aspect of healthcare burden changes during the pandemic period. METHODS: We included first-visit patients diagnosed exclusively with LSS and SPL in 2017 and followed them up for two years before and after the implementation of the SoD policy. As controls, patients who first visited in 2015 and were followed for four years without SoD were analyzed. The common data model was employed to analyze each patient's diagnostic codes and treatments. Hospital visits and medical costs were analyzed by regression discontinuity in time to control for temporal effects on dependent variables. RESULTS: Among 33,484 patients, 2,615 with LSS and 446 with SPL were included. A significant decrease in hospital visits was observed in the LSS (difference, -3.94 times/month·100 patients; p = 0.023) and SPL (difference, -3.44 times/month·100 patients; p = 0.026) groups after SoD. This decrease was not observed in the data from the control group. Concerning medical costs, the LSS group showed a statistically significant reduction in median copayment (difference, -$45/month·patient; p < 0.001) after SoD, whereas a significant change was not observed in the SPL group (difference, -$19/month·patient; p = 0.160). CONCLUSION: Restricted physical activity during the SoD period decreased the healthcare burden for patients with LSS or, conversely, it did not significantly affect patients with SPL. Under circumstances of physical inactivity, patients with LSS may underrate their symptoms, while maintaining an appropriate activity level may be beneficial for patients with SPL.
Asunto(s)
COVID-19 , Ejercicio Físico , Vértebras Lumbares , Estenosis Espinal , Espondilolistesis , Humanos , COVID-19/epidemiología , Espondilolistesis/epidemiología , Masculino , Femenino , Estudios Retrospectivos , Persona de Mediana Edad , Anciano , Costos de la Atención en Salud/estadística & datos numéricos , SARS-CoV-2 , Distanciamiento Físico , Hospitalización/estadística & datos numéricos , Hospitalización/economía , PandemiasRESUMEN
BACKGROUND: The population diagnosed with renal cell carcinoma, especially in Asia, represents 36.6% of global cases, with the incidence rate of renal cell carcinoma in Korea steadily increasing annually. However, treatment options for renal cell carcinoma are diverse, depending on clinical stage and histologic characteristics. Hence, this study aims to develop a machine learning based clinical decision-support system that recommends personalized treatment tailored to the individual health condition of each patient. RESULTS: We reviewed the real-world medical data of 1,867 participants diagnosed with renal cell carcinoma between November 2008 and June 2021 at the Pusan National University Yangsan Hospital in South Korea. Data were manually divided into a follow-up group where the patients did not undergo surgery or chemotherapy (Surveillance), a group where the patients underwent surgery (Surgery), and a group where the patients received chemotherapy before or after surgery (Chemotherapy). Feature selection was conducted to identify the significant clinical factors influencing renal cell carcinoma treatment decisions from 2,058 features. These features included subsets of 20, 50, 75, 100, and 150, as well as the complete set and an additional 50 expert-selected features. We applied representative machine learning algorithms, namely Decision Tree, Random Forest, and Gradient Boosting Machine (GBM). We analyzed the performance of three applied machine learning algorithms, among which the GBM algorithm achieved an accuracy score of 95% (95% CI, 92-98%) for the 100 and 150 feature sets. The GBM algorithm using 100 and 150 features achieved better performance than the algorithm using features selected by clinical experts (93%, 95% CI 89-97%). CONCLUSIONS: We developed a preliminary personalized treatment decision-support system (TDSS) called "RCC-Supporter" by applying machine learning (ML) algorithms to determine personalized treatment for the various clinical situations of RCC patients. Our results demonstrate the feasibility of using machine learning-based clinical decision support systems for treatment decisions in real clinical settings.
Asunto(s)
Carcinoma de Células Renales , Sistemas de Apoyo a Decisiones Clínicas , Neoplasias Renales , Aprendizaje Automático , Humanos , Carcinoma de Células Renales/terapia , Carcinoma de Células Renales/tratamiento farmacológico , Neoplasias Renales/terapia , Neoplasias Renales/tratamiento farmacológico , Masculino , Femenino , Persona de Mediana Edad , República de Corea , Toma de Decisiones Clínicas , Anciano , AdultoRESUMEN
BACKGROUND: In this era of big data, data harmonization is an important step to ensure reproducible, scalable, and collaborative research. Thus, terminology mapping is a necessary step to harmonize heterogeneous data. Take the Medical Dictionary for Regulatory Activities (MedDRA) and International Classification of Diseases (ICD) for example, the mapping between them is essential for drug safety and pharmacovigilance research. Our main objective is to provide a quantitative and qualitative analysis of the mapping status between MedDRA and ICD. We focus on evaluating the current mapping status between MedDRA and ICD through the Unified Medical Language System (UMLS) and Observational Medical Outcomes Partnership Common Data Model (OMOP CDM). We summarized the current mapping statistics and evaluated the quality of the current MedDRA-ICD mapping; for unmapped terms, we used our self-developed algorithm to rank the best possible mapping candidates for additional mapping coverage. RESULTS: The identified MedDRA-ICD mapped pairs cover 27.23% of the overall MedDRA preferred terms (PT). The systematic quality analysis demonstrated that, among the mapped pairs provided by UMLS, only 51.44% are considered an exact match. For the 2400 sampled unmapped terms, 56 of the 2400 MedDRA Preferred Terms (PT) could have exact match terms from ICD. CONCLUSION: Some of the mapped pairs between MedDRA and ICD are not exact matches due to differences in granularity and focus. For 72% of the unmapped PT terms, the identified exact match pairs illustrate the possibility of identifying additional mapped pairs. Referring to its own mapping standard, some of the unmapped terms should qualify for the expansion of MedDRA to ICD mapping in UMLS.
Asunto(s)
Sistemas de Registro de Reacción Adversa a Medicamentos , Clasificación Internacional de Enfermedades , Humanos , Unified Medical Language System , Farmacovigilancia , AlgoritmosRESUMEN
The need for an accurate country-specific real-world-based fracture prediction model is increasing. Thus, we developed scoring systems for osteoporotic fractures from hospital-based cohorts and validated them in an independent cohort in Korea. The model includes history of fracture, age, lumbar spine and total hip T-score, and cardiovascular disease. PURPOSE: Osteoporotic fractures are substantial health and economic burden. Therefore, the need for an accurate real-world-based fracture prediction model is increasing. We aimed to develop and validate an accurate and user-friendly model to predict major osteoporotic and hip fractures using a common data model database. METHODS: The study included 20,107 and 13,353 participants aged ≥ 50 years with data on bone mineral density using dual-energy X-ray absorptiometry from the CDM database between 2008 and 2011 from the discovery and validation cohort, respectively. The main outcomes were major osteoporotic and hip fracture events. DeepHit and Cox proportional hazard models were used to identify predictors of fractures and to build scoring systems, respectively. RESULTS: The mean age was 64.5 years, and 84.3% were women. During a mean of 7.6 years of follow-up, 1990 major osteoporotic and 309 hip fracture events were observed. In the final scoring model, history of fracture, age, lumbar spine T-score, total hip T-score, and cardiovascular disease were selected as predictors for major osteoporotic fractures. For hip fractures, history of fracture, age, total hip T-score, cerebrovascular disease, and diabetes mellitus were selected. Harrell's C-index for osteoporotic and hip fractures were 0.789 and 0.860 in the discovery cohort and 0.762 and 0.773 in the validation cohort, respectively. The estimated 10-year risks of major osteoporotic and hip fractures were 2.0%, 0.2% at score 0 and 68.8%, 18.8% at their maximum scores, respectively. CONCLUSION: We developed scoring systems for osteoporotic fractures from hospital-based cohorts and validated them in an independent cohort. These simple scoring models may help predict fracture risks in real-world practice.
Asunto(s)
Enfermedades Cardiovasculares , Fracturas de Cadera , Fracturas Osteoporóticas , Humanos , Femenino , Persona de Mediana Edad , Masculino , Fracturas Osteoporóticas/epidemiología , Fracturas Osteoporóticas/etiología , Densidad Ósea , Fracturas de Cadera/epidemiología , Fracturas de Cadera/etiología , Absorciometría de Fotón , Algoritmos , Factores de Riesgo , Medición de RiesgoRESUMEN
PURPOSE: Quetiapine is a drug used to treat schizophrenia, bipolar disorder, and major depressive disorder. However, it can cause mild or severe hepatic adverse events and rarely fatal liver damage. This study was aimed at investigating hepatic toxicity caused by quetiapine use by analyzing the information captured from hospital electronic health records by using the Observational Medical Outcomes Partnership common data model (CDM). METHODS: This was a retrospective observational study involving a nested case-control method. A CDM based on an electronic health record database from five hospitals between January 2009 and May 2020 was used. We analyzed the status of quetiapine use, adverse events, and hepatic impairment. RESULTS: The numbers of patients with non-serious and severe hepatic adverse reactions were 2566 (5.05%) and 835 (1.64%) out of 50 766 patients, respectively. After adjusting for covariates, the odds ratio of hepatic adverse events was 2.35 (95% CI: 2.03-2.72), and the odds ratio of severe hepatic adverse events was 1.76 (95% CI: 1.16-2.66). CONCLUSION: Our findings suggest that quetiapine should be cautiously used, and hepatic function should be monitored in patients using quetiapine because it can cause mild or severe hepatic adverse events, complications, and in rare cases, fatal liver damage.
Asunto(s)
Antipsicóticos , Trastorno Bipolar , Trastorno Depresivo Mayor , Humanos , Fumarato de Quetiapina/efectos adversos , Antipsicóticos/efectos adversos , Trastorno Depresivo Mayor/tratamiento farmacológico , Trastorno Bipolar/tratamiento farmacológico , HígadoRESUMEN
BACKGROUND: Older adults are at an increased risk of postoperative morbidity. Numerous risk stratification tools exist, but effort and manpower are required. OBJECTIVE: This study aimed to develop a predictive model of postoperative adverse outcomes in older patients following general surgery with an open-source, patient-level prediction from the Observational Health Data Sciences and Informatics for internal and external validation. METHODS: We used the Observational Medical Outcomes Partnership common data model and machine learning algorithms. The primary outcome was a composite of 90-day postoperative all-cause mortality and emergency department visits. Secondary outcomes were postoperative delirium, prolonged postoperative stay (≥75th percentile), and prolonged hospital stay (≥21 days). An 80% versus 20% split of the data from the Seoul National University Bundang Hospital (SNUBH) and Seoul National University Hospital (SNUH) common data model was used for model training and testing versus external validation. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC) with a 95% CI. RESULTS: Data from 27,197 (SNUBH) and 32,857 (SNUH) patients were analyzed. Compared to the random forest, Adaboost, and decision tree models, the least absolute shrinkage and selection operator logistic regression model showed good internal discriminative accuracy (internal AUC 0.723, 95% CI 0.701-0.744) and transportability (external AUC 0.703, 95% CI 0.692-0.714) for the primary outcome. The model also possessed good internal and external AUCs for postoperative delirium (internal AUC 0.754, 95% CI 0.713-0.794; external AUC 0.750, 95% CI 0.727-0.772), prolonged postoperative stay (internal AUC 0.813, 95% CI 0.800-0.825; external AUC 0.747, 95% CI 0.741-0.753), and prolonged hospital stay (internal AUC 0.770, 95% CI 0.749-0.792; external AUC 0.707, 95% CI 0.696-0.718). Compared with age or the Charlson comorbidity index, the model showed better prediction performance. CONCLUSIONS: The derived model shall assist clinicians and patients in understanding the individualized risks and benefits of surgery.
Asunto(s)
Delirio del Despertar , Humanos , Anciano , Pronóstico , Estudios Retrospectivos , Algoritmos , Aprendizaje AutomáticoRESUMEN
BACKGROUND: Status epilepticus (SE) is a critical neurological emergency in patients with neurological and nonneurological diseases. Mortality rises with SE severity. However, whether brain injury or systemic organ dysfunction causes death after SE remains unclear. We studied clinical outcomes and systemic dysfunctions associated with SE using standardized data from the common data model. This model includes clinical evaluations and treatments that provide real-world evidence for standard practice. METHODS: This retrospective cohort study used the common data model database of a single tertiary academic medical center. Patients diagnosed with SE (corresponding to G41 of the International Classification of Diseases 10 and administration of antiseizure medication) between January 1, 2001, and January 1, 2018, were enrolled. Demographics, classifications of SE severity, and outcomes were collected as operational definitions by using a common data model format. Systemic complications were defined based on the Sequential Organ Failure Assessment criteria. RESULTS: The electronic medical records of 1,825,196 patients were transformed into a common data model, and 410 patients were enrolled. The proportion of patients classified as having nonrefractory SE was 65.4% (268/410), followed by refractory (28.5%, 117/410) and super-refractory SE (6.1%, 25/410). Patients with more severe SE had longer intensive care unit and hospital stays. Renal dysfunction and thrombocytopenia were higher in the in-hospital death group (P = 0.002 and 0.003, respectively). In multivariable analysis, the Acute Physiology and Chronic Health Evaluation II score and platelet count were significantly different in the in-hospital death group (odds ratio, 1.169, P = 0.004; and 0.989, P = 0.043). CONCLUSIONS: Systemic complications after SE, especially low platelet counts, were linked to worse outcomes and increased mortality in a common data model. The common data model offers expandability and comprehensive analysis, making it a potentially valuable tool for SE research.
RESUMEN
BACKGROUND: A paucity of data addressing real-world treatment of myopic choroidal neovascularization (mCNV) in the era of anti-vascular endothelial growth factor (VEGF) drugs led us to investigate real-world treatment intensity and treatment patterns in patients with mCNV. METHODS: This is a retrospective, observational study using the Observational Medical Outcomes Partnership-Common Data Model database of treatment-naïve patients with mCNV over the 18-year study period (2003-2020). Outcomes were treatment intensity (time trends of total/average number of prescriptions, mean number of prescriptions in the first year and the second year after initiating treatment, proportion of patients with no treatment in the second year) and treatment patterns (subsequent patterns of treatment according to the initial treatment). RESULTS: Our final cohort included 94 patients with at-least 1-year observation period. Overall, 96.8% of patients received anti-VEGF drugs as first-line treatment, with most of injections from bevacizumab. The number of anti-VEGF injections in each calendar year showed an increasing trend over time; however, there was a drop in the mean number of injections in the second year compared to the first year from 2.09 to 0.47. About 77% of patients did not receive any treatment in their second year of treatment regardless of drugs. Most of patients (86.2%) followed non-switching monotherapy only and bevacizumab was the most popular choice either in the first-line (68.1%) or in the second-line (53.8%) of treatment. Aflibercept was increasingly used as the first-line treatment for patients with mCNV. CONCLUSION: Anti-VEGF drugs have become the treatment of choice and second-line treatment for mCNV over the past decade. Anti-VEGF drugs are effective for the treatment of mCNV as the non-switching monotherapy is the main treatment regimen in most cases and the number of treatments decreases significantly in the second year of treatment.
Asunto(s)
Oftalmología , Humanos , Bevacizumab , Bases de Datos Factuales , PacientesRESUMEN
BACKGROUND: The CVD-COVID-UK consortium was formed to understand the relationship between COVID-19 and cardiovascular diseases through analyses of harmonised electronic health records (EHRs) across the four UK nations. Beyond COVID-19, data harmonisation and common approaches enable analysis within and across independent Trusted Research Environments. Here we describe the reproducible harmonisation method developed using large-scale EHRs in Wales to accommodate the fast and efficient implementation of cross-nation analysis in England and Wales as part of the CVD-COVID-UK programme. We characterise current challenges and share lessons learnt. METHODS: Serving the scope and scalability of multiple study protocols, we used linked, anonymised individual-level EHR, demographic and administrative data held within the SAIL Databank for the population of Wales. The harmonisation method was implemented as a four-layer reproducible process, starting from raw data in the first layer. Then each of the layers two to four is framed by, but not limited to, the characterised challenges and lessons learnt. We achieved curated data as part of our second layer, followed by extracting phenotyped data in the third layer. We captured any project-specific requirements in the fourth layer. RESULTS: Using the implemented four-layer harmonisation method, we retrieved approximately 100 health-related variables for the 3.2 million individuals in Wales, which are harmonised with corresponding variables for > 56 million individuals in England. We processed 13 data sources into the first layer of our harmonisation method: five of these are updated daily or weekly, and the rest at various frequencies providing sufficient data flow updates for frequent capturing of up-to-date demographic, administrative and clinical information. CONCLUSIONS: We implemented an efficient, transparent, scalable, and reproducible harmonisation method that enables multi-nation collaborative research. With a current focus on COVID-19 and its relationship with cardiovascular outcomes, the harmonised data has supported a wide range of research activities across the UK.
Asunto(s)
COVID-19 , Registros Electrónicos de Salud , Humanos , COVID-19/epidemiología , Gales/epidemiología , InglaterraRESUMEN
At the time medical products are approved, we rarely know enough about their comparative safety and effectiveness vis-à-vis alternative therapies to advise patients and providers. Postmarket generation of evidence on rare adverse events following medical product exposure increasingly requires analysis of millions of longitudinal patient records that can provide complete capture of data on patient experiences. In the accompanying article by Pradhan et al. (Am J Epidemiology. 2022;191(8):1352-1367), the authors demonstrate how observational database studies are often the most practical approach, provided these databases are carefully chosen to be "fit for purpose." Distributed data networks with common data models have proliferated in the last 2 decades in pharmacoepidemiology, allowing efficient capture of patient data in a standardized and structured format across disparate real-world data sources. Use of common data models facilitates transparency by allowing standardized programming approaches that can be easily reproduced. The distributed data network architecture, combined with a common data approach, supports not only multisite observational studies but also pragmatic clinical trials. It also helps bridge international boundaries and further increases the sample size and diversity of study populations.
Asunto(s)
Farmacoepidemiología , Bases de Datos Factuales , HumanosRESUMEN
Data-sharing improves epidemiologic research, but the sharing of data frustrates epidemiologic researchers. The inefficiencies of current methods and options for data-sharing are increasingly documented and easily understood by any study group that has shared its data and any researcher who has received shared data. In this issue of the Journal, Temprosa et al. (Am J Epidemiol. 2021;191(1):147-158) describe how the Consortium of Metabolomics Studies (COMETS) developed and deployed a flexible analytical platform to eliminate key pain points in large-scale metabolomics research. COMETS Analytics includes an online tool, but its cloud computing and technology are the supporting rather than the leading actors in this script. The COMETS team identified the need to standardize diverse and inconsistent metabolomics and covariate data and models across its many participating cohort studies, and then developed a flexible tool that gave its member studies choices about how they wanted to meet the consortium's analytical requirements. Different specialties will have different specific research needs and will probably continue to use and develop an array of diverse analytical and technical solutions for their projects. COMETS Analytics shows how important-and enabling-the upstream attention to data standards and data consistency is to producing high-quality metabolomics, consortia-based, and large-scale epidemiology research.
Asunto(s)
Difusión de la Información , Metabolómica , Estudios Epidemiológicos , Humanos , Estándares de ReferenciaRESUMEN
BACKGROUND: Statin treatment increases the risk of new-onset diabetes mellitus (NODM); however, data directly comparing the risk of NODM among individual statins is limited. We compared the risk of NODM between patients using pitavastatin and atorvastatin or rosuvastatin using reliable, large-scale data. METHODS: Data of electronic health records from ten hospitals converted to the Observational Medical Outcomes Partnership Common Data Model (n = 14,605,368 patients) were used to identify new users of pitavastatin, atorvastatin, or rosuvastatin (atorvastatin + rosuvastatin) for ≥ 180 days without a previous history of diabetes or HbA1c level ≥ 5.7%. We conducted a cohort study using Cox regression analysis to examine the hazard ratio (HR) of NODM after propensity score matching (PSM) and then performed an aggregate meta-analysis of the HR. RESULTS: After 1:2 PSM, 10,238 new pitavastatin users (15,998 person-years of follow-up) and 18,605 atorvastatin + rosuvastatin users (33,477 person-years of follow-up) were pooled from 10 databases. The meta-analysis of the HRs demonstrated that pitavastatin resulted in a significantly reduced risk of NODM than atorvastatin + rosuvastatin (HR 0.72; 95% CI 0.59-0.87). In sub-analysis, pitavastatin was associated with a lower risk of NODM than atorvastatin or rosuvastatin after 1:1 PSM (HR 0.69; CI 0.54-0.88 and HR 0.74; CI 0.55-0.99, respectively). A consistently low risk of NODM in pitavastatin users was observed when compared with low-to-moderate-intensity atorvastatin + rosuvastatin users (HR 0.78; CI 0.62-0.98). CONCLUSIONS: In this retrospective, multicenter active-comparator, new-user, cohort study, pitavastatin reduced the risk of NODM compared with atorvastatin or rosuvastatin.
Asunto(s)
Diabetes Mellitus , Inhibidores de Hidroximetilglutaril-CoA Reductasas , Atorvastatina/efectos adversos , Estudios de Cohortes , Diabetes Mellitus/diagnóstico , Diabetes Mellitus/tratamiento farmacológico , Diabetes Mellitus/epidemiología , Humanos , Inhibidores de Hidroximetilglutaril-CoA Reductasas/efectos adversos , Estudios Multicéntricos como Asunto , Quinolinas , Estudios Retrospectivos , Rosuvastatina Cálcica/efectos adversosRESUMEN
PURPOSE: Risk of second primary malignancy (SPM) after radioiodine (RAI) therapy has been continuously debated. The aim of this study is to identify the risk of SPM in thyroid cancer (TC) patients with RAI compared with TC patients without RAI from matched cohort. METHODS: Retrospective propensity-matched cohorts were constructed across 4 hospitals in South Korea via the Observational Health Data Science and Informatics (OHDSI), and electrical health records were converted to data of common data model. TC patients who received RAI therapy constituted the target group, whereas TC patients without RAI therapy constituted the comparative group with 1:1 propensity score matching. Hazard ratio (HR) by Cox proportional hazard model was used to estimate the risk of SPM, and meta-analysis was performed to pool the HRs. RESULTS: Among a total of 24,318 patients, 5,374 patients from each group were analyzed (mean age 48.9 and 49.2, women 79.4% and 79.5% for target and comparative group, respectively). All hazard ratios of SPM in TC patients with RAI therapy were ≤ 1 based on 95% confidence interval(CI) from full or subgroup analyses according to thyroid cancer stage, time-at-risk period, SPM subtype (hematologic or non-hematologic), and initial age (< 30 years or ≥ 30 years). The HR within the target group was not significantly higher (< 1) in patients who received over 3.7 GBq of I-131 compared with patients who received less than 3.7 GBq of I-131 based on 95% CI. CONCLUSION: There was no significant difference of the SPM risk between TC patients treated with I-131 and propensity-matched TC patients without I-131 therapy.
Asunto(s)
Neoplasias Primarias Secundarias , Neoplasias de la Tiroides , Adulto , Ciencia de los Datos , Femenino , Humanos , Informática , Radioisótopos de Yodo/efectos adversos , Persona de Mediana Edad , Neoplasias Primarias Secundarias/epidemiología , Neoplasias Primarias Secundarias/etiología , Estudios Retrospectivos , Neoplasias de la Tiroides/radioterapiaRESUMEN
OBJECTIVE: More than one third of appropriately treated patients with epilepsy have continued seizures despite two or more medication trials, meeting criteria for drug-resistant epilepsy (DRE). Accurate and reliable identification of patients with DRE in observational data would enable large-scale, real-world comparative effectiveness research and improve access to specialized epilepsy care. In the present study, we aim to develop and compare the performance of computable phenotypes for DRE using the Observational Medical Outcomes Partnership (OMOP) Common Data Model. METHODS: We randomly sampled 600 patients from our academic medical center's electronic health record (EHR)-derived OMOP database meeting previously validated criteria for epilepsy (January 2015-August 2021). Two reviewers manually classified patients as having DRE, drug-responsive epilepsy, undefined drug responsiveness, or no epilepsy as of the last EHR encounter in the study period based on consensus definitions. Demographic characteristics and codes for diagnoses, antiseizure medications (ASMs), and procedures were tested for association with DRE. Algorithms combining permutations of these factors were applied to calculate sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) for DRE. The F1 score was used to compare overall performance. RESULTS: Among 412 patients with source record-confirmed epilepsy, 62 (15.0%) had DRE, 163 (39.6%) had drug-responsive epilepsy, 124 (30.0%) had undefined drug responsiveness, and 63 (15.3%) had insufficient records. The best performing phenotype for DRE in terms of the F1 score was the presence of ≥1 intractable epilepsy code and ≥2 unique non-gabapentinoid ASM exposures each with ≥90-day drug era (sensitivity = .661, specificity = .937, PPV = .594, NPV = .952, F1 score = .626). Several phenotypes achieved higher sensitivity at the expense of specificity and vice versa. SIGNIFICANCE: OMOP algorithms can identify DRE in EHR-derived data with varying tradeoffs between sensitivity and specificity. These computable phenotypes can be applied across the largest international network of standardized clinical databases for further validation, reproducible observational research, and improving access to appropriate care.
Asunto(s)
Epilepsia Refractaria , Epilepsia , Humanos , Registros Electrónicos de Salud , Epilepsia Refractaria/diagnóstico , Epilepsia Refractaria/tratamiento farmacológico , Bases de Datos Factuales , Recolección de Datos , Algoritmos , Epilepsia/diagnóstico , Epilepsia/tratamiento farmacológicoRESUMEN
INTRODUCTION: There remains a need to optimize treatments and improve outcomes among patients with hematologic malignancies. The timely synthesis and analysis of real-world data could play a key role. OBJECTIVES: The Haematology Outcomes Network in Europe (HONEUR) is a federated data network (FDN) that aims to overcome the challenges of heterogenous data collected from different registries, hospitals, and other databases in different countries. It has the functionality required to analyze data from various sources in a time efficient manner, while preserving local data security and governance. With this, research studies can be performed that can increase knowledge and understanding of the management of patients with hematologic malignancies. METHODS: HONEUR uses the Observational Medical Outcomes Partnership (OMOP) common data model, which allows analysis scripts to be run by multiple sites using their own data, ultimately generating aggregated results. Furthermore, distributed analytics can be used to run statistical analyses across multiple sites, as if data were pooled. The external governance model ensures high-quality standards, while data ownership is retained locally. Twenty partners from nine countries are now participating, with data from more than 26 000 patients available for analysis. Research questions that can be addressed through HONEUR include assessments of natural disease history, treatment patterns, and clinical effectiveness. CONCLUSIONS: The HONEUR FDN marks an important step forward in increasing the value of information routinely captured by individual hospitals, registries and other database holders, thus enabling larger-scale studies to be undertaken rapidly and efficiently.