RESUMEN
The amount of research on the gathering and handling of healthcare data keeps growing. To support multi-center research, numerous institutions have sought to create a common data model (CDM). However, data quality issues continue to be a major obstacle in the development of CDM. To address these limitations, a data quality assessment system was created based on the representative data model OMOP CDM v5.3.1. Additionally, 2,433 advanced evaluation rules were created and incorporated into the system by mapping the rules of existing OMOP CDM quality assessment systems. The data quality of six hospitals was verified using the developed system and an overall error rate of 0.197% was confirmed. Finally, we proposed a plan for high-quality data generation and the evaluation of multi-center CDM quality.
Asunto(s)
Exactitud de los Datos , Manejo de Datos , Instituciones de Salud , HospitalesRESUMEN
The amount of research on the gathering and handling of healthcare data keeps growing. To support multi-center research, numerous institutions have sought to create a common data model (CDM). However, data quality issues continue to be a major obstacle in the development of CDM. To address these limitations, a data quality assessment system was created based on the representative data model OMOP CDM v5.3.1. Additionally, 2,433 advanced evaluation rules were created and incorporated into the system by mapping the rules of existing OMOP CDM quality assessment systems. The data quality of six hospitals was verified using the developed system and an overall error rate of 0.197% was confirmed. Finally, we proposed a plan for high-quality data generation and the evaluation of multi-center CDM quality.
Asunto(s)
Exactitud de los Datos , Hospitales , Bases de Datos Factuales , Atención a la Salud , Registros Electrónicos de SaludRESUMEN
Numerous studies make extensive use of healthcare data, including human materials and clinical information, and acknowledge its significance. However, limitations in data collection methods can impact the quality of healthcare data obtained from multiple institutions. In order to secure high-quality data related to human materials, research focused on data quality is necessary. This study validated the quality of data collected in 2020 from 16 institutions constituting the Korea Biobank Network using 104 validation rules. The validation rules were developed based on the DQ4HEALTH model and were divided into four dimensions: completeness, validity, accuracy, and uniqueness. Korea Biobank Network collects and manages human materials and clinical information from multiple biobanks, and is in the process of developing a common data model for data integration. The results of the data quality verification revealed an error rate of 0.74%. Furthermore, an analysis of the data from each institution was performed to examine the relationship between the institution's characteristics and error count. The results from a chi-square test indicated that there was an independent correlation between each institution and its error count. To confirm this correlation between error counts and the characteristics of each institution, a correlation analysis was conducted. The results, shown in a graph, revealed the relationship between factors that had high correlation coefficients and the error count. The findings suggest that the data quality was impacted by biases in the evaluation system, including the institution's IT environment, infrastructure, and the number of collected samples. These results highlight the need to consider the scalability of research quality when evaluating clinical epidemiological information linked to human materials in future validation studies of data quality.
Asunto(s)
Bancos de Muestras Biológicas , Exactitud de los Datos , Humanos , Manejo de Especímenes/métodos , Atención a la Salud , República de CoreaRESUMEN
Allopurinol is the first-line agent for patients with gout, including those with moderate-to-severe chronic kidney disease. However, increased thyroid-stimulating hormone (TSH) levels are observed in patients with long-term allopurinol treatment. This large-scale, nested case-control, retrospective observational study analysed the association between allopurinol use and increased TSH levels. A common data model based on an electronic medical record database of 19,200,973 patients from seven hospitals between January 1997 and September 2020 was used. Individuals aged > 19 years in South Korea with at least one record of a blood TSH test were included. Data of 59,307 cases with TSH levels > 4.5 mIU/L and 236,508 controls matched for sex, age (± 5), and cohort registration date (± 30 days) were analysed. An association between the risk of increased TSH and allopurinol use in participants from five hospitals was observed. A meta-analysis (I2 = 0) showed that the OR was 1.51 (95% confidence interval: 1.32-1.72) in both the fixed and random effects models. The allopurinol intake group demonstrated that increased TSH did not significantly affect free thyroxine and thyroxine levels. After the index date, some diseases were likely to occur in patients with subclinical hypothyroidism and hypothyroidism. Allopurinol administration may induce subclinical hypothyroidism.
Asunto(s)
Alopurinol/efectos adversos , Alopurinol/uso terapéutico , Tirotropina/sangre , Adulto , Estudios de Casos y Controles , Femenino , Humanos , Hipertiroidismo , Hipotiroidismo/complicaciones , Masculino , Oportunidad Relativa , República de Corea , Estudios Retrospectivos , Reumatología/métodos , Riesgo , Factores de Riesgo , Pruebas de Función de la Tiroides , Tiroxina/sangre , Adulto JovenRESUMEN
We expanded and constructed a Common Data Model (CDM) based on hospital EHR to enable analysis and comparison of Adverse Drug Reactions(ADRs) integrated with external organizations with different data structures. This is significant in that it is possible to conduct joint research, analysis, and comparisons among institutions with the same type of CDM constructed, and provide the basis for conducting the same research simultaneously on various data sources.