Your browser doesn't support javascript.
loading
Data harmonization and federated learning for multi-cohort dementia research using the OMOP common data model: A Netherlands consortium of dementia cohorts case study.
Mateus, Pedro; Moonen, Justine; Beran, Magdalena; Jaarsma, Eva; van der Landen, Sophie M; Heuvelink, Joost; Birhanu, Mahlet; Harms, Alexander G J; Bron, Esther; Wolters, Frank J; Cats, Davy; Mei, Hailiang; Oomens, Julie; Jansen, Willemijn; Schram, Miranda T; Dekker, Andre; Bermejo, Inigo.
Afiliación
  • Mateus P; Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, Netherlands. Electronic address: pedro.mateus@maastro.nl.
  • Moonen J; Alzheimer Center Amsterdam, Neurology, Vrije Universiteit Amsterdam, Amsterdam UMC location VUmc, Amsterdam, Netherlands; Amsterdam Neuroscience, Neurodegeneration, Amsterdam, Netherlands.
  • Beran M; Department of Internal Medicine, School for Cardiovascular Diseases (CARIM), Maastricht University, Maastricht, Netherlands; Department of Epidemiology and Global Health, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, Netherlands.
  • Jaarsma E; Center for Nutrition, Prevention, and Health Services, National Institute for Public Health and the Environment (RIVM), Bilthoven, Netherlands; Amsterdam UMC location Vrije Universiteit Amsterdam, Epidemiology and Data Science, Amsterdam, Netherlands.
  • van der Landen SM; Alzheimer Center Amsterdam, Neurology, Vrije Universiteit Amsterdam, Amsterdam UMC location VUmc, Amsterdam, Netherlands; Amsterdam Neuroscience, Neurodegeneration, Amsterdam, Netherlands.
  • Heuvelink J; Alzheimer Center Amsterdam, Neurology, Vrije Universiteit Amsterdam, Amsterdam UMC location VUmc, Amsterdam, Netherlands.
  • Birhanu M; Biomedical Imaging Group Rotterdam, Dept. Radiology & Nuclear Medicine, Erasmus MC - University Medical Center Rotterdam, Rotterdam, Netherlands.
  • Harms AGJ; Biomedical Imaging Group Rotterdam, Dept. Radiology & Nuclear Medicine, Erasmus MC - University Medical Center Rotterdam, Rotterdam, Netherlands.
  • Bron E; Biomedical Imaging Group Rotterdam, Dept. Radiology & Nuclear Medicine, Erasmus MC - University Medical Center Rotterdam, Rotterdam, Netherlands.
  • Wolters FJ; Erasmus MC - University Medical Centre Rotterdam, Departments of Epidemiology and Radiology & Nuclear Medicine, Netherlands.
  • Cats D; Sequencing Analysis Support Core, Department of Biomedical Data Sciences, Leiden University Medical Center, Netherlands.
  • Mei H; Sequencing Analysis Support Core, Department of Biomedical Data Sciences, Leiden University Medical Center, Netherlands.
  • Oomens J; Department of Psychiatry and Neuropsychology, School for Mental Health and Neuroscience, Alzheimer Center Limburg, Maastricht University, Netherlands.
  • Jansen W; Department of Psychiatry and Neuropsychology, School for Mental Health and Neuroscience, Alzheimer Center Limburg, Maastricht University, Netherlands.
  • Schram MT; Cardiovascular Research Institute Maastricht (CARIM), Maastricht University, Maastricht, Netherlands; Department of Internal Medicine, Maastricht University Medical Centre, Maastricht, Netherlands; MHeNS School for Mental Health and Neuroscience, Maastricht University, Maastricht, Netherlands; Heart
  • Dekker A; Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, Netherlands.
  • Bermejo I; Department of Radiation Oncology (Maastro), GROW School for Oncology and Reproduction, Maastricht University Medical Centre+, Maastricht, Netherlands.
J Biomed Inform ; 155: 104661, 2024 Jul.
Article en En | MEDLINE | ID: mdl-38806105
ABSTRACT

BACKGROUND:

Establishing collaborations between cohort studies has been fundamental for progress in health research. However, such collaborations are hampered by heterogeneous data representations across cohorts and legal constraints to data sharing. The first arises from a lack of consensus in standards of data collection and representation across cohort studies and is usually tackled by applying data harmonization processes. The second is increasingly important due to raised awareness for privacy protection and stricter regulations, such as the GDPR. Federated learning has emerged as a privacy-preserving alternative to transferring data between institutions through analyzing data in a decentralized manner.

METHODS:

In this study, we set up a federated learning infrastructure for a consortium of nine Dutch cohorts with appropriate data available to the etiology of dementia, including an extract, transform, and load (ETL) pipeline for data harmonization. Additionally, we assessed the challenges of transforming and standardizing cohort data using the Observational Medical Outcomes Partnership (OMOP) common data model (CDM) and evaluated our tool in one of the cohorts employing federated algorithms.

RESULTS:

We successfully applied our ETL tool and observed a complete coverage of the cohorts' data by the OMOP CDM. The OMOP CDM facilitated the data representation and standardization, but we identified limitations for cohort-specific data fields and in the scope of the vocabularies available. Specific challenges arise in a multi-cohort federated collaboration due to technical constraints in local environments, data heterogeneity, and lack of direct access to the data.

CONCLUSION:

In this article, we describe the solutions to these challenges and limitations encountered in our study. Our study shows the potential of federated learning as a privacy-preserving solution for multi-cohort studies that enhance reproducibility and reuse of both data and analyses.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Demencia Límite: Humans País/Región como asunto: Europa Idioma: En Revista: J Biomed Inform Asunto de la revista: INFORMATICA MEDICA Año: 2024 Tipo del documento: Article

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Demencia Límite: Humans País/Región como asunto: Europa Idioma: En Revista: J Biomed Inform Asunto de la revista: INFORMATICA MEDICA Año: 2024 Tipo del documento: Article