Search | VHL Regional Portal

Secondary use of patient data within decentralized studies using the example of rare diseases in Germany: A data scientist's exploration of process and lessons learned.

Zoch, Michele; Gierschner, Christian; Andreeff, Anne-Katrin; Henke, Elisa; Sedlmayr, Martin; Müller, Gabriele; Tippmann, Jenny; Hebestreit, Helge; Choukair, Daniela; Hoffmann, Georg F; Fritz-Kebede, Fleur; Toepfner, Nicole; Berner, Reinhard; Biergans, Stephanie; Verbücheln, Raphael; Schaaf, Jannik; Fleck, Julia; Wirth, Felix Nikolaus; Schepers, Josef; Prasser, Fabian.

Digit Health ; 10: 20552076241265219, 2024.

Article in English | MEDLINE | ID: mdl-39130526

ABSTRACT

Objective: Unlocking the potential of routine medical data for clinical research requires the analysis of data from multiple healthcare institutions. However, according to German data protection regulations, data can often not leave the individual institutions and decentralized approaches are needed. Decentralized studies face challenges regarding coordination, technical infrastructure, interoperability and regulatory compliance. Rare diseases are an important prototype research focus for decentralized data analyses, as patients are rare by definition and adequate cohort sizes can only be reached if data from multiple sites is combined. Methods: Within the project "Collaboration on Rare Diseases", decentralized studies focusing on four rare diseases (cystic fibrosis, phenylketonuria, Kawasaki disease, multisystem inflammatory syndrome in children) were conducted at 17 German university hospitals. Therefore, a data management process for decentralized studies was developed by an interdisciplinary team of experts from medicine, public health and data science. Along the process, lessons learned were formulated and discussed. Results: The process consists of eight steps and includes sub-processes for the definition of medical use cases, script development and data management. The lessons learned include on the one hand the organization and administration of the studies (collaboration of experts, use of standardized forms and publication of project information), and on the other hand the development of scripts and analysis (dependency on the database, use of standards and open source tools, feedback loops, anonymization). Conclusions: This work captures central challenges and describes possible solutions and can hence serve as a solid basis for the implementation and conduction of similar decentralized studies.

EasySMPC: a simple but powerful no-code tool for practical secure multiparty computation.

Wirth, Felix Nikolaus; Kussel, Tobias; Müller, Armin; Hamacher, Kay; Prasser, Fabian.

BMC Bioinformatics ; 23(1): 531, 2022 Dec 09.

Article in English | MEDLINE | ID: mdl-36494612

ABSTRACT

BACKGROUND: Modern biomedical research is data-driven and relies heavily on the re-use and sharing of data. Biomedical data, however, is subject to strict data protection requirements. Due to the complexity of the data required and the scale of data use, obtaining informed consent is often infeasible. Other methods, such as anonymization or federation, in turn have their own limitations. Secure multi-party computation (SMPC) is a cryptographic technology for distributed calculations, which brings formally provable security and privacy guarantees and can be used to implement a wide-range of analytical approaches. As a relatively new technology, SMPC is still rarely used in real-world biomedical data sharing activities due to several barriers, including its technical complexity and lack of usability. RESULTS: To overcome these barriers, we have developed the tool EasySMPC, which is implemented in Java as a cross-platform, stand-alone desktop application provided as open-source software. The tool makes use of the SMPC method Arithmetic Secret Sharing, which allows to securely sum up pre-defined sets of variables among different parties in two rounds of communication (input sharing and output reconstruction) and integrates this method into a graphical user interface. No additional software services need to be set up or configured, as EasySMPC uses the most widespread digital communication channel available: e-mails. No cryptographic keys need to be exchanged between the parties and e-mails are exchanged automatically by the software. To demonstrate the practicability of our solution, we evaluated its performance in a wide range of data sharing scenarios. The results of our evaluation show that our approach is scalable (summing up 10,000 variables between 20 parties takes less than 300 s) and that the number of participants is the essential factor. CONCLUSIONS: We have developed an easy-to-use "no-code solution" for performing secure joint calculations on biomedical data using SMPC protocols, which is suitable for use by scientists without IT expertise and which has no special infrastructure requirements. We believe that innovative approaches to data sharing with SMPC are needed to foster the translation of complex protocols into practice.

Subject(s)

Biomedical Research , Computer Security , Humans , Information Dissemination , Software

Privacy-preserving data sharing infrastructures for medical research: systematization and comparison.

Wirth, Felix Nikolaus; Meurers, Thierry; Johns, Marco; Prasser, Fabian.

BMC Med Inform Decis Mak ; 21(1): 242, 2021 08 12.

Article in English | MEDLINE | ID: mdl-34384406

ABSTRACT

BACKGROUND: Data sharing is considered a crucial part of modern medical research. Unfortunately, despite its advantages, it often faces obstacles, especially data privacy challenges. As a result, various approaches and infrastructures have been developed that aim to ensure that patients and research participants remain anonymous when data is shared. However, privacy protection typically comes at a cost, e.g. restrictions regarding the types of analyses that can be performed on shared data. What is lacking is a systematization making the trade-offs taken by different approaches transparent. The aim of the work described in this paper was to develop a systematization for the degree of privacy protection provided and the trade-offs taken by different data sharing methods. Based on this contribution, we categorized popular data sharing approaches and identified research gaps by analyzing combinations of promising properties and features that are not yet supported by existing approaches. METHODS: The systematization consists of different axes. Three axes relate to privacy protection aspects and were adopted from the popular Five Safes Framework: (1) safe data, addressing privacy at the input level, (2) safe settings, addressing privacy during shared processing, and (3) safe outputs, addressing privacy protection of analysis results. Three additional axes address the usefulness of approaches: (4) support for de-duplication, to enable the reconciliation of data belonging to the same individuals, (5) flexibility, to be able to adapt to different data analysis requirements, and (6) scalability, to maintain performance with increasing complexity of shared data or common analysis processes. RESULTS: Using the systematization, we identified three different categories of approaches: distributed data analyses, which exchange anonymous aggregated data, secure multi-party computation protocols, which exchange encrypted data, and data enclaves, which store pooled individual-level data in secure environments for access for analysis purposes. We identified important research gaps, including a lack of approaches enabling the de-duplication of horizontally distributed data or providing a high degree of flexibility. CONCLUSIONS: There are fundamental differences between different data sharing approaches and several gaps in their functionality that may be interesting to investigate in future work. Our systematization can make the properties of privacy-preserving data sharing infrastructures more transparent and support decision makers and regulatory authorities with a better understanding of the trade-offs taken.

Subject(s)

Biomedical Research , Privacy , Computer Security , Humans , Information Dissemination

A Comprehensive Portal for Clinical and Translational Data Warehouses.

Johns, Marco; Müller, Armin; Wirth, Felix Nikolaus; Prasser, Fabian.

Stud Health Technol Inform ; 281: 462-466, 2021 May 27.

Article in English | MEDLINE | ID: mdl-34042786

ABSTRACT

Data-driven methods in biomedical research can help to obtain new insights into the development, progression and therapy of diseases. Clinical and translational data warehouses such as Informatics for Integrating Biology and the Bedside (i2b2) and tranSMART are important solutions for this. From the well-known FAIR data principles, which are used to address the aspects of findability, accessibility, interoperability and reusability. In this paper, we focus on findability. For this purpose, we describe a portal solution that acts as a catalogue for a wide range of data warehouse instances, featuring a central access point and links to training material, such as user manuals and video tutorials. Moreover, the portal provides an overview of the status of multiple warehouses for developers and a set of statistics about the data currently loaded. Due to its modular design and the use of modern web technologies, the portal is easy to extend and customize to reflect different corporate designs and institutional requirements.

Subject(s)

Biomedical Research , Data Warehousing , Informatics

Citizen-Centered Mobile Health Apps Collecting Individual-Level Spatial Data for Infectious Disease Management: Scoping Review.

Wirth, Felix Nikolaus; Johns, Marco; Meurers, Thierry; Prasser, Fabian.

JMIR Mhealth Uhealth ; 8(11): e22594, 2020 11 10.

Article in English | MEDLINE | ID: mdl-33074833

ABSTRACT

BACKGROUND: The novel coronavirus SARS-CoV-2 rapidly spread around the world, causing the disease COVID-19. To contain the virus, much hope is placed on participatory surveillance using mobile apps, such as automated digital contact tracing, but broad adoption is an important prerequisite for associated interventions to be effective. Data protection aspects are a critical factor for adoption, and privacy risks of solutions developed often need to be balanced against their functionalities. This is reflected by an intensive discussion in the public and the scientific community about privacy-preserving approaches. OBJECTIVE: Our aim is to inform the current discussions and to support the development of solutions providing an optimal balance between privacy protection and pandemic control. To this end, we present a systematic analysis of existing literature on citizen-centered surveillance solutions collecting individual-level spatial data. Our main hypothesis is that there are dependencies between the following dimensions: the use cases supported, the technology used to collect spatial data, the specific diseases focused on, and data protection measures implemented. METHODS: We searched PubMed and IEEE Xplore with a search string combining terms from the area of infectious disease management with terms describing spatial surveillance technologies to identify studies published between 2010 and 2020. After a two-step eligibility assessment process, 27 articles were selected for the final analysis. We collected data on the four dimensions described as well as metadata, which we then analyzed by calculating univariate and bivariate frequency distributions. RESULTS: We identified four different use cases, which focused on individual surveillance and public health (most common: digital contact tracing). We found that the solutions described were highly specialized, with 89% (24/27) of the articles covering one use case only. Moreover, we identified eight different technologies used for collecting spatial data (most common: GPS receivers) and five different diseases covered (most common: COVID-19). Finally, we also identified six different data protection measures (most common: pseudonymization). As hypothesized, we identified relationships between the dimensions. We found that for highly infectious diseases such as COVID-19 the most common use case was contact tracing, typically based on Bluetooth technology. For managing vector-borne diseases, use cases require absolute positions, which are typically measured using GPS. Absolute spatial locations are also important for further use cases relevant to the management of other infectious diseases. CONCLUSIONS: We see a large potential for future solutions supporting multiple use cases by combining different technologies (eg, Bluetooth and GPS). For this to be successful, however, adequate privacy-protection measures must be implemented. Technologies currently used in this context can probably not offer enough protection. We, therefore, recommend that future solutions should consider the use of modern privacy-enhancing techniques (eg, from the area of secure multiparty computing and differential privacy).

Subject(s)

COVID-19/prevention & control , COVID-19/transmission , Contact Tracing/methods , Mobile Applications , Public Health Surveillance/methods , Spatio-Temporal Analysis , Computer Security , Humans , Pandemics , Privacy

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL